Skip to contents

Data and Functionality

vdemlite incorporates data for 297 variables V-Dem indices and indicators from 1970 to the present and is organized around the structure of aggregation outlined in this document and in Appendix A of the V-Dem Codebook. It includes all of the indicators included in this document plus a few background factors and additional commonly-used indicators. The dataset covers 1970 to the present and includes all countries in the V-Dem dataset.

There are three primary functions in the vdemlite package: - fetchdem() provides functionality for downloading V-Dem data included in vdemlite and loads it into R. - summarizedem() provides an interactive table of summary statistics of the data and, for smaller queries, some capacity for visualization. - searchdem() function returns an interactive, searchable table of the dataset.

You can install the latest version of vdemlite from GitHub using the pak package:

#install.packages("pak)
pak::pkg_install("eteitelbaum/vdemlite")

Alternatively, you could use the remotes package or the devtools package:

remotes::install_github("eteitelbaum/vdemlite")
# or
devtools::install_github("eteitelbaum/vdemlite")

Querying V-Dem Data

The main function in vdemlite is fetchdem(), which can be used to query the vdemlite dataset and retrieve data for specific indicators, groups of indicators, years, and countries. The function returns a data frame with the requested data.

fetchdem(indicators = NULL, 
          start_year = 1970, 
          end_year = 2023,
          countries = NULL,
          wide = TRUE)

Simply calling fetchdem() will return the entire vdemlite dataset. But fetchdem() takes up to five arguments that allow the user to conduct more targeted queries: indicators, start_year, end_year, countries, and wide.

The indicators argument is a character vector of V-Dem indicator codes but also groups of variables such as “big_5” (for the five main high-level democracy indexes) or “v2x_libdem_all” (for all of the indices and indicators used in the construction of the liberal democracy index.

The start_year and end_year arguments specify the range of years to fetch data for. The countries argument is a character vector of three-digit iso3c country codes. The wide argument specifies whether to return the data in wide format (the default) or long format.

Basic Usage

Let’s say you wanted to get the Polyarchy score for all countries for the years 1980-2020. You could do this by setting indicators to v2x_polyarchy and start_year and end_year to 1980 and 2020, respectively.

library(vdemlite)

# Polyarchy score for all countries for 1980-2020
polyarchy <- fetchdem(indicators = "v2x_polyarchy",
                       start_year = 1980, end_year = 2020)

We can add indicators to our desired list or filter for specific countries by using the combine function (c()). For example, if you wanted to get the polyarchy and clean elections index for the USA and Sweden, you could set indicators to c("v2x_polyarchy", "v2xel_frefair") and countries to c("USA", "SWE").

# Polyarchy and clean elections index for USA and Sweden for 2000-2020
my_indicators <- fetchdem(indicators = c("v2x_polyarchy", "v2xel_frefair"),
                           countries = c("USA", "SWE"))

By default, fetchdem() returns the data in wide format. However, you can also return the data in long format by setting the wide argument to FALSE.

# Five main high-level indices for all countries and years in long format
big_5 <- fetchdem(indicators = c("v2x_polyarchy", "v2xel_frefair"),
                   countries = c("USA", "SWE"), wide = FALSE)

Querying Groups of Indicators

fetchdem() can also be used to retrieve groups of indexes and indicators that follow the structure of aggregation outlined in Appendix A of the V-Dem Codebook.

Setting indicators to big_5 will retrieve the five main high-level indices for all countries and years.

# Five main high-level indices for all countries and years in long format
big_5 <- fetchdem(indicators = "big_5", wide = FALSE)

There are also groupings available for the high-level, mid-level, and lower-level indices. For example, you could get all of the high-level indices for all countries in 2015 like this:

# All high-level indices for all countries in 2015
high_level_2015 <- fetchdem(indicators = "high_level",
                        start_year = 2015, end_year = 2015)

Similarly, for low-level and mid-level indices you would specify indicators = "low_level" or indicators = "mid_level".

One useful bit of information for leveraging this functionality is that in V-Dem, indexes start with the prefix v2x_. vdemlite leverages this structure to enable the user to retrieve all of the indices and indicators that are used in the construction of a particular index.

Specifically, you can retrieve all of the indices and indicators that are used in the construction of a particular index code by adding suffix _all to an index code. For example, to get all of the indicators that comprise the liberal democracy index for India, you would set indicators to v2x_libdem_all and countries to IND.

# Indicators that comprise the liberal democracy index for India, all years
lib_dem_ind_2010 <- fetchdem(indicators = "v2x_libdem_all",
                            countries = "IND")

Similarly, you can get all of the indicators that comprise the Regimes of the World index for Venezuela by setting indicators to v2x_regime_all and countries to VEN.

# Indicators that comprise the Regimes of the World index for Venezuela, all years
regime_venezuela <- fetchdem(indicators = "v2x_regime_all",
                           countries = "VEN")

There are some additional categories not included in the construction of indicators that are also included in vdemlite including legitimation, civic_space, mass_mobilization, citizen_engagement, academic_space and background (for background factors).

Let’s grab the background factors and polyarchy index for all countries in 2015 and use it to do some analysis.

# Polyarchy and background factors, all countries and years
polyarchy_analysis <- fetchdem(indicators = c("v2x_polyarchy", "background"),
                                start_year = 2015, end_year = 2015)

# Run a regression
model1 <- lm(v2x_polyarchy ~ log(e_gdppc) + 
               e_pechmor + 
               e_pelifeex + 
               log(e_wb_pop), 
             data = polyarchy_analysis)

# Display the results
summary(model1)
#> 
#> Call:
#> lm(formula = v2x_polyarchy ~ log(e_gdppc) + e_pechmor + e_pelifeex + 
#>     log(e_wb_pop), data = polyarchy_analysis)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.60413 -0.17328  0.05856  0.17371  0.41818 
#> 
#> Coefficients:
#>                Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)   -0.502659   0.371600  -1.353 0.177994    
#> log(e_gdppc)   0.022868   0.026088   0.877 0.381996    
#> e_pechmor      0.001414   0.001063   1.330 0.185317    
#> e_pelifeex     0.018181   0.004987   3.646 0.000356 ***
#> log(e_wb_pop) -0.022915   0.010431  -2.197 0.029412 *  
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.2245 on 166 degrees of freedom
#>   (8 observations deleted due to missingness)
#> Multiple R-squared:  0.2581, Adjusted R-squared:  0.2403 
#> F-statistic: 14.44 on 4 and 166 DF,  p-value: 3.871e-10

Summarizing V-Dem Data

Sometimes you might be curious to know about the coverage of a given index of indicator and to get a quick look at how the data look for individual countries. For this, vdemlite includes a function called summarizedem() that provides an interactive, table of summary statistics for a given index or indicator:

summarizedem(indicator = NULL,
              start_year = 1970,
              end_year = 2023,
              countries = NULL)

summarizedem() takes four arguments: indicator, start_year, end_year and countries. The indicator argument takes a character string representing exactly one V-Dem indicator code. The start_year and end_year arguments specify the range of years to fetch data for, while the countries argument is a character vector of country codes.

Let’s say you wanted to get a summary of the polyarchy index for all countries and years:

# Summary statistics for the polyarchy index
summarizedem(indicator = "v2x_polyarchy")
Summary of v2x_polyarchy by Country
Years: 1970 - 2023

But you can also be more specific. Maybe you want a summary of the polyarchy index for the USA and Sweden for the years 2000-2020:

# Summary statistics for the polyarchy index for the USA and Sweden for 2000-2020
summarizedem(indicator = "v2x_polyarchy",
              start_year = 2000, end_year = 2020,
              countries = c("USA", "SWE"))
Summary of v2x_polyarchy by Country
Years: 2000 - 2020

Searching for Indicators

If you are wondering whether your favorite V-Dem indicator is available, vdemlite includes a function called searchdem() that, by default, produces a searchable table of all V-Dem indicators included in the dataset.

# Produce a searchable table of the indicators

searchdem()
Variable Tags and Descriptions

The table is based on the Structure of V-Dem Indices, Components and Indicators document and includes four columns: the variable tag; the index or indicator descriptor; the level of the index or indicator (high-level index, mid-level index, lower-level index; or indicator); and a column called “Part” that distinguishes between V-Dem Indicators and Indices (part 1) and indices created using V-Dem data (part 2).

The default behavior is to return all indices and their descriptions but does have an argument index that is a character vector of indices or indicators. If a value for index is provided, the function will return any sub-indices and indicators used to construct the selected indices. For example searchdem("v2x_libdem") would return all of the sub-indices and indicators used to construct the polyarchy score.

It is not possible to specify the groups of indicators that can be used in the fetchdem() function, e.g. big5, v2x_libdem_all, etc. in the indices argument. However, you can search for the “level” groupings (e.g. high_level, mid_level and lower_level) in the table itself.