vdemlite
vdemlite.Rmd
Data and Functionality
vdemlite
incorporates data for 297 variables V-Dem
indices and indicators from 1970 to the present and is organized around
the structure of aggregation outlined in this
document and in Appendix A of the V-Dem
Codebook. It includes all of the indicators included in this
document plus a few background factors and additional commonly-used
indicators. The dataset covers 1970 to the present and includes all
countries in the V-Dem dataset.
There are three primary functions in the vdemlite
package: - fetchdem()
provides functionality for
downloading V-Dem data included in vdemlite
and loads it
into R. - summarizedem()
provides an interactive table of
summary statistics of the data and, for smaller queries, some capacity
for visualization. - searchdem()
function returns an
interactive, searchable table of the dataset.
You can install the latest version of vdemlite
from
GitHub using the pak
package:
#install.packages("pak)
pak::pkg_install("eteitelbaum/vdemlite")
Alternatively, you could use the remotes
package or the
devtools
package:
remotes::install_github("eteitelbaum/vdemlite")
# or
devtools::install_github("eteitelbaum/vdemlite")
Querying V-Dem Data
The main function in vdemlite
is
fetchdem()
, which can be used to query the
vdemlite
dataset and retrieve data for specific indicators,
groups of indicators, years, and countries. The function returns a data
frame with the requested data.
fetchdem(indicators = NULL,
start_year = 1970,
end_year = 2023,
countries = NULL,
wide = TRUE)
Simply calling fetchdem()
will return the entire
vdemlite
dataset. But fetchdem()
takes up to
five arguments that allow the user to conduct more targeted queries:
indicators
, start_year
, end_year
,
countries
, and wide
.
The indicators
argument is a character vector of V-Dem
indicator codes but also groups of variables such as “big_5” (for the
five main high-level democracy indexes) or “v2x_libdem_all” (for all of
the indices and indicators used in the construction of the liberal
democracy index.
The start_year
and end_year
arguments
specify the range of years to fetch data for. The countries
argument is a character vector of three-digit iso3c country codes. The
wide
argument specifies whether to return the data in wide
format (the default) or long format.
Basic Usage
Let’s say you wanted to get the Polyarchy score for all countries for
the years 1980-2020. You could do this by setting
indicators
to v2x_polyarchy
and
start_year
and end_year
to 1980 and 2020,
respectively.
library(vdemlite)
# Polyarchy score for all countries for 1980-2020
polyarchy <- fetchdem(indicators = "v2x_polyarchy",
start_year = 1980, end_year = 2020)
We can add indicators to our desired list or filter for specific
countries by using the combine function (c()
). For example,
if you wanted to get the polyarchy and clean elections index for the USA
and Sweden, you could set indicators
to
c("v2x_polyarchy", "v2xel_frefair")
and
countries
to c("USA", "SWE")
.
# Polyarchy and clean elections index for USA and Sweden for 2000-2020
my_indicators <- fetchdem(indicators = c("v2x_polyarchy", "v2xel_frefair"),
countries = c("USA", "SWE"))
By default, fetchdem()
returns the data in wide format.
However, you can also return the data in long format by setting the
wide
argument to FALSE
.
Querying Groups of Indicators
fetchdem()
can also be used to retrieve groups of
indexes and indicators that follow the structure of aggregation outlined
in Appendix A of the V-Dem Codebook.
Setting indicators to big_5
will retrieve the five main
high-level indices for all countries and years.
# Five main high-level indices for all countries and years in long format
big_5 <- fetchdem(indicators = "big_5", wide = FALSE)
There are also groupings available for the high-level, mid-level, and lower-level indices. For example, you could get all of the high-level indices for all countries in 2015 like this:
# All high-level indices for all countries in 2015
high_level_2015 <- fetchdem(indicators = "high_level",
start_year = 2015, end_year = 2015)
Similarly, for low-level and mid-level indices you would specify
indicators = "low_level"
or
indicators = "mid_level"
.
One useful bit of information for leveraging this functionality is
that in V-Dem, indexes start with the prefix v2x_
.
vdemlite
leverages this structure to enable the user to
retrieve all of the indices and indicators that are used in the
construction of a particular index.
Specifically, you can retrieve all of the indices and indicators that
are used in the construction of a particular index code by adding suffix
_all
to an index code. For example, to get all of the
indicators that comprise the liberal democracy index for India, you
would set indicators
to v2x_libdem_all
and
countries
to IND
.
# Indicators that comprise the liberal democracy index for India, all years
lib_dem_ind_2010 <- fetchdem(indicators = "v2x_libdem_all",
countries = "IND")
Similarly, you can get all of the indicators that comprise the
Regimes of the World index for Venezuela by setting
indicators
to v2x_regime_all
and
countries
to VEN
.
# Indicators that comprise the Regimes of the World index for Venezuela, all years
regime_venezuela <- fetchdem(indicators = "v2x_regime_all",
countries = "VEN")
There are some additional categories not included in the construction
of indicators that are also included in vdemlite
including
legitimation
, civic_space
,
mass_mobilization
, citizen_engagement
,
academic_space
and background
(for background
factors).
Let’s grab the background factors and polyarchy index for all countries in 2015 and use it to do some analysis.
# Polyarchy and background factors, all countries and years
polyarchy_analysis <- fetchdem(indicators = c("v2x_polyarchy", "background"),
start_year = 2015, end_year = 2015)
# Run a regression
model1 <- lm(v2x_polyarchy ~ log(e_gdppc) +
e_pechmor +
e_pelifeex +
log(e_wb_pop),
data = polyarchy_analysis)
# Display the results
summary(model1)
#>
#> Call:
#> lm(formula = v2x_polyarchy ~ log(e_gdppc) + e_pechmor + e_pelifeex +
#> log(e_wb_pop), data = polyarchy_analysis)
#>
#> Residuals:
#> Min 1Q Median 3Q Max
#> -0.60413 -0.17328 0.05856 0.17371 0.41818
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) -0.502659 0.371600 -1.353 0.177994
#> log(e_gdppc) 0.022868 0.026088 0.877 0.381996
#> e_pechmor 0.001414 0.001063 1.330 0.185317
#> e_pelifeex 0.018181 0.004987 3.646 0.000356 ***
#> log(e_wb_pop) -0.022915 0.010431 -2.197 0.029412 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Residual standard error: 0.2245 on 166 degrees of freedom
#> (8 observations deleted due to missingness)
#> Multiple R-squared: 0.2581, Adjusted R-squared: 0.2403
#> F-statistic: 14.44 on 4 and 166 DF, p-value: 3.871e-10
Summarizing V-Dem Data
Sometimes you might be curious to know about the coverage of a given
index of indicator and to get a quick look at how the data look for
individual countries. For this, vdemlite
includes a
function called summarizedem()
that provides an
interactive, table of summary statistics for a given index or
indicator:
summarizedem(indicator = NULL,
start_year = 1970,
end_year = 2023,
countries = NULL)
summarizedem()
takes four arguments:
indicator
, start_year
, end_year
and countries
. The indicator
argument takes a
character string representing exactly one V-Dem indicator code. The
start_year
and end_year
arguments specify the
range of years to fetch data for, while the countries
argument is a character vector of country codes.
Let’s say you wanted to get a summary of the polyarchy index for all countries and years:
# Summary statistics for the polyarchy index
summarizedem(indicator = "v2x_polyarchy")
But you can also be more specific. Maybe you want a summary of the polyarchy index for the USA and Sweden for the years 2000-2020:
# Summary statistics for the polyarchy index for the USA and Sweden for 2000-2020
summarizedem(indicator = "v2x_polyarchy",
start_year = 2000, end_year = 2020,
countries = c("USA", "SWE"))
Searching for Indicators
If you are wondering whether your favorite V-Dem indicator is
available, vdemlite
includes a function called
searchdem()
that, by default, produces a searchable table
of all V-Dem indicators included in the dataset.
# Produce a searchable table of the indicators
searchdem()
The table is based on the Structure of V-Dem Indices, Components and Indicators document and includes four columns: the variable tag; the index or indicator descriptor; the level of the index or indicator (high-level index, mid-level index, lower-level index; or indicator); and a column called “Part” that distinguishes between V-Dem Indicators and Indices (part 1) and indices created using V-Dem data (part 2).
The default behavior is to return all indices and their descriptions
but does have an argument index
that is a character vector
of indices or indicators. If a value for index
is provided,
the function will return any sub-indices and indicators used to
construct the selected indices. For example
searchdem("v2x_libdem")
would return all of the sub-indices
and indicators used to construct the polyarchy score.
It is not possible to specify the groups of indicators that can be
used in the fetchdem()
function, e.g. big5
,
v2x_libdem_all
, etc. in the indices
argument.
However, you can search for the “level” groupings
(e.g. high_level
, mid_level
and
lower_level
) in the table itself.