R-miss-tastic

A resource website on missing values - Methods and references for managing missing data

Package:

Amelia

Authors:

James Honaker [aut], Gary King [aut], Matthew Blackwell [aut, cre]

Category:

Multiple Imputation

Use-Cases:

Cross-sectional survey data, High-dimensional datasets with many variables.

Popularity:

CRAN Downloads

Description:

A tool that “multiply imputes” missing data in a single cross-section (such as a survey), from a time series (like variables collected for each year in a country), or from a time-series-cross-sectional data set (such as collected by years for each of several countries). Amelia II implements our bootstrapping-based algorithm that gives essentially the same answers as the standard IP or EMis approaches, is usually considerably faster than existing approaches and can handle many more variables. Unlike Amelia I and other statistically rigorous imputation software, it virtually never crashes (but please let us know if you find to the contrary!). The program also generalizes existing approaches by allowing for trends in time series across observations within a cross-sectional unit, as well as priors that allow experts to incorporate beliefs they have about the values of missing cells in their data. Amelia II also includes useful diagnostics of the fit of multiple imputation models. The program works from the R command line or via a graphical user interface that does not require users to know R.

Read more →

Package:

CALIBERrfimpute

Authors:

Anoop Shah [aut, cre], Jonathan Bartlett [ctb], Harry Hemingway [ths], Owen Nicholas [ths], Aroon Hingorani [ths]

Category:

Multiple Imputation

Use-Cases:

Multiple Imputation, MICE and Random Forest

Popularity:

CRAN Downloads

Description:

Functions to impute using random forest under full conditional specifications (multivariate imputation by chained equations). The methods are described in Shah and others (2014) doi:10.1093/aje/kwt312.

Algorithms for Imputation:

mice compatible methods such as:

  • rfcont - Impute continuous variables using Random Forest within MICE,
  • rfcat - Impute categorical variables using Random Forest within MICE.
Other Algorithms:
  • simdata() - Simulate multivariate data for testing
Datasets:

none.

Read more →