R-miss-tastic

A resource website on missing values - Methods and references for managing missing data

Package:

mice

Category:

Multiple Imputation

Use-Cases:

Multiple imputation for mixes of continuous, binary, unordered categorical and ordered categorical data, Inspect the missing data, Generate simulated incomplete data

Popularity:

CRAN Downloads

Description:

Multiple imputation using Fully Conditional Specification (FCS) implemented by the MICE algorithm as described in Van Buuren and Groothuis-Oudshoorn (2011) doi:10.18637/jss.v045.i03. Each variable has its own imputation model. Built-in imputation models are provided for continuous data (predictive mean matching, normal), binary data (logistic regression), unordered categorical data (polytomous logistic regression) and ordered categorical data (proportional odds). MICE can also impute continuous two-level data (normal model, pan, second-level variables). Passive imputation can be used to maintain consistency between variables. Various diagnostic plots are available to inspect the quality of the imputations.

Last update:

CRAN Release

Algorithms:
  • pmm Predictive mean matching
  • midastouch Weighted predictive mean matching
  • sample Random sample from observed values
  • cart Classification and regression trees
  • rf Random forest imputations
  • mean Unconditional mean imputation
  • norm Bayesian linear regression
  • norm.nob Linear regression ignoring model error
  • norm.boot Linear regression using bootstrap
  • norm.predict Linear regression, predicted values
  • quadratic Imputation of quadratic terms
  • ri Random indicator for nonignorable data
  • logreg Logistic regression
  • logreg.boot Logistic regression with bootstrap
  • polr Proportional odds model
  • polyreg Polytomous logistic regression
  • lda Linear discriminant analysis
  • 2l.norm Level-1 normal heteroscedastic
  • 2l.lmer Level-1 normal homoscedastic, lmer
  • 2l.pan Level-1 normal homoscedastic, pan
  • 2l.bin Level-1 logistic, glmer
  • 2lonly.mean Level-2 class mean
  • 2lonly.norm Level-2 class normal
  • 2lonly.pmm Level-2 class predictive mean matching
Datasets:
  • boys (Growth of Dutch boys)
  • brandsma (Brandsma school data, Snijders and Bosker, 2012)
  • employee (Employee selection data)
  • fdd (SE Fireworks disaster data)
  • fdgs (Fifth Dutch growth study,2009)
  • leiden85 (Leiden 85+ study)
  • mammalsleep (Mammal sleep data)
  • mgg (Self-reported and measured BMI)
  • nhanes (NHANES example - all variables numerical)
  • pattern (Datasets with various missing data patterns)
  • pattern1 (Datasets with various missing data patterns)
  • pattern2 (Datasets with various missing data patterns)
  • pattern3 (Datasets with various missing data patterns)
  • pattern4 (Datasets with various missing data patterns)
  • popmis (Hox pupil popularity data with missing popularity scores)
  • pops (Project on preterm and small for gestational age infants)
  • potthoffroy (Potthoff-Roy data)
  • selfreport (Self-reported and measured BMI)
  • sleep Mammal (sleep data)
  • tbc Terneuzen (birth cohort)
  • walking Walking (disability data)
  • windspeed (Subset of Irish wind speed data)
Further Information:
Input:

data.frame

Example:
# classic MICE/multiple imputation workflow
library("mice")

#Perform imputation - create multiple imputed datasets
imp <- mice(nhanes, maxit = 2, m = 2)

# Fit a lm model on each of the datasets
fit <- with(data = imp, exp = lm(bmi ~ hyp + chl))

# Pool the models/results
summary(pool(fit))

Here you can have a interactive look at the example:

https://rdrr.io/snippets/embedding/


Share