R-miss-tastic

A resource website on missing values - Methods and references for managing missing data

Package:

missMDA

Category:

Single and multiple imputation

Use-Cases:

Imputation of incomplete continuous, categorical or mixed datasets, …

Popularity:

CRAN Downloads

Description:

Imputation of incomplete continuous or categorical datasets; Missing values are imputed with a principal component analysis (PCA), a multiple correspondence analysis (MCA) model or a multiple factor analysis (MFA) model; Perform multiple imputation with and in PCA or MCA.

Last update:

CRAN Release

Algorithms:
  • Continuous data (multiple) imputation (with Principal Components Analysis)
  • Contingency table imputation (with Correspondence Analysis)
  • Mixed data (multiple) imputation (with Factorial Analysis of Mixed Data)
  • Categorical data (multiple) imputation (with Multiple Correspondence Analysis)
  • Structured data imputation (with Multiple Factor Analysis)
  • Multilevel mixed data imputation (with Multilevel Factorial Analysis for Mixed Data)
  • Overimputation diagnostic
Datasets:
  • gene
  • geno
  • orange
  • ozone
  • snorena
  • TitanicNA
  • vnf
Further Information:
  • Josse, J. & Husson, F. (2012). Handling missing values in exploratory multivariate data analysis methods. Journal de la SFdS, 153(2), pp. 79-99. PDF (on HAL)
  • Julie Josse, Francois Husson (2016). missMDA: A Package for Handling Missing Values in Multivariate Data Analysis. Journal of Statistical Software, 70(1), 1-31. doi:10.18637/jss.v070.i01. PDF (on HAL)
  • Audigier, V., Husson, F., and Josse, J. (2016). Multiple imputation for continuous variables using a bayesian principal component analysis. Journal of Statistical Computation and Simulation, 86(11):2140-2156. PDF (on arXiv)
  • Audigier, V., Husson, F., and Josse, J. (2016). A principal component method to impute missing values for mixed data. Advances in Data Analysis and Classification, 10(1):5-26. PDF (on arXiv)
  • Audigier, V., Husson, F., and Josse, J. (2017). Mimca: multiple imputation for categorical variables with multiple correspondence analysis. Statistics and Computing, 27(2):501-518. PDF (on arXiv)
  • Some videos (on Youtube)
Input:

data.frame, mids

Example:
library(missMDA)

data(orange)

print("print data set with NAs")
print(head(orange))

## First the number of components has to be chosen
## (for the imputation step)
## nb <- estim_ncpPCA(orange,ncp.max=5) ## Time consuming, nb = 2
## Imputation
res.comp <- imputePCA(orange,ncp=2)

print("print data set with the imputations")
print(head(res.comp$completeObs))

Here you can have a interactive look at the example:

https://rdrr.io/snippets/embedding/


Share