Package:
missForest
Category:
Single Imputation
Use-Cases:
Single Imputation of continuous and/or categorical data.
Popularity:
Description:
The function ‘missForest’ in this package is used to impute missing values particularly in the case of mixed-type data. It uses a random forest trained on the observed values of a data matrix to predict the missing values. It can be used to impute continuous and/or categorical data including complex interactions and non-linear relations. It yields an out-of-bag (OOB) imputation error estimate without the need of a test set or elaborate cross-validation. It can be run in parallel to save computation time.
Last update:
Algorithms:
- missForest (randomForest)
Datasets:
none
Further Information:
Stekhoven D. J., & Buehlmann, P. (2012). MissForest - non-parametric missing value imputation for mixed-type data. Bioinformatics, 28(1), 112-118.
“Using the missForest package” (https://stat.ethz.ch/education/semesters/ss2012/ams/paper/missForest_1.2.pdf)
Input:
data.frame
Example:
library(missForest)
library(VIM)
# Load sleep data from VIM package as example
data(sleep, package = "VIM")
print("before imputation")
summary(sleep)
# Perform imputation
erg <- missForest(sleep)
print("after imputation")
summary(erg$ximp)
Here you can have a interactive look at the example: https://rdrr.io/snippets/embedding/