R-miss-tastic

A resource website on missing values - Methods and references for managing missing data

Package:

MetabImpute

Authors:

Tarek Firzli, Trenton Davis, Emily Higgins.

Category:

Single and Multiple Imputation, Left-Censored Missing Data, Metabolomics

Use-Cases:

Single imputation, Metabolomics, Imputation with Biological Replicates

Description:

A package to evaluate missing data, simulate data matrices and missingness, evaluate multiple imputation methods and return statistics on these and finally methods to impute utilizing multiple standard imputation approaches. Novel imputation methodologies which utilize an imputation approach with data that uses biological or technical replication are also included. ICC evaluation methods are included specifically included to suit researchers working with data with biological or technical replicates. Source code was written by the authors with code copied and modified from the following GitHub packages: https://github.com/Tirgit/missCompare, https://github.com/WandeRum/GSimp (Wei, R., Wang, J., Jia, E., Chen, T., Ni, Y., & Jia, W. (2017). GSimp: A Gibbs sampler based left-censored missing value imputation approach for metabolomics studies. PLOS Computational Biology) https://github.com/juuussi/impute-metabo Kokla, M., Virtanen, J., Kolehmainen, M. et al. Random forest-based imputation outperforms other methods for imputing LC-MS metabolomics data: a comparative study. BMC Bioinformatics 20, 492 (2019). https://doi.org/10.1186/s12859-019-3110-0

Read more →

Package:

pcaMethods

Authors:

Wolfram Stacklies, Henning Redestig, Kevin Wright

Category:

Single Imputation

Use-Cases:

Bayesian and probabilistic PCA on incomplete data, Missing value imputation using PCA methods, Comparison with cluster-based imputation, Unified structure for PCA results and plots.

Description:

Provides Bayesian PCA, Probabilistic PCA, Nipals PCA, Inverse Non-Linear PCA and the conventional SVD PCA. A cluster based method for missing value estimation is included for comparison. BPCA, PPCA and NipalsPCA may be used to perform PCA on incomplete data as well as for accurate missing value estimation. A set of methods for printing and plotting the results is also provided. All PCA methods make use of the same data structure (pcaRes) to provide a common interface to the PCA results. Initiated at the Max-Planck Institute for Molecular Plant Physiology, Golm, Germany.

Read more →