R-miss-tastic

A resource website on missing values - Methods and references for managing missing data


A commented version of this bibliography can be found here.

Publication type Year Author
Citation Year Publication type
Abayomi, K., A. Gelman, and M. Levy. Diagnostics for multivariate imputations. In: Journal of the Royal Statistical Society, Series C (Applied Statistics) 57.3 (2008), pp. 273-291.
DOI
2008 Article
Albert, P. S. and D. A. Follmann. Modeling repeated count data subject to informative dropout. In: Biometrics 56.3 (2000), pp. 667-677.
DOI
2000 Article
Allison, P. D. Missing Data. Quantitative Applications in the Social Sciences. Thousand Oaks, CA, USA: Sage Publications, 2001. ISBN: 9780761916727.
DOI
2001 Book
Andridge, R. and R. J. A. Little. A review of hot deck imputation for survey non-response. In: International Statistical Review 78.1 (2010), pp. 40-64.
DOI
2010 Article
Audigier, V., F. Husson, and J. Josse. A principal component method to impute missing values for mixed data. In: Advances in Data Analysis and Classification 10.1 (2016), pp. 5-26.
DOI
2016 Article
Audigier, V., F. Husson, and J. Josse. Multiple imputation for continuous variables using a Bayesian principal component analysis. In: Journal of Statistical Computation and Simulation 86.11 (2015), pp. 2140-2156.
DOI
2015 Article
Audigier, V., F. Husson, and J. Josse. MIMCA: multiple imputation for categorical variables with multiple correspondence analysis. In: Statistics and Computing 27.2 (2016), pp. 1-18. eprint: 1505.08116.
DOI
2016 Article
Bang, H. and J. M. Robins. Doubly robust estimation in missing data and causal inference models. In: Biometrics 61.4 (2005), pp. 962-973.
DOI
2005 Article
Baraldi, A. N. and C. K. Enders. An introduction to modern missing data analysis. In: Journal of School Psychology 48.1 (2010), pp. 5-37.
DOI
2010 Article
Baretta, L. and A. Santaniello. Nearest neighbor imputation algorithms: a critical evaluation. In: BMC Medical Informatics and Decision Making. Proceedings of the 5th Translational Bioinformatics Conference (TBC 2015): medical informatics and decision making 16.Supp. 3 (2016), p. 74.
DOI
2016 Article
Bartlett, J. W., O. Harel, and J. R. Carpenter. Asymptotically unbiased estimation of exposure odds ratios in complete records logistic regression. In: American journal of epidemiology 182.8 (2015), pp. 730–736.
DOI
2015 Article
Beaulac, C. and J. S. Rosenthal. BEST: A decision tree algorithm that handles missing values. In: arXiv preprint (2018). eprint: 1804.10168.
URL
2018 Article
Bengio, Y. and F. Gingras. Recurrent neural networks for missing or asynchronous data. In: Proceedings of the 8th International Conference on Neural Information Processing Systems. (Nov. 27, 1995-Dec. 02, 1995). Ed. by -. Cambridge, MA, USA: MIT Press, 1995, pp. 395-401.
URL
1995 Paper
Bertsimas, D., C. Pawlowski, and Y. D. Zhuo. From predictive methods to missing data imputation: an optimization approach. In: The Journal of Machine Learning Research 18.1 (2017), pp. 7133–7171. 2017 Article
Beunckens, C., G. Molenberghs, G. Verbeke, et al. A latent-class mixture model for incomplete longitudinal Gaussian data. In: Biometrics 64.1 (2008), pp. 96–105.
DOI
2008 Article
Bianchi, F. M., L. Livi, K. Ø. Mikalsen, et al. Learning representations of multivariate time series with missing data. In: Pattern Recognition 96 (2019), p. 106973.
DOI
2019 Article
Biessmann, F., D. Salinas, S. Schelter, et al. “Deep” Learning for Missing Value Imputation in Tables with Non-Numerical Data. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. Ed. by -. CIKM ’18. Torino, Italy: ACM, 2018, pp. 2017–2025. ISBN: 978-1-4503-6014-2. 2018 Paper
Blake, H. A., C. Leyrat, K. Mansfield, et al. Propensity scores using missingness pattern information: a practical guide. In: arXiv preprint (2019). arXiv: 1901.03981 [stat.ME].
URL
2019 Article
Brinis, S., C. Traina, and A. J. Traina. Hollow-tree: a metric access method for data with missing values. In: Journal of Intelligent Information Systems (2019), pp. 1–28.
DOI
2019 Article
Buck, S. F. A method of estimation of missing values in multivariate data suitable for use with an electronic computer. In: Journal of the Royal Statistical Society, Series B 22 (1960), pp. 302-306.
DOI
1960 Article
Burns, R. M. Multiple and replicate item imputation in a complex sample survey. In: Proceedings of the 6th Annual Research Conference. Ed. by B. of the Census. Washington DC, USA, 1990, pp. 655-665. 1990 Paper
Candès, E. J., C. A. Sing-Long, and J. D. Trzasko. Unbiased risk estimates for singular value thresholding and spectral estimators. In: IEEE Transactions on Signal Processing 61.19 (2013), pp. 4643-4657.
DOI
2013 Article
Carpenter, J. R., M. G. Kenward, and S. Vansteelandt. A comparison of multiple imputation and doubly robust estimation for analyses with missing data. In: Journal of the Royal Statistical Society: Series A (Statistics in Society) 169.3 (2006), pp. 571–584.
DOI
2006 Article
Carpenter, J. and M. Kenward. Multiple Imputation and its Application. Chichester, West Sussex, UK: Wiley, 2013. ISBN: 9780470740521.
DOI
2013 Book
Chen, T. and C. Guestrin. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. (Aug. 13, 2016-Aug. 17, 2016). Ed. by -. New York, NY, USA: ACM, 2016, pp. 785-794. ISBN: 0450342322.
DOI
2016 Paper
Chen, Y. and M. Sadinle. Nonparametric Pattern-Mixture Models for Inference with Missing Data. In: arXiv preprint (2019). arXiv: 1904.11085 [stat.ME].
URL
2019 Article
Chen, J. and J. Shao. Nearest neighbor imputation for survey data. In: Journal of Official Statistics 16.2 (2000), pp. 113-131.
URL
2000 Article
Collins, L. M., J. L. Schafer, and K. Chi-Ming. A comparison of inclusive and restrictive strategies in modern missing data procedures. In: Psychological Methods 6.4 (2007), pp. 330-351.
DOI
2007 Article
Cranmer, S. J. and J. Gill. We have to be discrete about this: a non-parametric imputation technique for missing categorical data. In: British Journal of Political Science 43 (2012), pp. 425-449.
DOI
2012 Article
Crookston, N. L. and A. O. Finley. yaImpute: an R package for kNN imputation. In: Journal of Statistical Software 23 (2008), p. 10.
DOI
2008 Article
Dax, A. Imputing Missing Entries of a Data Matrix: A review. In: Journal of Advanced Computing 3.3 (2014), pp. 98-222.
DOI
2014 Article
Dempster, A. P., N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. In: Journal of the Royal Statistical Society, Series B (Methodological) 39.1 (1977), pp. 1-38.
URL
1977 Article
Diggle, P. and M. G. Kenward. Informative drop-out in longitudinal data analysis. In: Journal of the Royal Statistical Society, Series C (Applied Statistics) 43.1 (1994), pp. 49-93.
DOI
1994 Article
Ding, P. and F. Li. Causal Inference: A Missing Data Perspective. In: Statistical Science 33.2 (2018), pp. 214–237.
DOI
2018 Article
Ding, Y. and J. S. Simonoff. An investigation of missing data methods for classification trees applied to binary response data. In: Journal of Machine Learning Research 11.1 (2010), pp. 131-170.
URL
2010 Article
Dong, Y. and C. J. Peng. Principled missing data methods for researchers. In: SpringerPlus 2 (2013), p. 222.
DOI
2013 Article
Enders, C. K. Applied Missing Data Analysis. Guilford Press, 2010, p. 401. ISBN: 9781606236390. 2010 Book
Enders, C. K. A primer on maximum likelihood algorithms available for use with missing data. In: Structural Equation Modeling 8.1 (2001), pp. 128-141.
DOI
2001 Article
Erler, N. S., D. Rizopoulos, and E. M. Lesaffre. JointAI: joint analysis and imputation of incomplete data in R. In: arXiv preprint (2019).
URL
2019 Article
Hunt, L. and M. Jorgensen. Mixture model clustering for mixed data with missing information. In: Computational Statistics & Data Analysis 41.3-4 (2003), pp. 429–440.
DOI
2003 Article
Jiang, W., M. Bogdan, J. Josse, et al. Adaptive Bayesian SLOPE–High-dimensional Model Selection with Missing Values. In: arXiv preprint (2019).
URL
2019 Article
Chi, J. T., E. C. Chi, and R. G. Baraniuk. k-pod: A method for k-means clustering of missing data. In: The American Statistician 70.1 (2016), pp. 91–99.
DOI
2016 Article
Fang, F., J. Zhao, and J. Shao. Imputation-based adjusted score equations in generalized linear models with nonignorable missing covariate values. In: Statistica Sinica 28.4 (2018), pp. 1677–1701.
DOI
2018 Article
Fay, R. E. Alternative paradigms for the analysis of imputed survey data. In: Journal of the American Statistical Association 91.434 (1996), pp. 490-498.
DOI
1996 Article
Fellegi, I. P. and D. Holt. A systematic approach to automatic edit and imputation. In: Journal of the American Statistical Association 71.353 (1976), pp. 17-35.
DOI
1976 Article
Ferrari, P. A., P. Annoni, A. Barbiero, et al. An imputation method for categorical variables with application to nonlinear principal component analysis. In: Computational Statistics & Data Analysis 55.7 (2011), pp. 2410-2420.
DOI
2011 Article
Finkbeiner, C. Estimation for the multiple factor model when data are missing. In: Psychometrika 44.4 (1979), pp. 409-420.
DOI
1979 Article
Fitzmaurice, G. M., G. Molenberghs, and S. R. Lipsitz. Regression Models for Longitudinal Binary Responses with Informative Drop-Outs. In: Journal of the Royal Statistical Society. Series B (Methodological) 57.4 (1995), pp. 691–704.
URL
1995 Article
Follmann, D. and M. Wu. An approximate generalized linear model with random effects for informative missing data. In: Biometrics 51.1 (1995), pp. 151-168.
DOI
1995 Article
Gad, A. M. and N. M. M. Darwish. A shared parameter model for longitudinal data with missing values. In: American Journal of Applied Mathematics and Statistics 1.2 (2013), pp. 30-35.
URL
2013 Article
Gelman, A., G. King, and C. Liu. Not asked and not answered: Multiple imputation for multiple surveys. In: Journal of the American Statistical Association 93.443 (1998), pp. 846–857.
DOI
1998 Article
Gelman, A., I. van Mechelen, G. Verbeke, et al. Multiple Imputation for Model Checking: Completed-Data Plots with Missing and Latent Data. In: Biometrics 61.1 (2005), pp. 74–85.
DOI
2005 Article
Gill, R. D., M. J. Van Der Laan, and J. M. Robins. Coarsening at random: Characterizations, conjectures, counter-examples. In: Proceedings of the First Seattle Symposium in Biostatistics. Springer. 1997, pp. 255–294.
DOI
1997 Paper
Golden, R. M., S. S. Henley, H. White, et al. Consequences of model misspecification for maximum likelihood estimation with missing data. In: Econometrics 7.3 (2019), p. 37.
DOI
2019 Article
Gondara, L. and K. Wang. MIDA: Multiple Imputation using Denoising Autoencoders. In: Proceedings of the 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2018). (Jun. 03, 2018-Jun. 06, 2018). Ed. by D. Phung, V. Tseng, G. Webb, B. Ho, M. Ganji and L. Rashidi. Lecture Notes in Computer Science. Springer International Publishing, 2018, pp. 260-272. ISBN: 3319930404. 2018 Paper
Goodfellow, I., M. Mirza, A. Courville, et al. Multi-Prediction Deep Boltzmann Machines. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. (Dec. 05, 2013-Dec. 10, 2013). Ed. by C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K. Weinberger. Advances in Neural Information Processing Systems 26. Curran Associates, Inc., 2013, pp. 548–556.
URL
2013 Paper
Graham, J. W. Missing data analysis: making it work in the real world. In: Annual Review of Psychology 60 (2009), pp. 549-576.
DOI
2009 Article
Graham, J. W., A. E. Olchowski, and T. E. Gilreath. How many imputations are really needed? Some practical clarifications of multiple imputation theory. In: Prevention Science 8.3 (2007), pp. 206-213.
DOI
2007 Article
Graham, J. W., S. M. Hofer, S. I. Donaldson, et al. The Science of Prevention: Methodological Advances from Alcohol and Substance Abuse Research. In: The Science of Prevention: Methodological Advances from Alcohol and Substance Abuse Research. Ed. by K. Bryant, M. Windle and S. West. Washington, DC, USA: American Psychological Association, 1997. Chap. Analysis with missing data in prevention research, pp. 325-366. ISBN: 1-55798-439-5.
DOI
1997 Book
Heckman, J. J. The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. In: Annals of Economic and Social Measurement 5.4 (1976), pp. 475-492.
URL
1976 Article
Heckman, J. Sample selection bias as a specification error. In: Econometrica 47.1 (1979), pp. 153-161.
DOI
1979 Article
Hogan, J. W. and N. M. Laird. Mixture models for the joint distribution of repeated measures and event times. In: Statistics in Medecine 16.1-3 (1997), pp. 239-257. 1997 Article
Hogan, J. W. and T. Lancaster. Instrumental variables and inverse probability weighting for causal inference from longitudinal observational studies. In: Statistical Methods in Medical Research 13.1 (2004), pp. 17-48.
DOI
2004 Article
Honaker, J., G. King, and M. Blackwell. Amelia II: a program for missing data. In: Journal of Statistical Software 45.7 (2011). eprint: arXiv:1501.0228.
DOI
2011 Article
Horton, N. J. and K. P. Kleinman. Much Ado About Nothing - A Comparison of Missing Data Methods and Software to Fit Incomplete Data Regression Models. In: The American Statistician 61.1 (2017), pp. 79-90.
DOI
2017 Article
Hothorn, T., K. Hornik, and A. Zeileis. Unbiased Recursive Partitioning: A Conditional Inference Framework. In: Journal of Computational and Graphical Statistics 15.3 (2012), pp. 651-674.
DOI
2012 Article
Huisman, M. Imputation of missing item responses: some simple techniques. In: Quality & Quantity 34.4 (2000), pp. 331-351.
DOI
2000 Article
Husson, F. and J. Josse. Handling missing values in multiple factor analysis. In: Food Quality and Preference 30 (2013), pp. 77-85.
DOI
2013 Article
Ibrahim, J. G., M. Chen, and S. R. Lipsitz. Missing responses in generalised linear mixed models when the missing data mechanism is nonignorable. In: Biometrika 88.2 (2001), pp. 551-564.
DOI
2001 Article
Ibrahim, J. G., S. R. Lipsitz, and M. Chen. Missing Covariates in Generalized Linear Models When the Missing Data Mechanism is Non-Ignorable. In: Journal of the Royal Statistical Society. Series B (Statistical Methodology) 61.1 (1999), pp. 173-190. 1999 Article
Ilin, A. and T. Raiko. Practical approaches to Principal Component Analysis in the presence of missing values. In: Journal of Machine Learning Research 11 (2010), pp. 1957-2000.
URL
2010 Article
Imbert, A., A. Valsesia, C. Le Gall, et al. Multiple hot-deck imputation for network inference from RNA sequencing data. In: Bioinformatics 34.10 (2018), pp. 1726-1732.
DOI
2018 Article
Ipsen, N. B., P. Mattei, and J. Frellsen. not-MIWAE: Deep generative modelling with missing not at random data. In: arXiv preprint (2020).
Ipsen, N., P. Mattei, and J. Frellsen. How to deal with missing data in supervised deep learning? In: ICML Workshop on the Art of Learning with Missing Values (Artemiss). 2020.
URL
2020 Paper
Ipsen, N. B., P. Mattei, and J. Frellsen. not-MIWAE: Deep generative modelling with missing not at random data. In: arXiv preprint (2020).
URL
2020 Article
Jamshidian, M., S. Jalal, and C. Jansen. MissMech: an R package for testing homoscedasticity, multivariate normality, and missing completely at random (MCAR). In: Journal of Statistical Software 56.6 (2014), pp. 1-31.
DOI
2014 Article
Jamshidian, M. and S. Jalal. Tests of homoscedasticity, normality, and missing completely at random for incomplete multivariate data. In: Psychometrika 75.4 (2010), pp. 649-674. eprint: NIHMS150003.
DOI
2010 Article
Jiang, W., J. Josse, and M. Lavielle. Logistic Regression with Missing Covariates–Parameter Estimation, Model Selection and Prediction. In: arXiv preprint (2018). arXiv: 1805.04602 [stat.ME]. 2018 Article
Joenssen, D. W. and U. Bankhofer. Donor limited hot deck imputation: effect on parameter estimation. In: Journal of Theoretical and Applied Computer Science 6.3 (2012), pp. 58-70.
URL
2012 Article
Jones, M. P. Indicator and Stratification Methods for Missing Explanatory Variables in Multiple Linear Regression. In: Journal of the American Statistical Association 91.433 (1996), pp. 222-230.
DOI
1996 Article
Jönsson, P. and C. Wohlin. An evaluation of k-nearest neighbour imputation using lIkert data. In: Proceedings of the 10th International Symposium on Software Metrics. (Sep. 14, 2004-Sep. 16, 2004). Ed. by -. Chicago, IL, USA: IEEE, 2004, pp. 1530-1435. ISBN: 0769521290.
DOI
2004 Paper
Josse, J., N. Prost, E. Scornet, et al. On the consistency of supervised learning with missing values. In: arXiv preprint (2019). arXiv: 1902.06931 [stat.ML].
URL
2019 Article
Josse, J., J. Pagès, and F. Husson. Multiple imputation in principal component analysis. In: Advances in Data Analysis and Classification 5.3 (2011), pp. 231-246.
DOI
2011 Article
Josse, J., M. Chavent, B. Liquet, et al. Handling missing values with regularized iterative multiple correspondance analysis. In: Journal of Classification 29.1 (2012), pp. 91-116.
DOI
2012 Article
Josse, J., F. Husson, and J. Pagès. Gestion des données manquantes en Analyse en Composantes Principales. In: Journal de la Société Française de Statistique 150.2 (2009), pp. 28-51.
URL
2009 Article
Josse, J. and F. Husson. Handling missing values in exploratory multivariate data analysis methods. In: Journal de la Société Française de Statistique 153.2 (2012), pp. 79-99.
URL
2012 Article
Josse, J. and F. Husson. missMDA: a package for handling missing values in multivariate data analysis. In: Journal of Statistical Software 70.1 (2016), pp. 1-31.
DOI
2016 Article
Kaiser, J. Dealing with missing values in data. In: Journal of Systems Integration 5.1 (2014), pp. 42-51.
DOI
2014 Article
Kallus, N., X. Mao, and M. Udell. Causal Inference with Noisy and Missing Covariates via Matrix Factorization. In: Advances in Neural Information Processing Systems. Ed. by -. 2018. eprint: 1806.00811.
URL
2018 Paper
Kalton, G. and D. Kasprzyk. The treatment of missing survey data. In: Survey Methodology 12.1 (1986), pp. 1-16.
URL
1986 Article
Kapelner, A. and J. Bleich. Prediction with missing data via Bayesian additive regression trees. In: Canadian Journal of Statistics 43.2 (2015), pp. 224-239. 2015 Article
Khosravi, P., A. Vergari, Y. Choi, et al. Handling missing data in decision trees: A probabilistic approach. In: arXiv preprint arXiv:2006.16341 (2020). 2020 Article
Kim, J. K. and J. Shao. Statistical Methods for Handling Incomplete Data. Boca Raton, FL, USA: Chapman and Hall/CRC, 2013. ISBN: 9781482205077. 2013 Book
Kohn, R. and C. F. Ansley. Estimation, prediction, and interpolation for ARIMA models with missing data. In: Journal of the American Statistical Association 81.395 (1986), pp. 751-761.
DOI
1986 Article
Kowarik, A. and M. Templ. Imputation with the R Package VIM. In: Journal of Statistical Software 74.7 (2016), pp. 1-16.
DOI
2016 Article
Kropko, J., B. Goodrich, A. Gelman, et al. Multiple Imputation for Continuous and Categorical Data: Comparing Joint Multivariate Normal and Conditional Approaches. In: Political Analysis 22.4 (2014), pp. 497–519.
DOI
2014 Article
Larose, C., D. K. Dey, and O. Harel. The impact of missing values on different measures of uncertainty. In: Statistica Sinica 29.2 (2019), pp. 551–566.
DOI
2019 Article
Lee, K. M., R. Mitra, and S. Biedermann. Optimal design when outcome values are not missing at random. In: Statistica Sinica 28.4 (2018), pp. 1821–1838.
DOI
2018 Article
Lee, K. J., K. Tilling, R. P. Cornish, et al. Framework for the Treatment And Reporting of Missing data in Observational Studies: The Treatment And Reporting of Missing data in Observational Studies framework. In: Journal of clinical epidemiology 134 (2021), pp. 79–88. 2021 Article
Little, R. J. A. A test of missing completely at random for multivariate data with missing values. In: Journal of the American Statistical Association 83.404 (1988), pp. 1198-1202.
DOI
1988 Article
Little, R. J. A. Regression with missing X’s: a review. In: Journal of the American Statistical Association 87.420 (1992), pp. 1227-1237.
DOI
1992 Article
Little, R. J. A. Pattern-mixture models for multivariate incomplete data. In: Journal of the American Statistical Association 88.421 (1993), pp. 125-134.
DOI
1993 Article
Little, R. J. A. Modeling the drop-out mechanism in repeated-measures studies. In: Journal of the American Statistical Association 90.431 (1995), pp. 1112-1121.
DOI
1995 Article
Little, R. J. A. and D. B. Rubin. Statistical Analysis with Missing Data. Wiley, 2002, p. 408. ISBN: 0471183865.
DOI
2002 Book
Loh, P. and M. J. Wainwright. High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity. In: Advances in Neural Information Processing Systems. Ed. by J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira and K. Q. Weinberger. Vol. 24. Curran Associates, Inc., 2011, pp. 2726–2734.
URL
2011 Paper
Londschien, M., S. Kovács, and P. Bühlmann. Change point detection for graphical models in presence of missing values. 2019. arXiv: 1907.05409 [stat.ML]. 2019 Misc
Louis, T. A. Finding the Observed Information Matrix when Using the EM Algorithm. In: Journal of the Royal Statistical Society. Series B (Methodological) 44.2 (1982), pp. 226–233.
URL
1982 Article
Lüdtke, O., A. Robitzsch, and S. G. West. Regression models involving nonlinear effects with missing data: A sequential modeling approach using Bayesian estimation. In: Psychological methods (2019).
DOI
2019 Article
Ma, A. and D. Needell. Stochastic Gradient Descent for Linear Systems with Missing Data. In: Numerical Mathematics: Theory, Methods and Applications 12.1 (2017), pp. 1-20.
DOI
2017 Article
Ma, W. and G. H. Chen. Missing Not at Random in Matrix Completion: The Effectiveness of Estimating Missingness Probabilities Under a Low Nuclear Norm Assumption. In: Advances in Neural Information Processing Systems 32. Ed. by H. Wallach, H. Larochelle, A. Beygelzimer, F. d. Alché-Buc, E. Fox and R. Garnett. Curran Associates, Inc., 2019, pp. 14900–14909.
URL
2019 Paper
Mattei, P. and J. Frellsen. MIWAE: Deep generative modelling and imputation of incomplete data sets. In: Proceedings of the 36th International Conference on Machine Learning. Vol. 97. Proceedings of Machine Learning Research. Kamalika Chaudhuri and Ruslan Salakhutdinov, 2019, pp. 4413–4423.
URL
2019 Paper
McLachlan, G. J. and T. Krishnan. The EM Algorithm and Extensions. Wiley series in probability and statistics. Hoboken, NJ, USA: Wiley, 2008. ISBN: 9780471201700. 2008 Book
Meng, S. L. and D. B. Rubin. Maximum likelihood estimation via the ECM algorithm: a general framework. In: Biometrika 80.2 (1993), pp. 267-278.
DOI
1993 Article
Meng, X. L. and D. B. Rubin. Using EM to obtain asymptotic variance-covariance matrices: the SEM algorithm. In: Journal of the American Statistical Association 86.416 (1991), pp. 899-909.
DOI
1991 Article
Meng, X. L. UYou want me to analyze data I don’t have? Are you insane? In: Shanghai Archives of Psychiatry 24.5 (2012), pp. 287-301. 2012 Article
Miao, W. and E. J. Tchetgen Tchetgen. Identification and inference with nonignorable missing covariate data. In: Statistica Sinica 28.4 (2018), pp. 2049–2067.
DOI
2018 Article
Moeur, M. and A. R. Stage. Most similar neighbor: an improved sampling inference procedure for natural resources planning. In: Forest Science 42.1 (1995), pp. 337-359.
DOI
1995 Article
Mohan, K., F. Thoemmes, and J. Pearl. Estimation with Incomplete Data: The Linear Case. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18. International Joint Conferences on Artificial Intelligence Organization, Jul. 2018, pp. 5082–5088. 2018 Paper
Mohan, K. and J. Pearl. Graphical Models for Processing Missing Data. Tech. rep. R-473-L. Forthcoming, Journal of American Statistical Association (JASA). CA: Department of Computer Science, University of California, Los Angeles, 2019.
URL
2019 Misc
Molenberghs, G., G. Fitzmaurice, M. G. Kenward, et al. Handbook of Missing Data Methodology. Chapman & Hall/CRC Handbooks of Modern Statistical Methods. New York, NY, USA: Chapman and Hall/CRC, 2014. ISBN: 9781439854624. 2014 Book
Molenberghs, G., B. Michiels, M. G. Kenward, et al. Monotone missing data and pattern-mixture models. In: Statistica Neerlandica 52.2 (1998), pp. 153-161.
DOI
1998 Article
Molenberghs, G. and M. G. Kenward. Missing Data in Clinical Studies. Chichester, West Sussex, UK: Wiley, 2007. ISBN: 9780470849811.
DOI
2007 Book
Molnar, F. J., B. Hutton, and D. Fergusson. Does analysis using last observation carried forward introduce bias in dementia research? In: Canadian Medical Association Journal 179.8 (2008), pp. 751-753.
DOI
2008 Article
Moritz, S. and T. Bartz-Beielstein. imputeTS: time series missing value imputation in R. In: The R Journal 9.1 (2017), pp. 207-218.
URL
2017 Article
Moritz, S., A. Sardá, T. Bartz-Beielstein, et al. Comparison of different methods for univariate time series imputation in R. Prepint arXiv 1510.03924. 2015.
URL
2015 Misc
Le Morvan, M., N. Prost, J. Josse, et al. Linear predictor on linearly-generated data with missing values: non consistency and solutions. In: Proceedings of Machine Learning Research. Ed. by -. Vol. 108. Proceedings of Machine Learning Research. 2020, p. 3165–3174. eprint: 2002.00658v2.
URL
2020 Paper
Le Morvan, M., J. Josse, T. Moreau, et al. NeuMiss networks: differentiable programming for supervised learning with missing values. In: Advances in Neural Information Processing Systems, 33. (Dec. 2020). Ed. by -. IEEE, 2020. eprint: 2007.01627v4.
URL
2020 Paper
Le Morvan, M., J. Josse, E. Scornet, et al. What’s a good imputation to predict with missing values? 2021.
URL
2021 Misc
Murray, J. S. and J. P. Reiter. Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models With Local Dependence. In: Journal of the American Statistical Association 111.516 (2016), pp. 1466-1479.
DOI
2016 Article
Muzellec, B., J. Josse, C. Boyer, et al. Missing Data Imputation using Optimal Transport. In: International Conference on Machine Learning. PMLR. 2020, pp. 7130–7140. 2020 Paper
Tang, F. and H. Ishwaran. Random forest missing data algorithms. In: Statistical Analysis and Data Mining: The ASA Data Science Journal 10.6 (2017), pp. 363–377.
DOI
2017 Article
Nabi, R., R. Bhattacharya, and I. Shpitser. Full Law Identification In Graphical Models Of Missing Data: Completeness Results. In: arXiv preprint arXiv:2004.04872 (2020).
URL
2020 Article
National Research Council, U. The Prevention and Treatment of Missing Data in Clinical Trials. Washington (DC), USA: National Academies Press, 2010. ISBN: 9780309158145.
DOI
2010 Book
Nguyen, L. T., J. Kim, and B. Shim. Low-Rank Matrix Completion: A Contemporary Survey. In: IEEE Access 7 (2019), pp. 94215–94237.
DOI
2019 Article
Nowicki, R. K., R. Scherer, and L. Rutkowski. Novel rough neural network for classification with missing data. In: 21st International Conference on Methods and Models in Automation and Robotics (MMAR). (Sep. 29, 2016-Sep. 01, 2016). Ed. by -. IEEE, 2016, pp. 820–825.
DOI
2016 Paper
O’Kelly, M. and B. Ratitch. Clinical Trials with Missing Data: A Guide for Practitioners. John Wiley & Sons, Ltd, 2014.
DOI
2014 Book
Orchard, T. and M. A. Woodbury. A missing information principle: theory and applications. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Theory of Statistic. Ed. by L. M. Le Cam, N. J. and E. L. Scott. Vol. 1. University of California Press, 1972, pp. 697–715.
URL
1972 Paper
Peugh, J. L. and C. K. Enders. Missing data in educational research: a review of reporting practices and suggestions for improvement. In: Review of Educational Research 74.4 (2004), pp. 525–556. 2004 Article
Pigott, T. D. A review of methods for missing data. In: Educational Research and Evaluation 7.4 (2001), pp. 353–383.
DOI
2001 Article
Preisser, J. S., K. K. Lohman, and P. J. Rathouz. Performance of weighted estimating equations for longitudinal binary data with drop-outs missing at random. In: Statistics in Medicine 21.20 (2002), pp. 3035–3054.
DOI
2002 Article
Quartagno, M. and J. R. Carpenter. Multiple imputation for discrete data: Evaluation of the joint latent normal model. In: Biometrical Journal 61.4 (2019), pp. 1003–1019.
DOI
2019 Article
Rahman, G. and Z. Islam. Missing value imputation using decision trees and decision forests by splitting and merging records: Two novel techniques. In: Knowledge-Based Systems 53 (2013), pp. 51–65. 2013 Article
Rao, J. N. K. and J. Shao. Jackknife variance estimation with survey data under hot deck imputation. In: Biometrika 79.4 (1992), pp. 811-822.
DOI
1992 Article
Reilly, M. and M. Pepe. The relationship between hot-deck multiple imputation and weighted likelihood. In: Statistics in Medecine 16.1-3 (1997), pp. 5-19.
DOI
1997 Article
Reiter, J. P. and M. Sadinle. Itemwise conditionally independent nonresponse modelling for incomplete multivariate data. In: Biometrika 104.1 (Jan. 2017), pp. 207-220. eprint: http://oup.prod.sis.lan/biomet/article-pdf/104/1/207/13066719/asw063.pdf. 2017 Article
Rieger, A., T. Hothorn, and C. Strobl. Random forests with missing values in the covariates. Tech. rep. 79. University of Munich, Department of Statistics, 2010.
URL
2010 Misc
Rioux, C., A. Lewin, O. A. Odejimi, et al. Reflection on modern methods: planned missing data designs for epidemiological research. In: International Journal of Epidemiology (2020).
DOI
2020 Article
Robin, G. Low-rank methods for heterogeneous and multi-source data. 2019.
DOI
2019 Misc
Robin, G., O. Klopp, J. Josse, et al. Main Effects and Interactions in Mixed and Incomplete Data Frames. In: Journal of the American Statistical Association 115.531 (2020), pp. 1292-1303. eprint: https://doi.org/10.1080/01621459.2019.1623041. 2020 Article
Robins, J. M., A. Rotnitzky, and L. P. Zhao. Estimation of Regression Coefficients When Some Regressors are not Always Observed. In: Journal of the American Statistical Association 89.427 (1994), pp. 846-866.
DOI
1994 Article
Robins, J. M., A. Rotnitzky, and L. P. Zhao. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. In: Journal of the American Statistical Association 90.429 (1995), pp. 106-121.
DOI
1995 Article
Robins, J. M. and N. Wang. Inference for imputation estimators. In: Biometrika 87.1 (2000), pp. 113-124.
URL
2000 Article
Rosseel, Y. lavaan: an R package for structural equation modeling. In: Journal of Statistical Software 48.2 (2012).
DOI
2012 Article
Rotnitzky, A., J. M. Robins, and D. O. Scharfstein. Semiparametric regression for repeated outcomes with nonignorable nonresponse. In: Journal of the American Statistical Association 93.444 (1998), pp. 1321-1339.
DOI
1998 Article
Rubin, D. B. Inference and missing data. In: Biometrika 63.3 (1976), pp. 581-592.
DOI
1976 Article
Rubin, D. B. Formalizing subjective notions about the effect of nonrespondents in sample surveys. In: Journal of the American Statistical Association 72.359 (1977), pp. 538-543.
DOI
1977 Article
Rubin, D. B. Multiple imputation after 18+ years. In: Journal of the American Statistical Association 91.434 (2012), pp. 473-489.
DOI
2012 Article
Rubin, D. B. Multlipe Imputation for Nonresponse in Surveys. Hoboken, NJ, USA: Wiley, 1987. ISBN: 9780471655740. 1987 Book
Sadinle, M. and J. P. Reiter. Sequential Identification of Nonignorable Missing Data Mechanisms. In: Statistica Sinica 28.4 (2018), pp. 1741–1759.
DOI
2018 Article
Sadinle, M. and J. P. Reiter. Sequentially additive nonignorable missing data modeling using auxiliary marginal information. In: arXiv preprint (2019). arXiv: 1902.06043 [stat.ME].
URL
2019 Article
Santos, M. S., R. C. Pereira, A. F. Costa, et al. Generating Synthetic Missing Data: A Review by Missing Mechanism. In: IEEE Access 7 (2019), pp. 11651–11667.
Generating Synthetic Missing Data: A Review by Missing Mechanism. In: IEEE Access 7 (2019), pp. 11651–11667.
DOI
2019 Article
Santos, M. S., R. C. Pereira, A. F. Costa, et al. Generating Synthetic Missing Data: A Review by Missing Mechanism. In: IEEE Access 7 (2019), pp. 11651–11667.
DOI
2019 Article
Schafer, J. L. Analysis of Incomplete Multivariate Data. CRC Monographs on Statistics & Applied Probability. Boca Raton, FL, USA: Chapman and Hall/CRC, 1997. ISBN: 0412040611. 1997 Book
Schafer, J. L. and J. W. Graham. Missing data: our view of the state of the art. In: Psychological Methods 7.2 (2002), pp. 147-177.
DOI
2002 Article
Schafer, J. L. and M. K. Olsen. Multiple Imputation for multivariate missing-data problems: a data analyst’s perspective. In: Multivariate Behavioral Research 33.4 (1998), pp. 545-571.
DOI
1998 Article
Schafer, J. L. Multiple imputation: a primer. In: Statistical Methods in Medical Research 8.1 (1999), pp. 3-15.
DOI
1999 Article
Seaman, S., J. Galati, D. Jackson, et al. What Is Meant by “Missing at Random”? In: Statistical Science 28.2 (2013), pp. 257–268.
What Is Meant by" Missing at Random"? In: Statistical Science (2013), pp. 257–268. 2013 Article
Seaman, S., J. Galati, D. Jackson, et al. What Is Meant by “Missing at Random”? In: Statistical Science 28.2 (2013), pp. 257–268. 2013 Article
Seaman, S. R. and S. Vansteelandt. Introduction to Double Robust Methods for Incomplete Data. In: Statistical Science 33.2 (2018), p. 184.
DOI
2018 Article
Seaman, S. R. and I. R. White. Review of inverse probability weighting for dealing with missing data. In: Statistical Methods in Medical Research 22.3 (2011), pp. 278-295.
DOI
2011 Article
Shao, J. and J. Zhang. A transformation approach in linear mixed-effects models with informative missing responses. In: Biometrika 102.1 (2015), pp. 107-119.
DOI
2015 Article
Sharpe, P. K. and R. J. Solly. Dealing with missing values in neural network-based diagnostic systems. In: Neural Computing & Applications 3.2 (1995), pp. 73-77.
DOI
1995 Article
Simon, G. A. and J. S. Simonoff. Diagnostic plots for missing data in least squares regression. In: Journal of the American Statistical Association 81.394 (1986), pp. 501-509.
DOI
1986 Article
Śmieja, M., Ł. Struski, J. Tabor, et al. Processing of missing data by neural networks. In: Computing Research Repository abs/1805.07405 (2018). eprint: 1805.07405.
URL
2018 Article
Sovilj, D., E. Eirola, Y. Miche, et al. Extreme learning machine for missing data using multiple imputations. In: Neurocomputing 174.A (2016), pp. 220-231.
DOI
2016 Article
Sportisse, A., C. Boyer, and J. Josse. Imputation and low-rank estimation with Missing Not At Random data. In: Statistics and Computing 30.6 (2018), pp. 1629-1643.
DOI
2018 Article
Sportisse, A., C. Boyer, and J. Josse. Estimation with informative missing data in the low-rank model with random effects. In: Advances in Neural Information Processing Systems, 33. (Dec. 2020). Ed. by -. IEEE, 2020. eprint: 1906.02493v3.
URL
2020 Paper
Sportisse, A., C. Boyer, A. Dieuleveut, et al. Debiasing Averaged Stochastic Gradient Descent to handle missing values. In: Advances in Neural Information Processing Systems, 33. (Dec. 2020). Ed. by -. IEEE, 2020. eprint: 2002.09338v2.
URL
2020 Paper
Stacklies, W., H. Redestig, M. Scholz, et al. pcaMethods – a bioconductor package providing PCA methods for incomplete data. In: Bioconductor 23.9 (2007), pp. 1164-1167.
DOI
2007 Article
Stage, A. R. and N. L. Crookston. Partitioning error components for accuracy-assessment of near-neighbor methods of imputation. In: Forest Science 53.1 (2007), pp. 62-72.
DOI
2007 Article
Stekhoven, D. J. and P. Bühlmann. Missforest-non-parametric missing value imputation for mixed-type data. In: Bioinformatics 28.1 (2012), pp. 112-118. eprint: 1105.0828.
DOI
2012 Article
Strobl, C., A. L. Boulesteix, and T. Augustin. Unbiased split selection for classification trees based on the Gini Index. In: Computational Statistics & Data Analysis 52.1 (2007), pp. 483-501.
DOI
2007 Article
Stuart, E. A., M. Azur, C. Frangakis, et al. Multiple imputation with large data sets: a case study of the children’s mental health initiative. In: American Journal of Epidemiology 169.9 (2009), pp. 1133-1139.
DOI
2009 Article
Stubbendick, A. L. and J. G. Ibrahim. Maximum Likelihood Methods for Nonignorable Missing Responses and Covariates in Random Effects Models. In: Biometrics 59.4 (2003), pp. 1140–1150.
DOI
2003 Article
Stubbendick, A. L. and J. G. Ibrahim. Likelihood-based inference with nonignorable missing responses and covariates in models for discrete longitudinal data. In: Statistica Sinica 16.4 (2006), pp. 1143–1167.
URL
2006 Article
Su, Y. S., A. Gelman, J. Hill, et al. Multiple imputation with diagnostics (mi) in R: opening windows into the black box. In: Journal of Statistical Software 45 (2011), p. 2.
DOI
2011 Article
Tabouy, T., P. Barbillon, and J. Chiquet. Variational inference for stochastic block models from sampled data. In: Journal of the American Statistical Association 115.529 (2020), pp. 455–466.
DOI
2020 Article
Tanner, M. A. and W. Wong. The calculation of posterior distributions by data augmentation. In: Journal of the American Statistical Association 82.398 (1987), pp. 528-540. 1987 Article
Tchetgen Tchetgen, E. J., L. Wang, and B. Sun. Discrete choice models for nonmonotone nonignorable missing data: identification and inference. In: Statistica Sinica 28.4 (2018), pp. 2069–2088.
DOI
2018 Article
Templ, M., A. Alfons, and P. Filzmoser. Exploring Incomplete data using visualization techniques. In: Advances in Data Analysis and Classification 6.1 (2012), pp. 29-47.
DOI
2012 Article
Thijs, H., G. Molenberghs, B. Michiels, et al. Strategies to fit pattern-mixture models. In: Biostatistics 3.2 (2002), pp. 245-265.
DOI
2002 Article
Tierney, N. and D. Cook. Expanding tidy data principles to facilitate missing data exploration, visualization and assessment of imputations. Monash Econometrics and Business Statistics Working Papers 14/18. Monash University, Department of Econometrics and Business Statistics, 2018.
URL
2018 Misc
Tierney, N. J., F. A. Harden, M. J. Harden, et al. Using decision trees to understand structure in missing data. In: BMJ Open 5.6 (2015), p. e007450.
DOI
2015 Article
Tran, L., X. Liu, J. Zhou, et al. Missing Modalities Imputation via Cascaded Residual Autoencoder. In: 2017 IEEE Conference on Computer Vision and PAttern Recognition (CVPR). (Jul. 21, 2017-Jul. 26, 2017). Ed. by -. IEEE, 2017, pp. 4971-4980.
DOI
2017 Paper
Troyanskaya, O., M. Cantor, G. Sherlock, et al. Missing value estimation methods for DNA microarrays. In: Bioinformatics 17.6 (2001), pp. 520-525.
DOI
2001 Article
Twala, B. E. T. H., M. C. Jones, and D. J. Hand. Good methods for coping with missing data in decision trees. In: Pattern Recognition Letters 29.7 (2008), pp. 950-956.
DOI
2008 Article
Unnebrink, K. and J. Windeler. Intention-to-treat: methods for dealing with missing values in clinical trials of progressively deteriorating diseases. In: Statistics in Medecine 20.24 (2001), pp. 3931-3946.
DOI
2001 Article
Buuren, S. van, J. P. L. Brand, C. G. M. Groothuis-Oudshoorn, et al. Fully conditional specification in multivariate imputation. In: Journal of Statistical Computation and Simulation 76.12 (2006), pp. 1049-1064.
DOI
2006 Article
Buuren, S. van. Flexible Imputation of Missing Data. Boca Raton, FL: Chapman and Hall/CRC, 2018.
URL
2018 Book
Buuren, S. van and K. Groothuis-Oudshoorn. MICE: multivariate imputation by chained equations in R. In: Journal of Statistical Software 45 (2011), p. 3. eprint: NIHMS150003.
DOI
2011 Article
Buuren, S. van. Multiple imputation of discrete and continuous data by fully conditional specification. In: Statistical Methods in Medical Research 16 (2007), pp. 219-242.
DOI
2007 Article
Wal, W. M. van der and R. B. Geskus. ipw: an R package for inverse probability weighting. In: Journal of Statistical Software 43.13 (2011).
DOI
2011 Article
Velden, M. van de and T. H. A. Bijmolt. Generalized canonical correlation analysis of matrices with missing rows: a simulation study. In: Psychometrika (2006).
DOI
2006 Article
Vansteelandt, S., A. Rotnitzky, and J. Robins. Estimation of regression models for the mean of repeated outcomes under nonignorable nonmonotone nonresponse. In: Biometrika 94.4 (2007), pp. 841–860.
DOI
2007 Article
Vansteelandt, S., J. Carpenter, and M. G. Kenward. Analysis of incomplete data using inverse probability weighting and doubly robust estimators. In: Methodology – European Journal of Research Methods for the Behavioral and Social Sciences 6.1 (2010), pp. 37–48.
DOI
2010 Article
Verbanck, M., J. Josse, and F. Husson. Regularised PCA to denoise and visualise data. In: Statistics and Computing 25.2 (2015), pp. 471-486.
DOI
2015 Article
Verbeke, G., G. Molenberghs, H. Thijs, et al. Sensitivity analysis for nonrandom dropout: a local influence approach. In: Biometrics 57.1 (2001), pp. 7-14.
DOI
2001 Article
Voillet, V., P. Besse, L. Liaubet, et al. Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework. In: BMC Bioinformatics 17.402 (2016). Forthcoming.
DOI
2016 Article
Wainer, H., ed. Drawing Inferences from Self-Selected Samples. New York, NY, USA: Springer, 1986. 1986 Book
Wang, N. and J. M. Robins. Large-sample theory for parametric multiple imputation procedures. In: Biometrika 85.4 (1998), pp. 935–948.
DOI
1998 Article
White, I. R., J. Carpenter, and N. J. Horton. A mean score method for sensitivity analysis to departures from the missing at random assumption in randomised trials. In: Statistica Sinica 28.4 (2018), pp. 1985–2003.
DOI
2018 Article
Wu, M. C. and R. J. Carroll. Estimation and comparison of changes in the presence of informative right censoring by modeling the censoring process. In: Biometrics 44.1 (1988), pp. 175-188.
DOI
1988 Article
Xie, X. and X. L. Meng. Dissecting multiple imputation from a multi-phase inference perspective: what happens when God’s, imputer’s and analyst’s models are uncongenial? In: Statistica Sinica 27.4 (2017), pp. 1485–1594.
DOI
2017 Article
Xue, F. and A. Qu. Integrating multi-source block-wise missing data in model selection. In: Journal of the American Statistical Association (2020), pp. 1–36.
DOI
2020 Article
Yang, S., L. Wang, and P. Ding. Identification and estimation of causal effects with confounders subject to instrumental missingness. In: Statistics Methodology Repository (2017).
URL
2017 Article
Yoon, J., J. Jordon, and M. van der Schaar. GAIN: Missing Data Imputation using Generative Adversarial Nets. In: Proceedings of the 35th International Conference on Machine Learning. (Jul. 10, 2018-Jul. 15, 2018). Ed. by J. Dy and A. Krause. Vol. 80. Proceedings of Machine Learning Research. Stockholmsmässan, Stockholm Sweden: PMLR, 2018, pp. 5689–5698.
URL
2018 Paper
Yoon, S. and S. Sull. GAMIN: Generative Adversarial Multiple Imputation Network for Highly Missing Data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, pp. 8456–8464.
URL
2020 Paper
Zhang, H., P. Xie, and E. Xing. Missing Value Imputation Based on Deep Generative Models. In: Computing Research Repository abs/1808.01684 (2018).
URL
2018 Article
Zhang, S. Nearest neighbor selection for iterative kNN imputation. In: Journal of Systems and Software 85.11 (2012), pp. 2541-2552.
DOI
2012 Article
Zhao, Y. Statistical inference for missing data mechanisms. In: Statistics in Medicine 39.28 (2020), pp. 4325–4333.
DOI
2020 Article
Zhao, Y. and M. Udell. Missing value imputation for mixed data via gaussian copula. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020, pp. 636–646.
DOI
2020 Paper
Zhao, J. and Y. Ma. A versatile estimation procedure without estimating the nonignorable missingness mechanism. In: Journal of the American Statistical Association (2021), pp. 1–15.
DOI
2021 Article
Zhou, Y., R. J. A. Little, and J. D. Kalbfleisch. Block-conditional missing at random models for missing data. In: Statistical Science 25.4 (2010), pp. 517–532.
DOI
2010 Article
Zhu, Z., T. Wang, and R. J. Samworth. High-dimensional principal component analysis with heterogeneous missingness. In: arXiv preprint (2019).
URL
2019 Article

Share