R-miss-tastic

A resource website on missing values - Methods and references for managing missing data


A commented version of this bibliography can be found here.

Publication type Year Author
CitationYearPublication type
Ayme, A., C. Boyer, A. Dieuleveut, et al. Near-optimal rate of consistency for linear models with missing values. In: International Conference on Machine Learning. PMLR. 2022, pp. 1211–1243.2022Paper
Mohan, K., J. Pearl, and J. Tian. Graphical models for inference with missing data. In: Advances in neural information processing systems 26 (2013).2013Article
Mohan, K. and J. Pearl. Graphical models for recovering probabilistic and causal queries from missing data. In: Probabilistic and Causal Inference: the Works of Judea Pearl. 2022, pp. 413–432.2022Book
Berrett, T. B. and R. J. Samworth. Optimal nonparametric testing of missing completely at random and its connections to compatibility. In: The Annals of Statistics 51.5 (2023), pp. 2170–2193.2023Article
Little, R. J. and D. B. Rubin. Statistical analysis with missing data. John Wiley & Sons, 2019.2019Book
Verchand, K. A. and A. Montanari. High-dimensional logistic regression with missing data: Imputation, regularization, and universality. In: arXiv preprint arXiv:2410.01093 (2024).2024Article
Ayme, A., C. Boyer, A. Dieuleveut, et al. Random features models: a way to study the success of naive imputation. In: Proceedings of the 41st International Conference on Machine Learning. ICML’24. Vienna, Austria: JMLR.org, 2024.2024Paper
Lobo, A. D. R., A. Ayme, C. Boyer, et al. A primer on linear classification with missing data. In: arXiv preprint arXiv:2405.09196 (2024).2024Article
Mohan, K. and J. Pearl. Graphical models for processing missing data. In: Journal of the American Statistical Association 116.534 (2021), pp. 1023–1037.2021Article
Ayme, A., C. Boyer, A. Dieuleveut, et al. Naive imputation implicitly regularizes high-dimensional linear models. In: International Conference on Machine Learning. PMLR. 2023, pp. 1320–1340.2023Paper
Morvan, M. L. and G. Varoquaux. Imputation for prediction: beware of diminishing returns. In: arXiv preprint arXiv:2407.19804 (2024).2024Article
Zaffran, M., A. Dieuleveut, J. Josse, et al. Conformal prediction with missing values. In: International Conference on Machine Learning. PMLR. 2023, pp. 40578–40604.2023Paper
Molenberghs, G., C. Beunckens, C. Sotto, et al. Every missingness not at random model has a missingness at random counterpart with equal fit. In: Journal of the Royal Statistical Society Series B: Statistical Methodology 70.2 (2008), pp. 371–388.2008Article
Näf, J., E. Scornet, and J. Josse. What Is a Good Imputation Under MAR Missingness? In: arXiv preprint arXiv:2403.19196 (2024).2024Article
Spohn, M., J. Näf, L. Michel, et al. PKLM: A flexible MCAR test using Classification. In: Psychometrika (2025), pp. 1–24.2025Article
Deng, G., C. Han, and D. S. Matteson. Extended missing data imputation via GANs for ranking applications. In: Data Mining and Knowledge Discovery 36.4 (2022), pp. 1498–1520.2022Article
Fang, F. and S. Bao. FragmGAN: generative adversarial nets for fragmentary data imputation and prediction. In: Statistical Theory and Related Fields 8.1 (2024), pp. 15–28.2024Article
Abayomi, K., A. Gelman, and M. Levy. Diagnostics for multivariate imputations. In: Journal of the Royal Statistical Society, Series C (Applied Statistics) 57.3 (2008), pp. 273-291.2008Article
Albert, P. S. and D. A. Follmann. Modeling repeated count data subject to informative dropout. In: Biometrics 56.3 (2000), pp. 667-677.2000Article
Allison, P. D. Missing Data. Quantitative Applications in the Social Sciences. Thousand Oaks, CA, USA: Sage Publications, 2001. ISBN: 9780761916727.2001Book
Andridge, R. and R. J. A. Little. A review of hot deck imputation for survey non-response. In: International Statistical Review 78.1 (2010), pp. 40-64.2010Article
Audigier, V., F. Husson, and J. Josse. A principal component method to impute missing values for mixed data. In: Advances in Data Analysis and Classification 10.1 (2016), pp. 5-26.2016Article
Audigier, V., F. Husson, and J. Josse. Multiple imputation for continuous variables using a Bayesian principal component analysis. In: Journal of Statistical Computation and Simulation 86.11 (2015), pp. 2140-2156.2015Article
Audigier, V., F. Husson, and J. Josse. MIMCA: multiple imputation for categorical variables with multiple correspondence analysis. In: Statistics and Computing 27.2 (2016), pp. 1-18. eprint: 1505.08116.2016Article
Bang, H. and J. M. Robins. Doubly robust estimation in missing data and causal inference models. In: Biometrics 61.4 (2005), pp. 962-973.2005Article
Baraldi, A. N. and C. K. Enders. An introduction to modern missing data analysis. In: Journal of School Psychology 48.1 (2010), pp. 5-37.2010Article
Baretta, L. and A. Santaniello. Nearest neighbor imputation algorithms: a critical evaluation. In: BMC Medical Informatics and Decision Making. Proceedings of the 5th Translational Bioinformatics Conference (TBC 2015): medical informatics and decision making 16.Supp. 3 (2016), p. 74.2016Article
Bartlett, J. W., O. Harel, and J. R. Carpenter. Asymptotically unbiased estimation of exposure odds ratios in complete records logistic regression. In: American journal of epidemiology 182.8 (2015), pp. 730–736.2015Article
Beaulac, C. and J. S. Rosenthal. BEST: A decision tree algorithm that handles missing values. In: arXiv preprint (2018). eprint: 1804.10168.2018Article
Bengio, Y. and F. Gingras. Recurrent neural networks for missing or asynchronous data. In: Proceedings of the 8th International Conference on Neural Information Processing Systems. (Nov. 27, 1995-Dec. 02, 1995). Ed. by -. Cambridge, MA, USA: MIT Press, 1995, pp. 395-401.1995Paper
Bertsimas, D., C. Pawlowski, and Y. D. Zhuo. From predictive methods to missing data imputation: an optimization approach. In: The Journal of Machine Learning Research 18.1 (2017), pp. 7133–7171.2017Article
Beunckens, C., G. Molenberghs, G. Verbeke, et al. A latent-class mixture model for incomplete longitudinal Gaussian data. In: Biometrics 64.1 (2008), pp. 96–105.2008Article
Bianchi, F. M., L. Livi, K. Ø. Mikalsen, et al. Learning representations of multivariate time series with missing data. In: Pattern Recognition 96 (2019), p. 106973.2019Article
Biessmann, F., D. Salinas, S. Schelter, et al. Deep" Learning for Missing Value Imputation in Tables with Non-Numerical Data. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management. Ed. by -. CIKM ’18. Torino, Italy: ACM, 2018, pp. 2017–2025. ISBN: 978-1-4503-6014-2.2018Paper
Blake, H. A., C. Leyrat, K. Mansfield, et al. Propensity scores using missingness pattern information: a practical guide. In: arXiv preprint (2019). arXiv: 1901.03981 [stat.ME].2019Article
Brinis, S., C. Traina, and A. J. Traina. Hollow-tree: a metric access method for data with missing values. In: Journal of Intelligent Information Systems (2019), pp. 1–28.2019Article
Buck, S. F. A method of estimation of missing values in multivariate data suitable for use with an electronic computer. In: Journal of the Royal Statistical Society, Series B 22 (1960), pp. 302-306.1960Article
Burns, R. M. Multiple and replicate item imputation in a complex sample survey. In: Proceedings of the 6th Annual Research Conference. Ed. by B. of the Census. Washington DC, USA, 1990, pp. 655-665.1990Paper
Candès, E. J., C. A. Sing-Long, and J. D. Trzasko. Unbiased risk estimates for singular value thresholding and spectral estimators. In: IEEE Transactions on Signal Processing 61.19 (2013), pp. 4643-4657.2013Article
Carpenter, J. R., M. G. Kenward, and S. Vansteelandt. A comparison of multiple imputation and doubly robust estimation for analyses with missing data. In: Journal of the Royal Statistical Society: Series A (Statistics in Society) 169.3 (2006), pp. 571–584.2006Article
Carpenter, J. and M. Kenward. Multiple Imputation and its Application. Chichester, West Sussex, UK: Wiley, 2013. ISBN: 9780470740521.2013Book
Chen, T. and C. Guestrin. XGBoost: A Scalable Tree Boosting System. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. (Aug. 13, 2016-Aug. 17, 2016). Ed. by -. New York, NY, USA: ACM, 2016, pp. 785-794. ISBN: 0450342322.2016Paper
Chen, Y. and M. Sadinle. Nonparametric Pattern-Mixture Models for Inference with Missing Data. In: arXiv preprint (2019). arXiv: 1904.11085 [stat.ME].2019Article
Chen, J. and J. Shao. Nearest neighbor imputation for survey data. In: Journal of Official Statistics 16.2 (2000), pp. 113-131.2000Article
Collins, L. M., J. L. Schafer, and K. Chi-Ming. A comparison of inclusive and restrictive strategies in modern missing data procedures. In: Psychological Methods 6.4 (2007), pp. 330-351.2007Article
Cranmer, S. J. and J. Gill. We have to be discrete about this: a non-parametric imputation technique for missing categorical data. In: British Journal of Political Science 43 (2012), pp. 425-449.2012Article
Crookston, N. L. and A. O. Finley. yaImpute: an R package for kNN imputation. In: Journal of Statistical Software 23 (2008), p. 10.2008Article
Dax, A. Imputing Missing Entries of a Data Matrix: A review. In: Journal of Advanced Computing 3.3 (2014), pp. 98-222.2014Article
Dempster, A. P., N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. In: Journal of the Royal Statistical Society, Series B (Methodological) 39.1 (1977), pp. 1-38.1977Article
Diggle, P. and M. G. Kenward. Informative drop-out in longitudinal data analysis. In: Journal of the Royal Statistical Society, Series C (Applied Statistics) 43.1 (1994), pp. 49-93.1994Article
Ding, P. and F. Li. Causal Inference: A Missing Data Perspective. In: Statistical Science 33.2 (2018), pp. 214–237.2018Article
Ding, Y. and J. S. Simonoff. An investigation of missing data methods for classification trees applied to binary response data. In: Journal of Machine Learning Research 11.1 (2010), pp. 131-170.2010Article
Dong, Y. and C. J. Peng. Principled missing data methods for researchers. In: SpringerPlus 2 (2013), p. 222.2013Article
Enders, C. K. Applied Missing Data Analysis. Guilford Press, 2010, p. 401. ISBN: 9781606236390.2010Book
Enders, C. K. A primer on maximum likelihood algorithms available for use with missing data. In: Structural Equation Modeling 8.1 (2001), pp. 128-141.2001Article
Erler, N. S., D. Rizopoulos, and E. M. Lesaffre. JointAI: joint analysis and imputation of incomplete data in R. In: arXiv preprint (2019).2019Article
Hunt, L. and M. Jorgensen. Mixture model clustering for mixed data with missing information. In: Computational Statistics & Data Analysis 41.3-4 (2003), pp. 429–440.2003Article
Jiang, W., M. Bogdan, J. Josse, et al. Adaptive Bayesian SLOPE–High-dimensional Model Selection with Missing Values. In: arXiv preprint (2019).2019Article
Chi, J. T., E. C. Chi, and R. G. Baraniuk. k-pod: A method for k-means clustering of missing data. In: The American Statistician 70.1 (2016), pp. 91–99.2016Article
Fang, F., J. Zhao, and J. Shao. Imputation-based adjusted score equations in generalized linear models with nonignorable missing covariate values. In: Statistica Sinica 28.4 (2018), pp. 1677–1701.2018Article
Fay, R. E. Alternative paradigms for the analysis of imputed survey data. In: Journal of the American Statistical Association 91.434 (1996), pp. 490-498.1996Article
Fellegi, I. P. and D. Holt. A systematic approach to automatic edit and imputation. In: Journal of the American Statistical Association 71.353 (1976), pp. 17-35.1976Article
Ferrari, P. A., P. Annoni, A. Barbiero, et al. An imputation method for categorical variables with application to nonlinear principal component analysis. In: Computational Statistics & Data Analysis 55.7 (2011), pp. 2410-2420.2011Article
Finkbeiner, C. Estimation for the multiple factor model when data are missing. In: Psychometrika 44.4 (1979), pp. 409-420.1979Article
Fitzmaurice, G. M., G. Molenberghs, and S. R. Lipsitz. Regression Models for Longitudinal Binary Responses with Informative Drop-Outs. In: Journal of the Royal Statistical Society. Series B (Methodological) 57.4 (1995), pp. 691–704.1995Article
Follmann, D. and M. Wu. An approximate generalized linear model with random effects for informative missing data. In: Biometrics 51.1 (1995), pp. 151-168.1995Article
Gad, A. M. and N. M. M. Darwish. A shared parameter model for longitudinal data with missing values. In: American Journal of Applied Mathematics and Statistics 1.2 (2013), pp. 30-35.2013Article
Gelman, A., G. King, and C. Liu. Not asked and not answered: Multiple imputation for multiple surveys. In: Journal of the American Statistical Association 93.443 (1998), pp. 846–857.1998Article
Gelman, A., I. van Mechelen, G. Verbeke, et al. Multiple Imputation for Model Checking: Completed-Data Plots with Missing and Latent Data. In: Biometrics 61.1 (2005), pp. 74–85.2005Article
Gill, R. D., M. J. Van Der Laan, and J. M. Robins. Coarsening at random: Characterizations, conjectures, counter-examples. In: Proceedings of the First Seattle Symposium in Biostatistics. Springer. 1997, pp. 255–294.1997Paper
Golden, R. M., S. S. Henley, H. White, et al. Consequences of model misspecification for maximum likelihood estimation with missing data. In: Econometrics 7.3 (2019), p. 37.2019Article
Gondara, L. and K. Wang. MIDA: Multiple Imputation using Denoising Autoencoders. In: Proceedings of the 22nd Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2018). (Jun. 03, 2018-Jun. 06, 2018). Ed. by D. Phung, V. Tseng, G. Webb, B. Ho, M. Ganji and L. Rashidi. Lecture Notes in Computer Science. Springer International Publishing, 2018, pp. 260-272. ISBN: 3319930404.2018Paper
Goodfellow, I., M. Mirza, A. Courville, et al. Multi-Prediction Deep Boltzmann Machines. In: Proceedings of the 26th International Conference on Neural Information Processing Systems. (Dec. 05, 2013-Dec. 10, 2013). Ed. by C. Burges, L. Bottou, M. Welling, Z. Ghahramani and K. Weinberger. Advances in Neural Information Processing Systems 26. Curran Associates, Inc., 2013, pp. 548–556.2013Paper
Graham, J. W. Missing data analysis: making it work in the real world. In: Annual Review of Psychology 60 (2009), pp. 549-576.2009Article
Graham, J. W., A. E. Olchowski, and T. E. Gilreath. How many imputations are really needed? Some practical clarifications of multiple imputation theory. In: Prevention Science 8.3 (2007), pp. 206-213.2007Article
Graham, J. W., S. M. Hofer, S. I. Donaldson, et al. The Science of Prevention: Methodological Advances from Alcohol and Substance Abuse Research. In: The Science of Prevention: Methodological Advances from Alcohol and Substance Abuse Research. Ed. by K. Bryant, M. Windle and S. West. Washington, DC, USA: American Psychological Association, 1997. Chap. Analysis with missing data in prevention research, pp. 325-366. ISBN: 1-55798-439-5.1997Book
Heckman, J. J. The common structure of statistical models of truncation, sample selection and limited dependent variables and a simple estimator for such models. In: Annals of Economic and Social Measurement 5.4 (1976), pp. 475-492.1976Article
Heckman, J. Sample selection bias as a specification error. In: Econometrica 47.1 (1979), pp. 153-161.1979Article
Hogan, J. W. and N. M. Laird. Mixture models for the joint distribution of repeated measures and event times. In: Statistics in Medecine 16.1-3 (1997), pp. 239-257.1997Article
Hogan, J. W. and T. Lancaster. Instrumental variables and inverse probability weighting for causal inference from longitudinal observational studies. In: Statistical Methods in Medical Research 13.1 (2004), pp. 17-48.2004Article
Honaker, J., G. King, and M. Blackwell. Amelia II: a program for missing data. In: Journal of Statistical Software 45.7 (2011). eprint: arXiv:1501.0228.2011Article
Horton, N. J. and K. P. Kleinman. Much Ado About Nothing - A Comparison of Missing Data Methods and Software to Fit Incomplete Data Regression Models. In: The American Statistician 61.1 (2017), pp. 79-90.2017Article
Hothorn, T., K. Hornik, and A. Zeileis. Unbiased Recursive Partitioning: A Conditional Inference Framework. In: Journal of Computational and Graphical Statistics 15.3 (2012), pp. 651-674.2012Article
Huisman, M. Imputation of missing item responses: some simple techniques. In: Quality & Quantity 34.4 (2000), pp. 331-351.2000Article
Husson, F. and J. Josse. Handling missing values in multiple factor analysis. In: Food Quality and Preference 30 (2013), pp. 77-85.2013Article
Ibrahim, J. G., M. Chen, and S. R. Lipsitz. Missing responses in generalised linear mixed models when the missing data mechanism is nonignorable. In: Biometrika 88.2 (2001), pp. 551-564.2001Article
Ibrahim, J. G., S. R. Lipsitz, and M. Chen. Missing Covariates in Generalized Linear Models When the Missing Data Mechanism is Non-Ignorable. In: Journal of the Royal Statistical Society. Series B (Statistical Methodology) 61.1 (1999), pp. 173-190.1999Article
Ilin, A. and T. Raiko. Practical approaches to Principal Component Analysis in the presence of missing values. In: Journal of Machine Learning Research 11 (2010), pp. 1957-2000.2010Article
Imbert, A., A. Valsesia, C. Le Gall, et al. Multiple hot-deck imputation for network inference from RNA sequencing data. In: Bioinformatics 34.10 (2018), pp. 1726-1732.2018Article
Ipsen, N. B., P. Mattei, and J. Frellsen. not-MIWAE: Deep generative modelling with missing not at random data. In: arXiv preprint (2020).
Ipsen, N., P. Mattei, and J. Frellsen. How to deal with missing data in supervised deep learning? In: ICML Workshop on the Art of Learning with Missing Values (Artemiss). 2020.2020Paper
Ipsen, N. B., P. Mattei, and J. Frellsen. not-MIWAE: Deep generative modelling with missing not at random data. In: arXiv preprint (2020).2020Article
Jamshidian, M., S. Jalal, and C. Jansen. MissMech: an R package for testing homoscedasticity, multivariate normality, and missing completely at random (MCAR). In: Journal of Statistical Software 56.6 (2014), pp. 1-31.2014Article
Jamshidian, M. and S. Jalal. Tests of homoscedasticity, normality, and missing completely at random for incomplete multivariate data. In: Psychometrika 75.4 (2010), pp. 649-674. eprint: NIHMS150003.2010Article
Jiang, W., J. Josse, and M. Lavielle. Logistic Regression with Missing Covariates–Parameter Estimation, Model Selection and Prediction. In: arXiv preprint (2018). arXiv: 1805.04602 [stat.ME].2018Article
Joenssen, D. W. and U. Bankhofer. Donor limited hot deck imputation: effect on parameter estimation. In: Journal of Theoretical and Applied Computer Science 6.3 (2012), pp. 58-70.2012Article
Jones, M. P. Indicator and Stratification Methods for Missing Explanatory Variables in Multiple Linear Regression. In: Journal of the American Statistical Association 91.433 (1996), pp. 222-230.1996Article
Jönsson, P. and C. Wohlin. An evaluation of k-nearest neighbour imputation using lIkert data. In: Proceedings of the 10th International Symposium on Software Metrics. (Sep. 14, 2004-Sep. 16, 2004). Ed. by -. Chicago, IL, USA: IEEE, 2004, pp. 1530-1435. ISBN: 0769521290.2004Paper
Josse, J., N. Prost, E. Scornet, et al. On the consistency of supervised learning with missing values. In: arXiv preprint (2019). arXiv: 1902.06931 [stat.ML].2019Article
Josse, J., J. Pagès, and F. Husson. Multiple imputation in principal component analysis. In: Advances in Data Analysis and Classification 5.3 (2011), pp. 231-246.2011Article
Josse, J., M. Chavent, B. Liquet, et al. Handling missing values with regularized iterative multiple correspondance analysis. In: Journal of Classification 29.1 (2012), pp. 91-116.2012Article
Josse, J., F. Husson, and J. Pagès. Gestion des données manquantes en Analyse en Composantes Principales. In: Journal de la Société Française de Statistique 150.2 (2009), pp. 28-51.2009Article
Josse, J. and F. Husson. Handling missing values in exploratory multivariate data analysis methods. In: Journal de la Société Française de Statistique 153.2 (2012), pp. 79-99.2012Article
Josse, J. and F. Husson. missMDA: a package for handling missing values in multivariate data analysis. In: Journal of Statistical Software 70.1 (2016), pp. 1-31.2016Article
Kaiser, J. Dealing with missing values in data. In: Journal of Systems Integration 5.1 (2014), pp. 42-51.2014Article
Kallus, N., X. Mao, and M. Udell. Causal Inference with Noisy and Missing Covariates via Matrix Factorization. In: Advances in Neural Information Processing Systems. Ed. by -. 2018. eprint: 1806.00811.2018Paper
Kalton, G. and D. Kasprzyk. The treatment of missing survey data. In: Survey Methodology 12.1 (1986), pp. 1-16.1986Article
Kapelner, A. and J. Bleich. Prediction with missing data via Bayesian additive regression trees. In: Canadian Journal of Statistics 43.2 (2015), pp. 224-239.2015Article
Khosravi, P., A. Vergari, Y. Choi, et al. Handling missing data in decision trees: A probabilistic approach. In: arXiv preprint arXiv:2006.16341 (2020).2020Article
Kim, J. K. and J. Shao. Statistical Methods for Handling Incomplete Data. Boca Raton, FL, USA: Chapman and Hall/CRC, 2013. ISBN: 9781482205077.2013Book
Kohn, R. and C. F. Ansley. Estimation, prediction, and interpolation for ARIMA models with missing data. In: Journal of the American Statistical Association 81.395 (1986), pp. 751-761.1986Article
Kowarik, A. and M. Templ. Imputation with the R Package VIM. In: Journal of Statistical Software 74.7 (2016), pp. 1-16.2016Article
Kropko, J., B. Goodrich, A. Gelman, et al. Multiple Imputation for Continuous and Categorical Data: Comparing Joint Multivariate Normal and Conditional Approaches. In: Political Analysis 22.4 (2014), pp. 497–519.2014Article
Larose, C., D. K. Dey, and O. Harel. The impact of missing values on different measures of uncertainty. In: Statistica Sinica 29.2 (2019), pp. 551–566.2019Article
Lee, K. M., R. Mitra, and S. Biedermann. Optimal design when outcome values are not missing at random. In: Statistica Sinica 28.4 (2018), pp. 1821–1838.2018Article
Lee, K. J., K. Tilling, R. P. Cornish, et al. Framework for the Treatment And Reporting of Missing data in Observational Studies: The Treatment And Reporting of Missing data in Observational Studies framework. In: Journal of clinical epidemiology 134 (2021), pp. 79–88.2021Article
Little, R. J. A. A test of missing completely at random for multivariate data with missing values. In: Journal of the American Statistical Association 83.404 (1988), pp. 1198-1202.1988Article
Little, R. J. A. Regression with missing X’s: a review. In: Journal of the American Statistical Association 87.420 (1992), pp. 1227-1237.1992Article
Little, R. J. A. Pattern-mixture models for multivariate incomplete data. In: Journal of the American Statistical Association 88.421 (1993), pp. 125-134.1993Article
Little, R. J. A. Modeling the drop-out mechanism in repeated-measures studies. In: Journal of the American Statistical Association 90.431 (1995), pp. 1112-1121.1995Article
Little, R. J. A. and D. B. Rubin. Statistical Analysis with Missing Data. Wiley, 2002, p. 408. ISBN: 0471183865.2002Book
Loh, P. and M. J. Wainwright. High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity. In: Advances in Neural Information Processing Systems. Ed. by J. Shawe-Taylor, R. Zemel, P. Bartlett, F. Pereira and K. Q. Weinberger. Vol. 24. Curran Associates, Inc., 2011, pp. 2726–2734.2011Paper
Londschien, M., S. Kovács, and P. Bühlmann. Change point detection for graphical models in presence of missing values. 2019. arXiv: 1907.05409 [stat.ML].2019Misc
Louis, T. A. Finding the Observed Information Matrix when Using the EM Algorithm. In: Journal of the Royal Statistical Society. Series B (Methodological) 44.2 (1982), pp. 226–233.1982Article
Lüdtke, O., A. Robitzsch, and S. G. West. Regression models involving nonlinear effects with missing data: A sequential modeling approach using Bayesian estimation. In: Psychological methods (2019).2019Article
Ma, A. and D. Needell. Stochastic Gradient Descent for Linear Systems with Missing Data. In: Numerical Mathematics: Theory, Methods and Applications 12.1 (2017), pp. 1-20.2017Article
Ma, W. and G. H. Chen. Missing Not at Random in Matrix Completion: The Effectiveness of Estimating Missingness Probabilities Under a Low Nuclear Norm Assumption. In: Advances in Neural Information Processing Systems 32. Ed. by H. Wallach, H. Larochelle, A. Beygelzimer, F. d. Alché-Buc, E. Fox and R. Garnett. Curran Associates, Inc., 2019, pp. 14900–14909.2019Paper
Mattei, P. and J. Frellsen. MIWAE: Deep generative modelling and imputation of incomplete data sets. In: Proceedings of the 36th International Conference on Machine Learning. Vol. 97. Proceedings of Machine Learning Research. Kamalika Chaudhuri and Ruslan Salakhutdinov, 2019, pp. 4413–4423.2019Paper
McLachlan, G. J. and T. Krishnan. The EM Algorithm and Extensions. Wiley series in probability and statistics. Hoboken, NJ, USA: Wiley, 2008. ISBN: 9780471201700.2008Book
Meng, S. L. and D. B. Rubin. Maximum likelihood estimation via the ECM algorithm: a general framework. In: Biometrika 80.2 (1993), pp. 267-278.1993Article
Meng, X. L. and D. B. Rubin. Using EM to obtain asymptotic variance-covariance matrices: the SEM algorithm. In: Journal of the American Statistical Association 86.416 (1991), pp. 899-909.1991Article
Meng, X. L. UYou want me to analyze data I don’t have? Are you insane? In: Shanghai Archives of Psychiatry 24.5 (2012), pp. 287-301.2012Article
Miao, W. and E. J. Tchetgen Tchetgen. Identification and inference with nonignorable missing covariate data. In: Statistica Sinica 28.4 (2018), pp. 2049–2067.2018Article
Moeur, M. and A. R. Stage. Most similar neighbor: an improved sampling inference procedure for natural resources planning. In: Forest Science 42.1 (1995), pp. 337-359.1995Article
Mohan, K., F. Thoemmes, and J. Pearl. Estimation with Incomplete Data: The Linear Case. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18. International Joint Conferences on Artificial Intelligence Organization, Jul. 2018, pp. 5082–5088.2018Paper
Mohan, K. and J. Pearl. Graphical Models for Processing Missing Data. Tech. rep. R-473-L. Forthcoming, Journal of American Statistical Association (JASA). CA: Department of Computer Science, University of California, Los Angeles, 2019.2019Misc
Molenberghs, G., G. Fitzmaurice, M. G. Kenward, et al. Handbook of Missing Data Methodology. Chapman & Hall/CRC Handbooks of Modern Statistical Methods. New York, NY, USA: Chapman and Hall/CRC, 2014. ISBN: 9781439854624.2014Book
Molenberghs, G., B. Michiels, M. G. Kenward, et al. Monotone missing data and pattern-mixture models. In: Statistica Neerlandica 52.2 (1998), pp. 153-161.1998Article
Molenberghs, G. and M. G. Kenward. Missing Data in Clinical Studies. Chichester, West Sussex, UK: Wiley, 2007. ISBN: 9780470849811.2007Book
Molnar, F. J., B. Hutton, and D. Fergusson. Does analysis using last observation carried forward introduce bias in dementia research? In: Canadian Medical Association Journal 179.8 (2008), pp. 751-753.2008Article
Moritz, S. and T. Bartz-Beielstein. imputeTS: time series missing value imputation in R. In: The R Journal 9.1 (2017), pp. 207-218.2017Article
Moritz, S., A. Sardá, T. Bartz-Beielstein, et al. Comparison of different methods for univariate time series imputation in R. Prepint arXiv 1510.03924. 2015.2015Misc
Le Morvan, M., N. Prost, J. Josse, et al. Linear predictor on linearly-generated data with missing values: non consistency and solutions. In: Proceedings of Machine Learning Research. Ed. by -. Vol. 108. Proceedings of Machine Learning Research. 2020, p. 3165–3174. eprint: 2002.00658v2.2020Paper
Le Morvan, M., J. Josse, T. Moreau, et al. NeuMiss networks: differentiable programming for supervised learning with missing values. In: Advances in Neural Information Processing Systems, 33. (Dec. 2020). Ed. by -. IEEE, 2020. eprint: 2007.01627v4.2020Paper
Le Morvan, M., J. Josse, E. Scornet, et al. What’s a good imputation to predict with missing values? 2021.2021Misc
Murray, J. S. and J. P. Reiter. Multiple Imputation of Missing Categorical and Continuous Values via Bayesian Mixture Models With Local Dependence. In: Journal of the American Statistical Association 111.516 (2016), pp. 1466-1479.2016Article
Muzellec, B., J. Josse, C. Boyer, et al. Missing Data Imputation using Optimal Transport. In: International Conference on Machine Learning. PMLR. 2020, pp. 7130–7140.2020Paper
Tang, F. and H. Ishwaran. Random forest missing data algorithms. In: Statistical Analysis and Data Mining: The ASA Data Science Journal 10.6 (2017), pp. 363–377.2017Article
Nabi, R., R. Bhattacharya, and I. Shpitser. Full Law Identification In Graphical Models Of Missing Data: Completeness Results. In: arXiv preprint arXiv:2004.04872 (2020).2020Article
National Research Council, U. The Prevention and Treatment of Missing Data in Clinical Trials. Washington (DC), USA: National Academies Press, 2010. ISBN: 9780309158145.2010Book
Nguyen, L. T., J. Kim, and B. Shim. Low-Rank Matrix Completion: A Contemporary Survey. In: IEEE Access 7 (2019), pp. 94215–94237.2019Article
Nowicki, R. K., R. Scherer, and L. Rutkowski. Novel rough neural network for classification with missing data. In: 21st International Conference on Methods and Models in Automation and Robotics (MMAR). (Sep. 29, 2016-Sep. 01, 2016). Ed. by -. IEEE, 2016, pp. 820–825.2016Paper
O’Kelly, M. and B. Ratitch. Clinical Trials with Missing Data: A Guide for Practitioners. John Wiley & Sons, Ltd, 2014.2014Book
Orchard, T. and M. A. Woodbury. A missing information principle: theory and applications. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Theory of Statistic. Ed. by L. M. Le Cam, N. J. and E. L. Scott. Vol. 1. University of California Press, 1972, pp. 697–715.1972Paper
Peugh, J. L. and C. K. Enders. Missing data in educational research: a review of reporting practices and suggestions for improvement. In: Review of Educational Research 74.4 (2004), pp. 525–556.2004Article
Pigott, T. D. A review of methods for missing data. In: Educational Research and Evaluation 7.4 (2001), pp. 353–383.2001Article
Preisser, J. S., K. K. Lohman, and P. J. Rathouz. Performance of weighted estimating equations for longitudinal binary data with drop-outs missing at random. In: Statistics in Medicine 21.20 (2002), pp. 3035–3054.2002Article
Quartagno, M. and J. R. Carpenter. Multiple imputation for discrete data: Evaluation of the joint latent normal model. In: Biometrical Journal 61.4 (2019), pp. 1003–1019.2019Article
Rahman, G. and Z. Islam. Missing value imputation using decision trees and decision forests by splitting and merging records: Two novel techniques. In: Knowledge-Based Systems 53 (2013), pp. 51–65.2013Article
Rao, J. N. K. and J. Shao. Jackknife variance estimation with survey data under hot deck imputation. In: Biometrika 79.4 (1992), pp. 811-822.1992Article
Reilly, M. and M. Pepe. The relationship between hot-deck multiple imputation and weighted likelihood. In: Statistics in Medecine 16.1-3 (1997), pp. 5-19.1997Article
Reiter, J. P. and M. Sadinle. Itemwise conditionally independent nonresponse modelling for incomplete multivariate data. In: Biometrika 104.1 (Jan. 2017), pp. 207-220. eprint: http://oup.prod.sis.lan/biomet/article-pdf/104/1/207/13066719/asw063.pdf.2017Article
Rieger, A., T. Hothorn, and C. Strobl. Random forests with missing values in the covariates. Tech. rep. 79. University of Munich, Department of Statistics, 2010.2010Misc
Rioux, C., A. Lewin, O. A. Odejimi, et al. Reflection on modern methods: planned missing data designs for epidemiological research. In: International Journal of Epidemiology (2020).2020Article
Robin, G. Low-rank methods for heterogeneous and multi-source data. 2019.2019Misc
Robin, G., O. Klopp, J. Josse, et al. Main Effects and Interactions in Mixed and Incomplete Data Frames. In: Journal of the American Statistical Association 115.531 (2020), pp. 1292-1303. eprint: https://doi.org/10.1080/01621459.2019.1623041.2020Article
Robins, J. M., A. Rotnitzky, and L. P. Zhao. Estimation of Regression Coefficients When Some Regressors are not Always Observed. In: Journal of the American Statistical Association 89.427 (1994), pp. 846-866.1994Article
Robins, J. M., A. Rotnitzky, and L. P. Zhao. Analysis of semiparametric regression models for repeated outcomes in the presence of missing data. In: Journal of the American Statistical Association 90.429 (1995), pp. 106-121.1995Article
Robins, J. M. and N. Wang. Inference for imputation estimators. In: Biometrika 87.1 (2000), pp. 113-124.2000Article
Rosseel, Y. lavaan: an R package for structural equation modeling. In: Journal of Statistical Software 48.2 (2012).2012Article
Rotnitzky, A., J. M. Robins, and D. O. Scharfstein. Semiparametric regression for repeated outcomes with nonignorable nonresponse. In: Journal of the American Statistical Association 93.444 (1998), pp. 1321-1339.1998Article
Rubin, D. B. Inference and missing data. In: Biometrika 63.3 (1976), pp. 581-592.1976Article
Rubin, D. B. Formalizing subjective notions about the effect of nonrespondents in sample surveys. In: Journal of the American Statistical Association 72.359 (1977), pp. 538-543.1977Article
Rubin, D. B. Multiple imputation after 18+ years. In: Journal of the American Statistical Association 91.434 (2012), pp. 473-489.2012Article
Rubin, D. B. Multlipe Imputation for Nonresponse in Surveys. Hoboken, NJ, USA: Wiley, 1987. ISBN: 9780471655740.1987Book
Sadinle, M. and J. P. Reiter. Sequential Identification of Nonignorable Missing Data Mechanisms. In: Statistica Sinica 28.4 (2018), pp. 1741–1759.2018Article
Sadinle, M. and J. P. Reiter. Sequentially additive nonignorable missing data modeling using auxiliary marginal information. In: arXiv preprint (2019). arXiv: 1902.06043 [stat.ME].2019Article
Santos, M. S., R. C. Pereira, A. F. Costa, et al. Generating Synthetic Missing Data: A Review by Missing Mechanism. In: IEEE Access 7 (2019), pp. 11651–11667.
Generating Synthetic Missing Data: A Review by Missing Mechanism. In: IEEE Access 7 (2019), pp. 11651–11667.2019Article
Santos, M. S., R. C. Pereira, A. F. Costa, et al. Generating Synthetic Missing Data: A Review by Missing Mechanism. In: IEEE Access 7 (2019), pp. 11651–11667.2019Article
Schafer, J. L. Analysis of Incomplete Multivariate Data. CRC Monographs on Statistics & Applied Probability. Boca Raton, FL, USA: Chapman and Hall/CRC, 1997. ISBN: 0412040611.1997Book
Schafer, J. L. and J. W. Graham. Missing data: our view of the state of the art. In: Psychological Methods 7.2 (2002), pp. 147-177.2002Article
Schafer, J. L. and M. K. Olsen. Multiple Imputation for multivariate missing-data problems: a data analyst’s perspective. In: Multivariate Behavioral Research 33.4 (1998), pp. 545-571.1998Article
Schafer, J. L. Multiple imputation: a primer. In: Statistical Methods in Medical Research 8.1 (1999), pp. 3-15.1999Article
Seaman, S., J. Galati, D. Jackson, et al. What Is Meant by "Missing at Random"? In: Statistical Science 28.2 (2013), pp. 257–268.
What Is Meant by" Missing at Random"? In: Statistical Science (2013), pp. 257–268.2013Article
Seaman, S., J. Galati, D. Jackson, et al. What Is Meant by "Missing at Random"? In: Statistical Science 28.2 (2013), pp. 257–268.2013Article
Seaman, S. R. and S. Vansteelandt. Introduction to Double Robust Methods for Incomplete Data. In: Statistical Science 33.2 (2018), p. 184.2018Article
Seaman, S. R. and I. R. White. Review of inverse probability weighting for dealing with missing data. In: Statistical Methods in Medical Research 22.3 (2011), pp. 278-295.2011Article
Shao, J. and J. Zhang. A transformation approach in linear mixed-effects models with informative missing responses. In: Biometrika 102.1 (2015), pp. 107-119.2015Article
Sharpe, P. K. and R. J. Solly. Dealing with missing values in neural network-based diagnostic systems. In: Neural Computing & Applications 3.2 (1995), pp. 73-77.1995Article
Simon, G. A. and J. S. Simonoff. Diagnostic plots for missing data in least squares regression. In: Journal of the American Statistical Association 81.394 (1986), pp. 501-509.1986Article
Śmieja, M., Ł. Struski, J. Tabor, et al. Processing of missing data by neural networks. In: Computing Research Repository abs/1805.07405 (2018). eprint: 1805.07405.2018Article
Sovilj, D., E. Eirola, Y. Miche, et al. Extreme learning machine for missing data using multiple imputations. In: Neurocomputing 174.A (2016), pp. 220-231.2016Article
Sportisse, A., C. Boyer, and J. Josse. Imputation and low-rank estimation with Missing Not At Random data. In: Statistics and Computing 30.6 (2018), pp. 1629-1643.2018Article
Sportisse, A., C. Boyer, and J. Josse. Estimation with informative missing data in the low-rank model with random effects. In: Advances in Neural Information Processing Systems, 33. (Dec. 2020). Ed. by -. IEEE, 2020. eprint: 1906.02493v3.2020Paper
Sportisse, A., C. Boyer, A. Dieuleveut, et al. Debiasing Averaged Stochastic Gradient Descent to handle missing values. In: Advances in Neural Information Processing Systems, 33. (Dec. 2020). Ed. by -. IEEE, 2020. eprint: 2002.09338v2.2020Paper
Stacklies, W., H. Redestig, M. Scholz, et al. pcaMethods – a bioconductor package providing PCA methods for incomplete data. In: Bioconductor 23.9 (2007), pp. 1164-1167.2007Article
Stage, A. R. and N. L. Crookston. Partitioning error components for accuracy-assessment of near-neighbor methods of imputation. In: Forest Science 53.1 (2007), pp. 62-72.2007Article
Stekhoven, D. J. and P. Bühlmann. Missforest-non-parametric missing value imputation for mixed-type data. In: Bioinformatics 28.1 (2012), pp. 112-118. eprint: 1105.0828.2012Article
Strobl, C., A. L. Boulesteix, and T. Augustin. Unbiased split selection for classification trees based on the Gini Index. In: Computational Statistics & Data Analysis 52.1 (2007), pp. 483-501.2007Article
Stuart, E. A., M. Azur, C. Frangakis, et al. Multiple imputation with large data sets: a case study of the children’s mental health initiative. In: American Journal of Epidemiology 169.9 (2009), pp. 1133-1139.2009Article
Stubbendick, A. L. and J. G. Ibrahim. Maximum Likelihood Methods for Nonignorable Missing Responses and Covariates in Random Effects Models. In: Biometrics 59.4 (2003), pp. 1140–1150.2003Article
Stubbendick, A. L. and J. G. Ibrahim. Likelihood-based inference with nonignorable missing responses and covariates in models for discrete longitudinal data. In: Statistica Sinica 16.4 (2006), pp. 1143–1167.2006Article
Su, Y. S., A. Gelman, J. Hill, et al. Multiple imputation with diagnostics (mi) in R: opening windows into the black box. In: Journal of Statistical Software 45 (2011), p. 2.2011Article
Tabouy, T., P. Barbillon, and J. Chiquet. Variational inference for stochastic block models from sampled data. In: Journal of the American Statistical Association 115.529 (2020), pp. 455–466.2020Article
Tanner, M. A. and W. Wong. The calculation of posterior distributions by data augmentation. In: Journal of the American Statistical Association 82.398 (1987), pp. 528-540.1987Article
Tchetgen Tchetgen, E. J., L. Wang, and B. Sun. Discrete choice models for nonmonotone nonignorable missing data: identification and inference. In: Statistica Sinica 28.4 (2018), pp. 2069–2088.2018Article
Templ, M., A. Alfons, and P. Filzmoser. Exploring Incomplete data using visualization techniques. In: Advances in Data Analysis and Classification 6.1 (2012), pp. 29-47.2012Article
Thijs, H., G. Molenberghs, B. Michiels, et al. Strategies to fit pattern-mixture models. In: Biostatistics 3.2 (2002), pp. 245-265.2002Article
Tierney, N. and D. Cook. Expanding tidy data principles to facilitate missing data exploration, visualization and assessment of imputations. Monash Econometrics and Business Statistics Working Papers 14/18. Monash University, Department of Econometrics and Business Statistics, 2018.2018Misc
Tierney, N. J., F. A. Harden, M. J. Harden, et al. Using decision trees to understand structure in missing data. In: BMJ Open 5.6 (2015), p. e007450.2015Article
Tran, L., X. Liu, J. Zhou, et al. Missing Modalities Imputation via Cascaded Residual Autoencoder. In: 2017 IEEE Conference on Computer Vision and PAttern Recognition (CVPR). (Jul. 21, 2017-Jul. 26, 2017). Ed. by -. IEEE, 2017, pp. 4971-4980.2017Paper
Troyanskaya, O., M. Cantor, G. Sherlock, et al. Missing value estimation methods for DNA microarrays. In: Bioinformatics 17.6 (2001), pp. 520-525.2001Article
Twala, B. E. T. H., M. C. Jones, and D. J. Hand. Good methods for coping with missing data in decision trees. In: Pattern Recognition Letters 29.7 (2008), pp. 950-956.2008Article
Unnebrink, K. and J. Windeler. Intention-to-treat: methods for dealing with missing values in clinical trials of progressively deteriorating diseases. In: Statistics in Medecine 20.24 (2001), pp. 3931-3946.2001Article
Buuren, S. van, J. P. L. Brand, C. G. M. Groothuis-Oudshoorn, et al. Fully conditional specification in multivariate imputation. In: Journal of Statistical Computation and Simulation 76.12 (2006), pp. 1049-1064.2006Article
Buuren, S. van. Flexible Imputation of Missing Data. Boca Raton, FL: Chapman and Hall/CRC, 2018.2018Book
Buuren, S. van and K. Groothuis-Oudshoorn. MICE: multivariate imputation by chained equations in R. In: Journal of Statistical Software 45 (2011), p. 3. eprint: NIHMS150003.2011Article
Buuren, S. van. Multiple imputation of discrete and continuous data by fully conditional specification. In: Statistical Methods in Medical Research 16 (2007), pp. 219-242.2007Article
Wal, W. M. van der and R. B. Geskus. ipw: an R package for inverse probability weighting. In: Journal of Statistical Software 43.13 (2011).2011Article
Velden, M. van de and T. H. A. Bijmolt. Generalized canonical correlation analysis of matrices with missing rows: a simulation study. In: Psychometrika (2006).2006Article
Vansteelandt, S., A. Rotnitzky, and J. Robins. Estimation of regression models for the mean of repeated outcomes under nonignorable nonmonotone nonresponse. In: Biometrika 94.4 (2007), pp. 841–860.2007Article
Vansteelandt, S., J. Carpenter, and M. G. Kenward. Analysis of incomplete data using inverse probability weighting and doubly robust estimators. In: Methodology – European Journal of Research Methods for the Behavioral and Social Sciences 6.1 (2010), pp. 37–48.2010Article
Verbanck, M., J. Josse, and F. Husson. Regularised PCA to denoise and visualise data. In: Statistics and Computing 25.2 (2015), pp. 471-486.2015Article
Verbeke, G., G. Molenberghs, H. Thijs, et al. Sensitivity analysis for nonrandom dropout: a local influence approach. In: Biometrics 57.1 (2001), pp. 7-14.2001Article
Voillet, V., P. Besse, L. Liaubet, et al. Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework. In: BMC Bioinformatics 17.402 (2016). Forthcoming.2016Article
Wainer, H., ed. Drawing Inferences from Self-Selected Samples. New York, NY, USA: Springer, 1986.1986Book
Wang, N. and J. M. Robins. Large-sample theory for parametric multiple imputation procedures. In: Biometrika 85.4 (1998), pp. 935–948.1998Article
White, I. R., J. Carpenter, and N. J. Horton. A mean score method for sensitivity analysis to departures from the missing at random assumption in randomised trials. In: Statistica Sinica 28.4 (2018), pp. 1985–2003.2018Article
Wu, M. C. and R. J. Carroll. Estimation and comparison of changes in the presence of informative right censoring by modeling the censoring process. In: Biometrics 44.1 (1988), pp. 175-188.1988Article
Xie, X. and X. L. Meng. Dissecting multiple imputation from a multi-phase inference perspective: what happens when God’s, imputer’s and analyst’s models are uncongenial? In: Statistica Sinica 27.4 (2017), pp. 1485–1594.2017Article
Xue, F. and A. Qu. Integrating multi-source block-wise missing data in model selection. In: Journal of the American Statistical Association (2020), pp. 1–36.2020Article
Yang, S., L. Wang, and P. Ding. Identification and estimation of causal effects with confounders subject to instrumental missingness. In: Statistics Methodology Repository (2017).2017Article
Yoon, J., J. Jordon, and M. van der Schaar. GAIN: Missing Data Imputation using Generative Adversarial Nets. In: Proceedings of the 35th International Conference on Machine Learning. (Jul. 10, 2018-Jul. 15, 2018). Ed. by J. Dy and A. Krause. Vol. 80. Proceedings of Machine Learning Research. Stockholmsmässan, Stockholm Sweden: PMLR, 2018, pp. 5689–5698.2018Paper
Yoon, S. and S. Sull. GAMIN: Generative Adversarial Multiple Imputation Network for Highly Missing Data. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2020, pp. 8456–8464.2020Paper
Zhang, H., P. Xie, and E. Xing. Missing Value Imputation Based on Deep Generative Models. In: Computing Research Repository abs/1808.01684 (2018).2018Article
Zhang, S. Nearest neighbor selection for iterative kNN imputation. In: Journal of Systems and Software 85.11 (2012), pp. 2541-2552.2012Article
Zhao, Y. Statistical inference for missing data mechanisms. In: Statistics in Medicine 39.28 (2020), pp. 4325–4333.2020Article
Zhao, Y. and M. Udell. Missing value imputation for mixed data via gaussian copula. In: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020, pp. 636–646.2020Paper
Zhao, J. and Y. Ma. A versatile estimation procedure without estimating the nonignorable missingness mechanism. In: Journal of the American Statistical Association (2021), pp. 1–15.2021Article
Zhou, Y., R. J. A. Little, and J. D. Kalbfleisch. Block-conditional missing at random models for missing data. In: Statistical Science 25.4 (2010), pp. 517–532.2010Article
Zhu, Z., T. Wang, and R. J. Samworth. High-dimensional principal component analysis with heterogeneous missingness. In: arXiv preprint (2019).2019Article