Julie Josse focuses her research on the development of methods to deal with missing values and of matrix completion. She has also specialized in principal components methods to explore and visualized complex data structures. Her fields of application include mainly bio-sciences and public health. Julie Josse is dedicated to reproducible research and has developed many packages, including FactoMineR and missMDA to disseminate her work. she is a member of the R Foundation and of Forwards, a task force to increase the participation of minorities in the R community.
In many application settings, the data have missing features which make data analysis challenging. An abundant literature addresses missing data as well as more than 150 R packages. Funded by the R consortium, we have created the R-miss-tastic platform along with a dedicated task view which aims at giving an overview of main references, contributors, tutorials to offer users keys to analyse their data. This platform highlights that this is an active field of work and that as usual different problems requires designing dedicated methods. In this presentation, I will share my experience on the topic. I will start by the inferential framework, where the aim is to estimate at best the parameters and their variance in the presence of missing data. Last multiple imputation methods have focused on taking into account the heterogeneity of the data (multi-sources with variables of different natures, etc.). Then I will present recent results in a supervised-learning setting. A striking one is that the widely-used method of imputing with the mean prior to learning can be consistent. That such a simple approach can be relevant may have important consequences in practice.