Mass spectrometry developments generate large amounts of lipidomic data, which can be helpful in the search of biomarkers. Data processing is crucial and challenging in lipidomics due to the high dimensionality of the data. Many different dimensionality reduction (DR) techniques have been used successfully in the analysis of omics data, each method has its own advantages and drawbacks. Nonnegative Matrix Factorization (NMF) has been frequently employed in genomics and proteomics to extract information. However, there have been no studies on the use of NMF in the lipidomics field. Here, we adopt NMF as a DR strategy to compress the breast milk lipid data of lean/overweight and obese mothers to better understand the impact on infant growth and atopic disease outcome during early life. We compared the obtained results with those gained using the well-known Principal Component Analysis (PCA), showing this new approach is useful in identifying two subsets of lipids, namely metalipids, differentiating obese and lean/overweight mothers. NMF also helps to improve the visualization and interpretation of trends according to the maternal group. Additionally, the nonnegative values of metalipids enable strong interpretative power, simple extraction of the key lipids that characterize each group, and the direct possibility of combining the reduced data with clinical characteristics. The post-statistical analysis allowed the identification of key lipid markers of atopic disease, infant growth parameters, and associated metabolic pathways.
Keywords: Lipidomics, Dimensionality Reduction, Nonnegative Matrix Factorization, Principal Component Analysis, Breastmilk lipidome