Published online by Cambridge University Press: 03 January 2022
Scanning transmission electron microscopy is a crucial tool for nanoscience, achieving sub-nanometric spatial resolution in both image and spectroscopic studies. This generates large datasets that cannot be analyzed without computational assistance. The so-called machine learning procedures can exploit redundancies and find hidden correlations. Principal component analysis (PCA) is the most popular approach to denoise data by reducing data dimensionality and extracting meaningful information; however, there are many open questions on the accuracy of reconstructions. We have used experiments and simulations to analyze the effect of PCA on quantitative chemical analysis of binary alloy (AuAg) nanoparticles using energy-dispersive X-ray spectroscopy. Our results demonstrate that it is possible to obtain very good fidelity of chemical composition distribution when the signal-to-noise ratio exceeds a certain minimal level. Accurate denoising derives from a complex interplay between redundancy (data matrix size), counting noise, and noiseless data intensity variance (associated with sample chemical composition dispersion). We have suggested several quantitative bias estimators and noise evaluation procedures to help in the analysis and design of experiments. This work demonstrates the high potential of PCA denoising, but it also highlights the limitations and pitfalls that need to be avoided to minimize artifacts and perform reliable quantification.