Non classé • 19.02.2026

Detection of outliers (Outliers) in your chemometric models

Julie

détection des valeurs aberrantes en chimiométrie: guide

INDEX +

The Detection of outliers (Outliers) in your chemometric models is not a theoretical exercise. I've seen production lines stopped for an odd batch that no one could explain, NIR calibrations degraded by three poorly labeled samples. Detecting these atypical points preserves performance, reassures quality, and saves time. This guide shares a pragmatic, field-tested method to identify, understand, and treat these deviant data without harming your models.

Detection of outliers (Outliers) in your chemometric models: the real challenge

An isolated point is not necessarily an error. It may indicate instrumental drift, a raw material out of specification, a contamination, or a simple weighing error. Ignoring these signals weakens calibration, inflates predictive uncertainty, and creates fragility in your PAT deployments. To decide correctly, we distinguish three cases: a sample not representative of the study space, a measurement problem, or a legitimate novelty to be integrated. The treatment will not be the same depending on the diagnosis.

Proven methods for outlier detection in a chemometric context

In practice, one combines several indicators to avoid false positives. My basic triplet: distance in the score space, residuals relative to the model, and influence. This trio covers the geometry of the data, the deviation from the model, and the impact of a point on the parameters. Statistical thresholds guide, but visual inspection and process knowledge finish the job.

Essential indicators

Multivariate distance (confidence ellipse, Mahalanobis metric), useful for spotting atypical structures.
Residuals on X and Y: DModX for X, prediction errors for Y, local influences.
Influence measures: leverage, Cook distance, model stability diagnostics.

PCA and PLS diagnostics for outlier detection

In Principal Component Analysis (PCA), the duo PCA “scores–residuals” remains my first reflex. The score cloud reveals the structure; points outside the 95% or 99% ellipse call for verification. The residual plot highlights objects poorly described by the retained components. Multiply the angles of view to avoid optical illusions.

In PLS, one adds the residuals on Y, the influence indices, and the distance to the model space. The DModX tool signals spectra poorly represented by the latent base. Prediction errors and the evolution of the PRESS in cross-validation point to samples that bias the calibration in a suspicious way. The scores plot and the contributions plot help to understand which wavelengths or variables pull the observation toward the outside.

Preprocessing and measurement quality: avoiding false outliers at the source

Many “anomalies” disappear when data are properly prepared. Reducing scattering effects, baseline correction, normalization: your pipeline makes the difference between a relevant alert and a statistical mirage. The article on spectral data preprocessing details these key steps to stabilize your models.

Baseline correction and smoothing before any modeling.
Reduction of illumination variability via SNV and derivatives.
Detection of saturation, lamp drift, wavelength shift.

On NIR spectra, an initial Savitzky–Golay derivative and proper standardization eliminate most of the “false” outlier points due to instrumental artifacts. Better to prevent than spend hours chasing a problem that doesn’t exist.

Thresholds and criteria: T2, Q, DModX to quantify abnormality

To move from judgment to decision, consistent and documented thresholds are essential. The classical framework combines a statistic of the type Hotelling’s T2 for the position in the latent space and Q-residuals (SPE) for the unexplained deviation. The 95% and 99% limits mark the alert and the exclusion.

Leverage: identifies points whose influence on the components is excessive.
DModX: distance of a sample to the X-model, useful for PLS and PCA.
Studentized residuals on Y: for quantitative calibration.

I recommend displaying T2 and Q simultaneously. A point “T2 high, Q low” is often a valid extreme to integrate into the domain. “Q high, T2 normal” tends to indicate a measurement or preprocessing defect.

What to do with an outlier? Exclude, correct, or integrate

Automatic removal tends to create more damage than it prevents. The strategy depends on the origin: data-entry or weighing error? Correct it. Noisy spectrum? Re-measure if possible, otherwise adjust the preprocessing pipeline. New product variety? Extend the calibration space.

Discard a point only if the cause is established and not representative of future data.
Document each decision and keep a before/after version.
Test the impact on performance via re-calibration and comparison of indicators.

A simple rule: if exclusion improves one indicator but degrades robustness on independent samples, the cure is worse than the problem. Robust models deserve to be considered before any aggressive purge.

Concrete examples from the lab and the workshop

In NIR on pharmaceutical granules, concentration predictions were unstable one morning. T2 remained calm, Q shot up. An inspection revealed a batch change of sachets: optical diffusion had changed. Adjust baseline correction, add a few samples from the new batch, problem solved without removing a single point.

In a dairy, two powder samples showed large Y residuals but coherent chemistry. The spectra showed increased water absorption. After verification, the sampling room had a faulty hygrometer. Repeating the analysis with controlled conditioning was enough, without re-writing the model.

Reference table: indicators and uses

Indicator	What it signals	When to use
Hotelling’s T2	Extreme position in the latent space	Global coherence check
Q-residuals (SPE)	Part not explained by the model	Preprocessing fault, local novelty
DModX	Distance to the X-model	PLS/PCA: spectra poorly described
Leverage	Excessive influence on the components	Calibration sample selection

Reproducible workflow for outlier detection

A clear procedure simplifies decisions and traceability. Here is the one I teach to teams and apply in industrial support; it adapts to NIR, Raman, or chromatographic matrices.

Stabilize measurement: instrument calibration, blank, drift control.
Preprocess according to the matrix: SNV, derivatives, smoothing, normalization.
Explore by PCA: scores, 95/99% ellipse, Q residuals.
Build the PLS or the PCR: choose the number of factors by cross-validation.
Control influence: leverage, prediction errors, stability of coefficients.
Document cases: cause, decision, impact on performance.

To deepen the reading of projections and axes, PCA review remains valuable, especially when outliers nest at the borders of the latent space.

Common mistakes and saving measures

Confusing process variability with measurement error. Believing that a "clean" model without outliers is necessarily better. Stacking preprocessing steps to smooth the useful signals. Forgetting that the selection of calibration samples conditions the rest. These traps are avoided through targeted checks, methodological parsimony, and solid external validations.

Check labels and units before any statistics.
Compare different preprocessing pipelines, not just their RMSE.
Test stability by resampling and independent datasets.

Robust approaches and AI: an extra safety net

When the distribution deviates from normality or classes are imbalanced, robust options take over: M-estimators, robust PCA, penalized PLS. In unsupervised detection, Isolation Forest or autoencoders provide a complementary view, useful for continuous monitoring. However keep a human in the loop: explaining a flag remains essential for acceptance by quality and production.

Outlier detection and domain of application: what matters for lasting impact

Beyond thresholds, the central question remains: does my domain of application cover the real variability? A repeatable outlier should often become a future inlier. Gradually widening the space, retraining from scratch, updating thresholds, and monitoring drift ensure the model’s performance in the field.

Useful quick reminder

Before concluding that a point is abnormal, inspect the raw spectrum, the preprocessing pipeline, the scores, the residuals, the contributions, and the repeatability. This simple routine prevents 80% of hasty decisions, saves hours of investigation, and strengthens data governance.

To cement these habits, reread the PCA chapter and refine your preprocessing chain. The following links nicely summarize the basics and traps to avoid: PCA in chemometrics and spectral data preprocessing.

The essentials to remember for outlier detection

Outlier detection is not a binary filter but an investigative process. Combine T2, Q, and DModX, monitor residuals and influence, refine preprocessing, document each decision. Lean toward robust approaches if the data mandate it. Your model will gain in precision, confidence, and operational lifespan. If you are just starting, begin with a quick audit of your diagnostics and implement this workflow from the next series.