Question for people working with multivariate biomarker data (cytokines, MRI, or omics):

When we perform dimensionality reduction, we often interpret the component scores — but the meaning of these scores differs depending on the method.

In PCA, the scores simply reflect each subject’s position along directions of maximal variance in the data (e.g., “low-to-high overall inflammation”).
In PLS-DA, the scores are supervised: they represent how strongly each subject expresses the multivariate pattern that best discriminates clinical groups (e.g., “control-like” vs “disease-like” inflammation profile).

Conceptually, PCA scores describe unsupervised variance, whereas PLS-DA scores act as a supervised index that incorporates group separation power.

I’m curious how others in the neurodegeneration / biomarker field interpret or report these scores.
Do you treat PLS-DA scores as quantitative indices (e.g., “inflammation index”) for downstream correlation with clinical outcomes?
How do you handle interpretability when comparing PCA vs PLS-DA components in your studies?

Great question — this distinction between PCA and PLS-DA scores is often overlooked, especially in biomarker and neurodegeneration studies where people instinctively treat all component scores as interchangeable “indices.”

In practice, I treat PCA scores as purely descriptive: they summarize dominant variance structure but are not tied to any biological contrast, so I avoid using them directly as quantitative biomarkers unless the variance they capture aligns with a clear physiological axis.

PLS-DA scores, on the other hand, can be interpreted as quantitative indices because they encode the latent variables that maximize group discrimination. In neurodegeneration work, it’s fairly common to use the first PLS component as an “activation index” or “pathology index” and correlate it with cognitive decline, imaging markers, or fluid biomarkers — as long as cross-validation confirms stability and guards against overfitting.

For interpretability, I usually:

  • Compare PCA and PLS-DA loadings side-by-side to see whether supervised separation aligns with natural variance structure.

  • Report PLS-DA scores only when the model shows strong latent variable stability (permutation tests, bootstrapping).

  • Emphasize that PLS-DA scores reflect supervised discrimination, not intrinsic variance.

Curious to hear how others handle this — especially when PCA and PLS-DA highlight different biological axes.