The intent of this method is to detect, amplify, analyze
and visualize signal and signal heterogeneity in high dimensional datasets
(nVariables >> nObservations), such as whole-genome scans or complex
datasets incorporating large number of clinical (sub-phenotypes) and
non-clinical (genetics, genomics, metabolomics,...) variables of various
types (discrete and continous).
'Taxonomy 3' provides a statistical
framework to large scale or complex problems, and produces simple answers,
visually and biologically meaningful..
The method reduces the complexity
and the dimensionality of the data, reveals independent sets of correlated
variables and meaningful sub-groups of observations. Technically, it
is a single multivariate eigen analysis (without multiple testing) based
on correlations or covariations of contrasts of empirically-derived
log Bayes factors (LBFs). The LBF properties allow usage of prior knowledge
for data aggregation.
Since a primary objective
of 'Taxonomy 3' is to visualize a complex dataset, it does not
produce impenetrable 'black box' solutions such as other multivariate
methods (artificial neural networks or support vector machine).
The authors believe this method can address
several industry and academic needs such as disease understanding, data
integration (integrated biomarker strategies) and decision making in
drug discovery and clinical development.
The main features of the
method can be tested on-line with small datasets: see the on-line
analysis page.