Technically, 'taxonomy
3' is an Eigen decomposition based on correlations or covariations of
contrasts of empirically-derived log Bayes factors.
The main features of the
method can be tested on-line with small datasets: see the on-line
analysis page.
The method can be divided
into the following main steps:
- Calculation of Log Bayes Factors (LBF).
The variables measured in cases and controls (e.g. categorical variables
such as genotypes) are transformed into a new continuous measure:
'LBF'. This measure amplifies each individual's contribution to the
overall between group distinction (i.e. case/control distinction),
and produces variable-based discriminatory or classificatory evidence.
LBFs are a subset of the divergence mathematical group and
have essential properties allowing the following steps.
- Aggregation of LBFs (optional). LBFs
can be regrouped (e.g. averaged) before being further analyzed. The
grouping can use groups of variables already known (biologically meaningful:
genes, ontologies). This allows the supervised biological
exploration of data, and the classification of subjects (sub-phenotype).
- Principal component analysis. Patterns
of co-occurence between LBFs are looked for. Variables in correlation
with the case/control distinction (the between group signal) are separated
from i) variables in correlation with any within group signal (i.e.
population heterogeneity or population variability observed only in
cases, or only in controls) and from ii) the remaining variables (noise).
The method can deal with any type of variables
or outcomes (e.g categorical or continuous), the software we provide
can deal with categorical outcomes (e.g. case/control studies) and variables
of many data types (genetic, discrete, continuous : normal, exponential,
poisson).
|