-
500 patients
affected with a common multifactorial, complex, neuropsychiatric
disorder (disease A)
- 500 control subjects (population
matched)
-
circa 5000
Single Nucleotide Polymorphisms (SNP) among 1200 genes were genotyped.
This dataset and these examples are discussed in more details in:
Visualizing
gene determinants of disease in drug discovery.
Delrieu O and Bowman C.
Pharmacogenomics. 2006 Apr;7(3):311-29.
PMID:
16610942 / Pharmacogenomics
|
|
Genes loadings. Genes of interest for disease A and population
heterogeneity
Legend: Eigen Analysis. Genes loadings
for the first two components.
1500 genes (green crosses) are displayed.
Statistical insight: Figure shows the Eigen
decomposition of gene level LBFs for disease A. The first two components
are shown. Each eigenvector is a particular weighted set of correlated
genes common in the people under study. Each eigenvector is composed
of gene loadings. The first Eigen vector contains useful information
regarding case-control features. Signal (boxes) to noise (green
cloud) decomposition of each eigenvector was performed by modeling
loadings distributions (winbugs
was used).
Clinical insight: RedBox genes are 18 genes
of physiological interest forming one ontology which one may want
to follow-up as a therapeutic target and for disease understanding.
The further to the right the gene is the stronger the case-control
signal for it is. Bluebox genes are genes associated with population
structure within or shared by cases and controls.
|
|
|
Subjects scores - Obvious heterogeneity in controls
Legend: Eigen Analysis. Subjects' scores
for the first two components.
500 cases in pink and 500 controls in blue.
Statistical insight: X-axis represents
Eigen analysis first component score 1 aligned with the main case/control
distinction. Y-axis represent Eigen analysis score 2 explaining
most of the remaining genetic heterogeneity. The patterns observed
in this figure are uniquely driven by the subjects' genetic information.
Horizontal differentiation between cases and controls is driven
by RedBox genes. Vertical differentiation within controls is driven
by BlueBox genes.
Clinical insight: This figure confirms
that the RedBox genes 'pull apart' the cases from the controls.
Also, an obvious pattern of genetic heterogeneity is observed within
the control group. This pattern is driven by the BlueBox genes.
|
|
|
Subjects scores - Overlay of clinical variable on controls.
Legend: Method of recruitment was overlaid
on the previous figure. In reddish, controls who had to make contact
with the site (active recruitment), either from an advertisement
(CNTL_AD) or via a relative acquaintance (CNTL_RA). In bluish, controls
(CNTL_S) who were directly recruited by the site (passive recruitment).
Cases in grey (OTHERS) have 'null' heat.
Statistical insight: Several methods can
be used to match observed heterogeneity with a clinical variable.
One option is to add in the Eigen analysis the clinical variable
or its LBF (see dataset 3 below). Here, a recursive
partitioning method on score 2 was used (HelixTree).
Method of ascertainment came up as strongly significant, having
corrected for multiple testings. In the overlaid heatmap, the color
represents the local differential density of 'hot' and 'cold' subjects.
Clinical insight: Follow-up determined
that the observed genetic heterogeneity in controls was associated
with differences in the ascertainment method for different control
individuals. This observed pattern is driven by genetic factors.
This highlights potential biases due to the method of recruitment,
driven solely by genetic factors. In other word, willingness to
take part or not in a genetic study is driven by ... genetic factors!
|
|
|
Subjects LBF signal allows cases sub-group individualization.
Case/control heatmap overlay.
Legend: Figure shows the result of aggregating
gene level LBFs for the gene pathway/ontology of interest in Disease
A (the 18 RedBox genes defined above). Mean(LBF) and var(LBF) are
plotted for each subject. An heatmap is overlaid, representing the
case (reddish) control (bluish) differentiation. A sub-group of
cases can clearly be individualized.
Statistical insight: Aggregating gene level
LBF for these 18 redbox genes is similar to describing the LBF signal,
projected on the first component. This method can be applied on
any group of genes having high loading 1 or which are part of a
previously known ontology. Also, the additive properties of LBFs
allow using any know dosing model. Here, the figure shows that cases
part of the individualized sub-group have a relatively low variability
compared to the other subjects. This indicates consensus (in the
sense of genetic co-occurence) between those case among the 18 genes.
Clinical insight: On the lower right hand
side are a set of cases whose genetic complement is distinct from
other cases and all controls. Such cases have a particular homogeneous
complement of gene targets for efficient drug discovery/development.
|
|
|
Subjects LBF signal projected on first component. Clinical
heatmap overlay.
Legend: Previous figure heatmap is replaced
with another heatmap showing that a clinical variable (EPQ-E) matches
the sub-group of cases with a distinct genetic component. Reddish
area indicate cases with higher than average EPQ-E. (controls =
OTHERS have 'null' heat).
Statistical insight: The same recursive
partitioning method described above was used.
Clinical insight: Case individualized in
previous figure, have also unique clinical characteristics. Their
clinical characteristics constitute a sub-phenotype that could be
used as surrogate inclusion or exclusion criteria to enrich subsequent
clinical trials of new drugs against disease A with populations
having this genetic pattern.
|
|