|
|
Gene Loadings. Case/Control
direction is not aligned with first component.
Legend: Eigen Analysis. Genes
loadings for the first two components. Circa 300 genes (green
crosses) are displayed. Genes having high loadings are noted.
The case/control line shows the direction of the case/control
distinction.
Statistical insight: Given the
size of the dataset being analyzed and LBF properties, it is
expected that the Case/Cont direction is strictly parallel with
the first component. The observed deviation indicates that Case/control
distinction is not responsible for most of the dataset genetic
variability, captured in the first component.
Clinical insight: The case/control
line is not aligned with the most of the genetic variability
of this large dataset. This suggest a within group genetic variability
greater than the between genetic group variability.
|
|
|
Subjects scores. Obvious genetic
heterogeneity in controls.
Legend: Eigen Analysis. Subjects'
scores for the first two components. 1200 cases in pink and
1200 controls in blue.
Statistical insight: Figure shows
individuals plotted on the first two Eigen vectors of gene level
LBFs for Disease B. Substantial inter-individual heterogeneity
is clear. The controls appear to be from 3 extended diplotype
groupings.
Clinical insight: The patterns
observed in this figure are driven solely by genetic factors,
and highlight substantial inter-individual genetic heterogeneity,
as suspected from the figure above, within controls.
|
|
|
Biplot shows controls genetic
heterogeneity confounds case/control genetic distinction.
Legend: Eigen analysis biplot.
Cases are in blue, controls in pink, and gene loadings in green.
Genes having a high loading are noted.
Statistical insight: The case-control
dummy variable is of importance for both components 1 and 2.
Since those are orthogonal, one should suspect a confounding
factor.
Clinical insight: Genes having
high loading 1 are pulling apart cases from controls (expected)
but also controls from other controls (in the horizontal direction).
Similarly, Genes having high loading 2 are pulling apart cases
from controls (not expected) and also controls from other controls
(in the vertical direction). The within control genetic heterogeneity
confounds the main case/control genetic heterogeneity, and no
firm conclusion can be drawn. Another analysis should be performed,
with the 'right' and homogenous group of controls.
|
|
|
Subjects scores.
Legend: Control origin overlaid
on previous figure. Geographic origin is color-coded. Cases
are in grey. There is a clear match between controls geographic
origins and the genetic heterogeneity patterns observed above.
Statistical insight: Follow-up
showed a mixture of four geographic origins to the controls.
The choice of the reference population of controls in a study
markedly affected the genes that were indicated for follow-up
(c.f. target discovery). The first Eigen vector, in this example,
does not represent the true case-control distinction, since
the within-control genetic variability overwhelms the true case-control
distinction.
Clinical insight: This example
highlights the importance of the choice of the reference population
of controls.
|
|