Taxonomy 3 - A multivariate genetic analysis
get newsletter
email us
Dataset 2
  • 1200 cases affected with metabolic disease B
  • 1200 controls
  • All cases and controls were of Caucasian origin, however:
    • 600 controls were matched (age, sex) and originated from the sites were the cases were collected.
    • 600 other 'banked' controls were not matched with the cases and originated from other sites, geographically very distinct.
  • Circa 1000 SNPs among 300 genes were genotyped

    This dataset and these examples are discussed in more details in:
    Visualizing gene determinants of disease in drug discovery. Delrieu O and Bowman C.
    Pharmacogenomics. 2006 Apr;7(3):311-29.
    PMID: 16610942 / Pharmacogenomics

Gene Loadings. Case/Control direction is not aligned with first component.

Legend: Eigen Analysis. Genes loadings for the first two components. Circa 300 genes (green crosses) are displayed. Genes having high loadings are noted. The case/control line shows the direction of the case/control distinction.

Statistical insight: Given the size of the dataset being analyzed and LBF properties, it is expected that the Case/Cont direction is strictly parallel with the first component. The observed deviation indicates that Case/control distinction is not responsible for most of the dataset genetic variability, captured in the first component.

Clinical insight: The case/control line is not aligned with the most of the genetic variability of this large dataset. This suggest a within group genetic variability greater than the between genetic group variability.

Subjects scores. Obvious genetic heterogeneity in controls.

Legend: Eigen Analysis. Subjects' scores for the first two components. 1200 cases in pink and 1200 controls in blue.

Statistical insight: Figure shows individuals plotted on the first two Eigen vectors of gene level LBFs for Disease B. Substantial inter-individual heterogeneity is clear. The controls appear to be from 3 extended diplotype groupings.

Clinical insight: The patterns observed in this figure are driven solely by genetic factors, and highlight substantial inter-individual genetic heterogeneity, as suspected from the figure above, within controls.

Biplot shows controls genetic heterogeneity confounds case/control genetic distinction.

Legend: Eigen analysis biplot. Cases are in blue, controls in pink, and gene loadings in green. Genes having a high loading are noted.

Statistical insight: The case-control dummy variable is of importance for both components 1 and 2. Since those are orthogonal, one should suspect a confounding factor.

Clinical insight: Genes having high loading 1 are pulling apart cases from controls (expected) but also controls from other controls (in the horizontal direction). Similarly, Genes having high loading 2 are pulling apart cases from controls (not expected) and also controls from other controls (in the vertical direction). The within control genetic heterogeneity confounds the main case/control genetic heterogeneity, and no firm conclusion can be drawn. Another analysis should be performed, with the 'right' and homogenous group of controls.

Subjects scores.

Legend: Control origin overlaid on previous figure. Geographic origin is color-coded. Cases are in grey. There is a clear match between controls geographic origins and the genetic heterogeneity patterns observed above.

Statistical insight: Follow-up showed a mixture of four geographic origins to the controls. The choice of the reference population of controls in a study markedly affected the genes that were indicated for follow-up (c.f. target discovery). The first Eigen vector, in this example, does not represent the true case-control distinction, since the within-control genetic variability overwhelms the true case-control distinction.

Clinical insight: This example highlights the importance of the choice of the reference population of controls.

 

 


   top of page
Newsletter

    -> We plan to send (infrequent) emails regarding publications, talks, software updates, etc...

To subscribe, or manage your subscription, just enter your email address below:

email:

 

Send us your comments
Name   (optional)
Subject   (optional)
Email  
Comment  
    

You can also email us directly at:   taxonomy@delrieu.org