Taxonomy 3 - A multivariate genetic analysis
get newsletter
email us

Aggregation of LBFs

The aggregation of LBFs can be made over any domain to help answer biologically and clinically relevant questions. This allows the ‘supervised’ biological exploration of SNP data, and the classification of subjects. For instance, LBF values can be summarised or collapsed (using simple summation, mean, variance...), over:

  • genotypes within a locus to give a SNP level measure (‘SNP LBF’)
  • all SNP loci within a gene to produce a gene level measure (‘gene LBF’)
  • any ontology of genes : any meaningful group of genes, such as coding for proteins involved in one biological pathway (‘pathway LBF’)
  • the whole genome

For any subject a vector of LBF measures is a type of profile. This profile (for example along the genome) can be considered as a stochastic (i.e. non-deterministic) sequence and thus be characterised by its first and second moments (mean and variance respectively). These have a meaning in terms of the biological classificatory signal involved for that person. The pattern of LBF values for a domain reflects the empirical genetic model of that domain for the trait being analyzed.

The examples below represent a dataset composed of 500 cases (affected with a common neuropsychiatric disorder) and 500 control subjects genotyped for circa 5000 SNPs among 1500 genes. Gene LBFs are aggregated over 2 ontologies: 'GTPase activity' (4 genes) and 'negative regulation of cell proliferation' (23 genes). Mean and variance of these LBF profiles are plotted for each subject (cases in pink and controls in blue).

The patterns observed in these figures are driven solely by genetic factors (and the case/control status used in the LBF calculation), and highlight substantial inter-individual genetic heterogeneity.

  • The first example shows clearly a sub-group of cases, distinct from other subjects. These case share a specific 'GTPase activity' pattern that could help define a sub-phenotype and could be of interest for drug discovery or drug development (clinical trial enrichment).

  • The second example shows clearly a substantial population heterogeneity in the dataset for this ontology, and does not reveal any obvious contrast between cases and controls.

click here for a larger, high resolution image

click here for a larger, high resolution image

 

 

 


   top of page
Newsletter

    -> We plan to send (infrequent) emails regarding publications, talks, software updates, etc...

To subscribe, or manage your subscription, just enter your email address below:

email:

 

Send us your comments
Name   (optional)
Subject   (optional)
Email  
Comment  
    

You can also email us directly at:   taxonomy@delrieu.org