|
We are using an informal inference technique
that filters SNP data to extract meaning. It relies upon a directed
discrete multivariate "Bayes Rule" measure of evidence
for the SNP distinction of an individual free of genetic assumptions.
The LBFs are the observed difference in log
genotype frequencies (between trait cases and controls) or a
difference in the relative frequencies of categorical variables
between the two groups. This measure transforms the characterisation
of people from a binary domain of SNPs to a rapidly calculable
continuous measure with simple additive properties.
For subject i, categorical variable
j and categorical values k, the lbf value is :

Where:
Si,j,k = {0,1} : presence/absence
of the kth categorical value (genotype) of the jth
categorical variable (SNP) for the ith subject
nu and theta are bayesian estimates of the
genotype frequencies (non-informative beta prior) :

This Bayes Factor or 'diagnostic' likelihood
ratio (DLR) is the amount of evidence that the ith case
individual is classified as not having the same genome as that
of a set of controls. LBFs are a directed or asymmetric measure
that indicate the 'case-ness' of an individual i.e. their propensity
to be a non-control. In other words, LBF values for each individual
(i.e. the individualised difference in information content)
represent that person's relative contribution to the 'case-ness'
in this overall distinction, or 'index of separation', between
the groups.
This extremely simple empirical Bayesian predictive
measure is effectively the summation of case-control contrasts
(i.e. group by SNP genotype differences or interaction) over
genotypes x loci from a log-linear model estimated for
each locus on its own (partial log-linear modeling) of
all subjects in 2 groups, instantiated by the presence of the
SNP genotype markers in that individual.
This method non-linearly deforms or amplifies
the SNP data space by re-weighting all subjects with the observed
between group differences in order to yield a continuous measure
of the genetic model involved in that trait.
This measure has simple additive properties
and can be used in any linear algebra tool: addition, averaging,
moments, singular value decomposition, eigen analysis, recursive
partitioning, ....
|