ABSTRACT
Spinal muscular atrophy is a severe motor neuron disease caused by inactivating mutations in the SMN1 gene leading to reduced levels of full-length functional SMN protein. SMN is a critical mediator of spliceosomal protein assembly, and complete loss or drastic reduction in protein leads to loss of cell viability. However, the reason for selective motor neuron degeneration when SMN is reduced to levels which are tolerated by all other cell types is not currently understood. Widespread splicing abnormalities have recently been reported at end-stage in a mouse model of SMA, leading to the proposition that disruption of efficient splicing is the primary mechanism of motor neuron death. However, it remains unclear whether splicing abnormalities are present during early stages of the disease, which would be a requirement for a direct role in disease pathogenesis. We performed exon-array analysis of RNA from SMN deficient mouse spinal cord at 3 time points, pre-symptomatic (P1), early symptomatic (P7), and late-symptomatic (P13). Compared to littermate control mice, SMA mice showed a time-dependent increase in the number of exons showing differential expression, with minimal differences between genotypes at P1 and P7, but substantial variation in late-symptomatic (P13) mice. Gene ontology analysis revealed differences in pathways associated with neuronal development as well as cellular injury. Validation of selected targets by RT-PCR confirmed the array findings and was in keeping with a shift between physiologically occurring mRNA isoforms. We conclude that the majority of splicing changes occur late in SMA and may represent a secondary effect of cell injury, though we cannot rule out significant early changes in a small number of transcripts crucial to motor neuron survival.
Subject(s)
Alternative Splicing/genetics , Muscular Atrophy, Spinal/pathology , Animals , Disease Models, Animal , Exons , Gene Expression Regulation , Mice , Motor Neurons , Protein Isoforms , RNA, Messenger/analysis , Spinal Cord , Time FactorsABSTRACT
Standard techniques for single marker quantitative trait mapping perform poorly in detecting complex interacting genetic influences. When a genetic marker interacts with other genetic markers and/or environmental factors to influence a quantitative trait, a sample of individuals will show different effects according to their exposure to other interacting factors. This paper presents a Bayesian mixture model, which effectively models heterogeneous genetic effects apparent at a single marker. We compute approximate Bayes factors which provide an efficient strategy for screening genetic markers (genome-wide) for evidence of a heterogeneous effect on a quantitative trait. We present a simulation study which demonstrates that the approximation is good and provide a real data example which identifies a population-specific genetic effect on gene expression in the HapMap CEU and YRI populations. We advocate the use of the model as a strategy for identifying candidate interacting markers without any knowledge of the nature or order of the interaction. The source of heterogeneity can be modeled as an extension.
Subject(s)
Genetic Loci , Models, Statistical , Quantitative Trait Loci , Algorithms , Alleles , Bayes Theorem , Computer Simulation , Environment , Genetic Markers , Genotype , Humans , Models, Genetic , Odds Ratio , SoftwareABSTRACT
Coalescent theory deals with the dynamics of how sampled genetic material has spread through a population from a single ancestor over many generations and is ubiquitous in contemporary molecular population genetics. Inherent in most applications is a continuous-time approximation that is derived under the assumption that sample size is small relative to the actual population size. In effect, this precludes multiple and simultaneous coalescent events that take place in the history of large samples. If sequences do not recombine, the number of sequences ancestral to a large sample is reduced sufficiently after relatively few generations such that use of the continuous-time approximation is justified. However, in tracing the history of large chromosomal segments, a large recombination rate per generation will consistently maintain a large number of ancestors. This can create a major disparity between discrete-time and continuous-time models and we analyze its importance, illustrated with model parameters typical of the human genome. The presence of gene conversion exacerbates the disparity and could seriously undermine applications of coalescent theory to complete genomes. However, we show that multiple and simultaneous coalescent events influence global quantities, such as total number of ancestors, but have negligible effect on local quantities, such as linkage disequilibrium. Reassuringly, most applications of the coalescent model with recombination (including association mapping) focus on local quantities.
Subject(s)
Genetics, Population/methods , Models, Genetic , Gene Conversion , Genome, Human , Humans , Recombination, Genetic , TimeABSTRACT
Genome-wide association study (GWAS) data on a disease are increasingly available from multiple related populations. In this scenario, meta-analyses can improve power to detect homogeneous genetic associations, but if there exist ancestry-specific effects, via interactions on genetic background or with a causal effect that co-varies with genetic background, then these will typically be obscured. To address this issue, we have developed a robust statistical method for detecting susceptibility gene-ancestry interactions in multi-cohort GWAS based on closely-related populations. We use the leading principal components of the empirical genotype matrix to cluster individuals into "ancestry groups" and then look for evidence of heterogeneous genetic associations with disease or other trait across these clusters. Robustness is improved when there are multiple cohorts, as the signal from true gene-ancestry interactions can then be distinguished from gene-collection artefacts by comparing the observed interaction effect sizes in collection groups relative to ancestry groups. When applied to colorectal cancer, we identified a missense polymorphism in iron-absorption gene CYBRD1 that associated with disease in individuals of English, but not Scottish, ancestry. The association replicated in two additional, independently-collected data sets. Our method can be used to detect associations between genetic variants and disease that have been obscured by population genetic heterogeneity. It can be readily extended to the identification of genetic interactions on other covariates such as measured environmental exposures. We envisage our methodology being of particular interest to researchers with existing GWAS data, as ancestry groups can be easily defined and thus tested for interactions.