Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 48
Filter
1.
medRxiv ; 2024 May 05.
Article in English | MEDLINE | ID: mdl-38746151

ABSTRACT

While genome sequencing has transformed medicine by elucidating the genetic underpinnings of both rare and common complex disorders, its utility to predict clinical outcomes remains understudied. Here, we used artificial intelligence (AI) technologies to explore the predictive value of genome sequencing in forecasting clinical outcomes following surgery for congenital heart defects (CHD). We report results for a cohort of 2,253 CHD patients from the Pediatric Cardiac Genomics Consortium with a broad range of complex heart defects, pre- and post-operative clinical variables and exome sequencing. Damaging genotypes in chromatin-modifying and cilia-related genes were associated with an elevated risk of adverse post-operative outcomes, including mortality, cardiac arrest and prolonged mechanical ventilation. The impact of damaging genotypes was further amplified in the context of specific CHD phenotypes, surgical complexity and extra-cardiac anomalies. The absence of a damaging genotype in chromatin-modifying and cilia-related genes was also informative, reducing the risk for adverse postoperative outcomes. Thus, genome sequencing enriches the ability to forecast outcomes following congenital cardiac surgery.

3.
Genome Biol ; 23(1): 253, 2022 12 12.
Article in English | MEDLINE | ID: mdl-36510265

ABSTRACT

BACKGROUND: Short tandem repeats (STRs) compose approximately 3% of the genome, and mutations at STR loci have been linked to dozens of human diseases including amyotrophic lateral sclerosis, Friedreich ataxia, Huntington disease, and fragile X syndrome. Improving our understanding of these mutations would increase our knowledge of the mutational dynamics of the genome and may uncover additional loci that contribute to disease. To estimate the genome-wide pattern of mutations at STR loci, we analyze blood-derived whole-genome sequencing data for 544 individuals from 29 three-generation CEPH pedigrees. These pedigrees contain both sets of grandparents, the parents, and an average of 9 grandchildren per family. RESULTS: We use HipSTR to identify de novo STR mutations in the 2nd generation of these pedigrees and require transmission to the third generation for validation. Analyzing approximately 1.6 million STR loci, we estimate the empirical de novo STR mutation rate to be 5.24 × 10-5 mutations per locus per generation. Perfect repeats mutate about 2 × more often than imperfect repeats. De novo STRs are significantly enriched in Alu elements. CONCLUSIONS: Approximately 30% of new STR mutations occur within Alu elements, which compose only 11% of the genome, but only 10% are found in LINE-1 insertions, which compose 17% of the genome. Phasing these mutations to the parent of origin shows that parental transmission biases vary among families. We estimate the average number of de novo genome-wide STR mutations per individual to be approximately 85, which is similar to the average number of observed de novo single nucleotide variants.


Subject(s)
Extended Family , Microsatellite Repeats , Humans , Mutation , Pedigree , Genome
4.
Genes (Basel) ; 12(5)2021 04 27.
Article in English | MEDLINE | ID: mdl-33925651

ABSTRACT

There is strong evidence for a genetic contribution to non-syndromic congenital heart defects (CHDs). However, exome- and genome-wide studies conducted at the variant and gene-level have identified few genome-wide significant CHD-related genes. Gene-set analyses are a useful complement to such studies and candidate gene-set analyses of rare variants have provided insight into the genetics of CHDs. However, similar analyses have not been conducted using data on common genetic variants. Consequently, we conducted common variant analyses of 15 CHD candidate gene-sets, using data from two common types of CHDs: conotruncal heart defects (1431 cases) and left ventricular outflow tract defects (509 cases). After Bonferroni correction for evaluation of multiple gene-sets, the cytoskeletal gene-set was significantly associated with conotruncal heart defects (ßS = 0.09; 95% confidence interval (CI) 0.03-0.15). This association was stronger when analyses were restricted to the sub-set of cytoskeletal genes that have been observed to harbor rare damaging genotypes in at least two CHD cases (ßS = 0.32, 95% CI 0.08-0.56). These findings add to the evidence linking cytoskeletal genes to CHDs and suggest that, for cytoskeletal genes, common variation may contribute to the risk of CHDs.


Subject(s)
Cytoskeleton/genetics , Heart Defects, Congenital/genetics , Polymorphism, Single Nucleotide/genetics , Case-Control Studies , Genome, Human/genetics , Genotype , Humans , Risk Factors
5.
Genome Biol Evol ; 12(6): 779-794, 2020 06 01.
Article in English | MEDLINE | ID: mdl-32359137

ABSTRACT

Ongoing retrotransposition of Alu, LINE-1, and SINE-VNTR-Alu elements generates diversity and variation among human populations. Previous analyses investigating the population genetics of mobile element insertions (MEIs) have been limited by population ascertainment bias or by relatively small numbers of populations and low sequencing coverage. Here, we use 296 individuals representing 142 global populations from the Simons Genome Diversity Project (SGDP) to discover and characterize MEI diversity from deeply sequenced whole-genome data. We report 5,742 MEIs not originally reported by the 1000 Genomes Project and show that high sampling diversity leads to a 4- to 7-fold increase in MEI discovery rates over the original 1000 Genomes Project data. As a result of negative selection, nonreference polymorphic MEIs are underrepresented within genes, and MEIs within genes are often found in the transcriptional orientation opposite that of the gene. Globally, 80% of Alu subfamilies predate the expansion of modern humans from Africa. Polymorphic MEIs show heterozygosity gradients that decrease from Africa to Eurasia to the Americas, and the number of MEIs found uniquely in a single individual are also distributed in this general pattern. The maximum fraction of MEI diversity partitioned among the seven major SGDP population groups (FST) is 7.4%, similar to, but slightly lower than, previous estimates and likely attributable to the diverse sampling strategy of the SGDP. Finally, we utilize these MEIs to extrapolate the primary Native American shared ancestry component to back to Asia and provide new evidence from genome-wide identical-by-descent genetic markers that add additional support for a southeastern Siberian origin for most Native Americans.


Subject(s)
Alu Elements , Genetic Variation , Genome, Human , Long Interspersed Nucleotide Elements , Humans , Phylogeography
6.
Nucleic Acids Res ; 48(6): e36, 2020 04 06.
Article in English | MEDLINE | ID: mdl-32067044

ABSTRACT

Alu retrotransposons account for more than 10% of the human genome, and insertions of these elements create structural variants segregating in human populations. Such polymorphic Alus are powerful markers to understand population structure, and they represent variants that can greatly impact genome function, including gene expression. Accurate genotyping of Alus and other mobile elements has been challenging. Indeed, we found that Alu genotypes previously called for the 1000 Genomes Project are sometimes erroneous, which poses significant problems for phasing these insertions with other variants that comprise the haplotype. To ameliorate this issue, we introduce a new pipeline - TypeTE - which genotypes Alu insertions from whole-genome sequencing data. Starting from a list of polymorphic Alus, TypeTE identifies the hallmarks (poly-A tail and target site duplication) and orientation of Alu insertions using local re-assembly to reconstruct presence and absence alleles. Genotype likelihoods are then computed after re-mapping sequencing reads to the reconstructed alleles. Using a high-quality set of PCR-based genotyping of >200 loci, we show that TypeTE improves genotype accuracy from 83% to 92% in the 1000 Genomes dataset. TypeTE can be readily adapted to other retrotransposon families and brings a valuable toolbox addition for population genomics.


Subject(s)
Interspersed Repetitive Sequences/genetics , Mutagenesis, Insertional/genetics , Software , Whole Genome Sequencing/methods , Databases, Genetic , Gene Frequency/genetics , Genetic Loci , Genetics, Population , Genome, Human , Genotype , Humans
7.
Genome Res ; 29(10): 1567-1577, 2019 10.
Article in English | MEDLINE | ID: mdl-31575651

ABSTRACT

Germline mutation rates in humans have been estimated for a variety of mutation types, including single-nucleotide and large structural variants. Here, we directly measure the germline retrotransposition rate for the three active retrotransposon elements: L1, Alu, and SVA. We used three tools for calling mobile element insertions (MEIs) (MELT, RUFUS, and TranSurVeyor) on blood-derived whole-genome sequence (WGS) data from 599 CEPH individuals, comprising 33 three-generation pedigrees. We identified 26 de novo MEIs in 437 births. The retrotransposition rate estimates for Alu elements, one in 40 births, is roughly half the rate estimated using phylogenetic analyses, a difference in magnitude similar to that observed for single-nucleotide variants. The L1 retrotransposition rate is one in 63 births and is within range of previous estimates (1:20-1:200 births). The SVA retrotransposition rate, one in 63 births, is much higher than the previous estimate of one in 900 births. Our large, three-generation pedigrees allowed us to assess parent-of-origin effects and the timing of insertion events in either gametogenesis or early embryonic development. We find a statistically significant paternal bias in Alu retrotransposition. Our study represents the first in-depth analysis of the rate and dynamics of human retrotransposition from WGS data in three-generation human pedigrees.


Subject(s)
Interspersed Repetitive Sequences/genetics , Phylogeny , Retroelements/genetics , Whole Genome Sequencing , Alu Elements/genetics , Animals , Female , Hominidae/blood , Hominidae/genetics , Humans , Long Interspersed Nucleotide Elements/genetics , Male , Mutation , Pedigree , Polymorphism, Single Nucleotide/genetics
8.
Nat Commun ; 10(1): 4722, 2019 10 17.
Article in English | MEDLINE | ID: mdl-31624253

ABSTRACT

The genetic architecture of sporadic congenital heart disease (CHD) is characterized by enrichment in damaging de novo variants in chromatin-modifying genes. To test the hypothesis that gene pathways contributing to de novo forms of CHD are distinct from those for recessive forms, we analyze 2391 whole-exome trios from the Pediatric Cardiac Genomics Consortium. We deploy a permutation-based gene-burden analysis to identify damaging recessive and compound heterozygous genotypes and disease genes, controlling for confounding effects, such as background mutation rate and ancestry. Cilia-related genes are significantly enriched for damaging rare recessive genotypes, but comparatively depleted for de novo variants. The opposite trend is observed for chromatin-modifying genes. Other cardiac developmental gene classes have less stratification by mode of inheritance than cilia and chromatin-modifying gene classes. Our analyses reveal dominant and recessive CHD are associated with distinct gene functions, with cilia-related genes providing a reservoir of rare segregating variation leading to CHD.


Subject(s)
Genes, Dominant , Genes, Recessive , Genetic Predisposition to Disease/genetics , Heart Defects, Congenital/genetics , Mutation , Case-Control Studies , Child , Female , Genome-Wide Association Study , Genotype , Heart Defects, Congenital/pathology , Humans , Male , Phenotype , Exome Sequencing
9.
G3 (Bethesda) ; 8(9): 2881-2888, 2018 08 30.
Article in English | MEDLINE | ID: mdl-30166421

ABSTRACT

Crohn's disease is a complex genetic trait characterized by chronic relapsing intestinal inflammation. Genome wide association studies (GWAS) have identified more than 170 loci associated with the disease, accounting for ∼14% of the disease variance. We hypothesized that rare genetic variation in GWAS positional candidates also contribute to disease pathogenesis. We performed targeted, massively-parallel sequencing of 101 genes in 205 children with Crohn's disease, including 179 parent-child trios and 200 controls, both of European ancestry. We used the gene burden test implemented in VAAST and estimated effect sizes using logistic regression and meta-analyses. We identified three genes with nominally significant p-values: NOD2, RTKN2, and MGAT3 Only NOD2 was significant after correcting for multiple comparisons. We identified eight novel rare variants in NOD2 that are likely disease-associated. Incorporation of rare variation and compound heterozygosity nominally increased the proportion of variance explained from 0.074 to 0.089. We estimated the population attributable risk and total heritability of variation in NOD2 to be 32.9% and 3.4%, respectively, with 3.7% and 0.25% accounted for by rare putatively functional variants. Sequencing probands (as opposed to genotyping) to identify rare variants and incorporating phase by sequencing parents can recover a portion of the missing heritability of Crohn's disease.


Subject(s)
Crohn Disease/genetics , Genetic Variation , Genome-Wide Association Study , Nod2 Signaling Adaptor Protein/genetics , Adolescent , Adult , Child , Child, Preschool , Female , Heterozygote , Humans , Infant , Intracellular Signaling Peptides and Proteins/genetics , Male , N-Acetylglucosaminyltransferases/genetics
10.
Nat Genet ; 49(11): 1593-1601, 2017 Nov.
Article in English | MEDLINE | ID: mdl-28991257

ABSTRACT

Congenital heart disease (CHD) is the leading cause of mortality from birth defects. Here, exome sequencing of a single cohort of 2,871 CHD probands, including 2,645 parent-offspring trios, implicated rare inherited mutations in 1.8%, including a recessive founder mutation in GDF1 accounting for ∼5% of severe CHD in Ashkenazim, recessive genotypes in MYH6 accounting for ∼11% of Shone complex, and dominant FLT4 mutations accounting for 2.3% of Tetralogy of Fallot. De novo mutations (DNMs) accounted for 8% of cases, including ∼3% of isolated CHD patients and ∼28% with both neurodevelopmental and extra-cardiac congenital anomalies. Seven genes surpassed thresholds for genome-wide significance, and 12 genes not previously implicated in CHD had >70% probability of being disease related. DNMs in ∼440 genes were inferred to contribute to CHD. Striking overlap between genes with damaging DNMs in probands with CHD and autism was also found.


Subject(s)
Autistic Disorder/genetics , Cardiac Myosins/genetics , Genetic Predisposition to Disease , Growth Differentiation Factor 1/genetics , Heart Defects, Congenital/genetics , Myosin Heavy Chains/genetics , Vascular Endothelial Growth Factor Receptor-3/genetics , Adult , Autistic Disorder/pathology , Case-Control Studies , Child , Exome , Female , Gene Expression , Genome-Wide Association Study , Heart Defects, Congenital/pathology , Heterozygote , High-Throughput Nucleotide Sequencing , Homozygote , Humans , Male , Mutation , Pedigree , Risk
11.
BMC Genomics ; 18(1): 396, 2017 05 22.
Article in English | MEDLINE | ID: mdl-28532386

ABSTRACT

BACKGROUND: The cost of Whole Genome Sequencing (WGS) has decreased tremendously in recent years due to advances in next-generation sequencing technologies. Nevertheless, the cost of carrying out large-scale cohort studies using WGS is still daunting. Past simulation studies with coverage at ~2x have shown promise for using low coverage WGS in studies focused on variant discovery, association study replications, and population genomics characterization. However, the performance of low coverage WGS in populations with a complex history and no reference panel remains to be determined. RESULTS: South Indian populations are known to have a complex population structure and are an example of a major population group that lacks adequate reference panels. To test the performance of extremely low-coverage WGS (EXL-WGS) in populations with a complex history and to provide a reference resource for South Indian populations, we performed EXL-WGS on 185 South Indian individuals from eight populations to ~1.6x coverage. Using two variant discovery pipelines, SNPTools and GATK, we generated a consensus call set that has ~90% sensitivity for identifying common variants (minor allele frequency ≥ 10%). Imputation further improves the sensitivity of our call set. In addition, we obtained high-coverage for the whole mitochondrial genome to infer the maternal lineage evolutionary history of the Indian samples. CONCLUSIONS: Overall, we demonstrate that EXL-WGS with imputation can be a valuable study design for variant discovery with a dramatically lower cost than standard WGS, even in populations with a complex history and without available reference data. In addition, the South Indian EXL-WGS data generated in this study will provide a valuable resource for future Indian genomic studies.


Subject(s)
Asian People/genetics , Metagenomics , Whole Genome Sequencing , Genetic Variation , Genome, Mitochondrial/genetics , Humans
12.
Proc Natl Acad Sci U S A ; 112(45): 13833-8, 2015 Nov 10.
Article in English | MEDLINE | ID: mdl-26504230

ABSTRACT

Pleistocene residential sites with multiple contemporaneous human burials are extremely rare in the Americas. We report mitochondrial genomic variation in the first multiple mitochondrial genomes from a single prehistoric population: two infant burials (USR1 and USR2) from a common interment at the Upward Sun River Site in central Alaska dating to ∼11,500 cal B.P. Using a targeted capture method and next-generation sequencing, we determined that the USR1 infant possessed variants that define mitochondrial lineage C1b, whereas the USR2 genome falls at the root of lineage B2, allowing us to refine younger coalescence age estimates for these two clades. C1b and B2 are rare to absent in modern populations of northern North America. Documentation of these lineages at this location in the Late Pleistocene provides evidence for the extent of mitochondrial diversity in early Beringian populations, which supports the expectations of the Beringian Standstill Model.


Subject(s)
DNA, Mitochondrial/genetics , Genetic Variation , Haplotypes/genetics , Human Migration/history , Models, Theoretical , Phylogeny , Alaska , Archaeology/methods , Base Sequence , Bayes Theorem , Burial/history , Evolution, Molecular , Geography , High-Throughput Nucleotide Sequencing , History, Ancient , Humans , Infant , Likelihood Functions , Models, Genetic , Molecular Sequence Data , Oligonucleotides/genetics
13.
Science ; 349(6253): aab3761, 2015 09 11.
Article in English | MEDLINE | ID: mdl-26249230

ABSTRACT

In order to explore the diversity and selective signatures of duplication and deletion human copy-number variants (CNVs), we sequenced 236 individuals from 125 distinct human populations. We observed that duplications exhibit fundamentally different population genetic and selective signatures than deletions and are more likely to be stratified between human populations. Through reconstruction of the ancestral human genome, we identify megabases of DNA lost in different human lineages and pinpoint large duplications that introgressed from the extinct Denisova lineage now found at high frequency exclusively in Oceanic populations. We find that the proportion of CNV base pairs to single-nucleotide-variant base pairs is greater among non-Africans than it is among African populations, but we conclude that this difference is likely due to unique aspects of non-African population history as opposed to differences in CNV load.


Subject(s)
DNA Copy Number Variations , Evolution, Molecular , Gene Duplication , Genome, Human/genetics , Population/genetics , Sequence Deletion , Animals , Black People/classification , Black People/genetics , Hominidae/genetics , Humans , Native Hawaiian or Other Pacific Islander/classification , Native Hawaiian or Other Pacific Islander/genetics , Phylogeny , Polymorphism, Single Nucleotide , Selection, Genetic
14.
PLoS One ; 9(8): e104378, 2014.
Article in English | MEDLINE | ID: mdl-25093581

ABSTRACT

BACKGROUND: The genetics involved in Ewing sarcoma susceptibility and prognosis are poorly understood. EWS/FLI and related EWS/ETS chimeras upregulate numerous gene targets via promoter-based GGAA-microsatellite response elements. These microsatellites are highly polymorphic in humans, and preliminary evidence suggests EWS/FLI-mediated gene expression is highly dependent on the number of GGAA motifs within the microsatellite. OBJECTIVES: Here we sought to examine the polymorphic spectrum of a GGAA-microsatellite within the NR0B1 promoter (a critical EWS/FLI target) in primary Ewing sarcoma tumors, and characterize how this polymorphism influences gene expression and clinical outcomes. RESULTS: A complex, bimodal pattern of EWS/FLI-mediated gene expression was observed across a wide range of GGAA motifs, with maximal expression observed in constructs containing 20-26 GGAA motifs. Relative to white European and African controls, the NR0B1 GGAA-microsatellite in tumor cells demonstrated a strong bias for haplotypes containing 21-25 GGAA motifs suggesting a relationship between microsatellite function and disease susceptibility. This selection bias was not a product of microsatellite instability in tumor samples, nor was there a correlation between NR0B1 GGAA-microsatellite polymorphisms and survival outcomes. CONCLUSIONS: These data suggest that GGAA-microsatellite polymorphisms observed in human populations modulate EWS/FLI-mediated gene expression and may influence disease susceptibility in Ewing sarcoma.


Subject(s)
DAX-1 Orphan Nuclear Receptor/genetics , Microsatellite Repeats/genetics , Nucleotide Motifs , Polymorphism, Genetic , Sarcoma, Ewing/diagnosis , Sarcoma, Ewing/genetics , Adolescent , Age Factors , Alleles , Case-Control Studies , Cell Transformation, Neoplastic/genetics , Child , Child, Preschool , Female , Gene Expression , Genetic Loci , Genomics , Humans , Linkage Disequilibrium , Male , Models, Biological , Oncogene Proteins, Fusion/genetics , Oncogene Proteins, Fusion/metabolism , Prognosis , Proto-Oncogene Protein c-fli-1/genetics , Proto-Oncogene Protein c-fli-1/metabolism , RNA-Binding Protein EWS/genetics , RNA-Binding Protein EWS/metabolism , Sarcoma, Ewing/mortality , Young Adult
15.
Am J Obstet Gynecol ; 210(4): 321.e1-321.e21, 2014 Apr.
Article in English | MEDLINE | ID: mdl-24594138

ABSTRACT

OBJECTIVE: We hypothesized that genetic variation affects responsiveness to 17-alpha hydroxyprogesterone caproate (17P) for recurrent preterm birth prevention. STUDY DESIGN: Women of European ancestry with ≥1 spontaneous singleton preterm birth at <34 weeks' gestation who received 17P were recruited prospectively and classified as a 17P responder or nonresponder by the difference in delivery gestational age between 17P-treated and -untreated pregnancies. Samples underwent whole exome sequencing. Coding variants were compared between responders and nonresponders with the use of the Variant Annotation, Analysis, and Search Tool (VAAST), which is a probabilistic search tool for the identification of disease-causing variants, and were compared with a Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway candidate gene list. Genes with the highest VAAST scores were then classified by the online Protein ANalysis THrough Evolutionary Relationships (PANTHER) system into known gene ontology molecular functions and biologic processes. Gene distributions within these classifications were compared with an online reference population to identify over- and under- represented gene sets. RESULTS: Fifty women (9 nonresponders) were included. Responders delivered 9.2 weeks longer with 17P vs 1.3 weeks' gestation for nonresponders (P < .001). A genome-wide search for genetic differences implicated the NOS1 gene to be the most likely associated gene from among genes on the KEGG candidate gene list (P < .00095). PANTHER analysis revealed several over-represented gene ontology categories that included cell adhesion, cell communication, signal transduction, nitric oxide signal transduction, and receptor activity (all with significant Bonferroni-corrected probability values). CONCLUSION: We identified sets of over-represented genes in key processes among responders to 17P, which is the first step in the application of pharmacogenomics to preterm birth prevention.


Subject(s)
Estrogen Antagonists/administration & dosage , Hydroxyprogesterones/administration & dosage , Premature Birth/prevention & control , 17 alpha-Hydroxyprogesterone Caproate , Case-Control Studies , Exome , Female , Genetic Variation , Humans , Nitric Oxide Synthase/genetics , Nitric Oxide Synthase Type I/genetics , Pharmacogenetics , Pregnancy , Prospective Studies , Secondary Prevention , Sequence Analysis, DNA/methods
16.
PLoS Genet ; 9(7): e1003634, 2013.
Article in English | MEDLINE | ID: mdl-23874230

ABSTRACT

Deedu (DU) Mongolians, who migrated from the Mongolian steppes to the Qinghai-Tibetan Plateau approximately 500 years ago, are challenged by environmental conditions similar to native Tibetan highlanders. Identification of adaptive genetic factors in this population could provide insight into coordinated physiological responses to this environment. Here we examine genomic and phenotypic variation in this unique population and present the first complete analysis of a Mongolian whole-genome sequence. High-density SNP array data demonstrate that DU Mongolians share genetic ancestry with other Mongolian as well as Tibetan populations, specifically in genomic regions related with adaptation to high altitude. Several selection candidate genes identified in DU Mongolians are shared with other Asian groups (e.g., EDAR), neighboring Tibetan populations (including high-altitude candidates EPAS1, PKLR, and CYP2E1), as well as genes previously hypothesized to be associated with metabolic adaptation (e.g., PPARG). Hemoglobin concentration, a trait associated with high-altitude adaptation in Tibetans, is at an intermediate level in DU Mongolians compared to Tibetans and Han Chinese at comparable altitude. Whole-genome sequence from a DU Mongolian (Tianjiao1) shows that about 2% of the genomic variants, including more than 300 protein-coding changes, are specific to this individual. Our analyses of DU Mongolians and the first Mongolian genome provide valuable insight into genetic adaptation to extreme environments.


Subject(s)
Adaptation, Physiological/genetics , Altitude Sickness/genetics , Genome, Human , Selection, Genetic , Acclimatization/genetics , Acclimatization/physiology , Alleles , Altitude , Altitude Sickness/pathology , Asian People/genetics , Gene Frequency , Genetics, Population , Genome-Wide Association Study , Humans , Mongolia , Phenotype , Polymorphism, Single Nucleotide , Sequence Analysis, DNA
17.
Genome Res ; 23(7): 1170-81, 2013 Jul.
Article in English | MEDLINE | ID: mdl-23599355

ABSTRACT

Alu retrotransposons are the most numerous and active mobile elements in humans, causing genetic disease and creating genomic diversity. Mobile element scanning (ME-Scan) enables comprehensive and affordable identification of mobile element insertions (MEI) using targeted high-throughput sequencing of multiplexed MEI junction libraries. In a single experiment, ME-Scan identifies nearly all AluYb8 and AluYb9 elements, with high sensitivity for both rare and common insertions, in 169 individuals of diverse ancestry. ME-Scan detects heterozygous insertions in single individuals with 91% sensitivity. Insertion presence or absence states determined by ME-Scan are 95% concordant with those determined by locus-specific PCR assays. By sampling diverse populations from Africa, South Asia, and Europe, we are able to identify 5799 Alu insertions, including 2524 novel ones, some of which occur in exons. Sub-Saharan populations and a Pygmy group in particular carry numerous intermediate-frequency Alu insertions that are absent in non-African groups. There is a significant dearth of exon-interrupting insertions among common Alu polymorphisms, but the density of singleton Alu insertions is constant across exonic and nonexonic regions. In one case, a validated novel singleton Alu interrupts a protein-coding exon of FAM187B. This implies that exonic Alu insertions are generally deleterious and thus eliminated by natural selection, but not so quickly that they cannot be observed as extremely rare variants.


Subject(s)
Alu Elements , Genome, Human , High-Throughput Nucleotide Sequencing , Mutagenesis, Insertional , Retroelements , DNA Replication , Exons , Genetic Loci , High-Throughput Nucleotide Sequencing/methods , Humans , Polymorphism, Genetic , Population Groups/genetics , Reproducibility of Results , Sensitivity and Specificity , Transcription, Genetic
18.
BMC Genet ; 14: 30, 2013 Apr 25.
Article in English | MEDLINE | ID: mdl-23617681

ABSTRACT

BACKGROUND: Because of the role of inflammation in preterm birth (PTB), polymorphisms in and near the interleukin-6 gene (IL6) have been association study targets. Several previous studies have assessed the association between PTB and a single nucleotide polymorphism (SNP), rs1800795, located in the IL6 gene promoter region. Their results have been inconsistent and SNP frequencies have varied strikingly among different populations. We therefore conducted a meta-analysis with subgroup analysis by population strata to: (1) reduce the confounding effect of population structure, (2) increase sample size and statistical power, and (3) elucidate the association between rs1800975 and PTB. RESULTS: We reviewed all published papers for PTB phenotype and SNP rs1800795 genotype. Maternal genotype and fetal genotype were analyzed separately and the analyses were stratified by population. The PTB phenotype was defined as gestational age (GA) < 37 weeks, but results from earlier GA were selected when available. All studies were compared by genotype (CC versus CG+GG), based on functional studies.For the maternal genotype analysis, 1,165 PTBs and 3,830 term controls were evaluated. Populations were stratified into women of European descent (for whom the most data were available) and women of heterogeneous origin or admixed populations. All ancestry was self-reported. Women of European descent had a summary odds ratio (OR) of 0.68, (95% confidence interval (CI) 0.51 - 0.91), indicating that the CC genotype is protective against PTB. The result for non-European women was not statistically significant (OR 1.01, 95% CI 0.59 - 1.75). For the fetal genotype analysis, four studies were included; there was no significant association with PTB (OR 0.98, 95% CI 0.72 - 1.33). Sensitivity analysis showed that preterm premature rupture of membrane (PPROM) may be a confounding factor contributing to phenotype heterogeneity. CONCLUSIONS: IL6 SNP rs1800795 genotype CC is protective against PTB in women of European descent. It is not significant in other heterogeneous or admixed populations, or in fetal genotype analysis.Population structure is an important confounding factor that should be controlled for in studies of PTB.


Subject(s)
Interleukin-6/genetics , Polymorphism, Single Nucleotide , Premature Birth/genetics , Female , Humans , Premature Birth/epidemiology , Promoter Regions, Genetic , White People/genetics
19.
Cancer Genet ; 205(6): 304-12, 2012 Jun.
Article in English | MEDLINE | ID: mdl-22749036

ABSTRACT

The genetics of Ewing sarcoma development remain obscure. The incidence of Ewing sarcoma is ten-fold less in Africans as compared to Europeans, irrespective of geographic location, suggesting population-specific genetic influences. Since GGAA-containing microsatellites within key target genes are necessary for Ewing sarcoma-specific EWS/FLI DNA binding and gene activation, and gene expression is positively correlated with the number of repeat motifs in the promoter/enhancer region, we sought to determine if significant polymorphisms exist between African and European populations which might contribute to observed differences in Ewing sarcoma incidence and outcomes. GGAA microsatellites upstream of two critical EWS/FLI target genes, NR0B1 and CAV1, were sequenced from subjects of European and African descent. While the characteristics of the CAV1 promoter microsatellites were similar across both populations, the NR0B1 microsatellite in African subjects was significantly larger, harboring more repeat motifs, a greater number of repeat segments, and longer consecutive repeats, than in European subjects. These results are biologically intriguing as NR0B1 was the most highly enriched EWS/FLI bound gene in prior studies, and is absolutely necessary for oncogenic transformation in Ewing sarcoma. These data suggest that GGAA microsatellite polymorphisms in the NR0B1 gene might influence disease susceptibility and prognosis in Ewing sarcoma in unanticipated ways.


Subject(s)
Black People/genetics , Caveolin 1/genetics , DAX-1 Orphan Nuclear Receptor/genetics , Microsatellite Repeats , Sarcoma, Ewing/genetics , White People/genetics , Africa/epidemiology , Base Composition , Europe/epidemiology , Humans , Polymorphism, Single Nucleotide , Promoter Regions, Genetic , Proto-Oncogene Protein c-fli-1/genetics , Sarcoma, Ewing/epidemiology , Sarcoma, Ewing/ethnology
20.
BMC Genet ; 13: 39, 2012 May 20.
Article in English | MEDLINE | ID: mdl-22606979

ABSTRACT

BACKGROUND: Populations of the Americas were founded by early migrants from Asia, and some have experienced recent genetic admixture. To better characterize the native and non-native ancestry components in populations from the Americas, we analyzed 815,377 autosomal SNPs, mitochondrial hypervariable segments I and II, and 36 Y-chromosome STRs from 24 Mesoamerican Totonacs and 23 South American Bolivians. RESULTS AND CONCLUSIONS: We analyzed common genomic regions from native Bolivian and Totonac populations to identify 324 highly predictive Native American ancestry informative markers (AIMs). As few as 40-50 of these AIMs perform nearly as well as large panels of random genome-wide SNPs for predicting and estimating Native American ancestry and admixture levels. These AIMs have greater New World vs. Old World specificity than previous AIMs sets. We identify highly-divergent New World SNPs that coincide with high-frequency haplotypes found at similar frequencies in all populations examined, including the HGDP Pima, Maya, Colombian, Karitiana, and Surui American populations. Some of these regions are potential candidates for positive selection. European admixture in the Bolivian sample is approximately 12%, though individual estimates range from 0-48%. We estimate that the admixture occurred ~360-384 years ago. Little evidence of European or African admixture was found in Totonac individuals. Bolivians with pre-Columbian mtDNA and Y-chromosome haplogroups had 5-30% autosomal European ancestry, demonstrating the limitations of Y-chromosome and mtDNA haplogroups and the need for autosomal ancestry informative markers for assessing ancestry in admixed populations.


Subject(s)
American Indian or Alaska Native/genetics , Bolivia/ethnology , DNA, Mitochondrial , Emigration and Immigration , Genetics, Population , Humans , Mexico/ethnology , Phylogeography , Polymorphism, Single Nucleotide , Selection, Genetic
SELECTION OF CITATIONS
SEARCH DETAIL
...