Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 57
Filter
1.
Blood ; 142(2): 172-184, 2023 07 13.
Article in English | MEDLINE | ID: mdl-37001051

ABSTRACT

Trisomy 21, the genetic cause of Down syndrome (DS), is the most common congenital chromosomal anomaly. It is associated with a 20-fold increased risk of acute lymphoblastic leukemia (ALL) during childhood and results in distinctive leukemia biology. To comprehensively define the genomic landscape of DS-ALL, we performed whole-genome sequencing and whole-transcriptome sequencing (RNA-Seq) on 295 cases. Our integrated genomic analyses identified 15 molecular subtypes of DS-ALL, with marked enrichment of CRLF2-r, IGH::IGF2BP1, and C/EBP altered (C/EBPalt) subtypes compared with 2257 non-DS-ALL cases. We observed abnormal activation of the CEBPD, CEBPA, and CEBPE genes in 10.5% of DS-ALL cases via a variety of genomic mechanisms, including chromosomal rearrangements and noncoding mutations leading to enhancer hijacking. A total of 42.3% of C/EBP-activated DS-ALL also have concomitant FLT3 point mutations or insertions/deletions, compared with 4.1% in other subtypes. CEBPD overexpression enhanced the differentiation of mouse hematopoietic progenitor cells into pro-B cells in vitro, particularly in a DS genetic background. Notably, recombination-activating gene-mediated somatic genomic abnormalities were common in DS-ALL, accounting for a median of 27.5% of structural alterations, compared with 7.7% in non-DS-ALL. Unsupervised hierarchical clustering analyses of CRLF2-rearranged DS-ALL identified substantial heterogeneity within this group, with the BCR::ABL1-like subset linked to an inferior event-free survival, even after adjusting for known clinical risk factors. These results provide important insights into the biology of DS-ALL and point to opportunities for targeted therapy and treatment individualization.


Subject(s)
Down Syndrome , Precursor Cell Lymphoblastic Leukemia-Lymphoma , Animals , Mice , Down Syndrome/complications , Down Syndrome/genetics , Mutation , Risk Factors , Genomics , Chromosome Aberrations , Precursor Cell Lymphoblastic Leukemia-Lymphoma/complications , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics
2.
Leukemia ; 37(3): 518-528, 2023 03.
Article in English | MEDLINE | ID: mdl-36658389

ABSTRACT

Childhood B-cell acute lymphoblastic leukaemia (B-ALL) is characterised by recurrent genetic abnormalities that drive risk-directed treatment strategies. Using current techniques, accurate detection of such aberrations can be challenging, due to the rapidly expanding list of key genetic abnormalities. Whole genome sequencing (WGS) has the potential to improve genetic testing, but requires comprehensive validation. We performed WGS on 210 childhood B-ALL samples annotated with clinical and genetic data. We devised a molecular classification system to subtype these patients based on identification of key genetic changes in tumour-normal and tumour-only analyses. This approach detected 294 subtype-defining genetic abnormalities in 96% (202/210) patients. Novel genetic variants, including fusions involving genes in the MAP kinase pathway, were identified. WGS results were concordant with standard-of-care methods and whole transcriptome sequencing (WTS). We expanded the catalogue of genetic profiles that reliably classify PAX5alt and ETV6::RUNX1-like subtypes. Our novel bioinformatic pipeline improved detection of DUX4 rearrangements (DUX4-r): a good-risk B-ALL subtype with high survival rates. Overall, we have validated that WGS provides a standalone, reliable genetic test to detect all subtype-defining genetic abnormalities in B-ALL, accurately classifying patients for the risk-directed treatment stratification, while simultaneously performing as a research tool to identify novel disease biomarkers.


Subject(s)
Precursor B-Cell Lymphoblastic Leukemia-Lymphoma , Precursor Cell Lymphoblastic Leukemia-Lymphoma , Humans , Precursor Cell Lymphoblastic Leukemia-Lymphoma/drug therapy , Precursor B-Cell Lymphoblastic Leukemia-Lymphoma/diagnosis , Precursor B-Cell Lymphoblastic Leukemia-Lymphoma/genetics , Computational Biology , Genetic Testing , Whole Genome Sequencing
3.
Nat Genet ; 54(9): 1376-1389, 2022 09.
Article in English | MEDLINE | ID: mdl-36050548

ABSTRACT

Acute lymphoblastic leukemia (ALL) is the most common childhood cancer. Here, using whole-genome, exome and transcriptome sequencing of 2,754 childhood patients with ALL, we find that, despite a generally low mutation burden, ALL cases harbor a median of four putative somatic driver alterations per sample, with 376 putative driver genes identified varying in prevalence across ALL subtypes. Most samples harbor at least one rare gene alteration, including 70 putative cancer driver genes associated with ubiquitination, SUMOylation, noncoding transcripts and other functions. In hyperdiploid B-ALL, chromosomal gains are acquired early and synchronously before ultraviolet-induced mutation. By contrast, ultraviolet-induced mutations precede chromosomal gains in B-ALL cases with intrachromosomal amplification of chromosome 21. We also demonstrate the prognostic significance of genetic alterations within subtypes. Intriguingly, DUX4- and KMT2A-rearranged subtypes separate into CEBPA/FLT3- or NFATC4-expressing subgroups with potential clinical implications. Together, these results deepen understanding of the ALL genomic landscape and associated outcomes.


Subject(s)
Precursor Cell Lymphoblastic Leukemia-Lymphoma , Child , Chromosome Aberrations , Exome/genetics , Genomics , Humans , Mutation , Precursor Cell Lymphoblastic Leukemia-Lymphoma/genetics
4.
Adv Exp Med Biol ; 1361: 163-175, 2022.
Article in English | MEDLINE | ID: mdl-35230688

ABSTRACT

Gene fusions play a prominent role in the oncogenesis of many cancers and have been extensively targeted as biomarkers for diagnostic, prognostic, and therapeutic purposes. Detection methods span a number of platforms, including cytogenetics (e.g., FISH), targeted qPCR, and sequencing-based assays. Before the advent of next-generation sequencing (NGS), fusion testing was primarily targeted to specific genome loci, with assays tailored for previously characterized fusion events. The availability of whole genome sequencing (WGS) and whole transcriptome sequencing (RNA-seq) allows for genome-wide screening for the simultaneous detection of both known and novel fusions. RNA-seq, in particular, offers the possibility of rapid turn-around testing with less dedicated sequencing than WGS. This makes it an attractive target for clinical oncology testing, particularly when transcriptome data can be multi-purposed for tumor classification and additional analyses. Despite considerable efforts and substantial progress, however, genome-wide screening for fusions solely based on RNA-seq data remains an ongoing challenge. A host of technical artifacts adversely impact the sensitivity and specificity of existing software tools. In this chapter, the general strategies employed by current fusion software are discussed, and a selection of available fusion detection tools are surveyed. Despite its current limitations, RNA-seq-based fusion detection offers a more comprehensive and efficient strategy as compared to multiple targeted fusion assays. When thoughtfully employed within a wider ecosystem of diagnostic assays and clinical information, RNA-seq fusion detection represents a powerful tool for precision oncology.


Subject(s)
Neoplasms , Ecosystem , Gene Fusion , High-Throughput Nucleotide Sequencing , Humans , Medical Oncology , Neoplasms/diagnosis , Neoplasms/genetics , Precision Medicine , RNA-Seq , Sequence Analysis, RNA/methods , Software , Exome Sequencing
5.
Sci Rep ; 11(1): 22213, 2021 11 15.
Article in English | MEDLINE | ID: mdl-34782706

ABSTRACT

Rhabdomyosarcomas (RMS) represent a family of aggressive soft tissue sarcomas that present in both children and adults. Pathologic risk stratification for RMS has been based on histologic subtype, with poor outcomes observed in alveolar rhabdomyosarcoma (ARMS) and the adult-type pleomorphic rhabdomyosarcoma (PRMS) compared to embryonal rhabdomyosarcoma (ERMS). Genomic sequencing studies have expanded the spectrum of RMS, with several new molecularly defined entities, including fusion-driven spindle cell/sclerosing rhabdomyosarcoma (SC/SRMS) and MYOD1-mutant SC/SRMS. Comprehensive genomic analysis has previously defined the mutational and copy number spectrum for the more common ERMS and ARMS and revealed corresponding methylation signatures. Comparatively, less is known about epigenetic correlates for the rare SC/SRMS or PRMS histologic subtypes. Herein, we present exome and RNA sequencing, copy number analysis, and methylation profiling of the largest cohort of molecularly characterized RMS samples to date. In addition to ARMS and ERMS, we identify two novel methylation subtypes, one having SC/SRMS histology and defined by MYOD1 p. L122R mutations and the other matching adult-type PRMS. Selected tumors from adolescent patients grouped with the PRMS methylation class, expanding the age range of these rare tumors. Limited follow-up data suggest that pediatric tumors with MYOD1-mutations are associated with an aggressive clinical course.


Subject(s)
Biomarkers, Tumor , DNA Methylation , Gene Expression Profiling , Gene Expression Regulation, Neoplastic , Rhabdomyosarcoma/diagnosis , Rhabdomyosarcoma/etiology , Adolescent , Adult , Aged , Child , Child, Preschool , Computational Biology/methods , DNA Copy Number Variations , Diagnosis, Differential , Disease Susceptibility , Female , Humans , Immunohistochemistry , In Situ Hybridization , Infant , Male , Middle Aged , Mutation , Rhabdomyosarcoma/therapy , Whole Genome Sequencing , Young Adult
6.
Cancer Discov ; 11(12): 3008-3027, 2021 12 01.
Article in English | MEDLINE | ID: mdl-34301788

ABSTRACT

Genomic studies of pediatric cancer have primarily focused on specific tumor types or high-risk disease. Here, we used a three-platform sequencing approach, including whole-genome sequencing (WGS), whole-exome sequencing (WES), and RNA sequencing (RNA-seq), to examine tumor and germline genomes from 309 prospectively identified children with newly diagnosed (85%) or relapsed/refractory (15%) cancers, unselected for tumor type. Eighty-six percent of patients harbored diagnostic (53%), prognostic (57%), therapeutically relevant (25%), and/or cancer-predisposing (18%) variants. Inclusion of WGS enabled detection of activating gene fusions and enhancer hijacks (36% and 8% of tumors, respectively), small intragenic deletions (15% of tumors), and mutational signatures revealing of pathogenic variant effects. Evaluation of paired tumor-normal data revealed relevance to tumor development for 55% of pathogenic germline variants. This study demonstrates the power of a three-platform approach that incorporates WGS to interrogate and interpret the full range of genomic variants across newly diagnosed as well as relapsed/refractory pediatric cancers. SIGNIFICANCE: Pediatric cancers are driven by diverse genomic lesions, and sequencing has proven useful in evaluating high-risk and relapsed/refractory cases. We show that combined WGS, WES, and RNA-seq of tumor and paired normal tissues enables identification and characterization of genetic drivers across the full spectrum of pediatric cancers. This article is highlighted in the In This Issue feature, p. 2945.


Subject(s)
Neoplasms , Child , DNA , Humans , Mutation , Neoplasms/genetics , Sequence Analysis, RNA , Exome Sequencing
7.
Virchows Arch ; 476(6): 915-920, 2020 Jun.
Article in English | MEDLINE | ID: mdl-31900635

ABSTRACT

BCOR internal tandem duplications (ITDs) and rearrangements are implicated in the oncogenesis of a subset of undifferentiated sarcomas. To date, BCOR ITD sarcomas have been exclusively found in non-appendicular infantile soft tissues, whereas BCOR-rearranged sarcomas occur in both bones and soft tissues affecting a wider patient age range. Little is known about patient outcome in BCOR ITD sarcomas. We present a BCOR-expressing, primary bone, undifferentiated sarcoma case involving an adolescent male's left tibia that, unexpectedly, harbored a BCOR ITD instead of a BCOR rearrangement. Furthermore, the patient achieved a partial histologic response after receiving a Ewing sarcoma chemotherapy regimen. Our case expands the clinical spectrum of BCOR ITD sarcomas and suggests that childhood and adult BCOR-expressing sarcomas with an undifferentiated histology should be considered for both BCOR rearrangement and ITD screening. Accurate BCOR mutation identification in undifferentiated sarcomas is essential to define their clinical spectrum and to develop effective management strategies.


Subject(s)
Biomarkers, Tumor/genetics , Bone Neoplasms/genetics , Proto-Oncogene Proteins/genetics , Repressor Proteins/genetics , Sarcoma/genetics , Adolescent , Bone Neoplasms/diagnostic imaging , Bone Neoplasms/drug therapy , Bone Neoplasms/pathology , Gene Duplication , Humans , Male , Sarcoma/diagnostic imaging , Sarcoma/drug therapy , Sarcoma/pathology , Tibia/diagnostic imaging , Tibia/pathology
8.
Pediatr Blood Cancer ; 67(2): e28047, 2020 02.
Article in English | MEDLINE | ID: mdl-31736278

ABSTRACT

PURPOSE: To estimate the absolute number of adult survivors of childhood cancer in the U.S. population who carry a pathogenic or likely pathogenic variant in a cancer predisposition gene. METHODS: Using the Surveillance, Epidemiology, and End Results (SEER) Program, we estimated the number of childhood cancer survivors on December 31, 2016 for each childhood cancer diagnosis, multiplied this by the proportion of carriers of pathogenic/likely pathogenic variants in the St. Jude Lifetime Cohort (SJLIFE) study, and projected the resulting number onto the U.S. RESULTS: Based on genome sequence data, 11.8% of 2450 SJLIFE participants carry a pathogenic/likely pathogenic variant in one of 156 cancer predisposition genes. Given this information, we estimate that 21 800 adult survivors of childhood cancer in the United States carry a pathogenic/likely pathogenic variant in one of these genes. The highest estimated absolute number of variant carriers are among survivors of central nervous system tumors (n = 4300), particularly astrocytoma (n = 1800) and other gliomas (n = 1700), acute lymphoblastic leukemia (n = 4300), and retinoblastoma (n = 3500). The most frequently mutated genes are RB1 (n = 3000), NF1 (n = 2300), and BRCA2 (n = 800). CONCLUSION: Given the increasing number of childhood cancer survivors in the United States, clinicians should counsel survivors regarding their potential genetic risk, consider referral for genetic counseling and testing, and, as appropriate, implement syndrome-specific cancer surveillance or risk-reducing measures.


Subject(s)
Cancer Survivors/statistics & numerical data , Genetic Predisposition to Disease , Germ-Line Mutation , Neoplasm Proteins/genetics , Neoplasms/mortality , Adolescent , Adult , Aged , Child , Child, Preschool , Cohort Studies , Female , Follow-Up Studies , Humans , Incidence , Infant , Infant, Newborn , Male , Middle Aged , Neoplasms/epidemiology , Neoplasms/genetics , Prognosis , Risk Factors , Survival Rate , United States/epidemiology , Young Adult
9.
Genome Res ; 29(9): 1555-1565, 2019 09.
Article in English | MEDLINE | ID: mdl-31439692

ABSTRACT

Variant interpretation in the era of massively parallel sequencing is challenging. Although many resources and guidelines are available to assist with this task, few integrated end-to-end tools exist. Here, we present the Pediatric Cancer Variant Pathogenicity Information Exchange (PeCanPIE), a web- and cloud-based platform for annotation, identification, and classification of variations in known or putative disease genes. Starting from a set of variants in variant call format (VCF), variants are annotated, ranked by putative pathogenicity, and presented for formal classification using a decision-support interface based on published guidelines from the American College of Medical Genetics and Genomics (ACMG). The system can accept files containing millions of variants and handle single-nucleotide variants (SNVs), simple insertions/deletions (indels), multiple-nucleotide variants (MNVs), and complex substitutions. PeCanPIE has been applied to classify variant pathogenicity in cancer predisposition genes in two large-scale investigations involving >4000 pediatric cancer patients and serves as a repository for the expert-reviewed results. PeCanPIE was originally developed for pediatric cancer but can be easily extended for use for nonpediatric cancers and noncancer genetic diseases. Although PeCanPIE's web-based interface was designed to be accessible to non-bioinformaticians, its back-end pipelines may also be run independently on the cloud, facilitating direct integration and broader adoption. PeCanPIE is publicly available and free for research use.


Subject(s)
Computational Biology/methods , Germ-Line Mutation , Neoplasms/genetics , Child , Cloud Computing , Databases, Genetic , Genetic Predisposition to Disease , High-Throughput Nucleotide Sequencing , Humans , User-Computer Interface
10.
J Clin Oncol ; 36(20): 2078-2087, 2018 07 10.
Article in English | MEDLINE | ID: mdl-29847298

ABSTRACT

Purpose Childhood cancer survivors are at increased risk of subsequent neoplasms (SNs), but the germline genetic contribution is largely unknown. We assessed the contribution of pathogenic/likely pathogenic (P/LP) mutations in cancer predisposition genes to their SN risk. Patients and Methods Whole-genome sequencing (30-fold) was performed on samples from childhood cancer survivors who were ≥ 5 years since initial cancer diagnosis and participants in the St Jude Lifetime Cohort Study, a retrospective hospital-based study with prospective clinical follow-up. Germline mutations in 60 genes known to be associated with autosomal dominant cancer predisposition syndromes with moderate to high penetrance were classified by their pathogenicity according to the American College of Medical Genetics and Genomics guidelines. Relative rates (RRs) and 95% CIs of SN occurrence by mutation status were estimated using multivariable piecewise exponential regression stratified by radiation exposure. Results Participants were 3,006 survivors (53% male; median age, 35.8 years [range, 7.1 to 69.8 years]; 56% received radiotherapy), 1,120 SNs were diagnosed among 439 survivors (14.6%), and 175 P/LP mutations were identified in 5.8% (95% CI, 5.0% to 6.7%) of survivors. Mutations were associated with significantly increased rates of breast cancer (RR, 13.9; 95% CI, 6.0 to 32.2) and sarcoma (RR, 10.6; 95% CI, 4.3 to 26.3) among irradiated survivors and with increased rates of developing any SN (RR, 4.7; 95% CI, 2.4 to 9.3), breast cancer (RR, 7.7; 95% CI, 2.4 to 24.4), nonmelanoma skin cancer (RR, 11.0; 95% CI, 2.9 to 41.4), and two or more histologically distinct SNs (RR, 18.6; 95% CI, 3.5 to 99.3) among nonirradiated survivors. Conclusion The findings support referral of all survivors for genetic counseling for potential clinical genetic testing, which should be prioritized for nonirradiated survivors with any SN and for those with breast cancer or sarcoma in the field of prior irradiation.


Subject(s)
Cancer Survivors/statistics & numerical data , Neoplasms, Second Primary/genetics , Neoplasms/genetics , Adolescent , Adult , Aged , Child , Cohort Studies , Female , Genetic Predisposition to Disease , Germ-Line Mutation , Humans , Male , Middle Aged , Neoplasms/epidemiology , Neoplasms, Second Primary/epidemiology , Retrospective Studies , Risk , United States/epidemiology , Whole Genome Sequencing , Young Adult
11.
Nucleic Acids Res ; 45(5): e31, 2017 03 17.
Article in English | MEDLINE | ID: mdl-27899577

ABSTRACT

L1 elements represent the only currently active, autonomous retrotransposon in the human genome, and they make major contributions to human genetic instability. The vast majority of the 500 000 L1 elements in the genome are defective, and only a relatively few can contribute to the retrotransposition process. However, there is currently no comprehensive approach to identify the specific loci that are actively transcribed separate from the excess of L1-related sequences that are co-transcribed within genes. We have developed RNA-Seq procedures, as well as a 1200 bp 5΄ RACE product coupled with PACBio sequencing that can identify the specific L1 loci that contribute most of the L1-related RNA reads. At least 99% of L1-related sequences found in RNA do not arise from the L1 promoter, instead representing pieces of L1 incorporated in other cellular RNAs. In any given cell type a relatively few active L1 loci contribute to the 'authentic' L1 transcripts that arise from the L1 promoter, with significantly different loci seen expressed in different tissues.


Subject(s)
Chromosomes, Human/chemistry , Genetic Loci , Genome, Human , Long Interspersed Nucleotide Elements , RNA, Messenger/genetics , Transcription, Genetic , Animals , Chromosome Mapping , Chromosomes, Human/metabolism , DNA, Complementary/genetics , DNA, Complementary/metabolism , Genomic Instability , HeLa Cells , Humans , Mice , NIH 3T3 Cells , Nucleic Acid Amplification Techniques , Promoter Regions, Genetic , RNA, Messenger/metabolism , Sequence Analysis, RNA
12.
BMC Genomics ; 16: 220, 2015 Mar 21.
Article in English | MEDLINE | ID: mdl-25887476

ABSTRACT

BACKGROUND: There are over a half a million copies of L1 retroelements in the human genome which are responsible for as much as 0.5% of new human genetic diseases. Most new L1 inserts arise from young source elements that are polymorphic in the human genome. Highly active polymorphic "hot" L1 source elements have been shown to be capable of extremely high levels of mobilization and result in numerous instances of disease. Additionally, hot polymorphic L1s have been described to be highly active within numerous cancer genomes. These hot L1s result in mutagenesis by insertion of new L1 copies elsewhere in the genome, but also have been shown to generate additional full length L1 insertions which are also hot and able to further retrotranspose. Through this mechanism, hot L1s may amplify within a tumor and result in a continued cycle of mutagenesis. RESULTS AND CONCLUSIONS: We have developed a method to detect full-length, polymorphic L1 elements using a targeted next generation sequencing approach, Sequencing Identification and Mapping of Primed L1 Elements (SIMPLE). SIMPLE has 94% sensitivity and detects nearly all full-length L1 elements in a genome. SIMPLE will allow researchers to identify hot mutagenic full-length L1s as potential drivers of genome instability. Using SIMPLE we find that the typical individual has approximately 100 non-reference, polymorphic L1 elements in their genome. These elements are at relatively low population frequencies relative to previously identified polymorphic L1 elements and demonstrate the tremendous diversity in potentially active L1 elements in the human population.


Subject(s)
Long Interspersed Nucleotide Elements , Polymorphism, Genetic , Sequence Analysis, DNA/methods , Alleles , Cell Line , Chromosome Mapping , Fibroblasts/metabolism , Gene Frequency , Genetic Association Studies , Genome, Human , High-Throughput Nucleotide Sequencing , Humans
13.
PLoS Genet ; 11(3): e1005016, 2015 Mar.
Article in English | MEDLINE | ID: mdl-25761216

ABSTRACT

Alu elements make up the largest family of human mobile elements, numbering 1.1 million copies and comprising 11% of the human genome. As a consequence of evolution and genetic drift, Alu elements of various sequence divergence exist throughout the human genome. Alu/Alu recombination has been shown to cause approximately 0.5% of new human genetic diseases and contribute to extensive genomic structural variation. To begin understanding the molecular mechanisms leading to these rearrangements in mammalian cells, we constructed Alu/Alu recombination reporter cell lines containing Alu elements ranging in sequence divergence from 0%-30% that allow detection of both Alu/Alu recombination and large non-homologous end joining (NHEJ) deletions that range from 1.0 to 1.9 kb in size. Introduction of as little as 0.7% sequence divergence between Alu elements resulted in a significant reduction in recombination, which indicates even small degrees of sequence divergence reduce the efficiency of homology-directed DNA double-strand break (DSB) repair. Further reduction in recombination was observed in a sequence divergence-dependent manner for diverged Alu/Alu recombination constructs with up to 10% sequence divergence. With greater levels of sequence divergence (15%-30%), we observed a significant increase in DSB repair due to a shift from Alu/Alu recombination to variable-length NHEJ which removes sequence between the two Alu elements. This increase in NHEJ deletions depends on the presence of Alu sequence homeology (similar but not identical sequences). Analysis of recombination products revealed that Alu/Alu recombination junctions occur more frequently in the first 100 bp of the Alu element within our reporter assay, just as they do in genomic Alu/Alu recombination events. This is the first extensive study characterizing the influence of Alu element sequence divergence on DNA repair, which will inform predictions regarding the effect of Alu element sequence divergence on both the rate and nature of DNA repair events.


Subject(s)
Alu Elements/genetics , DNA End-Joining Repair/genetics , Recombination, Genetic , Animals , DNA Breaks, Double-Stranded , DNA Damage/genetics , Genome, Human , Humans
14.
PLoS One ; 9(1)2014.
Article in English | MEDLINE | ID: mdl-29364980

ABSTRACT

[This corrects the article DOI: 10.1371/journal.pone.0079402.].

15.
PLoS Genet ; 9(11): e1003925, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24244192

ABSTRACT

The Caribbean basin is home to some of the most complex interactions in recent history among previously diverged human populations. Here, we investigate the population genetic history of this region by characterizing patterns of genome-wide variation among 330 individuals from three of the Greater Antilles (Cuba, Puerto Rico, Hispaniola), two mainland (Honduras, Colombia), and three Native South American (Yukpa, Bari, and Warao) populations. We combine these data with a unique database of genomic variation in over 3,000 individuals from diverse European, African, and Native American populations. We use local ancestry inference and tract length distributions to test different demographic scenarios for the pre- and post-colonial history of the region. We develop a novel ancestry-specific PCA (ASPCA) method to reconstruct the sub-continental origin of Native American, European, and African haplotypes from admixed genomes. We find that the most likely source of the indigenous ancestry in Caribbean islanders is a Native South American component shared among inland Amazonian tribes, Central America, and the Yucatan peninsula, suggesting extensive gene flow across the Caribbean in pre-Columbian times. We find evidence of two pulses of African migration. The first pulse--which today is reflected by shorter, older ancestry tracts--consists of a genetic component more similar to coastal West African regions involved in early stages of the trans-Atlantic slave trade. The second pulse--reflected by longer, younger tracts--is more similar to present-day West-Central African populations, supporting historical records of later transatlantic deportation. Surprisingly, we also identify a Latino-specific European component that has significantly diverged from its parental Iberian source populations, presumably as a result of small European founder population size. We demonstrate that the ancestral components in admixed genomes can be traced back to distinct sub-continental source populations with far greater resolution than previously thought, even when limited pre-Columbian Caribbean haplotypes have survived.


Subject(s)
Black People/genetics , Gene Flow , Genetics, Population , Indians, North American/genetics , White People/genetics , Caribbean Region , DNA, Mitochondrial/genetics , Demography , Genomics , Haplotypes , Hispanic or Latino/genetics , Humans
16.
PLoS One ; 8(11): e79402, 2013.
Article in English | MEDLINE | ID: mdl-24244495

ABSTRACT

Retrotransposons comprise approximately half of the human genome and contribute to chromatin structure, regulatory motifs, and protein-coding sequences. Since retrotransposon insertions can disrupt functional genetic elements as well as introduce new sequence motifs to a region, they have the potential to affect the function of genes that harbour insertions as well as those nearby. Partly as a result of these effects, the distribution of retrotransposons across the genome is non-uniform and there are observed imbalances in the orientation of insertions with respect to the transcriptional direction of the containing gene. Although some of the factors underlying the observed distributions are understood, much of the variability remains unexplained. Detailed characterization of retrotransposon density in genes could help inform predictions of the functional consequence of de novo as well as polymorphic insertions. In order to characterize the relationship between genes and inserted elements, we have examined the distribution of retrotransposons and their internal motifs within tissue-specific and housekeeping genes. We have identified that the previously established retrotransposon antisense bias decays at a linear rate across genes, resulting in an equal density of sense and antisense retrotransposons near the 3'-UTR. In addition, the decay of antisense bias across genes is less pronounced among tissue-specific genes. Our results provide support for the scenario in which this linear decay in antisense bias is established by natural selection shortly after retrotransposon integration, and that total antisense bias observed is above and beyond any bias introduced by the integration process itself. Finally, we provide an example of a retrotransposon acting as an eQTL on a coincident gene, highlighting one of several possible avenues through which insertions may modulate gene function.


Subject(s)
RNA, Antisense , Retroelements/genetics , 5' Untranslated Regions , Alu Elements/genetics , Evolution, Molecular , Exons , Gene Frequency , Humans , Long Interspersed Nucleotide Elements/genetics , Nucleotide Motifs , Organ Specificity/genetics , Polymorphism, Genetic , Quantitative Trait Loci
17.
Nat Rev Cardiol ; 10(9): 531-47, 2013 Sep.
Article in English | MEDLINE | ID: mdl-23900355

ABSTRACT

Remarkable progress has been made in understanding the genetic basis of dilated cardiomyopathy (DCM). Rare variants in >30 genes, some also involved in other cardiomyopathies, muscular dystrophy, or syndromic disease, perturb a diverse set of important myocardial proteins to produce a final DCM phenotype. Large, publicly available datasets have provided the opportunity to evaluate previously identified DCM-causing mutations, and to examine the population frequency of sequence variants similar to those that have been observed to cause DCM. The frequency of these variants, whether associated with dilated or hypertrophic cardiomyopathy, is greater than estimates of disease prevalence. This mismatch might be explained by one or more of the following possibilities: that the penetrance of DCM-causing mutations is lower than previously thought, that some variants are noncausal, that DCM prevalence is higher than previously estimated, or that other more-complex genomics underlie DCM. Reassessment of our assumptions about the complexity of the genomic and phenomic architecture of DCM is warranted. Much about the genomic basis of DCM remains to be investigated, which will require comprehensive genomic studies in much larger cohorts of rigorously phenotyped probands and family members than previously examined.


Subject(s)
Cardiomyopathy, Dilated/genetics , Mutation , Animals , Arrhythmogenic Right Ventricular Dysplasia/genetics , Arrhythmogenic Right Ventricular Dysplasia/physiopathology , Cardiomyopathy, Dilated/classification , Cardiomyopathy, Dilated/physiopathology , Cardiomyopathy, Hypertrophic, Familial/genetics , Cardiomyopathy, Hypertrophic, Familial/physiopathology , Genetic Predisposition to Disease , Genomics/methods , Heredity , Humans , Pedigree , Phenotype , Risk Factors , Terminology as Topic
18.
Circ Cardiovasc Genet ; 6(2): 144-53, 2013 Apr.
Article in English | MEDLINE | ID: mdl-23418287

ABSTRACT

BACKGROUND- Familial dilated cardiomyopathy (DCM) is a genetically heterogeneous disease with >30 known genes. TTN truncating variants were recently implicated in a candidate gene study to cause 25% of familial and 18% of sporadic DCM cases. METHODS AND RESULTS- We used an unbiased genome-wide approach using both linkage analysis and variant filtering across the exome sequences of 48 individuals affected with DCM from 17 families to identify genetic cause. Linkage analysis ranked the TTN region as falling under the second highest genome-wide multipoint linkage peak, multipoint logarithm of odds, 1.59. We identified 6 TTN truncating variants carried by individuals affected with DCM in 7 of 17 DCM families (logarithm of odds, 2.99); 2 of these 7 families also had novel missense variants that segregated with disease. Two additional novel truncating TTN variants did not segregate with DCM. Nucleotide diversity at the TTN locus, including missense variants, was comparable with 5 other known DCM genes. The average number of missense variants in the exome sequences from the DCM cases or the ≈5400 cases from the Exome Sequencing Project was ≈23 per individual. The average number of TTN truncating variants in the Exome Sequencing Project was 0.014 per individual. We also identified a region (chr9q21.11-q22.31) with no known DCM genes with a maximum heterogeneity logarithm of odds score of 1.74. CONCLUSIONS- These data suggest that TTN truncating variants contribute to DCM cause. However, the lack of segregation of all identified TTN truncating variants illustrates the challenge of determining variant pathogenicity even with full exome sequencing.


Subject(s)
Cardiomyopathy, Dilated/genetics , Exome/genetics , Genome, Human , Muscle Proteins/genetics , Protein Kinases/genetics , Adolescent , Adult , Aged , Cardiomyopathy, Dilated/metabolism , Cardiomyopathy, Dilated/pathology , Chromosomes, Human, Pair 9 , Connectin , Female , Genetic Heterogeneity , Genetic Linkage , Genetic Loci , Humans , Male , Middle Aged , Mutation, Missense , Odds Ratio , Pedigree , Sequence Analysis, DNA , Young Adult
19.
Ann Hum Genet ; 77(1): 9-21, 2013 Jan.
Article in English | MEDLINE | ID: mdl-23130936

ABSTRACT

Despite the increasing speculation that oxidative stress and abnormal energy metabolism may play a role in Autism Spectrum Disorders (ASD), and the observation that patients with mitochondrial defects have symptoms consistent with ASD, there are no comprehensive published studies examining the role of mitochondrial variation in autism. Therefore, we have sought to comprehensively examine the role of mitochondrial DNA (mtDNA) variation with regard to ASD risk, employing a multi-phase approach. In phase 1 of our experiment, we examined 132 mtDNA single-nucleotide polymorphisms (SNPs) genotyped as part of our genome-wide association studies of ASD. In phase 2 we genotyped the major European mitochondrial haplogroup-defining variants within an expanded set of autism probands and controls. Finally in phase 3, we resequenced the entire mtDNA in a subset of our Caucasian samples (∼400 proband-father pairs). In each phase we tested whether mitochondrial variation showed evidence of association to ASD. Despite a thorough interrogation of mtDNA variation, we found no evidence to suggest a major role for mtDNA variation in ASD susceptibility. Accordingly, while there may be attractive biological hints suggesting the role of mitochondria in ASD our data indicate that mtDNA variation is not a major contributing factor to the development of ASD.


Subject(s)
Child Development Disorders, Pervasive/genetics , DNA, Mitochondrial/genetics , Genetic Variation , Adolescent , Adult , Child , Child, Preschool , Genome-Wide Association Study , Haplotypes , Humans , Mutation , Polymorphism, Single Nucleotide , Young Adult
20.
PLoS Genet ; 8(8): e1002842, 2012.
Article in English | MEDLINE | ID: mdl-22912586

ABSTRACT

Alu elements are trans-mobilized by the autonomous non-LTR retroelement, LINE-1 (L1). Alu-induced insertion mutagenesis contributes to about 0.1% human genetic disease and is responsible for the majority of the documented instances of human retroelement insertion-induced disease. Here we introduce a SINE recovery method that provides a complementary approach for comprehensive analysis of the impact and biological mechanisms of Alu retrotransposition. Using this approach, we recovered 226 de novo tagged Alu inserts in HeLa cells. Our analysis reveals that in human cells marked Alu inserts driven by either exogenously supplied full length L1 or ORF2 protein are indistinguishable. Four percent of de novo Alu inserts were associated with genomic deletions and rearrangements and lacked the hallmarks of retrotransposition. In contrast to L1 inserts, 5' truncations of Alu inserts are rare, as most of the recovered inserts (96.5%) are full length. De novo Alus show a random pattern of insertion across chromosomes, but further characterization revealed an Alu insertion bias exists favoring insertion near other SINEs, highly conserved elements, with almost 60% landing within genes. De novo Alu inserts show no evidence of RNA editing. Priming for reverse transcription rarely occurred within the first 20 bp (most 5') of the A-tail. The A-tails of recovered inserts show significant expansion, with many at least doubling in length. Sequence manipulation of the construct led to the demonstration that the A-tail expansion likely occurs during insertion due to slippage by the L1 ORF2 protein. We postulate that the A-tail expansion directly impacts Alu evolution by reintroducing new active source elements to counteract the natural loss of active Alus and minimizing Alu extinction.


Subject(s)
Alu Elements/genetics , Long Interspersed Nucleotide Elements/genetics , Mutagenesis, Insertional , Terminal Repeat Sequences/genetics , 3' Flanking Region , 5' Flanking Region , Base Sequence , Endonucleases/genetics , Endonucleases/metabolism , Evolution, Molecular , Exons , Genome, Human , HeLa Cells , Humans , Introns , Molecular Sequence Data , RNA-Directed DNA Polymerase/genetics , RNA-Directed DNA Polymerase/metabolism , Reverse Transcription
SELECTION OF CITATIONS
SEARCH DETAIL
...