Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 45
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nat Rev Genet ; 25(7): 476-499, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38467784

RESUMEN

Short tandem repeats (STRs) are a class of repetitive elements, composed of tandem arrays of 1-6 base pair sequence motifs, that comprise a substantial fraction of the human genome. STR expansions can cause a wide range of neurological and neuromuscular conditions, known as repeat expansion disorders, whose age of onset, severity, penetrance and/or clinical phenotype are influenced by the length of the repeats and their sequence composition. The presence of non-canonical motifs, depending on the type, frequency and position within the repeat tract, can alter clinical outcomes by modifying somatic and intergenerational repeat stability, gene expression and mutant transcript-mediated and/or protein-mediated toxicities. Here, we review the diverse structural conformations of repeat expansions, technological advances for the characterization of changes in sequence composition, their clinical correlations and the impact on disease mechanisms.


Asunto(s)
Repeticiones de Microsatélite , Humanos , Repeticiones de Microsatélite/genética , Expansión de las Repeticiones de ADN/genética , Genoma Humano
2.
Cell ; 158(5): 1187-1198, 2014 Aug 28.
Artículo en Inglés | MEDLINE | ID: mdl-25171416

RESUMEN

Programmed DNA rearrangements in the single-celled eukaryote Oxytricha trifallax completely rewire its germline into a somatic nucleus during development. This elaborate, RNA-mediated pathway eliminates noncoding DNA sequences that interrupt gene loci and reorganizes the remaining fragments by inversions and permutations to produce functional genes. Here, we report the Oxytricha germline genome and compare it to the somatic genome to present a global view of its massive scale of genome rearrangements. The remarkably encrypted genome architecture contains >3,500 scrambled genes, as well as >800 predicted germline-limited genes expressed, and some posttranslationally modified, during genome rearrangements. Gene segments for different somatic loci often interweave with each other. Single gene segments can contribute to multiple, distinct somatic loci. Terminal precursor segments from neighboring somatic loci map extremely close to each other, often overlapping. This genome assembly provides a draft of a scrambled genome and a powerful model for studies of genome rearrangement.


Asunto(s)
Reordenamiento Génico , Genoma de Protozoos , Oxytricha/crecimiento & desarrollo , Oxytricha/genética , Núcleo Celular/metabolismo , Cromosomas/metabolismo , Datos de Secuencia Molecular , Oxytricha/citología , Oxytricha/metabolismo
3.
Nature ; 613(7942): 96-102, 2023 01.
Artículo en Inglés | MEDLINE | ID: mdl-36517591

RESUMEN

Expansion of a single repetitive DNA sequence, termed a tandem repeat (TR), is known to cause more than 50 diseases1,2. However, repeat expansions are often not explored beyond neurological and neurodegenerative disorders. In some cancers, mutations accumulate in short tracts of TRs, a phenomenon termed microsatellite instability; however, larger repeat expansions have not been systematically analysed in cancer3-8. Here we identified TR expansions in 2,622 cancer genomes spanning 29 cancer types. In seven cancer types, we found 160 recurrent repeat expansions (rREs), most of which (155/160) were subtype specific. We found that rREs were non-uniformly distributed in the genome with enrichment near candidate cis-regulatory elements, suggesting a potential role in gene regulation. One rRE, a GAAA-repeat expansion, located near a regulatory element in the first intron of UGT2B7 was detected in 34% of renal cell carcinoma samples and was validated by long-read DNA sequencing. Moreover, in preliminary experiments, treating cells that harbour this rRE with a GAAA-targeting molecule led to a dose-dependent decrease in cell proliferation. Overall, our results suggest that rREs may be an important but unexplored source of genetic variation in human cancer, and we provide a comprehensive catalogue for further study.


Asunto(s)
Expansión de las Repeticiones de ADN , Genoma Humano , Neoplasias , Humanos , Secuencia de Bases , Expansión de las Repeticiones de ADN/genética , Genoma Humano/genética , Neoplasias/clasificación , Neoplasias/genética , Neoplasias/patología , Análisis de Secuencia de ADN , Regulación de la Expresión Génica , Elementos Reguladores de la Transcripción/genética , Intrones/genética , Carcinoma de Células Renales/genética , Carcinoma de Células Renales/patología , Proliferación Celular/efectos de los fármacos , Reproducibilidad de los Resultados
4.
Cell ; 152(3): 406-16, 2013 Jan 31.
Artículo en Inglés | MEDLINE | ID: mdl-23374338

RESUMEN

Ciliates are an ancient and diverse group of microbial eukaryotes that have emerged as powerful models for RNA-mediated epigenetic inheritance. They possess extensive sets of both tiny and long noncoding RNAs that, together with a suite of proteins that includes transposases, orchestrate a broad cascade of genome rearrangements during somatic nuclear development. This Review emphasizes three important themes: the remarkable role of RNA in shaping genome structure, recent discoveries that unify many deeply diverged ciliate genetic systems, and a surprising evolutionary "sign change" in the role of small RNAs between major species groups.


Asunto(s)
Evolución Biológica , Cilióforos/genética , Inestabilidad Genómica , ARN Protozoario/genética , ARN no Traducido/genética , Genoma de Protozoos , ARN Largo no Codificante/genética
5.
Am J Hum Genet ; 110(1): 105-119, 2023 01 05.
Artículo en Inglés | MEDLINE | ID: mdl-36493768

RESUMEN

Adult-onset cerebellar ataxias are a group of neurodegenerative conditions that challenge both genetic discovery and molecular diagnosis. In this study, we identified an intronic (GAA) repeat expansion in fibroblast growth factor 14 (FGF14). Genetic analysis of 95 Australian individuals with adult-onset ataxia identified four (4.2%) with (GAA)>300 and a further nine individuals with (GAA)>250. PCR and long-read sequence analysis revealed these were pure (GAA) repeats. In comparison, no control subjects had (GAA)>300 and only 2/311 control individuals (0.6%) had a pure (GAA)>250. In a German validation cohort, 9/104 (8.7%) of affected individuals had (GAA)>335 and a further six had (GAA)>250, whereas 10/190 (5.3%) control subjects had (GAA)>250 but none were (GAA)>335. The combined data suggest (GAA)>335 are disease causing and fully penetrant (p = 6.0 × 10-8, OR = 72 [95% CI = 4.3-1,227]), while (GAA)>250 is likely pathogenic with reduced penetrance. Affected individuals had an adult-onset, slowly progressive cerebellar ataxia with variable features including vestibular impairment, hyper-reflexia, and autonomic dysfunction. A negative correlation between age at onset and repeat length was observed (R2 = 0.44, p = 0.00045, slope = -0.12) and identification of a shared haplotype in a minority of individuals suggests that the expansion can be inherited or generated de novo during meiotic division. This study demonstrates the power of genome sequencing and advanced bioinformatic tools to identify novel repeat expansions via model-free, genome-wide analysis and identifies SCA50/ATX-FGF14 as a frequent cause of adult-onset ataxia.


Asunto(s)
Ataxia Cerebelosa , Factores de Crecimiento de Fibroblastos , Ataxia de Friedreich , Expansión de Repetición de Trinucleótido , Adulto , Humanos , Ataxia/genética , Australia , Ataxia Cerebelosa/genética , Ataxia de Friedreich/genética , Expansión de Repetición de Trinucleótido/genética
6.
Nature ; 586(7828): 292-298, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32999459

RESUMEN

The RecQ DNA helicase WRN is a synthetic lethal target for cancer cells with microsatellite instability (MSI), a form of genetic hypermutability that arises from impaired mismatch repair1-4. Depletion of WRN induces widespread DNA double-strand breaks in MSI cells, leading to cell cycle arrest and/or apoptosis. However, the mechanism by which WRN protects MSI-associated cancers from double-strand breaks remains unclear. Here we show that TA-dinucleotide repeats are highly unstable in MSI cells and undergo large-scale expansions, distinct from previously described insertion or deletion mutations of a few nucleotides5. Expanded TA repeats form non-B DNA secondary structures that stall replication forks, activate the ATR checkpoint kinase, and require unwinding by the WRN helicase. In the absence of WRN, the expanded TA-dinucleotide repeats are susceptible to cleavage by the MUS81 nuclease, leading to massive chromosome shattering. These findings identify a distinct biomarker that underlies the synthetic lethal dependence on WRN, and support the development of therapeutic agents that target WRN for MSI-associated cancers.


Asunto(s)
Roturas del ADN de Doble Cadena , Expansión de las Repeticiones de ADN/genética , Repeticiones de Dinucleótido/genética , Neoplasias/genética , Helicasa del Síndrome de Werner/metabolismo , Proteínas de la Ataxia Telangiectasia Mutada/metabolismo , Línea Celular Tumoral , Cromosomas Humanos/genética , Cromosomas Humanos/metabolismo , Cromotripsis , División del ADN , Replicación del ADN , Proteínas de Unión al ADN/metabolismo , Endodesoxirribonucleasas/metabolismo , Endonucleasas/metabolismo , Inestabilidad Genómica , Humanos , Recombinasas/metabolismo
7.
Nature ; 586(7827): 80-86, 2020 10.
Artículo en Inglés | MEDLINE | ID: mdl-32717741

RESUMEN

Tandem DNA repeats vary in the size and sequence of each unit (motif). When expanded, these tandem DNA repeats have been associated with more than 40 monogenic disorders1. Their involvement in disorders with complex genetics is largely unknown, as is the extent of their heterogeneity. Here we investigated the genome-wide characteristics of tandem repeats that had motifs with a length of 2-20 base pairs in 17,231 genomes of families containing individuals with autism spectrum disorder (ASD)2,3 and population control individuals4. We found extensive polymorphism in the size and sequence of motifs. Many of the tandem repeat loci that we detected correlated with cytogenetic fragile sites. At 2,588 loci, gene-associated expansions of tandem repeats that were rare among population control individuals were significantly more prevalent among individuals with ASD than their siblings without ASD, particularly in exons and near splice junctions, and in genes related to the development of the nervous system and cardiovascular system or muscle. Rare tandem repeat expansions had a prevalence of 23.3% in children with ASD compared with 20.7% in children without ASD, which suggests that tandem repeat expansions make a collective contribution to the risk of ASD of 2.6%. These rare tandem repeat expansions included previously undescribed ASD-linked expansions in DMPK and FXN, which are associated with neuromuscular conditions, and in previously unknown loci such as FGF14 and CACNB1. Rare tandem repeat expansions were associated with lower IQ and adaptive ability. Our results show that tandem DNA repeat expansions contribute strongly to the genetic aetiology and phenotypic complexity of ASD.


Asunto(s)
Trastorno del Espectro Autista/genética , Expansión de las Repeticiones de ADN/genética , Genoma Humano/genética , Genómica , Secuencias Repetidas en Tándem/genética , Femenino , Factores de Crecimiento de Fibroblastos/genética , Predisposición Genética a la Enfermedad , Humanos , Inteligencia/genética , Proteínas de Unión a Hierro/genética , Masculino , Proteína Quinasa de Distrofia Miotónica/genética , Motivos de Nucleótidos , Polimorfismo Genético , Frataxina
8.
Hum Mutat ; 43(7): 859-868, 2022 07.
Artículo en Inglés | MEDLINE | ID: mdl-35395114

RESUMEN

Expansions of short tandem repeats (STRs) have been implicated as the causal variant in over 50 diseases known to date. There are several tools which can genotype STRs from high-throughput sequencing (HTS) data. However, running these tools out of the box only allows around half of the known disease-causing loci to be genotyped. Furthermore, the genotypes estimated at these loci are often underestimated with maximum lengths limited to either the read or fragment length, which is less than the pathogenic cutoff for some diseases. Although analysis tools can be customized to genotype extra loci, this requires proficiency in bioinformatics to set up, limiting their widespread usage by other researchers and clinicians. To address these issues, we have developed a new software called STRipy, which is able to target all known disease-causing STRs from HTS data. We created an intuitive graphical interface for STRipy and significantly simplified the detection of STRs expansions. Moreover, we genotyped all disease loci for over two and half thousand samples to provide population-wide distributions to assist with interpretation of results. We believe the simplicity and breadth of STRipy will increase the genotyping of STRs in sequencing data resulting in further diagnoses of rare STR diseases.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Repeticiones de Microsatélite , Biología Computacional , Genotipo , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Repeticiones de Microsatélite/genética , Programas Informáticos
9.
Mol Biol Evol ; 38(3): 927-939, 2021 03 09.
Artículo en Inglés | MEDLINE | ID: mdl-33022053

RESUMEN

A major challenge in modern biology is understanding how the effects of short-term biological responses influence long-term evolutionary adaptation, defined as a genetically determined increase in fitness to novel environments. This is particularly important in globally important microbes experiencing rapid global change, due to their influence on food webs, biogeochemical cycles, and climate. Epigenetic modifications like methylation have been demonstrated to influence short-term plastic responses, which ultimately impact long-term adaptive responses to environmental change. However, there remains a paucity of empirical research examining long-term methylation dynamics during environmental adaptation in nonmodel, ecologically important microbes. Here, we show the first empirical evidence in a marine prokaryote for long-term m5C methylome modifications correlated with phenotypic adaptation to CO2, using a 7-year evolution experiment (1,000+ generations) with the biogeochemically important marine cyanobacterium Trichodesmium. We identify m5C methylated sites that rapidly changed in response to high (750 µatm) CO2 exposure and were maintained for at least 4.5 years of CO2 selection. After 7 years of CO2 selection, however, m5C methylation levels that initially responded to high-CO2 returned to ancestral, ambient CO2 levels. Concurrently, high-CO2 adapted growth and N2 fixation rates remained significantly higher than those of ambient CO2 adapted cell lines irrespective of CO2 concentration, a trend consistent with genetic assimilation theory. These data demonstrate the maintenance of CO2-responsive m5C methylation for 4.5 years alongside phenotypic adaptation before returning to ancestral methylation levels. These observations in a globally distributed marine prokaryote provide critical evolutionary insights into biogeochemically important traits under global change.


Asunto(s)
Adaptación Biológica , Evolución Biológica , Dióxido de Carbono/fisiología , Metilación de ADN , Trichodesmium/genética , Epigenoma , Fenotipo , Transcripción Genética
10.
Am J Hum Genet ; 104(6): 1116-1126, 2019 06 06.
Artículo en Inglés | MEDLINE | ID: mdl-31104771

RESUMEN

Huntington disease (HD) is caused by a CAG repeat expansion in the huntingtin (HTT) gene. Although the length of this repeat is inversely correlated with age of onset (AOO), it does not fully explain the variability in AOO. We assessed the sequence downstream of the CAG repeat in HTT [reference: (CAG)n-CAA-CAG], since variants within this region have been previously described, but no study of AOO has been performed. These analyses identified a variant that results in complete loss of interrupting (LOI) adenine nucleotides in this region [(CAG)n-CAG-CAG]. Analysis of multiple HD pedigrees showed that this LOI variant is associated with dramatically earlier AOO (average of 25 years) despite the same polyglutamine length as in individuals with the interrupting penultimate CAA codon. This LOI allele is particularly frequent in persons with reduced penetrance alleles who manifest with HD and increases the likelihood of presenting clinically with HD with a CAG of 36-39 repeats. Further, we show that the LOI variant is associated with increased somatic repeat instability, highlighting this as a significant driver of this effect. These findings indicate that the number of uninterrupted CAG repeats, which is lengthened by the LOI, is the most significant contributor to AOO of HD and is more significant than polyglutamine length, which is not altered in these individuals. In addition, we identified another variant in this region, where the CAA-CAG sequence is duplicated, which was associated with later AOO. Identification of these cis-acting modifiers have potentially important implications for genetic counselling in HD-affected families.


Asunto(s)
Codón/genética , Enfermedad de Huntington/genética , Enfermedad de Huntington/patología , Péptidos/genética , Expansión de Repetición de Trinucleótido/genética , Adolescente , Adulto , Edad de Inicio , Niño , Femenino , Humanos , Masculino , Persona de Mediana Edad , Linaje
11.
Am J Hum Genet ; 105(1): 151-165, 2019 07 03.
Artículo en Inglés | MEDLINE | ID: mdl-31230722

RESUMEN

Genomic technologies such as next-generation sequencing (NGS) are revolutionizing molecular diagnostics and clinical medicine. However, these approaches have proven inefficient at identifying pathogenic repeat expansions. Here, we apply a collection of bioinformatics tools that can be utilized to identify either known or novel expanded repeat sequences in NGS data. We performed genetic studies of a cohort of 35 individuals from 22 families with a clinical diagnosis of cerebellar ataxia with neuropathy and bilateral vestibular areflexia syndrome (CANVAS). Analysis of whole-genome sequence (WGS) data with five independent algorithms identified a recessively inherited intronic repeat expansion [(AAGGG)exp] in the gene encoding Replication Factor C1 (RFC1). This motif, not reported in the reference sequence, localized to an Alu element and replaced the reference (AAAAG)11 short tandem repeat. Genetic analyses confirmed the pathogenic expansion in 18 of 22 CANVAS-affected families and identified a core ancestral haplotype, estimated to have arisen in Europe more than twenty-five thousand years ago. WGS of the four RFC1-negative CANVAS-affected families identified plausible variants in three, with genomic re-diagnosis of SCA3, spastic ataxia of the Charlevoix-Saguenay type, and SCA45. This study identified the genetic basis of CANVAS and demonstrated that these improved bioinformatics tools increase the diagnostic utility of WGS to determine the genetic basis of a heterogeneous group of clinically overlapping neurogenetic disorders.


Asunto(s)
Ataxia Cerebelosa/etiología , Biología Computacional/métodos , Intrones , Repeticiones de Microsatélite , Polineuropatías/etiología , Proteína de Replicación C/genética , Trastornos de la Sensación/etiología , Enfermedades Vestibulares/etiología , Algoritmos , Ataxia Cerebelosa/patología , Estudios de Cohortes , Familia , Femenino , Genómica , Humanos , Masculino , Persona de Mediana Edad , Polineuropatías/patología , Trastornos de la Sensación/patología , Síndrome , Enfermedades Vestibulares/patología , Secuenciación Completa del Genoma
12.
N Engl J Med ; 380(15): 1433-1441, 2019 04 11.
Artículo en Inglés | MEDLINE | ID: mdl-30970188

RESUMEN

We report an inborn error of metabolism caused by an expansion of a GCA-repeat tract in the 5' untranslated region of the gene encoding glutaminase (GLS) that was identified through detailed clinical and biochemical phenotyping, combined with whole-genome sequencing. The expansion was observed in three unrelated patients who presented with an early-onset delay in overall development, progressive ataxia, and elevated levels of glutamine. In addition to ataxia, one patient also showed cerebellar atrophy. The expansion was associated with a relative deficiency of GLS messenger RNA transcribed from the expanded allele, which probably resulted from repeat-mediated chromatin changes upstream of the GLS repeat. Our discovery underscores the importance of careful examination of regions of the genome that are typically excluded from or poorly captured by exome sequencing.


Asunto(s)
Errores Innatos del Metabolismo de los Aminoácidos/genética , Ataxia/genética , Discapacidades del Desarrollo/genética , Glutaminasa/deficiencia , Glutaminasa/genética , Glutamina/metabolismo , Repeticiones de Microsatélite , Mutación , Atrofia/genética , Cerebelo/patología , Preescolar , Femenino , Genotipo , Glutamina/análisis , Humanos , Masculino , Fenotipo , Reacción en Cadena de la Polimerasa , Secuenciación Completa del Genoma
14.
Genome Res ; 27(11): 1895-1903, 2017 11.
Artículo en Inglés | MEDLINE | ID: mdl-28887402

RESUMEN

Identifying large expansions of short tandem repeats (STRs), such as those that cause amyotrophic lateral sclerosis (ALS) and fragile X syndrome, is challenging for short-read whole-genome sequencing (WGS) data. A solution to this problem is an important step toward integrating WGS into precision medicine. We developed a software tool called ExpansionHunter that, using PCR-free WGS short-read data, can genotype repeats at the locus of interest, even if the expanded repeat is larger than the read length. We applied our algorithm to WGS data from 3001 ALS patients who have been tested for the presence of the C9orf72 repeat expansion with repeat-primed PCR (RP-PCR). Compared against this truth data, ExpansionHunter correctly classified all (212/212, 95% CI [0.98, 1.00]) of the expanded samples as either expansions (208) or potential expansions (4). Additionally, 99.9% (2786/2789, 95% CI [0.997, 1.00]) of the wild-type samples were correctly classified as wild type by this method with the remaining three samples identified as possible expansions. We further applied our algorithm to a set of 152 samples in which every sample had one of eight different pathogenic repeat expansions, including those associated with fragile X syndrome, Friedreich's ataxia, and Huntington's disease, and correctly flagged all but one of the known repeat expansions. Thus, ExpansionHunter can be used to accurately detect known pathogenic repeat expansions and provides researchers with a tool that can be used to identify new pathogenic repeat expansions.


Asunto(s)
Esclerosis Amiotrófica Lateral/genética , Expansión de las Repeticiones de ADN , Secuenciación Completa del Genoma/métodos , Algoritmos , Proteína C9orf72/genética , Bases de Datos Genéticas , Humanos , Medicina de Precisión , Sensibilidad y Especificidad , Programas Informáticos
15.
Bioinformatics ; 35(22): 4754-4756, 2019 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-31134279

RESUMEN

SUMMARY: We describe a novel computational method for genotyping repeats using sequence graphs. This method addresses the long-standing need to accurately genotype medically important loci containing repeats adjacent to other variants or imperfect DNA repeats such as polyalanine repeats. Here we introduce a new version of our repeat genotyping software, ExpansionHunter, that uses this method to perform targeted genotyping of a broad class of such loci. AVAILABILITY AND IMPLEMENTATION: ExpansionHunter is implemented in C++ and is available under the Apache License Version 2.0. The source code, documentation, and Linux/macOS binaries are available at https://github.com/Illumina/ExpansionHunter/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Repeticiones de Microsatélite , Programas Informáticos , Genotipo
16.
Genet Med ; 21(5): 1121-1130, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30293986

RESUMEN

PURPOSE: Current diagnostic testing for genetic disorders involves serial use of specialized assays spanning multiple technologies. In principle, genome sequencing (GS) can detect all genomic pathogenic variant types on a single platform. Here we evaluate copy-number variant (CNV) calling as part of a clinically accredited GS test. METHODS: We performed analytical validation of CNV calling on 17 reference samples, compared the sensitivity of GS-based variants with those from a clinical microarray, and set a bound on precision using orthogonal technologies. We developed a protocol for family-based analysis of GS-based CNV calls, and deployed this across a clinical cohort of 79 rare and undiagnosed cases. RESULTS: We found that CNV calls from GS are at least as sensitive as those from microarrays, while only creating a modest increase in the number of variants interpreted (~10 CNVs per case). We identified clinically significant CNVs in 15% of the first 79 cases analyzed, all of which were confirmed by an orthogonal approach. The pipeline also enabled discovery of a uniparental disomy (UPD) and a 50% mosaic trisomy 14. Directed analysis of select CNVs enabled breakpoint level resolution of genomic rearrangements and phasing of de novo CNVs. CONCLUSION: Robust identification of CNVs by GS is possible within a clinical testing environment.


Asunto(s)
Variaciones en el Número de Copia de ADN/genética , Enfermedades Raras/genética , Enfermedades no Diagnosticadas/genética , Adolescente , Niño , Preescolar , Mapeo Cromosómico/métodos , Estudios de Cohortes , Femenino , Pruebas Genéticas/métodos , Genoma Humano , Genómica/métodos , Humanos , Lactante , Masculino , Enfermedades Raras/diagnóstico , Enfermedades no Diagnosticadas/diagnóstico , Secuenciación Completa del Genoma/métodos , Adulto Joven
17.
Environ Microbiol ; 19(11): 4700-4713, 2017 11.
Artículo en Inglés | MEDLINE | ID: mdl-28925547

RESUMEN

Cytosine methylation has been shown to regulate essential cellular processes and impact biological adaptation. Despite its evolutionary importance, only a handful of bacterial, genome-wide cytosine studies have been conducted, with none for marine bacteria. Here, we examine the genome-wide, C5 -Methyl-cytosine (m5C) methylome and its correlation to global transcription in the marine nitrogen-fixing cyanobacterium Trichodesmium. We characterize genome-wide methylation and highlight conserved motifs across three Trichodesmium isolates and two Trichodesmium metagenomes, thereby identifying highly conserved, novel genomic signatures of potential gene regulation in Trichodesmium. Certain gene bodies with the highest methylation levels correlate with lower expression levels. Several methylated motifs were highly conserved across spatiotemporally separated Trichodesmium isolates, thereby elucidating biogeographically conserved methylation potential. These motifs were also highly conserved in Trichodesmium metagenomic samples from natural populations suggesting them to be potential in situ markers of m5C methylation. Using these data, we highlight predicted roles of cytosine methylation in global cellular metabolism providing evidence for a 'core' m5C methylome spanning different ocean regions. These results provide important insights into the m5C methylation landscape and its biogeochemical implications in an important marine N2 -fixer, as well as advancing evolutionary theory examining methylation influences on adaptation.


Asunto(s)
Citosina/metabolismo , Metilación de ADN/genética , ADN Bacteriano/metabolismo , Trichodesmium/genética , Secuencia de Bases/genética , ADN Bacteriano/genética , Genoma Bacteriano/genética , Genómica , Nitrógeno/metabolismo , Fijación del Nitrógeno/genética , Análisis de Secuencia de ADN , Trichodesmium/aislamiento & purificación
18.
Nucleic Acids Res ; 41(Database issue): D1079-82, 2013 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23193296

RESUMEN

Organisms represented by the root of the universal evolutionary tree were most likely complex cells with a sophisticated protein translation system and a DNA genome encoding hundreds of genes. The growth of bioinformatics data from taxonomically diverse organisms has made it possible to infer the likely properties of early life in greater detail. Here we present LUCApedia, (http://eeb.princeton.edu/lucapedia), a unified framework for simultaneously evaluating multiple data sets related to the Last Universal Common Ancestor (LUCA) and its predecessors. This unification is achieved by mapping eleven such data sets onto UniProt, KEGG and BioCyc IDs. LUCApedia may be used to rapidly acquire evidence that a certain gene or set of genes is ancient, to examine the early evolution of metabolic pathways, or to test specific hypotheses related to ancient life by corroborating them against the rest of the database.


Asunto(s)
Bases de Datos Genéticas , Evolución Molecular , Internet , Redes y Vías Metabólicas , Familia de Multigenes , Origen de la Vida , Filogenia , Proteínas/química , Proteínas/clasificación , Programas Informáticos
19.
BMC Bioinformatics ; 15: 215, 2014 Jun 24.
Artículo en Inglés | MEDLINE | ID: mdl-24962134

RESUMEN

BACKGROUND: Whole-genome bisulfite sequencing currently provides the highest-precision view of the epigenome, with quantitative information about populations of cells down to single nucleotide resolution. Several studies have demonstrated the value of this precision: meaningful features that correlate strongly with biological functions can be found associated with only a few CpG sites. Understanding the role of DNA methylation, and more broadly the role of DNA accessibility, requires that methylation differences between populations of cells are identified with extreme precision and in complex experimental designs. RESULTS: In this work we investigated the use of beta-binomial regression as a general approach for modeling whole-genome bisulfite data to identify differentially methylated sites and genomic intervals. CONCLUSIONS: The regression-based analysis can handle medium- and large-scale experiments where it becomes critical to accurately model variation in methylation levels between replicates and account for influence of various experimental factors like cell types or batch effects.


Asunto(s)
Metilación de ADN , Genómica/métodos , Análisis de Secuencia de ADN/métodos , Sulfitos/farmacología , Islas de CpG/genética , Metilación de ADN/efectos de los fármacos , Modelos Estadísticos , Nucleótidos/genética , Análisis de Regresión
20.
medRxiv ; 2024 May 21.
Artículo en Inglés | MEDLINE | ID: mdl-38826469

RESUMEN

Approximately 3% of the human genome consists of repetitive elements called tandem repeats (TRs), which include short tandem repeats (STRs) of 1-6bp motifs and variable number tandem repeats (VNTRs) of 7+bp motifs. TR variants contribute to several dozen mono- and polygenic diseases but remain understudied and "enigmatic," particularly relative to single nucleotide variants. It remains comparatively challenging to interpret the clinical significance of TR variants. Although existing resources provide portions of necessary data for interpretation at disease-associated loci, it is currently difficult or impossible to efficiently invoke the additional details critical to proper interpretation, such as motif pathogenicity, disease penetrance, and age of onset distributions. It is also often unclear how to apply population information to analyses. We present STRchive (S-T-archive, http://strchive.org/ ), a dynamic resource consolidating information on TR disease loci in humans from research literature, up-to-date clinical resources, and large-scale genomic databases, with the goal of streamlining TR variant interpretation at disease-associated loci. We apply STRchive -including pathogenic thresholds, motif classification, and clinical phenotypes-to a gnomAD cohort of ∼18.5k individuals genotyped at 60 disease-associated loci. Through detailed literature curation, we demonstrate that the majority of TR diseases affect children despite being thought of as adult diseases. Additionally, we show that pathogenic genotypes can be found within gnomAD which do not necessarily overlap with known disease prevalence, and leverage STRchive to interpret locus-specific findings therein. We apply a diagnostic blueprint empowered by STRchive to relevant clinical vignettes, highlighting possible pitfalls in TR variant interpretation. As a living resource, STRchive is maintained by experts, takes community contributions, and will evolve as understanding of TR diseases progresses.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA