Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 88
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Genome Res ; 34(2): 179-188, 2024 03 20.
Artículo en Inglés | MEDLINE | ID: mdl-38355308

RESUMEN

A mechanistic understanding of the biological and technical factors that impact transcript measurements is essential to designing and analyzing single-cell and single-nucleus RNA sequencing experiments. Nuclei contain the same pre-mRNA population as cells, but they contain a small subset of the mRNAs. Nonetheless, early studies argued that single-nucleus analysis yielded results comparable to cellular samples if pre-mRNA measurements were included. However, typical workflows do not distinguish between pre-mRNA and mRNA when estimating gene expression, and variation in their relative abundances across cell types has received limited attention. These gaps are especially important given that incorporating pre-mRNA has become commonplace for both assays, despite known gene length bias in pre-mRNA capture. Here, we reanalyze public data sets from mouse and human to describe the mechanisms and contrasting effects of mRNA and pre-mRNA sampling on gene expression and marker gene selection in single-cell and single-nucleus RNA-seq. We show that pre-mRNA levels vary considerably among cell types, which mediates the degree of gene length bias and limits the generalizability of a recently published normalization method intended to correct for this bias. As an alternative, we repurpose an existing post hoc gene length-based correction method from conventional RNA-seq gene set enrichment analysis. Finally, we show that inclusion of pre-mRNA in bioinformatic processing can impart a larger effect than assay choice itself, which is pivotal to the effective reuse of existing data. These analyses advance our understanding of the sources of variation in single-cell and single-nucleus RNA-seq experiments and provide useful guidance for future studies.


Asunto(s)
Núcleo Celular , Precursores del ARN , Humanos , Animales , Ratones , RNA-Seq , ARN Mensajero/genética , Análisis de Secuencia de ARN/métodos , Núcleo Celular/genética , Perfilación de la Expresión Génica/métodos , Análisis de la Célula Individual
2.
Nat Methods ; 19(4): 445-448, 2022 04.
Artículo en Inglés | MEDLINE | ID: mdl-35396485

RESUMEN

Structural variants are associated with cancers and developmental disorders, but challenges with estimating population frequency remain a barrier to prioritizing mutations over inherited variants. In particular, variability in variant calling heuristics and filtering limits the use of current structural variant catalogs. We present STIX, a method that, instead of relying on variant calls, indexes and searches the raw alignments from thousands of samples to enable more comprehensive allele frequency estimation.


Asunto(s)
Genoma , Variación Estructural del Genoma , Neoplasias , Algoritmos , Variación Estructural del Genoma/genética , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Neoplasias/genética , Programas Informáticos
3.
Am J Hum Genet ; 108(4): 597-607, 2021 04 01.
Artículo en Inglés | MEDLINE | ID: mdl-33675682

RESUMEN

Each human genome includes de novo mutations that arose during gametogenesis. While these germline mutations represent a fundamental source of new genetic diversity, they can also create deleterious alleles that impact fitness. Whereas the rate and patterns of point mutations in the human germline are now well understood, far less is known about the frequency and features that impact de novo structural variants (dnSVs). We report a family-based study of germline mutations among 9,599 human genomes from 33 multigenerational CEPH-Utah families and 2,384 families from the Simons Foundation Autism Research Initiative. We find that de novo structural mutations detected by alignment-based, short-read WGS occur at an overall rate of at least 0.160 events per genome in unaffected individuals, and we observe a significantly higher rate (0.206 per genome) in ASD-affected individuals. In both probands and unaffected samples, nearly 73% of de novo structural mutations arose in paternal gametes, and we predict most de novo structural mutations to be caused by mutational mechanisms that do not require sequence homology. After multiple testing correction, we did not observe a statistically significant correlation between parental age and the rate of de novo structural variation in offspring. These results highlight that a spectrum of mutational mechanisms contribute to germline structural mutations and that these mechanisms most likely have markedly different rates and selective pressures than those leading to point mutations.


Asunto(s)
Familia , Genoma Humano/genética , Células Germinativas , Mutación de Línea Germinal/genética , Tasa de Mutación , Envejecimiento/genética , Trastorno Autístico/genética , Sesgo , Variaciones en el Número de Copia de ADN/genética , Análisis Mutacional de ADN , Femenino , Humanos , Masculino , Edad Paterna , Mutación Puntual/genética
4.
Bioinformatics ; 38(5): 1231-1234, 2022 02 07.
Artículo en Inglés | MEDLINE | ID: mdl-34864893

RESUMEN

SUMMARY: We present trfermikit, a software tool designed to detect deletions larger than 50 bp occurring in Variable Number Tandem Repeats using Illumina DNA sequencing reads. In such regions, it achieves a better tradeoff between sensitivity and false discovery than a state-of-the-art structural variation caller, Manta and complements it by recovering a significant number of deletions that Manta missed. trfermikit is based upon the fermikit pipeline, which performs read assembly, maps the assembly to the reference genome and calls variants from the alignment. AVAILABILITY AND IMPLEMENTATION: https://github.com/petermchale/trfermikit. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Genoma , Programas Informáticos , Análisis de Secuencia de ADN , Secuenciación de Nucleótidos de Alto Rendimiento
5.
BMC Bioinformatics ; 23(1): 482, 2022 Nov 14.
Artículo en Inglés | MEDLINE | ID: mdl-36376793

RESUMEN

BACKGROUND: Despite numerous molecular and computational advances, roughly half of patients with a rare disease remain undiagnosed after exome or genome sequencing. A particularly challenging barrier to diagnosis is identifying variants that cause deleterious alternative splicing at intronic or exonic loci outside of canonical donor or acceptor splice sites. RESULTS: Several existing tools predict the likelihood that a genetic variant causes alternative splicing. We sought to extend such methods by developing a new metric that aids in discerning whether a genetic variant leads to deleterious alternative splicing. Our metric combines genetic variation in the Genome Aggregate Database with alternative splicing predictions from SpliceAI to compare observed and expected levels of splice-altering genetic variation. We infer genic regions with significantly less splice-altering variation than expected to be constrained. The resulting model of regional splicing constraint captures differential splicing constraint across gene and exon categories, and the most constrained genic regions are enriched for pathogenic splice-altering variants. Building from this model, we developed ConSpliceML. This ensemble machine learning approach combines regional splicing constraint with multiple per-nucleotide alternative splicing scores to guide the prediction of deleterious splicing variants in protein-coding genes. ConSpliceML more accurately distinguishes deleterious and benign splicing variants than state-of-the-art splicing prediction methods, especially in "cryptic" splicing regions beyond canonical donor or acceptor splice sites. CONCLUSION: Integrating a model of genetic constraint with annotations from existing alternative splicing tools allows ConSpliceML to prioritize potentially deleterious splice-altering variants in studies of rare human diseases.


Asunto(s)
Empalme Alternativo , Enfermedades Raras , Humanos , Enfermedades Raras/genética , Empalme del ARN , Intrones , Exones , Mutación , Sitios de Empalme de ARN
6.
BMC Bioinformatics ; 23(1): 490, 2022 Nov 16.
Artículo en Inglés | MEDLINE | ID: mdl-36384437

RESUMEN

BACKGROUND: Identification of deleterious genetic variants using DNA sequencing data relies on increasingly detailed filtering strategies to isolate the small subset of variants that are more likely to underlie a disease phenotype. Datasets reflecting population allele frequencies of different types of variants serve as powerful filtering tools, especially in the context of rare disease analysis. While such population-scale allele frequency datasets now exist for structural variants (SVs), it remains a challenge to match SV calls between multiple datasets, thereby complicating estimates of a putative SV's population allele frequency. RESULTS: We introduce SVAFotate, a software tool that enables the annotation of SVs with variant allele frequency and related information from existing SV datasets. As a result, VCF files annotated by SVAFotate offer a variety of metrics to aid in the stratification of SVs as common or rare in the broader human population. CONCLUSIONS: Here we demonstrate the use of SVAFotate in the classification of SVs with regards to their population frequency and illustrate how SVAFotate's annotations can be used to filter and prioritize SVs. Lastly, we detail how best to utilize these SV annotations in the analysis of genetic variation in studies of rare disease.


Asunto(s)
Frecuencia de los Genes , Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Humanos , Enfermedades Raras
7.
Genome Res ; 29(4): 532-542, 2019 04.
Artículo en Inglés | MEDLINE | ID: mdl-30858344

RESUMEN

Coding variants in epigenetic regulators are emerging as causes of neurological dysfunction and cancer. However, a comprehensive effort to identify disease candidates within the human epigenetic machinery (EM) has not been performed; it is unclear whether features exist that distinguish between variation-intolerant and variation-tolerant EM genes, and between EM genes associated with neurological dysfunction versus cancer. Here, we rigorously define 295 genes with a direct role in epigenetic regulation (writers, erasers, remodelers, readers). Systematic exploration of these genes reveals that although individual enzymatic functions are always mutually exclusive, readers often also exhibit enzymatic activity (dual-function EM genes). We find that the majority of EM genes are very intolerant to loss-of-function variation, even when compared to the dosage sensitive transcription factors, and we identify 102 novel EM disease candidates. We show that this variation intolerance is driven by the protein domains encoding the epigenetic function, suggesting that disease is caused by a perturbed chromatin state. We then describe a large subset of EM genes that are coexpressed within multiple tissues. This subset is almost exclusively populated by extremely variation-intolerant genes and shows enrichment for dual-function EM genes. It is also highly enriched for genes associated with neurological dysfunction, even when accounting for dosage sensitivity, but not for cancer-associated EM genes. Finally, we show that regulatory regions near epigenetic regulators are genetically important for common neurological traits. These findings prioritize novel disease candidate EM genes and suggest that this coexpression plays a functional role in normal neurological homeostasis.


Asunto(s)
Epigénesis Genética , Enfermedades del Sistema Nervioso/genética , Polimorfismo Genético , Ensamble y Desensamble de Cromatina , Humanos , Mutación con Pérdida de Función , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
8.
Bioinformatics ; 37(24): 4860-4861, 2021 12 11.
Artículo en Inglés | MEDLINE | ID: mdl-34146087

RESUMEN

SUMMARY: Unfazed is a command-line tool to determine the parental gamete of origin for de novo mutations from paired-end Illumina DNA sequencing reads. Unfazed uses variant information for a sequenced trio to identify the parental gamete of origin by linking phase-informative inherited variants to de novo mutations using read-based phasing. It achieves a high success rate by chaining reads into haplotype groups, thus increasing the search space for informative sites. Unfazed provides a simple command-line interface and scales well to large inputs, determining parent-of-origin for nearly 30 000 de novo variants in under 60 h. AVAILABILITY AND IMPLEMENTATION: Unfazed is available at https://github.com/jbelyeu/unfazed. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Programas Informáticos , Análisis de Secuencia de ADN , Haplotipos , Secuenciación de Nucleótidos de Alto Rendimiento
9.
Nucleic Acids Res ; 48(12): 6597-6610, 2020 07 09.
Artículo en Inglés | MEDLINE | ID: mdl-32479598

RESUMEN

The human genome encodes an order of magnitude more gene expression enhancers than promoters, suggesting that most genes are regulated by the combined action of multiple enhancers. We have previously shown that neighboring estrogen-responsive enhancers exhibit complex synergistic contributions to the production of an estrogenic transcriptional response. Here we sought to determine the molecular underpinnings of this enhancer cooperativity. We generated genetic deletions of four estrogen receptor α (ER) bound enhancers that regulate two genes and found that enhancers containing full estrogen response element (ERE) motifs control ER binding at neighboring sites, while enhancers with pre-existing histone acetylation/accessibility confer a permissible chromatin environment to the neighboring enhancers. Genome engineering revealed that two enhancers with half EREs could not compensate for the lack of a full ERE site within the cluster. In contrast, two enhancers with full EREs produced a transcriptional response greater than the wild-type locus. By swapping genomic sequences, we found that the genomic location of a full ERE strongly influences enhancer activity. Our results lead to a model in which a full ERE is required for ER recruitment, but the presence of a pre-existing permissible chromatin environment can also be needed for estrogen-driven gene regulation to occur.


Asunto(s)
Elementos de Facilitación Genéticos/genética , Receptor alfa de Estrógeno/genética , Motivos de Nucleótidos/genética , Transcripción Genética , Acetilación , Cromatina/genética , Proteínas de Unión al ADN/genética , Regulación de la Expresión Génica/genética , Genoma Humano/genética , Humanos , Regiones Promotoras Genéticas/genética
10.
Proc Natl Acad Sci U S A ; 116(19): 9491-9500, 2019 05 07.
Artículo en Inglés | MEDLINE | ID: mdl-31019089

RESUMEN

The textbook view that most germline mutations in mammals arise from replication errors is indirectly supported by the fact that there are both more mutations and more cell divisions in the male than in the female germline. When analyzing large de novo mutation datasets in humans, we find multiple lines of evidence that call that view into question. Notably, despite the drastic increase in the ratio of male to female germ cell divisions after the onset of spermatogenesis, even young fathers contribute three times more mutations than young mothers, and this ratio barely increases with parental age. This surprising finding points to a substantial contribution of damage-induced mutations. Indeed, C-to-G transversions and CpG transitions, which together constitute over one-fourth of all base substitution mutations, show genomic distributions and sex-specific age dependencies indicative of double-strand break repair and methylation-associated damage, respectively. Moreover, we find evidence that maternal age at conception influences the mutation rate both because of the accumulation of damage in oocytes and potentially through an influence on the number of postzygotic mutations in the embryo. These findings reveal underappreciated roles of DNA damage and maternal age in the genesis of human germline mutations.


Asunto(s)
Roturas del ADN de Doble Cadena , Reparación del ADN , Bases de Datos de Ácidos Nucleicos , Mutación de Línea Germinal , Edad Materna , Adolescente , Adulto , Femenino , Humanos , Masculino , Persona de Mediana Edad , Oocitos , Embarazo , Espermatogénesis/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA