Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
1.
Science ; 384(6698): eadi5199, 2024 May 24.
Artículo en Inglés | MEDLINE | ID: mdl-38781369

RESUMEN

Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multiomics datasets into a resource comprising >2.8 million nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550,000 cell type-specific regulatory elements and >1.4 million single-cell expression quantitative trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ~250 disease-risk genes and drug targets with associated cell types.


Asunto(s)
Encéfalo , Redes Reguladoras de Genes , Trastornos Mentales , Análisis de la Célula Individual , Humanos , Envejecimiento/genética , Encéfalo/metabolismo , Comunicación Celular/genética , Cromatina/metabolismo , Cromatina/genética , Genómica , Trastornos Mentales/genética , Corteza Prefrontal/metabolismo , Corteza Prefrontal/fisiología , Sitios de Carácter Cuantitativo
2.
bioRxiv ; 2024 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-38562822

RESUMEN

Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet, little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multi-omics datasets into a resource comprising >2.8M nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550K cell-type-specific regulatory elements and >1.4M single-cell expression-quantitative-trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ~250 disease-risk genes and drug targets with associated cell types.

3.
Cell ; 186(7): 1493-1511.e40, 2023 03 30.
Artículo en Inglés | MEDLINE | ID: mdl-37001506

RESUMEN

Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × âˆ¼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.


Asunto(s)
Epigenoma , Sitios de Carácter Cuantitativo , Estudio de Asociación del Genoma Completo , Genómica , Fenotipo , Polimorfismo de Nucleótido Simple
4.
Nat Commun ; 11(1): 3696, 2020 07 29.
Artículo en Inglés | MEDLINE | ID: mdl-32728046

RESUMEN

ENCODE comprises thousands of functional genomics datasets, and the encyclopedia covers hundreds of cell types, providing a universal annotation for genome interpretation. However, for particular applications, it may be advantageous to use a customized annotation. Here, we develop such a custom annotation by leveraging advanced assays, such as eCLIP, Hi-C, and whole-genome STARR-seq on a number of data-rich ENCODE cell types. A key aspect of this annotation is comprehensive and experimentally derived networks of both transcription factors and RNA-binding proteins (TFs and RBPs). Cancer, a disease of system-wide dysregulation, is an ideal application for such a network-based annotation. Specifically, for cancer-associated cell types, we put regulators into hierarchies and measure their network change (rewiring) during oncogenesis. We also extensively survey TF-RBP crosstalk, highlighting how SUB1, a previously uncharacterized RBP, drives aberrant tumor expression and amplifies the effect of MYC, a well-known oncogenic TF. Furthermore, we show how our annotation allows us to place oncogenic transformations in the context of a broad cell space; here, many normal-to-tumor transitions move towards a stem-like state, while oncogene knockdowns show an opposing trend. Finally, we organize the resource into a coherent workflow to prioritize key elements and variants, in addition to regulators. We showcase the application of this prioritization to somatic burdening, cancer differential expression and GWAS. Targeted validations of the prioritized regulators, elements and variants using siRNA knockdowns, CRISPR-based editing, and luciferase assays demonstrate the value of the ENCODE resource.


Asunto(s)
Bases de Datos Genéticas , Genómica , Neoplasias/genética , Línea Celular Tumoral , Transformación Celular Neoplásica/genética , Redes Reguladoras de Genes , Humanos , Mutación/genética , Reproducibilidad de los Resultados , Factores de Transcripción/metabolismo
5.
Cell Syst ; 8(4): 352-357.e3, 2019 04 24.
Artículo en Inglés | MEDLINE | ID: mdl-30956140

RESUMEN

Small RNA sequencing has been widely adopted to study the diversity of extracellular RNAs (exRNAs) in biofluids; however, the analysis of exRNA samples can be challenging: they are vulnerable to contamination and artifacts from different isolation techniques, present in lower concentrations than cellular RNA, and occasionally of exogenous origin. To address these challenges, we present exceRpt, the exRNA-processing toolkit of the NIH Extracellular RNA Communication Consortium (ERCC). exceRpt is structured as a cascade of filters and quantifications prioritized based on one's confidence in a given set of annotated RNAs. It generates quality control reports and abundance estimates for RNA biotypes. It is also capable of characterizing mappings to exogenous genomes, which, in turn, can be used to generate phylogenetic trees. exceRpt has been used to uniformly process all ∼3,500 exRNA-seq datasets in the public exRNA Atlas and is available from genboree.org and github.gersteinlab.org/exceRpt.


Asunto(s)
Ácidos Nucleicos Libres de Células/química , RNA-Seq/métodos , Programas Informáticos , Animales , Ácidos Nucleicos Libres de Células/genética , Ácidos Nucleicos Libres de Células/metabolismo , Humanos , Ratones , RNA-Seq/normas
6.
Cell ; 177(2): 463-477.e15, 2019 04 04.
Artículo en Inglés | MEDLINE | ID: mdl-30951672

RESUMEN

To develop a map of cell-cell communication mediated by extracellular RNA (exRNA), the NIH Extracellular RNA Communication Consortium created the exRNA Atlas resource (https://exrna-atlas.org). The Atlas version 4P1 hosts 5,309 exRNA-seq and exRNA qPCR profiles from 19 studies and a suite of analysis and visualization tools. To analyze variation between profiles, we apply computational deconvolution. The analysis leads to a model with six exRNA cargo types (CT1, CT2, CT3A, CT3B, CT3C, CT4), each detectable in multiple biofluids (serum, plasma, CSF, saliva, urine). Five of the cargo types associate with known vesicular and non-vesicular (lipoprotein and ribonucleoprotein) exRNA carriers. To validate utility of this model, we re-analyze an exercise response study by deconvolution to identify physiologically relevant response pathways that were not detected previously. To enable wide application of this model, as part of the exRNA Atlas resource, we provide tools for deconvolution and analysis of user-provided case-control studies.


Asunto(s)
Comunicación Celular/fisiología , ARN/metabolismo , Adulto , Líquidos Corporales/química , Ácidos Nucleicos Libres de Células/metabolismo , MicroARN Circulante/metabolismo , Vesículas Extracelulares/metabolismo , Femenino , Humanos , Masculino , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/métodos , Programas Informáticos
7.
Nat Commun ; 10(1): 1784, 2019 04 16.
Artículo en Inglés | MEDLINE | ID: mdl-30992455

RESUMEN

The incomplete identification of structural variants (SVs) from whole-genome sequencing data limits studies of human genetic diversity and disease association. Here, we apply a suite of long-read, short-read, strand-specific sequencing technologies, optical mapping, and variant discovery algorithms to comprehensively analyze three trios to define the full spectrum of human genetic variation in a haplotype-resolved manner. We identify 818,054 indel variants (<50 bp) and 27,622 SVs (≥50 bp) per genome. We also discover 156 inversions per genome and 58 of the inversions intersect with the critical regions of recurrent microdeletion and microduplication syndromes. Taken together, our SV callsets represent a three to sevenfold increase in SV detection compared to most standard high-throughput sequencing studies, including those from the 1000 Genomes Project. The methods and the dataset presented serve as a gold standard for the scientific community allowing us to make recommendations for maximizing structural variation sensitivity for future genome sequencing studies.


Asunto(s)
Genoma Humano/genética , Variación Estructural del Genoma , Genómica/métodos , Haplotipos/genética , Algoritmos , Mapeo Cromosómico/métodos , Bases de Datos Genéticas , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Mutación INDEL , Secuenciación Completa del Genoma/métodos
8.
Science ; 361(6409)2018 09 28.
Artículo en Inglés | MEDLINE | ID: mdl-30139913

RESUMEN

To assess the impact of genetic variation in regulatory loci on human health, we constructed a high-resolution map of allelic imbalances in DNA methylation, histone marks, and gene transcription in 71 epigenomes from 36 distinct cell and tissue types from 13 donors. Deep whole-genome bisulfite sequencing of 49 methylomes revealed sequence-dependent CpG methylation imbalances at thousands of heterozygous regulatory loci. Such loci are enriched for stochastic switching, which is defined as random transitions between fully methylated and unmethylated states of DNA. The methylation imbalances at thousands of loci are explainable by different relative frequencies of the methylated and unmethylated states for the two alleles. Further analyses provided a unifying model that links sequence-dependent allelic imbalances of the epigenome, stochastic switching at gene regulatory loci, and disease-associated genetic variation.


Asunto(s)
Desequilibrio Alélico , Metilación de ADN , Enfermedad/genética , Epigénesis Genética , Genoma Humano , Polimorfismo de Nucleótido Simple , Alelos , Sitios de Unión , Islas de CpG , Redes Reguladoras de Genes , Sitios Genéticos , Estudio de Asociación del Genoma Completo , Humanos , Análisis de Secuencia de ADN , Sulfitos/química , Factores de Transcripción/metabolismo
9.
Genome Biol ; 19(1): 38, 2018 03 20.
Artículo en Inglés | MEDLINE | ID: mdl-29559002

RESUMEN

Comprehensive and accurate identification of structural variations (SVs) from next generation sequencing data remains a major challenge. We develop FusorSV, which uses a data mining approach to assess performance and merge callsets from an ensemble of SV-calling algorithms. It includes a fusion model built using analysis of 27 deep-coverage human genomes from the 1000 Genomes Project. We identify 843 novel SV calls that were not reported by the 1000 Genomes Project for these 27 samples. Experimental validation of a subset of these calls yields a validation rate of 86.7%. FusorSV is available at https://github.com/TheJacksonLaboratory/SVE .


Asunto(s)
Algoritmos , Genoma Humano , Variación Estructural del Genoma , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Análisis de Secuencia de ADN , Programas Informáticos
10.
Bioinformatics ; 34(1): 1-8, 2018 01 01.
Artículo en Inglés | MEDLINE | ID: mdl-28961734

RESUMEN

Motivation: Analysis of RNA sequencing (RNA-Seq) data in human saliva is challenging. Lack of standardization and unification of the bioinformatic procedures undermines saliva's diagnostic potential. Thus, it motivated us to perform this study. Results: We applied principal pipelines for bioinformatic analysis of small RNA-Seq data of saliva of 98 healthy Korean volunteers including either direct or indirect mapping of the reads to the human genome using Bowtie1. Analysis of alignments to exogenous genomes by another pipeline revealed that almost all of the reads map to bacterial genomes. Thus, salivary exRNA has fundamental properties that warrant the design of unique additional steps while performing the bioinformatic analysis. Our pipelines can serve as potential guidelines for processing of RNA-Seq data of human saliva. Availability and implementation: Processing and analysis results of the experimental data generated by the exceRpt (v4.6.3) small RNA-seq pipeline (github.gersteinlab.org/exceRpt) are available from exRNA atlas (exrna-atlas.org). Alignment to exogenous genomes and their quantification results were used in this paper for the analyses of small RNAs of exogenous origin. Contact: dtww@ucla.edu.


Asunto(s)
Biología Computacional/métodos , Análisis de Secuencia de ARN/métodos , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , ARN , Saliva/química
12.
Nat Commun ; 7: 11106, 2016 Apr 26.
Artículo en Inglés | MEDLINE | ID: mdl-27112789

RESUMEN

There is growing appreciation for the importance of non-protein-coding genes in development and disease. Although much is known about microRNAs, limitations in bioinformatic analyses of RNA sequencing have precluded broad assessment of other forms of small-RNAs in humans. By analysing sequencing data from plasma-derived RNA from 40 individuals, here we identified over a thousand human extracellular RNAs including microRNAs, piwi-interacting RNA (piRNA), and small nucleolar RNAs. Using a targeted quantitative PCR with reverse transcription approach in an additional 2,763 individuals, we characterized almost 500 of the most abundant extracellular transcripts including microRNAs, piRNAs and small nucleolar RNAs. The presence in plasma of many non-microRNA small-RNAs was confirmed in an independent cohort. We present comprehensive data to demonstrate the broad and consistent detection of diverse classes of circulating non-cellular small-RNAs from a large population.


Asunto(s)
Genoma Humano , MicroARNs/genética , ARN Interferente Pequeño/genética , ARN Nucleolar Pequeño/genética , Anciano , Femenino , Regulación de la Expresión Génica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Estudios Longitudinales , Masculino , MicroARNs/sangre , Persona de Mediana Edad , Anotación de Secuencia Molecular , ARN Interferente Pequeño/sangre , ARN Nucleolar Pequeño/sangre , Reacción en Cadena en Tiempo Real de la Polimerasa , Estados Unidos
13.
Nat Commun ; 7: 11101, 2016 Apr 18.
Artículo en Inglés | MEDLINE | ID: mdl-27089393

RESUMEN

Large-scale sequencing in the 1000 Genomes Project has revealed multitudes of single nucleotide variants (SNVs). Here, we provide insights into the functional effect of these variants using allele-specific behaviour. This can be assessed for an individual by mapping ChIP-seq and RNA-seq reads to a personal genome, and then measuring 'allelic imbalances' between the numbers of reads mapped to the paternal and maternal chromosomes. We annotate variants associated with allele-specific binding and expression in 382 individuals by uniformly processing 1,263 functional genomics data sets, developing approaches to reduce the heterogeneity between data sets due to overdispersion and mapping bias. Since many allelic variants are rare, aggregation across multiple individuals is necessary to identify broadly applicable 'allelic elements'. We also found SNVs for which we can anticipate allelic imbalance from the disruption of a binding motif. Our results serve as an allele-specific annotation for the 1000 Genomes variant catalogue and are distributed as an online resource (alleledb.gersteinlab.org).


Asunto(s)
Mapeo Cromosómico/métodos , Genoma Humano/genética , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Polimorfismo de Nucleótido Simple , Algoritmos , Sitios de Unión/genética , Biología Computacional/métodos , Bases de Datos Genéticas , Expresión Génica , Frecuencia de los Genes , Genotipo , Proyecto Genoma Humano , Humanos , Internet , Anotación de Secuencia Molecular/métodos , Medicina de Precisión/métodos
14.
Curr Opin Struct Biol ; 35: 125-34, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26658741

RESUMEN

Structure has traditionally been interrelated with sequence, usually in the framework of comparing sequences across species sharing a common fold. However, the nature of information within the sequence and structure databases is evolving, changing the type of comparisons possible. In particular, we now have a vast amount of personal genome sequences from human populations and a greater fraction of new structures contain interacting proteins within large complexes. Consequently, we have to recast our conception of sequence conservation and its relation to structure-for example, focusing more on selection within the human population. Moreover, within structural biology there is less emphasis on the discovery of novel folds and more on relating structures to networks of protein interactions. We cover this changing mindset here.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Proteínas/química , Proteínas/genética , Humanos , Isomerismo , Mutación , Proteínas/metabolismo
15.
J Chem Phys ; 138(13): 134315, 2013 Apr 07.
Artículo en Inglés | MEDLINE | ID: mdl-23574235

RESUMEN

The effects of the electronic and geometric factors on the global minimum structures of MB9(-) (M = V, Nb, Ta) are investigated using photoelectron spectroscopy and ab initio calculations. Photoelectron spectra are obtained for MB9(-) at two photon energies, and similar spectral features are observed for all three species. The structures for all clusters are established by global minima searches and confirmed by comparison of calculated and experimental vertical electron detachment energies. The VB9(-) cluster is shown to have a planar C2v V©B9(-) structure, whereas both NbB9(-) and TaB9(-) are shown to have Cs M©B9(-) type structures with the central metal atom slightly out of plane. Theoretical calculations suggest that the V atom fits perfectly inside the B9 ring forming a planar D(9h) V©B9(2-) structure, while the lower symmetry of V©B9(-) is due to the Jahn-Teller effect. The Nb and Ta atoms are too large to fit in the B9 ring, and they are squeezed out of the plane slightly even in the M©B9(2-) dianions. Thus, even though all three M©B9(2-) dianions fulfill the electronic design principle for the doubly aromatic molecular wheels, the geometric effect lowers the symmetry of the Nb and Ta clusters.

16.
Phys Chem Chem Phys ; 15(14): 5022-9, 2013 Apr 14.
Artículo en Inglés | MEDLINE | ID: mdl-23443061

RESUMEN

A new tool to elucidate chemical bonding in bulk solids, surfaces and nanostructures has been developed. Solid State Adaptive Natural Density Partitioning (SSAdNDP) is a method to interpret chemical bonding in terms of classical lone pairs and two-center bonds, as well as multi-center delocalized bonds. Here we extend the domain of AdNDP to bulk materials and interfaces, yielding SSAdNDP. We demonstrate the versatility of the method by applying it to several systems featuring both localized and many-center chemical bonding, and varying in structural complexity: boron α-sheet, magnesium diboride and the Na8BaSn6 Zintl phase.

17.
Acc Chem Res ; 46(2): 350-8, 2013 Feb 19.
Artículo en Inglés | MEDLINE | ID: mdl-23210660

RESUMEN

Atomic clusters have intermediate properties between that of individual atoms and bulk solids, which provide fertile ground for the discovery of new molecules and novel chemical bonding. In addition, the study of small clusters can help researchers design better nanosystems with specific physical and chemical properties. From recent experimental and computational studies, we know that small boron clusters possess planar structures stabilized by electron delocalization both in the σ and π frameworks. An interesting boron cluster is B(9)(-), which has a D(8h) molecular wheel structure with a single boron atom in the center of a B(8) ring. This ring in the D(8h)-B(9)(-) cluster is connected by eight classical two-center, two-electron bonds. In contrast, the cluster's central boron atom is bonded to the peripheral ring through three delocalized σ and three delocalized π bonds. This bonding structure gives the molecular wheel double aromaticity and high electronic stability. The unprecedented structure and bonding pattern in B(9)(-) and other planar boron clusters have inspired the designs of similar molecular wheel-type structures. But these mimics instead substitute a heteroatom for the central boron. Through recent experiments in cluster beams, chemists have demonstrated that transition metals can be doped into the center of the planar boron clusters. These new metal-centered monocyclic boron rings have variable ring sizes, M©B(n) and M©B(n)(-) with n = 8-10. Using size-selected anion photoelectron spectroscopy and ab initio calculations, researchers have characterized these novel borometallic molecules. Chemists have proposed a design principle based on σ and π double aromaticity for electronically stable borometallic cluster compounds, featuring a highly coordinated transition metal atom centered inside monocyclic boron rings. The central metal atom is coordinatively unsaturated in the direction perpendicular to the molecular plane. Thus, chemists may design appropriate ligands to synthesize the molecular wheels in the bulk. In this Account, we discuss these recent experimental and theoretical advances of this new class of aromatic borometallic compounds, which contain a highly coordinated central transition metal atom inside a monocyclic boron ring. Through these examples, we show that atomic clusters can facilitate the discovery of new structures, new chemical bonding, and possibly new nanostructures with specific, advantageous properties.

18.
J Chem Phys ; 137(23): 234306, 2012 Dec 21.
Artículo en Inglés | MEDLINE | ID: mdl-23267485

RESUMEN

We performed a joint photoelectron spectroscopy and ab initio study of two carbon-doped boron clusters, CB(9)(-) and C(2)B(8)(-). Unbiased computational searches revealed similar global minimum structures for both clusters. The comparison of the experimentally observed and theoretically calculated vertical detachment energies revealed that only the global minimum structure is responsible for the experimental spectra of CB(9)(-), whereas the two lowest-lying isomers of C(2)B(8)(-) contribute to the experimental spectra. The planar "distorted wheel" type structures with a single inner boron atom found for CB(9)(-) and C(2)B(8)(-) are different from the quasi-planar structure of B(10)(-), which consists of two inner atoms and eight peripheral boron atoms. The adaptive natural density partitioning chemical bonding analysis revealed that CB(9)(-) and C(2)B(8) clusters exhibit π aromaticity and σ antiaromaticity, which is consistent with their planar distorted structures.

20.
J Am Chem Soc ; 134(1): 165-8, 2012 Jan 11.
Artículo en Inglés | MEDLINE | ID: mdl-22148745

RESUMEN

We report the observation of two transition-metal-centered nine-atom boron rings, RhⓒB(9)(-) and IrⓒB(9)(-). These two doped-boron clusters are produced in a laser-vaporization supersonic molecular beam and characterized by photoelectron spectroscopy and ab initio calculations. Large HOMO-LUMO gaps are observed in the anion photoelectron spectra, suggesting that neutral RhⓒB(9) and IrⓒB(9) are highly stable, closed shell species. Theoretical calculations show that RhⓒB(9) and IrⓒB(9) are of D(9h) symmetry. Chemical bonding analyses reveal that these complexes are doubly aromatic, each with six completely delocalized π and σ electrons, which describe the bonding between the central metal atom and the boron ring. This work establishes firmly the metal-doped B rings as a new class of novel aromatic molecular wheels.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA