Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 47
Filtrar
1.
Cell ; 172(5): 897-909.e21, 2018 02 22.
Artigo em Inglês | MEDLINE | ID: mdl-29474918

RESUMO

X-linked Dystonia-Parkinsonism (XDP) is a Mendelian neurodegenerative disease that is endemic to the Philippines and is associated with a founder haplotype. We integrated multiple genome and transcriptome assembly technologies to narrow the causal mutation to the TAF1 locus, which included a SINE-VNTR-Alu (SVA) retrotransposition into intron 32 of the gene. Transcriptome analyses identified decreased expression of the canonical cTAF1 transcript among XDP probands, and de novo assembly across multiple pluripotent stem-cell-derived neuronal lineages discovered aberrant TAF1 transcription that involved alternative splicing and intron retention (IR) in proximity to the SVA that was anti-correlated with overall TAF1 expression. CRISPR/Cas9 excision of the SVA rescued this XDP-specific transcriptional signature and normalized TAF1 expression in probands. These data suggest an SVA-mediated aberrant transcriptional mechanism associated with XDP and may provide a roadmap for layered technologies and integrated assembly-based analyses for other unsolved Mendelian disorders.


Assuntos
Distúrbios Distônicos/genética , Doenças Genéticas Ligadas ao Cromossomo X/genética , Genoma Humano , Transcriptoma/genética , Processamento Alternativo/genética , Elementos Alu/genética , Sequência de Bases , Sistemas CRISPR-Cas/genética , Estudos de Coortes , Família , Feminino , Loci Gênicos , Haplótipos/genética , Sequenciamento de Nucleotídeos em Larga Escala , Histona Acetiltransferases/genética , Histona Acetiltransferases/metabolismo , Humanos , Células-Tronco Pluripotentes Induzidas/metabolismo , Íntrons/genética , Masculino , Repetições Minissatélites/genética , Modelos Genéticos , Degeneração Neural/genética , Degeneração Neural/patologia , Células-Tronco Neurais/metabolismo , Neurônios/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Elementos Nucleotídeos Curtos e Dispersos , Fatores Associados à Proteína de Ligação a TATA/genética , Fatores Associados à Proteína de Ligação a TATA/metabolismo , Fator de Transcrição TFIID/genética , Fator de Transcrição TFIID/metabolismo
2.
Genome Res ; 2024 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-38906680

RESUMO

Transcription and translation are intertwined processes where mRNA isoforms are crucial intermediaries. However, methodological limitations in analyzing translation at the mRNA isoform level have left gaps in our understanding of critical biological processes. To address these gaps, we developed an integrated computational and experimental framework called long-read Ribo-STAMP (LR-Ribo-STAMP) that capitalizes on advancements in long-read sequencing and RNA-base editing-mediated technologies to simultaneously profile translation and transcription at both gene and mRNA isoform levels. We also developed the EditsC metric to quantify editing and leverage the single-molecule, full-length transcript information provided by long-read sequencing. Here, we report concordance between gene-level translation profiles obtained with long-read and short-read Ribo-STAMP. We show that LR-Ribo-STAMP successfully profiles translation of mRNA isoforms and links regulatory features, such as upstream open reading frames (uORFs), to translation measurements. We apply LR-Ribo-STAMP to discovering translational differences at both gene and isoform levels in a triple-negative breast cancer cell line under normoxia and hypoxia and find that LR-Ribo-STAMP effectively delineates orthogonal transcriptional and translation shifts between conditions. We also discover regulatory elements that distinguish translational differences at the isoform level. We highlight GRK6, where hypoxia is observed to increase expression and translation of a shorter mRNA isoform, giving rise to a truncated protein without the AGC Kinase domain. Overall, LR-Ribo-STAMP is an important advance in our repertoire of methods that measure mRNA translation with isoform sensitivity.

3.
Nature ; 594(7861): 77-81, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33953399

RESUMO

The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3-5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome.


Assuntos
Evolução Molecular , Genoma/genética , Genômica , Pan paniscus/genética , Filogenia , Animais , Fator de Iniciação 4A em Eucariotos/genética , Feminino , Genes , Gorilla gorilla/genética , Anotação de Sequência Molecular/normas , Pan troglodytes/genética , Pongo/genética , Duplicações Segmentares Genômicas , Análise de Sequência de DNA
4.
Nat Methods ; 18(5): 507-519, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33963355

RESUMO

RNA-binding proteins (RBPs) are critical regulators of gene expression and RNA processing that are required for gene function. Yet the dynamics of RBP regulation in single cells is unknown. To address this gap in understanding, we developed STAMP (Surveying Targets by APOBEC-Mediated Profiling), which efficiently detects RBP-RNA interactions. STAMP does not rely on ultraviolet cross-linking or immunoprecipitation and, when coupled with single-cell capture, can identify RBP-specific and cell-type-specific RNA-protein interactions for multiple RBPs and cell types in single, pooled experiments. Pairing STAMP with long-read sequencing yields RBP target sites in an isoform-specific manner. Finally, Ribo-STAMP leverages small ribosomal subunits to measure transcriptome-wide ribosome association in single cells. STAMP enables the study of RBP-RNA interactomes and translational landscapes with unprecedented cellular resolution.


Assuntos
Proteínas de Ligação a RNA/metabolismo , RNA/metabolismo , Análise de Célula Única/métodos , Animais , Sítios de Ligação , Perfilação da Expressão Gênica , Células HEK293 , Humanos , Sequenciamento por Nanoporos , RNA/química , Proteínas de Ligação a RNA/química , Análise de Sequência de RNA , Transcriptoma
5.
Nucleic Acids Res ; 50(14): 7801-7815, 2022 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-35253883

RESUMO

Centromeres are the chromosomal loci essential for faithful chromosome segregation during cell division. Although centromeres are transcribed and produce non-coding RNAs (cenRNAs) that affect centromere function, we still lack a mechanistic understanding of how centromere transcription is regulated. Here, using a targeted RNA isoform sequencing approach, we identified the transcriptional landscape at and surrounding all centromeres in budding yeast. Overall, cenRNAs are derived from transcription readthrough of pericentromeric regions but rarely span the entire centromere and are a complex mixture of molecules that are heterogeneous in abundance, orientation, and sequence. While most pericentromeres are transcribed throughout the cell cycle, centromere accessibility to the transcription machinery is restricted to S-phase. This temporal restriction is dependent on Cbf1, a centromere-binding transcription factor, that we demonstrate acts locally as a transcriptional roadblock. Cbf1 deletion leads to an accumulation of cenRNAs at all phases of the cell cycle which correlates with increased chromosome mis-segregation that is partially rescued when the roadblock activity is restored. We propose that a Cbf1-mediated transcriptional roadblock protects yeast centromeres from untimely transcription to ensure genomic stability.


Centromeres are essential chromosomal regions that do not encode gene products and instead ensure the accurate partitioning of chromosomes during cell division. Despite the lack of genes, transcription has been detected at centromeres. It has not been clear where this centromeric RNA comes from and how it is regulated. In this study, the authors identified all of the centromeric RNAs at and around budding yeast centromeres during the cell cycle. Unlike RNAs that encode for proteins, centromeric RNAs are a complex mixture of transcripts that result from adjacent RNAs that continue into the centromere. The authors found that most transcription is blocked at the centromere border by a protein called Cbf1. This mechanism shields the centromere from untimely transcription to ensure genome stability.


Assuntos
Centrômero , Proteínas de Saccharomyces cerevisiae , Fatores de Transcrição de Zíper de Leucina e Hélice-Alça-Hélix Básicos/metabolismo , Centrômero/genética , Centrômero/metabolismo , Segregação de Cromossomos/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Transcrição Gênica
6.
Genome Res ; 28(10): 1566-1576, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30228200

RESUMO

Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth-death process of duplicate genes and particularly how the process leads to gene innovation. In 48% of the cases, we find that the expressed duplicates have changed substantially from their ancestral models due to novel sites of transcription initiation, splicing, and polyadenylation, as well as fusion transcripts that connect duplication-derived exons with neighboring genes. We detect unannotated open reading frames in genes currently annotated as pseudogenes, while relegating other duplicates to nonfunctional status. Our method significantly improves gene annotation, specifically defining full-length transcripts, isoforms, and open reading frames for new genes in highly identical SDs. The approach will be more broadly applicable to genes in structurally complex regions of other genomes where the duplication process creates novel genes important for adaptive traits.


Assuntos
Encéfalo/metabolismo , Duplicações Segmentares Genômicas , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodos , Evolução Molecular , Duplicação Gênica , Perfilação da Expressão Gênica , Humanos , Anotação de Sequência Molecular , Família Multigênica , Fases de Leitura Aberta , Pseudogenes
7.
Genome Res ; 28(7): 1029-1038, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29884752

RESUMO

The recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultracontiguous genome assemblies. To compare these genomes, we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms, and structural variants-even in genomes as well studied as rat and the great apes-and how these annotations improve cross-species RNA expression experiments.


Assuntos
Genoma Humano/genética , Algoritmos , Animais , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Anotação de Sequência Molecular/métodos , RNA/genética , Ratos
8.
Genet Med ; 21(2): 477-486, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-29955105

RESUMO

PURPOSE: Rh antigens can provoke severe alloimmune reactions, particularly in high-risk transfusion contexts, such as sickle cell disease. Rh antigens are encoded by the paralogs, RHD and RHCE, located in one of the most complex genetic loci. Our goal was to characterize RH genetic variation in multi-ethnic cohorts, with the focus on detecting RH structural variation (SV). METHODS: We customized analytical methods to estimate paralog-specific copy number from next-generation sequencing (NGS) data. We applied these methods to clinically characterized samples, including four World Health Organization (WHO) genotyping references and 1135 Asian and Native American blood donors. Subsequently, we surveyed 1715 African American samples from the Jackson Heart Study. RESULTS: Most samples in each dataset exhibited SV. SV detection enabled prediction of the immunogenic RhD and RhC antigens in concordance (>99%) with serological phenotyping. RhC antigen expression was associated with exon 2 hybrid alleles (RHCE*CE-D(2)-CE). Clinically relevant exon 4-7 hybrid alleles (RHD*D-CE(4-7)-D) and exon 9 hybrid alleles (RHCE*CE-D(9)-CE) were prevalent in African Americans. CONCLUSION: This study shows custom NGS methods can accurately detect RH SV, and that SV is important to inform prediction of relevant RH alleles. Additionally, this study provides the first large NGS survey of RH alleles in African Americans.


Assuntos
Anemia Falciforme/genética , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Sistema do Grupo Sanguíneo Rh-Hr/genética , Negro ou Afro-Americano/genética , Alelos , Anemia Falciforme/epidemiologia , Anemia Falciforme/fisiopatologia , Povo Asiático/genética , Variações do Número de Cópias de DNA/genética , Etnicidade/genética , Feminino , Variação Estrutural do Genoma/genética , Humanos , Indígenas Norte-Americanos/genética , Masculino , Sistema do Grupo Sanguíneo Rh-Hr/química , Sistema do Grupo Sanguíneo Rh-Hr/imunologia , Organização Mundial da Saúde
9.
Metrologia ; 56(1)2018.
Artigo em Inglês | MEDLINE | ID: mdl-32863435

RESUMO

A detailed analysis of the uncertainties obtained in ac-dc difference measurements with an AC Josephson Voltage Standard (ACJVS) is presented. For audio frequencies and for voltages less than 200 mV, ac-dc transfers with the ACJVS may reduce the combined uncertainty by factors of 2 to 10, compared with conventional methods based on thermal converters. Type A uncertainties are predominantly limited by the thermal transfer standard (TTS), or the digital voltmeter used to acquire the output voltage from the TTS. In agreement with earlier work, the transmission line is the primary contributor to Type B errors for frequencies above 10 kHz. A Monte Carlo sensitivity analysis is used to demonstrate how the uncertainties of transmission line impedance and on-chip inductance impact the accuracy of the rms amplitude conveyed to the TTS.

10.
Nucleic Acids Res ; 43(18): e116, 2015 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-26040699

RESUMO

We developed an innovative hybrid sequencing approach, IDP-fusion, to detect fusion genes, determine fusion sites and identify and quantify fusion isoforms. IDP-fusion is the first method to study gene fusion events by integrating Third Generation Sequencing long reads and Second Generation Sequencing short reads. We applied IDP-fusion to PacBio data and Illumina data from the MCF-7 breast cancer cells. Compared with the existing tools, IDP-fusion detects fusion genes at higher precision and a very low false positive rate. The results show that IDP-fusion will be useful for unraveling the complexity of multiple fusion splices and fusion isoforms within tumorigenesis-relevant fusion genes.


Assuntos
Carcinogênese/genética , Perfilação da Expressão Gênica , Fusão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Feminino , Humanos , Células MCF-7 , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Alinhamento de Sequência
11.
Proc Natl Acad Sci U S A ; 110(50): E4821-30, 2013 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-24282307

RESUMO

Although transcriptional and posttranscriptional events are detected in RNA-Seq data from second-generation sequencing, full-length mRNA isoforms are not captured. On the other hand, third-generation sequencing, which yields much longer reads, has current limitations of lower raw accuracy and throughput. Here, we combine second-generation sequencing and third-generation sequencing with a custom-designed method for isoform identification and quantification to generate a high-confidence isoform dataset for human embryonic stem cells (hESCs). We report 8,084 RefSeq-annotated isoforms detected as full-length and an additional 5,459 isoforms predicted through statistical inference. Over one-third of these are novel isoforms, including 273 RNAs from gene loci that have not previously been identified. Further characterization of the novel loci indicates that a subset is expressed in pluripotent cells but not in diverse fetal and adult tissues; moreover, their reduced expression perturbs the network of pluripotency-associated genes. Results suggest that gene identification, even in well-characterized human cell lines and tissues, is likely far from complete.


Assuntos
Processamento Alternativo/genética , Células-Tronco Embrionárias/metabolismo , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Isoformas de Proteínas/genética , Transcriptoma/genética , Células-Tronco Embrionárias/química , Humanos , Masculino
12.
Nat Methods ; 7(12): 995-1001, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-21057495

RESUMO

Classical approaches to determine structures of noncoding RNA (ncRNA) probed only one RNA at a time with enzymes and chemicals, using gel electrophoresis to identify reactive positions. To accelerate RNA structure inference, we developed fragmentation sequencing (FragSeq), a high-throughput RNA structure probing method that uses high-throughput RNA sequencing of fragments generated by digestion with nuclease P1, which specifically cleaves single-stranded nucleic acids. In experiments probing the entire mouse nuclear transcriptome, we accurately and simultaneously mapped single-stranded RNA regions in multiple ncRNAs with known structure. We probed in two cell types to verify reproducibility. We also identified and experimentally validated structured regions in ncRNAs with, to our knowledge, no previously reported probing data.


Assuntos
Perfilação da Expressão Gênica/métodos , RNA/química , RNA/genética , Animais , Pareamento de Bases , Sequência de Bases , Mapeamento Cromossômico/métodos , Primers do DNA , Biblioteca Gênica , Histonas/genética , Humanos , Camundongos , Modelos Moleculares , Dados de Sequência Molecular , Neurônios/fisiologia , Conformação de Ácido Nucleico , RNA não Traduzido/química
13.
bioRxiv ; 2023 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-37808736

RESUMO

Resolving the molecular basis of a Mendelian condition (MC) remains challenging owing to the diverse mechanisms by which genetic variants cause disease. To address this, we developed a synchronized long-read genome, methylome, epigenome, and transcriptome sequencing approach, which enables accurate single-nucleotide, insertion-deletion, and structural variant calling and diploid de novo genome assembly, and permits the simultaneous elucidation of haplotype-resolved CpG methylation, chromatin accessibility, and full-length transcript information in a single long-read sequencing run. Application of this approach to an Undiagnosed Diseases Network (UDN) participant with a chromosome X;13 balanced translocation of uncertain significance revealed that this translocation disrupted the functioning of four separate genes (NBEA, PDK3, MAB21L1, and RB1) previously associated with single-gene MCs. Notably, the function of each gene was disrupted via a distinct mechanism that required integration of the four 'omes' to resolve. These included nonsense-mediated decay, fusion transcript formation, enhancer adoption, transcriptional readthrough silencing, and inappropriate X chromosome inactivation of autosomal genes. Overall, this highlights the utility of synchronized long-read multi-omic profiling for mechanistically resolving complex phenotypes.

14.
G3 (Bethesda) ; 12(3)2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-35100340

RESUMO

Understanding hibernation in brown bears (Ursus arctos) can provide insight into some human diseases. During hibernation, brown bears experience periods of insulin resistance, physical inactivity, extreme bradycardia, obesity, and the absence of urine production. These states closely mimic aspects of human diseases such as type 2 diabetes, muscle atrophy, as well as renal and heart failure. The reversibility of these states from hibernation to active season enables the identification of mediators with possible therapeutic value for humans. Recent studies have identified genes and pathways that are differentially expressed between active and hibernation seasons in bears. However, little is known about the role of differential expression of gene isoforms on hibernation physiology. To identify both distinct and novel mRNA isoforms, full-length RNA-sequencing (Iso-Seq) was performed on adipose, skeletal muscle, and liver from three individual bears sampled during both active and hibernation seasons. The existing reference genome annotation was improved by combining it with the Iso-Seq data. Short-read RNA-sequencing data from six individuals were mapped to the new reference annotation to quantify differential isoform usage (DIU) between tissues and seasons. We identified differentially expressed isoforms in all three tissues, to varying degrees. Adipose had a high level of DIU with isoform switching, regardless of whether the genes were differentially expressed. Our analyses revealed that DIU, even in the absence of differential gene expression, is an important mechanism for modulating genes during hibernation. These findings demonstrate the value of isoform expression studies and will serve as the basis for deeper exploration into hibernation biology.


Assuntos
Diabetes Mellitus Tipo 2 , Regulação da Expressão Gênica , Hibernação , Ursidae , Tecido Adiposo/metabolismo , Animais , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Hibernação/genética , Humanos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Ursidae/genética , Ursidae/metabolismo
15.
Phys Rev Lett ; 107(25): 255504, 2011 Dec 16.
Artigo em Inglês | MEDLINE | ID: mdl-22243092

RESUMO

We have examined the role of the substrate on electron-phonon coupling in normal-metal films of Mn-doped Al at temperatures below 1 K. Normal metal-insulator-superconductor junctions were used to measure the electron temperature in the films as a function of Joule heating power and phonon temperature. Theory suggests that the distribution of phonons available for interaction with electrons in metal films may depend on the acoustic properties of the substrate, namely, that the electron-phonon coupling constant Σ would be larger on the substrate with smaller sound speed. In contrast, our results indicate that within experimental error (typically ±10%), Σ is unchanged among the two acoustically distinct substrates used in our investigation.

16.
Nat Commun ; 12(1): 5118, 2021 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-34433829

RESUMO

TRP channel-associated factor 1/2 (TCAF1/TCAF2) proteins antagonistically regulate the cold-sensor protein TRPM8 in multiple human tissues. Understanding their significance has been complicated given the locus spans a gap-ridden region with complex segmental duplications in GRCh38. Using long-read sequencing, we sequence-resolve the locus, annotate full-length TCAF models in primate genomes, and show substantial human-specific TCAF copy number variation. We identify two human super haplogroups, H4 and H5, and establish that TCAF duplications originated ~1.7 million years ago but diversified only in Homo sapiens by recurrent structural mutations. Conversely, in all archaic-hominin samples the fixation for a specific H4 haplotype without duplication is likely due to positive selection. Here, our results of TCAF copy number expansion, selection signals in hominins, and differential TCAF2 expression between haplogroups and high TCAF2 and TRPM8 expression in liver and prostate in modern-day humans imply TCAF diversification among hominins potentially in response to cold or dietary adaptations.


Assuntos
Duplicação Gênica , Hominidae/genética , Proteínas de Membrana/genética , Seleção Genética , Animais , Variações do Número de Cópias de DNA , Evolução Molecular , Genoma Humano , Haplótipos , Humanos , Homem de Neandertal , Filogenia
17.
Elife ; 92020 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-33263279

RESUMO

Our understanding of the beads-on-a-string arrangement of nucleosomes has been built largely on high-resolution sequence-agnostic imaging methods and sequence-resolved bulk biochemical techniques. To bridge the divide between these approaches, we present the single-molecule adenine methylated oligonucleosome sequencing assay (SAMOSA). SAMOSA is a high-throughput single-molecule sequencing method that combines adenine methyltransferase footprinting and single-molecule real-time DNA sequencing to natively and nondestructively measure nucleosome positions on individual chromatin fibres. SAMOSA data allows unbiased classification of single-molecular 'states' of nucleosome occupancy on individual chromatin fibres. We leverage this to estimate nucleosome regularity and spacing on single chromatin fibres genome-wide, at predicted transcription factor binding motifs, and across human epigenomic domains. Our analyses suggest that chromatin is comprised of both regular and irregular single-molecular oligonucleosome patterns that differ subtly in their relative abundance across epigenomic domains. This irregularity is particularly striking in constitutive heterochromatin, which has typically been viewed as a conformationally static entity. Our proof-of-concept study provides a powerful new methodology for studying nucleosome organization at a previously intractable resolution and offers up new avenues for modeling and visualizing higher order chromatin structure.


Assuntos
Cromatina/genética , DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala , Nucleossomos/genética , Imagem Individual de Molécula , Acetilação , Sítios de Ligação , Cromatina/química , Cromatina/metabolismo , DNA/química , DNA/metabolismo , Epigênese Genética , Histonas/química , Histonas/genética , Histonas/metabolismo , Humanos , Células K562 , Conformação de Ácido Nucleico , Nucleossomos/química , Nucleossomos/metabolismo , Estudo de Prova de Conceito , Conformação Proteica , Processamento de Proteína Pós-Traducional , DNA Metiltransferases Sítio Específica (Adenina-Específica)/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
18.
Nat Commun ; 11(1): 2326, 2020 05 11.
Artigo em Inglês | MEDLINE | ID: mdl-32393825

RESUMO

Most human protein-coding genes are expressed as multiple isoforms, which greatly expands the functional repertoire of the encoded proteome. While at least one reliable open reading frame (ORF) model has been assigned for every coding gene, the majority of alternative isoforms remains uncharacterized due to (i) vast differences of overall levels between different isoforms expressed from common genes, and (ii) the difficulty of obtaining full-length transcript sequences. Here, we present ORF Capture-Seq (OCS), a flexible method that addresses both challenges for targeted full-length isoform sequencing applications using collections of cloned ORFs as probes. As a proof-of-concept, we show that an OCS pipeline focused on genes coding for transcription factors increases isoform detection by an order of magnitude when compared to unenriched samples. In short, OCS enables rapid discovery of isoforms from custom-selected genes and will accelerate mapping of the human transcriptome.


Assuntos
Fases de Leitura Aberta/genética , Análise de Sequência de RNA/métodos , Humanos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Padrões de Referência , Fatores de Transcrição/genética
19.
Genome Biol ; 21(1): 202, 2020 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-32778141

RESUMO

BACKGROUND: The complex interspersed pattern of segmental duplications in humans is responsible for rearrangements associated with neurodevelopmental disease, including the emergence of novel genes important in human brain evolution. We investigate the evolution of LCR16a, a putative driver of this phenomenon that encodes one of the most rapidly evolving human-ape gene families, nuclear pore interacting protein (NPIP). RESULTS: Comparative analysis shows that LCR16a has independently expanded in five primate lineages over the last 35 million years of primate evolution. The expansions are associated with independent lineage-specific segmental duplications flanking LCR16a leading to the emergence of large interspersed duplication blocks at non-orthologous chromosomal locations in each primate lineage. The intron-exon structure of the NPIP gene family has changed dramatically throughout primate evolution with different branches showing characteristic gene models yet maintaining an open reading frame. In the African ape lineage, we detect signatures of positive selection that occurred after a transition to more ubiquitous expression among great ape tissues when compared to Old World and New World monkeys. Mouse transgenic experiments from baboon and human genomic loci confirm these expression differences and suggest that the broader ape expression pattern arose due to mutational changes that emerged in cis. CONCLUSIONS: LCR16a promotes serial interspersed duplications and creates hotspots of genomic instability that appear to be an ancient property of primate genomes. Dramatic changes to NPIP gene structure and altered tissue expression preceded major bouts of positive selection in the African ape lineage, suggestive of a gene undergoing strong adaptive evolution.


Assuntos
Evolução Molecular , Duplicação Gênica , Primatas/genética , Duplicações Segmentares Genômicas , Animais , Biodiversidade , Encéfalo , Mapeamento Cromossômico , Cromossomos , Éxons , Fusão Gênica , Genoma Humano , Instabilidade Genômica , Hominidae , Humanos , Filogenia
20.
Science ; 370(6523)2020 12 18.
Artigo em Inglês | MEDLINE | ID: mdl-33335035

RESUMO

The rhesus macaque (Macaca mulatta) is the most widely studied nonhuman primate (NHP) in biomedical research. We present an updated reference genome assembly (Mmul_10, contig N50 = 46 Mbp) that increases the sequence contiguity 120-fold and annotate it using 6.5 million full-length transcripts, thus improving our understanding of gene content, isoform diversity, and repeat organization. With the improved assembly of segmental duplications, we discovered new lineage-specific genes and expanded gene families that are potentially informative in studies of evolution and disease susceptibility. Whole-genome sequencing (WGS) data from 853 rhesus macaques identified 85.7 million single-nucleotide variants (SNVs) and 10.5 million indel variants, including potentially damaging variants in genes associated with human autism and developmental delay, providing a framework for developing noninvasive NHP models of human disease.


Assuntos
Predisposição Genética para Doença , Genoma , Macaca mulatta/genética , Polimorfismo de Nucleotídeo Único , Animais , Variação Genética , Humanos , Anotação de Sequência Molecular , Sequenciamento Completo do Genoma
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa