Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 40
Filtrar
1.
Genome Res ; 2024 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-38906680

RESUMO

Transcription and translation are intertwined processes where mRNA isoforms are crucial intermediaries. However, methodological limitations in analyzing translation at the mRNA isoform level have left gaps in our understanding of critical biological processes. To address these gaps, we developed an integrated computational and experimental framework called long-read Ribo-STAMP (LR-Ribo-STAMP) that capitalizes on advancements in long-read sequencing and RNA-base editing-mediated technologies to simultaneously profile translation and transcription at both gene and mRNA isoform levels. We also developed the EditsC metric to quantify editing and leverage the single-molecule, full-length transcript information provided by long-read sequencing. Here, we report concordance between gene-level translation profiles obtained with long-read and short-read Ribo-STAMP. We show that LR-Ribo-STAMP successfully profiles translation of mRNA isoforms and links regulatory features, such as upstream open reading frames (uORFs), to translation measurements. We apply LR-Ribo-STAMP to discovering translational differences at both gene and isoform levels in a triple-negative breast cancer cell line under normoxia and hypoxia and find that LR-Ribo-STAMP effectively delineates orthogonal transcriptional and translation shifts between conditions. We also discover regulatory elements that distinguish translational differences at the isoform level. We highlight GRK6, where hypoxia is observed to increase expression and translation of a shorter mRNA isoform, giving rise to a truncated protein without the AGC Kinase domain. Overall, LR-Ribo-STAMP is an important advance in our repertoire of methods that measure mRNA translation with isoform sensitivity.

2.
Nature ; 594(7861): 77-81, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33953399

RESUMO

The divergence of chimpanzee and bonobo provides one of the few examples of recent hominid speciation1,2. Here we describe a fully annotated, high-quality bonobo genome assembly, which was constructed without guidance from reference genomes by applying a multiplatform genomics approach. We generate a bonobo genome assembly in which more than 98% of genes are completely annotated and 99% of the gaps are closed, including the resolution of about half of the segmental duplications and almost all of the full-length mobile elements. We compare the bonobo genome to those of other great apes1,3-5 and identify more than 5,569 fixed structural variants that specifically distinguish the bonobo and chimpanzee lineages. We focus on genes that have been lost, changed in structure or expanded in the last few million years of bonobo evolution. We produce a high-resolution map of incomplete lineage sorting and estimate that around 5.1% of the human genome is genetically closer to chimpanzee or bonobo and that more than 36.5% of the genome shows incomplete lineage sorting if we consider a deeper phylogeny including gorilla and orangutan. We also show that 26% of the segments of incomplete lineage sorting between human and chimpanzee or human and bonobo are non-randomly distributed and that genes within these clustered segments show significant excess of amino acid replacement compared to the rest of the genome.


Assuntos
Evolução Molecular , Genoma/genética , Genômica , Pan paniscus/genética , Filogenia , Animais , Fator de Iniciação 4A em Eucariotos/genética , Feminino , Genes , Gorilla gorilla/genética , Anotação de Sequência Molecular/normas , Pan troglodytes/genética , Pongo/genética , Duplicações Segmentares Genômicas , Análise de Sequência de DNA
3.
Nat Methods ; 18(5): 507-519, 2021 05.
Artigo em Inglês | MEDLINE | ID: mdl-33963355

RESUMO

RNA-binding proteins (RBPs) are critical regulators of gene expression and RNA processing that are required for gene function. Yet the dynamics of RBP regulation in single cells is unknown. To address this gap in understanding, we developed STAMP (Surveying Targets by APOBEC-Mediated Profiling), which efficiently detects RBP-RNA interactions. STAMP does not rely on ultraviolet cross-linking or immunoprecipitation and, when coupled with single-cell capture, can identify RBP-specific and cell-type-specific RNA-protein interactions for multiple RBPs and cell types in single, pooled experiments. Pairing STAMP with long-read sequencing yields RBP target sites in an isoform-specific manner. Finally, Ribo-STAMP leverages small ribosomal subunits to measure transcriptome-wide ribosome association in single cells. STAMP enables the study of RBP-RNA interactomes and translational landscapes with unprecedented cellular resolution.


Assuntos
Proteínas de Ligação a RNA/metabolismo , RNA/metabolismo , Análise de Célula Única/métodos , Animais , Sítios de Ligação , Perfilação da Expressão Gênica , Células HEK293 , Humanos , Sequenciamento por Nanoporos , RNA/química , Proteínas de Ligação a RNA/química , Análise de Sequência de RNA , Transcriptoma
4.
Nucleic Acids Res ; 50(14): 7801-7815, 2022 08 12.
Artigo em Inglês | MEDLINE | ID: mdl-35253883

RESUMO

Centromeres are the chromosomal loci essential for faithful chromosome segregation during cell division. Although centromeres are transcribed and produce non-coding RNAs (cenRNAs) that affect centromere function, we still lack a mechanistic understanding of how centromere transcription is regulated. Here, using a targeted RNA isoform sequencing approach, we identified the transcriptional landscape at and surrounding all centromeres in budding yeast. Overall, cenRNAs are derived from transcription readthrough of pericentromeric regions but rarely span the entire centromere and are a complex mixture of molecules that are heterogeneous in abundance, orientation, and sequence. While most pericentromeres are transcribed throughout the cell cycle, centromere accessibility to the transcription machinery is restricted to S-phase. This temporal restriction is dependent on Cbf1, a centromere-binding transcription factor, that we demonstrate acts locally as a transcriptional roadblock. Cbf1 deletion leads to an accumulation of cenRNAs at all phases of the cell cycle which correlates with increased chromosome mis-segregation that is partially rescued when the roadblock activity is restored. We propose that a Cbf1-mediated transcriptional roadblock protects yeast centromeres from untimely transcription to ensure genomic stability.


Centromeres are essential chromosomal regions that do not encode gene products and instead ensure the accurate partitioning of chromosomes during cell division. Despite the lack of genes, transcription has been detected at centromeres. It has not been clear where this centromeric RNA comes from and how it is regulated. In this study, the authors identified all of the centromeric RNAs at and around budding yeast centromeres during the cell cycle. Unlike RNAs that encode for proteins, centromeric RNAs are a complex mixture of transcripts that result from adjacent RNAs that continue into the centromere. The authors found that most transcription is blocked at the centromere border by a protein called Cbf1. This mechanism shields the centromere from untimely transcription to ensure genome stability.


Assuntos
Centrômero , Proteínas de Saccharomyces cerevisiae , Fatores de Transcrição de Zíper de Leucina e Hélice-Alça-Hélix Básicos/metabolismo , Centrômero/genética , Centrômero/metabolismo , Segregação de Cromossomos/genética , Saccharomyces cerevisiae/genética , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Transcrição Gênica
5.
Genome Res ; 28(10): 1566-1576, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-30228200

RESUMO

Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth-death process of duplicate genes and particularly how the process leads to gene innovation. In 48% of the cases, we find that the expressed duplicates have changed substantially from their ancestral models due to novel sites of transcription initiation, splicing, and polyadenylation, as well as fusion transcripts that connect duplication-derived exons with neighboring genes. We detect unannotated open reading frames in genes currently annotated as pseudogenes, while relegating other duplicates to nonfunctional status. Our method significantly improves gene annotation, specifically defining full-length transcripts, isoforms, and open reading frames for new genes in highly identical SDs. The approach will be more broadly applicable to genes in structurally complex regions of other genomes where the duplication process creates novel genes important for adaptive traits.


Assuntos
Encéfalo/metabolismo , Duplicações Segmentares Genômicas , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodos , Evolução Molecular , Duplicação Gênica , Perfilação da Expressão Gênica , Humanos , Anotação de Sequência Molecular , Família Multigênica , Fases de Leitura Aberta , Pseudogenes
6.
Genome Res ; 28(7): 1029-1038, 2018 07.
Artigo em Inglês | MEDLINE | ID: mdl-29884752

RESUMO

The recent introductions of low-cost, long-read, and read-cloud sequencing technologies coupled with intense efforts to develop efficient algorithms have made affordable, high-quality de novo sequence assembly a realistic proposition. The result is an explosion of new, ultracontiguous genome assemblies. To compare these genomes, we need robust methods for genome annotation. We describe the fully open source Comparative Annotation Toolkit (CAT), which provides a flexible way to simultaneously annotate entire clades and identify orthology relationships. We show that CAT can be used to improve annotations on the rat genome, annotate the great apes, annotate a diverse set of mammals, and annotate personal, diploid human genomes. We demonstrate the resulting discovery of novel genes, isoforms, and structural variants-even in genomes as well studied as rat and the great apes-and how these annotations improve cross-species RNA expression experiments.


Assuntos
Genoma Humano/genética , Algoritmos , Animais , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Anotação de Sequência Molecular/métodos , RNA/genética , Ratos
7.
Genet Med ; 21(2): 477-486, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-29955105

RESUMO

PURPOSE: Rh antigens can provoke severe alloimmune reactions, particularly in high-risk transfusion contexts, such as sickle cell disease. Rh antigens are encoded by the paralogs, RHD and RHCE, located in one of the most complex genetic loci. Our goal was to characterize RH genetic variation in multi-ethnic cohorts, with the focus on detecting RH structural variation (SV). METHODS: We customized analytical methods to estimate paralog-specific copy number from next-generation sequencing (NGS) data. We applied these methods to clinically characterized samples, including four World Health Organization (WHO) genotyping references and 1135 Asian and Native American blood donors. Subsequently, we surveyed 1715 African American samples from the Jackson Heart Study. RESULTS: Most samples in each dataset exhibited SV. SV detection enabled prediction of the immunogenic RhD and RhC antigens in concordance (>99%) with serological phenotyping. RhC antigen expression was associated with exon 2 hybrid alleles (RHCE*CE-D(2)-CE). Clinically relevant exon 4-7 hybrid alleles (RHD*D-CE(4-7)-D) and exon 9 hybrid alleles (RHCE*CE-D(9)-CE) were prevalent in African Americans. CONCLUSION: This study shows custom NGS methods can accurately detect RH SV, and that SV is important to inform prediction of relevant RH alleles. Additionally, this study provides the first large NGS survey of RH alleles in African Americans.


Assuntos
Anemia Falciforme/genética , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Sistema do Grupo Sanguíneo Rh-Hr/genética , Negro ou Afro-Americano/genética , Alelos , Anemia Falciforme/epidemiologia , Anemia Falciforme/fisiopatologia , Povo Asiático/genética , Variações do Número de Cópias de DNA/genética , Etnicidade/genética , Feminino , Variação Estrutural do Genoma/genética , Humanos , Indígenas Norte-Americanos/genética , Masculino , Sistema do Grupo Sanguíneo Rh-Hr/química , Sistema do Grupo Sanguíneo Rh-Hr/imunologia , Organização Mundial da Saúde
8.
Nucleic Acids Res ; 43(18): e116, 2015 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-26040699

RESUMO

We developed an innovative hybrid sequencing approach, IDP-fusion, to detect fusion genes, determine fusion sites and identify and quantify fusion isoforms. IDP-fusion is the first method to study gene fusion events by integrating Third Generation Sequencing long reads and Second Generation Sequencing short reads. We applied IDP-fusion to PacBio data and Illumina data from the MCF-7 breast cancer cells. Compared with the existing tools, IDP-fusion detects fusion genes at higher precision and a very low false positive rate. The results show that IDP-fusion will be useful for unraveling the complexity of multiple fusion splices and fusion isoforms within tumorigenesis-relevant fusion genes.


Assuntos
Carcinogênese/genética , Perfilação da Expressão Gênica , Fusão Gênica , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Neoplasias da Mama/genética , Neoplasias da Mama/metabolismo , Feminino , Humanos , Células MCF-7 , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Alinhamento de Sequência
9.
Proc Natl Acad Sci U S A ; 110(50): E4821-30, 2013 Dec 10.
Artigo em Inglês | MEDLINE | ID: mdl-24282307

RESUMO

Although transcriptional and posttranscriptional events are detected in RNA-Seq data from second-generation sequencing, full-length mRNA isoforms are not captured. On the other hand, third-generation sequencing, which yields much longer reads, has current limitations of lower raw accuracy and throughput. Here, we combine second-generation sequencing and third-generation sequencing with a custom-designed method for isoform identification and quantification to generate a high-confidence isoform dataset for human embryonic stem cells (hESCs). We report 8,084 RefSeq-annotated isoforms detected as full-length and an additional 5,459 isoforms predicted through statistical inference. Over one-third of these are novel isoforms, including 273 RNAs from gene loci that have not previously been identified. Further characterization of the novel loci indicates that a subset is expressed in pluripotent cells but not in diverse fetal and adult tissues; moreover, their reduced expression perturbs the network of pluripotency-associated genes. Results suggest that gene identification, even in well-characterized human cell lines and tissues, is likely far from complete.


Assuntos
Processamento Alternativo/genética , Células-Tronco Embrionárias/metabolismo , Perfilação da Expressão Gênica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Isoformas de Proteínas/genética , Transcriptoma/genética , Células-Tronco Embrionárias/química , Humanos , Masculino
10.
Nat Methods ; 7(12): 995-1001, 2010 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-21057495

RESUMO

Classical approaches to determine structures of noncoding RNA (ncRNA) probed only one RNA at a time with enzymes and chemicals, using gel electrophoresis to identify reactive positions. To accelerate RNA structure inference, we developed fragmentation sequencing (FragSeq), a high-throughput RNA structure probing method that uses high-throughput RNA sequencing of fragments generated by digestion with nuclease P1, which specifically cleaves single-stranded nucleic acids. In experiments probing the entire mouse nuclear transcriptome, we accurately and simultaneously mapped single-stranded RNA regions in multiple ncRNAs with known structure. We probed in two cell types to verify reproducibility. We also identified and experimentally validated structured regions in ncRNAs with, to our knowledge, no previously reported probing data.


Assuntos
Perfilação da Expressão Gênica/métodos , RNA/química , RNA/genética , Animais , Pareamento de Bases , Sequência de Bases , Mapeamento Cromossômico/métodos , Primers do DNA , Biblioteca Gênica , Histonas/genética , Humanos , Camundongos , Modelos Moleculares , Dados de Sequência Molecular , Neurônios/fisiologia , Conformação de Ácido Nucleico , RNA não Traduzido/química
11.
bioRxiv ; 2023 Sep 27.
Artigo em Inglês | MEDLINE | ID: mdl-37808736

RESUMO

Resolving the molecular basis of a Mendelian condition (MC) remains challenging owing to the diverse mechanisms by which genetic variants cause disease. To address this, we developed a synchronized long-read genome, methylome, epigenome, and transcriptome sequencing approach, which enables accurate single-nucleotide, insertion-deletion, and structural variant calling and diploid de novo genome assembly, and permits the simultaneous elucidation of haplotype-resolved CpG methylation, chromatin accessibility, and full-length transcript information in a single long-read sequencing run. Application of this approach to an Undiagnosed Diseases Network (UDN) participant with a chromosome X;13 balanced translocation of uncertain significance revealed that this translocation disrupted the functioning of four separate genes (NBEA, PDK3, MAB21L1, and RB1) previously associated with single-gene MCs. Notably, the function of each gene was disrupted via a distinct mechanism that required integration of the four 'omes' to resolve. These included nonsense-mediated decay, fusion transcript formation, enhancer adoption, transcriptional readthrough silencing, and inappropriate X chromosome inactivation of autosomal genes. Overall, this highlights the utility of synchronized long-read multi-omic profiling for mechanistically resolving complex phenotypes.

12.
G3 (Bethesda) ; 12(3)2022 03 04.
Artigo em Inglês | MEDLINE | ID: mdl-35100340

RESUMO

Understanding hibernation in brown bears (Ursus arctos) can provide insight into some human diseases. During hibernation, brown bears experience periods of insulin resistance, physical inactivity, extreme bradycardia, obesity, and the absence of urine production. These states closely mimic aspects of human diseases such as type 2 diabetes, muscle atrophy, as well as renal and heart failure. The reversibility of these states from hibernation to active season enables the identification of mediators with possible therapeutic value for humans. Recent studies have identified genes and pathways that are differentially expressed between active and hibernation seasons in bears. However, little is known about the role of differential expression of gene isoforms on hibernation physiology. To identify both distinct and novel mRNA isoforms, full-length RNA-sequencing (Iso-Seq) was performed on adipose, skeletal muscle, and liver from three individual bears sampled during both active and hibernation seasons. The existing reference genome annotation was improved by combining it with the Iso-Seq data. Short-read RNA-sequencing data from six individuals were mapped to the new reference annotation to quantify differential isoform usage (DIU) between tissues and seasons. We identified differentially expressed isoforms in all three tissues, to varying degrees. Adipose had a high level of DIU with isoform switching, regardless of whether the genes were differentially expressed. Our analyses revealed that DIU, even in the absence of differential gene expression, is an important mechanism for modulating genes during hibernation. These findings demonstrate the value of isoform expression studies and will serve as the basis for deeper exploration into hibernation biology.


Assuntos
Diabetes Mellitus Tipo 2 , Regulação da Expressão Gênica , Hibernação , Ursidae , Tecido Adiposo/metabolismo , Animais , Diabetes Mellitus Tipo 2/genética , Diabetes Mellitus Tipo 2/metabolismo , Hibernação/genética , Humanos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , Ursidae/genética , Ursidae/metabolismo
13.
Nat Commun ; 12(1): 5118, 2021 08 25.
Artigo em Inglês | MEDLINE | ID: mdl-34433829

RESUMO

TRP channel-associated factor 1/2 (TCAF1/TCAF2) proteins antagonistically regulate the cold-sensor protein TRPM8 in multiple human tissues. Understanding their significance has been complicated given the locus spans a gap-ridden region with complex segmental duplications in GRCh38. Using long-read sequencing, we sequence-resolve the locus, annotate full-length TCAF models in primate genomes, and show substantial human-specific TCAF copy number variation. We identify two human super haplogroups, H4 and H5, and establish that TCAF duplications originated ~1.7 million years ago but diversified only in Homo sapiens by recurrent structural mutations. Conversely, in all archaic-hominin samples the fixation for a specific H4 haplotype without duplication is likely due to positive selection. Here, our results of TCAF copy number expansion, selection signals in hominins, and differential TCAF2 expression between haplogroups and high TCAF2 and TRPM8 expression in liver and prostate in modern-day humans imply TCAF diversification among hominins potentially in response to cold or dietary adaptations.


Assuntos
Duplicação Gênica , Hominidae/genética , Proteínas de Membrana/genética , Seleção Genética , Animais , Variações do Número de Cópias de DNA , Evolução Molecular , Genoma Humano , Haplótipos , Humanos , Homem de Neandertal , Filogenia
14.
Elife ; 92020 12 02.
Artigo em Inglês | MEDLINE | ID: mdl-33263279

RESUMO

Our understanding of the beads-on-a-string arrangement of nucleosomes has been built largely on high-resolution sequence-agnostic imaging methods and sequence-resolved bulk biochemical techniques. To bridge the divide between these approaches, we present the single-molecule adenine methylated oligonucleosome sequencing assay (SAMOSA). SAMOSA is a high-throughput single-molecule sequencing method that combines adenine methyltransferase footprinting and single-molecule real-time DNA sequencing to natively and nondestructively measure nucleosome positions on individual chromatin fibres. SAMOSA data allows unbiased classification of single-molecular 'states' of nucleosome occupancy on individual chromatin fibres. We leverage this to estimate nucleosome regularity and spacing on single chromatin fibres genome-wide, at predicted transcription factor binding motifs, and across human epigenomic domains. Our analyses suggest that chromatin is comprised of both regular and irregular single-molecular oligonucleosome patterns that differ subtly in their relative abundance across epigenomic domains. This irregularity is particularly striking in constitutive heterochromatin, which has typically been viewed as a conformationally static entity. Our proof-of-concept study provides a powerful new methodology for studying nucleosome organization at a previously intractable resolution and offers up new avenues for modeling and visualizing higher order chromatin structure.


Assuntos
Cromatina/genética , DNA/genética , Sequenciamento de Nucleotídeos em Larga Escala , Nucleossomos/genética , Imagem Individual de Molécula , Acetilação , Sítios de Ligação , Cromatina/química , Cromatina/metabolismo , DNA/química , DNA/metabolismo , Epigênese Genética , Histonas/química , Histonas/genética , Histonas/metabolismo , Humanos , Células K562 , Conformação de Ácido Nucleico , Nucleossomos/química , Nucleossomos/metabolismo , Estudo de Prova de Conceito , Conformação Proteica , Processamento de Proteína Pós-Traducional , DNA Metiltransferases Sítio Específica (Adenina-Específica)/metabolismo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo
15.
Nat Commun ; 11(1): 2326, 2020 05 11.
Artigo em Inglês | MEDLINE | ID: mdl-32393825

RESUMO

Most human protein-coding genes are expressed as multiple isoforms, which greatly expands the functional repertoire of the encoded proteome. While at least one reliable open reading frame (ORF) model has been assigned for every coding gene, the majority of alternative isoforms remains uncharacterized due to (i) vast differences of overall levels between different isoforms expressed from common genes, and (ii) the difficulty of obtaining full-length transcript sequences. Here, we present ORF Capture-Seq (OCS), a flexible method that addresses both challenges for targeted full-length isoform sequencing applications using collections of cloned ORFs as probes. As a proof-of-concept, we show that an OCS pipeline focused on genes coding for transcription factors increases isoform detection by an order of magnitude when compared to unenriched samples. In short, OCS enables rapid discovery of isoforms from custom-selected genes and will accelerate mapping of the human transcriptome.


Assuntos
Fases de Leitura Aberta/genética , Análise de Sequência de RNA/métodos , Humanos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismo , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Padrões de Referência , Fatores de Transcrição/genética
16.
Genome Biol ; 21(1): 202, 2020 08 10.
Artigo em Inglês | MEDLINE | ID: mdl-32778141

RESUMO

BACKGROUND: The complex interspersed pattern of segmental duplications in humans is responsible for rearrangements associated with neurodevelopmental disease, including the emergence of novel genes important in human brain evolution. We investigate the evolution of LCR16a, a putative driver of this phenomenon that encodes one of the most rapidly evolving human-ape gene families, nuclear pore interacting protein (NPIP). RESULTS: Comparative analysis shows that LCR16a has independently expanded in five primate lineages over the last 35 million years of primate evolution. The expansions are associated with independent lineage-specific segmental duplications flanking LCR16a leading to the emergence of large interspersed duplication blocks at non-orthologous chromosomal locations in each primate lineage. The intron-exon structure of the NPIP gene family has changed dramatically throughout primate evolution with different branches showing characteristic gene models yet maintaining an open reading frame. In the African ape lineage, we detect signatures of positive selection that occurred after a transition to more ubiquitous expression among great ape tissues when compared to Old World and New World monkeys. Mouse transgenic experiments from baboon and human genomic loci confirm these expression differences and suggest that the broader ape expression pattern arose due to mutational changes that emerged in cis. CONCLUSIONS: LCR16a promotes serial interspersed duplications and creates hotspots of genomic instability that appear to be an ancient property of primate genomes. Dramatic changes to NPIP gene structure and altered tissue expression preceded major bouts of positive selection in the African ape lineage, suggestive of a gene undergoing strong adaptive evolution.


Assuntos
Evolução Molecular , Duplicação Gênica , Primatas/genética , Duplicações Segmentares Genômicas , Animais , Biodiversidade , Encéfalo , Mapeamento Cromossômico , Cromossomos , Éxons , Fusão Gênica , Genoma Humano , Instabilidade Genômica , Hominidae , Humanos , Filogenia
17.
Science ; 370(6523)2020 12 18.
Artigo em Inglês | MEDLINE | ID: mdl-33335035

RESUMO

The rhesus macaque (Macaca mulatta) is the most widely studied nonhuman primate (NHP) in biomedical research. We present an updated reference genome assembly (Mmul_10, contig N50 = 46 Mbp) that increases the sequence contiguity 120-fold and annotate it using 6.5 million full-length transcripts, thus improving our understanding of gene content, isoform diversity, and repeat organization. With the improved assembly of segmental duplications, we discovered new lineage-specific genes and expanded gene families that are potentially informative in studies of evolution and disease susceptibility. Whole-genome sequencing (WGS) data from 853 rhesus macaques identified 85.7 million single-nucleotide variants (SNVs) and 10.5 million indel variants, including potentially damaging variants in genes associated with human autism and developmental delay, providing a framework for developing noninvasive NHP models of human disease.


Assuntos
Predisposição Genética para Doença , Genoma , Macaca mulatta/genética , Polimorfismo de Nucleotídeo Único , Animais , Variação Genética , Humanos , Anotação de Sequência Molecular , Sequenciamento Completo do Genoma
18.
Science ; 366(6463)2019 10 18.
Artigo em Inglês | MEDLINE | ID: mdl-31624180

RESUMO

Copy number variants (CNVs) are subject to stronger selective pressure than single-nucleotide variants, but their roles in archaic introgression and adaptation have not been systematically investigated. We show that stratified CNVs are significantly associated with signatures of positive selection in Melanesians and provide evidence for adaptive introgression of large CNVs at chromosomes 16p11.2 and 8p21.3 from Denisovans and Neanderthals, respectively. Using long-read sequence data, we reconstruct the structure and complex evolutionary history of these polymorphisms and show that both encode positively selected genes absent from most human populations. Our results collectively suggest that large CNVs originating in archaic hominins and introgressed into modern humans have played an important role in local population adaptation and represent an insufficiently studied source of large-scale genetic variation.


Assuntos
Introgressão Genética , Animais , Duplicação Cromossômica , Cromossomos Humanos Par 16/genética , Cromossomos Humanos Par 8/genética , Variações do Número de Cópias de DNA , Evolução Molecular , Genoma Humano , Haplótipos , Hominidae/genética , Humanos , Melanesia , Modelos Genéticos , Homem de Neandertal/genética , Polimorfismo Genético , Seleção Genética , Sequenciamento Completo do Genoma
19.
Mol Cell Biol ; 25(22): 10005-16, 2005 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-16260614

RESUMO

A vertebrate homologue of the Fox-1 protein from C. elegans was recently shown to bind to the element GCAUG and to act as an inhibitor of alternative splicing patterns in muscle. The element UGCAUG is a splicing enhancer element found downstream of numerous neuron-specific exons. We show here that mouse Fox-1 (mFox-1) and another homologue, Fox-2, are both specifically expressed in neurons in addition to muscle and heart. The mammalian Fox genes are very complex transcription units that generate transcripts from multiple promoters and with multiple internal exons whose inclusion is regulated. These genes produce a large family of proteins with variable N and C termini and internal deletions. We show that the overexpression of both Fox-1 and Fox-2 isoforms specifically activates splicing of neuronally regulated exons. This splicing activation requires UGCAUG enhancer elements. Conversely, RNA interference-mediated knockdown of Fox protein expression inhibits splicing of UGCAUG-dependent exons. These experiments show that this large family of proteins regulates splicing in the nervous system. They do this through a splicing enhancer function, in addition to their apparent negative effects on splicing in vertebrate muscle and in worms.


Assuntos
Proteínas de Caenorhabditis elegans/fisiologia , Proteínas de Transporte/fisiologia , Neurônios/metabolismo , Splicing de RNA , Proteínas de Ligação a RNA/fisiologia , Proteínas Repressoras/fisiologia , Animais , Northern Blotting , Western Blotting , Encéfalo/metabolismo , Caenorhabditis elegans , Proteínas de Caenorhabditis elegans/genética , Proteínas de Transporte/genética , Reagentes de Ligações Cruzadas/farmacologia , Elementos Facilitadores Genéticos , Éxons , Fibronectinas/metabolismo , Regulação da Expressão Gênica , Células HeLa , Hipocampo/metabolismo , Humanos , Imuno-Histoquímica , Íntrons , Camundongos , Modelos Genéticos , Músculos/metabolismo , Plasmídeos/metabolismo , Regiões Promotoras Genéticas , Ligação Proteica , Isoformas de Proteínas , Estrutura Terciária de Proteína , Interferência de RNA , Fatores de Processamento de RNA , RNA Mensageiro/metabolismo , RNA Interferente Pequeno/metabolismo , Proteínas de Ligação a RNA/genética , Proteínas Repressoras/genética , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Distribuição Tecidual , Transfecção
20.
Science ; 360(6393)2018 06 08.
Artigo em Inglês | MEDLINE | ID: mdl-29880660

RESUMO

Genetic studies of human evolution require high-quality contiguous ape genome assemblies that are not guided by the human reference. We coupled long-read sequence assembly and full-length complementary DNA sequencing with a multiplatform scaffolding approach to produce ab initio chimpanzee and orangutan genome assemblies. By comparing these with two long-read de novo human genome assemblies and a gorilla genome assembly, we characterized lineage-specific and shared great ape genetic variation ranging from single- to mega-base pair-sized variants. We identified ~17,000 fixed human-specific structural variants identifying genic and putative regulatory changes that have emerged in humans since divergence from nonhuman apes. Interestingly, these variants are enriched near genes that are down-regulated in human compared to chimpanzee cerebral organoids, particularly in cells analogous to radial glial neural progenitors.


Assuntos
Evolução Molecular , Genoma Humano , Hominidae/genética , Animais , Mapeamento de Sequências Contíguas , Variação Genética , Humanos , Anotação de Sequência Molecular , Análise de Sequência de DNA
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA