Pesquisa | BVS Integralidade em Saúde

1.

Disruptive CHD8 mutations define a subtype of autism early in development.

Bernier, Raphael; Golzio, Christelle; Xiong, Bo; Stessman, Holly A; Coe, Bradley P; Penn, Osnat; Witherspoon, Kali; Gerdts, Jennifer; Baker, Carl; Vulto-van Silfhout, Anneke T; Schuurs-Hoeijmakers, Janneke H; Fichera, Marco; Bosco, Paolo; Buono, Serafino; Alberti, Antonino; Failla, Pinella; Peeters, Hilde; Steyaert, Jean; Vissers, Lisenka E L M; Francescatto, Ludmila; Mefford, Heather C; Rosenfeld, Jill A; Bakken, Trygve; O'Roak, Brian J; Pawlus, Matthew; Moon, Randall; Shendure, Jay; Amaral, David G; Lein, Ed; Rankin, Julia; Romano, Corrado; de Vries, Bert B A; Katsanis, Nicholas; Eichler, Evan E.

Cell ; 158(2): 263-276, 2014 Jul 17.

Artigo em Inglês | MEDLINE | ID: mdl-24998929

RESUMO

Autism spectrum disorder (ASD) is a heterogeneous disease in which efforts to define subtypes behaviorally have met with limited success. Hypothesizing that genetically based subtype identification may prove more productive, we resequenced the ASD-associated gene CHD8 in 3,730 children with developmental delay or ASD. We identified a total of 15 independent mutations; no truncating events were identified in 8,792 controls, including 2,289 unaffected siblings. In addition to a high likelihood of an ASD diagnosis among patients bearing CHD8 mutations, characteristics enriched in this group included macrocephaly, distinct faces, and gastrointestinal complaints. chd8 disruption in zebrafish recapitulates features of the human phenotype, including increased head size as a result of expansion of the forebrain/midbrain and impairment of gastrointestinal motility due to a reduction in postmitotic enteric neurons. Our findings indicate that CHD8 disruptions define a distinct ASD subtype and reveal unexpected comorbidities between brain development and enteric innervation.

Assuntos

Transtornos Globais do Desenvolvimento Infantil/genética , Transtornos Globais do Desenvolvimento Infantil/fisiopatologia , Proteínas de Ligação a DNA/genética , Fatores de Transcrição/genética , Adolescente , Sequência de Aminoácidos , Animais , Encéfalo/crescimento & desenvolvimento , Encéfalo/patologia , Criança , Transtornos Globais do Desenvolvimento Infantil/classificação , Transtornos Globais do Desenvolvimento Infantil/patologia , Pré-Escolar , Proteínas de Ligação a DNA/metabolismo , Feminino , Trato Gastrointestinal/inervação , Trato Gastrointestinal/fisiopatologia , Humanos , Macaca mulatta , Masculino , Megalencefalia/patologia , Dados de Sequência Molecular , Mutação , Alinhamento de Sequência , Fatores de Transcrição/metabolismo , Peixe-Zebra , Proteínas de Peixe-Zebra/genética , Proteínas de Peixe-Zebra/metabolismo

2.

Conserved cell types with divergent features in human versus mouse cortex.

Hodge, Rebecca D; Bakken, Trygve E; Miller, Jeremy A; Smith, Kimberly A; Barkan, Eliza R; Graybuck, Lucas T; Close, Jennie L; Long, Brian; Johansen, Nelson; Penn, Osnat; Yao, Zizhen; Eggermont, Jeroen; Höllt, Thomas; Levi, Boaz P; Shehata, Soraya I; Aevermann, Brian; Beller, Allison; Bertagnolli, Darren; Brouner, Krissy; Casper, Tamara; Cobbs, Charles; Dalley, Rachel; Dee, Nick; Ding, Song-Lin; Ellenbogen, Richard G; Fong, Olivia; Garren, Emma; Goldy, Jeff; Gwinn, Ryder P; Hirschstein, Daniel; Keene, C Dirk; Keshk, Mohamed; Ko, Andrew L; Lathia, Kanan; Mahfouz, Ahmed; Maltzer, Zoe; McGraw, Medea; Nguyen, Thuc Nghi; Nyhus, Julie; Ojemann, Jeffrey G; Oldre, Aaron; Parry, Sheana; Reynolds, Shannon; Rimorin, Christine; Shapovalova, Nadiya V; Somasundaram, Saroja; Szafer, Aaron; Thomsen, Elliot R; Tieu, Michael; Quon, Gerald.

Nature ; 573(7772): 61-68, 2019 09.

Artigo em Inglês | MEDLINE | ID: mdl-31435019

RESUMO

Elucidating the cellular architecture of the human cerebral cortex is central to understanding our cognitive abilities and susceptibility to disease. Here we used single-nucleus RNA-sequencing analysis to perform a comprehensive study of cell types in the middle temporal gyrus of human cortex. We identified a highly diverse set of excitatory and inhibitory neuron types that are mostly sparse, with excitatory types being less layer-restricted than expected. Comparison to similar mouse cortex single-cell RNA-sequencing datasets revealed a surprisingly well-conserved cellular architecture that enables matching of homologous types and predictions of properties of human cell types. Despite this general conservation, we also found extensive differences between homologous human and mouse cell types, including marked alterations in proportions, laminar distributions, gene expression and morphology. These species-specific features emphasize the importance of directly studying human brain.

Assuntos

Astrócitos/classificação , Evolução Biológica , Córtex Cerebral/citologia , Córtex Cerebral/metabolismo , Neurônios/classificação , Adolescente , Adulto , Idoso , Animais , Astrócitos/citologia , Feminino , Humanos , Masculino , Camundongos , Pessoa de Meia-Idade , Inibição Neural , Neurônios/citologia , Análise de Componente Principal , RNA-Seq , Análise de Célula Única , Especificidade da Espécie , Transcriptoma/genética , Adulto Jovem

3.

Shared and distinct transcriptomic cell types across neocortical areas.

Tasic, Bosiljka; Yao, Zizhen; Graybuck, Lucas T; Smith, Kimberly A; Nguyen, Thuc Nghi; Bertagnolli, Darren; Goldy, Jeff; Garren, Emma; Economo, Michael N; Viswanathan, Sarada; Penn, Osnat; Bakken, Trygve; Menon, Vilas; Miller, Jeremy; Fong, Olivia; Hirokawa, Karla E; Lathia, Kanan; Rimorin, Christine; Tieu, Michael; Larsen, Rachael; Casper, Tamara; Barkan, Eliza; Kroll, Matthew; Parry, Sheana; Shapovalova, Nadiya V; Hirschstein, Daniel; Pendergraft, Julie; Sullivan, Heather A; Kim, Tae Kyung; Szafer, Aaron; Dee, Nick; Groblewski, Peter; Wickersham, Ian; Cetin, Ali; Harris, Julie A; Levi, Boaz P; Sunkin, Susan M; Madisen, Linda; Daigle, Tanya L; Looger, Loren; Bernard, Amy; Phillips, John; Lein, Ed; Hawrylycz, Michael; Svoboda, Karel; Jones, Allan R; Koch, Christof; Zeng, Hongkui.

Nature ; 563(7729): 72-78, 2018 11.

Artigo em Inglês | MEDLINE | ID: mdl-30382198

RESUMO

The neocortex contains a multitude of cell types that are segregated into layers and functionally distinct areas. To investigate the diversity of cell types across the mouse neocortex, here we analysed 23,822 cells from two areas at distant poles of the mouse neocortex: the primary visual cortex and the anterior lateral motor cortex. We define 133 transcriptomic cell types by deep, single-cell RNA sequencing. Nearly all types of GABA (Î³-aminobutyric acid)-containing neurons are shared across both areas, whereas most types of glutamatergic neurons were found in one of the two areas. By combining single-cell RNA sequencing and retrograde labelling, we match transcriptomic types of glutamatergic neurons to their long-range projection specificity. Our study establishes a combined transcriptomic and projectional taxonomy of cortical cell types from functionally distinct areas of the adult mouse cortex.

Assuntos

Perfilação da Expressão Gênica , Neocórtex/citologia , Neocórtex/metabolismo , Animais , Biomarcadores/análise , Feminino , Neurônios GABAérgicos/metabolismo , Ácido Glutâmico/metabolismo , Masculino , Camundongos , Córtex Motor/anatomia & histologia , Córtex Motor/citologia , Córtex Motor/metabolismo , Neocórtex/anatomia & histologia , Especificidade de Órgãos , Análise de Sequência de RNA , Análise de Célula Única , Córtex Visual/anatomia & histologia , Córtex Visual/citologia , Córtex Visual/metabolismo

4.

Emergence of a Homo sapiens-specific gene family and chromosome 16p11.2 CNV susceptibility.

Nuttle, Xander; Giannuzzi, Giuliana; Duyzend, Michael H; Schraiber, Joshua G; Narvaiza, Iñigo; Sudmant, Peter H; Penn, Osnat; Chiatante, Giorgia; Malig, Maika; Huddleston, John; Benner, Chris; Camponeschi, Francesca; Ciofi-Baffoni, Simone; Stessman, Holly A F; Marchetto, Maria C N; Denman, Laura; Harshman, Lana; Baker, Carl; Raja, Archana; Penewit, Kelsi; Janke, Nicolette; Tang, W Joyce; Ventura, Mario; Banci, Lucia; Antonacci, Francesca; Akey, Joshua M; Amemiya, Chris T; Gage, Fred H; Reymond, Alexandre; Eichler, Evan E.

Nature ; 536(7615): 205-9, 2016 08 11.

Artigo em Inglês | MEDLINE | ID: mdl-27487209

RESUMO

Genetic differences that specify unique aspects of human evolution have typically been identified by comparative analyses between the genomes of humans and closely related primates, including more recently the genomes of archaic hominins. Not all regions of the genome, however, are equally amenable to such study. Recurrent copy number variation (CNV) at chromosome 16p11.2 accounts for approximately 1% of cases of autism and is mediated by a complex set of segmental duplications, many of which arose recently during human evolution. Here we reconstruct the evolutionary history of the locus and identify bolA family member 2 (BOLA2) as a gene duplicated exclusively in Homo sapiens. We estimate that a 95-kilobase-pair segment containing BOLA2 duplicated across the critical region approximately 282 thousand years ago (ka), one of the latest among a series of genomic changes that dramatically restructured the locus during hominid evolution. All humans examined carried one or more copies of the duplication, which nearly fixed early in the human lineage--a pattern unlikely to have arisen so rapidly in the absence of selection (P < 0.0097). We show that the duplication of BOLA2 led to a novel, human-specific in-frame fusion transcript and that BOLA2 copy number correlates with both RNA expression (r = 0.36) and protein level (r = 0.65), with the greatest expression difference between human and chimpanzee in experimentally derived stem cells. Analyses of 152 patients carrying a chromosome 16p11. rearrangement show that more than 96% of breakpoints occur within the H. sapiens-specific duplication. In summary, the duplicative transposition of BOLA2 at the root of the H. sapiens lineage about 282 ka simultaneously increased copy number of a gene associated with iron homeostasis and predisposed our species to recurrent rearrangements associated with disease.

Assuntos

Cromossomos Humanos Par 16/genética , Variações do Número de Cópias de DNA/genética , Evolução Molecular , Predisposição Genética para Doença , Proteínas/genética , Animais , Transtorno Autístico/genética , Quebra Cromossômica , Duplicação Gênica , Homeostase/genética , Humanos , Ferro/metabolismo , Pan troglodytes/genética , Pongo/genética , Proteínas/análise , Recombinação Genética , Especificidade da Espécie , Fatores de Tempo

5.

Transcriptional fates of human-specific segmental duplications in brain.

Dougherty, Max L; Underwood, Jason G; Nelson, Bradley J; Tseng, Elizabeth; Munson, Katherine M; Penn, Osnat; Nowakowski, Tomasz J; Pollen, Alex A; Eichler, Evan E.

Genome Res ; 28(10): 1566-1576, 2018 10.

Artigo em Inglês | MEDLINE | ID: mdl-30228200

RESUMO

Despite the importance of duplicate genes for evolutionary adaptation, accurate gene annotation is often incomplete, incorrect, or lacking in regions of segmental duplication. We developed an approach combining long-read sequencing and hybridization capture to yield full-length transcript information and confidently distinguish between nearly identical genes/paralogs. We used biotinylated probes to enrich for full-length cDNA from duplicated regions, which were then amplified, size-fractionated, and sequenced using single-molecule, long-read sequencing technology, permitting us to distinguish between highly identical genes by virtue of multiple paralogous sequence variants. We examined 19 gene families as expressed in developing and adult human brain, selected for their high sequence identity (average >99%) and overlap with human-specific segmental duplications (SDs). We characterized the transcriptional differences between related paralogs to better understand the birth-death process of duplicate genes and particularly how the process leads to gene innovation. In 48% of the cases, we find that the expressed duplicates have changed substantially from their ancestral models due to novel sites of transcription initiation, splicing, and polyadenylation, as well as fusion transcripts that connect duplication-derived exons with neighboring genes. We detect unannotated open reading frames in genes currently annotated as pseudogenes, while relegating other duplicates to nonfunctional status. Our method significantly improves gene annotation, specifically defining full-length transcripts, isoforms, and open reading frames for new genes in highly identical SDs. The approach will be more broadly applicable to genes in structurally complex regions of other genomes where the duplication process creates novel genes important for adaptive traits.

Assuntos

Encéfalo/metabolismo , Duplicações Segmentares Genômicas , Análise de Sequência de DNA/métodos , Análise de Sequência de RNA/métodos , Evolução Molecular , Duplicação Gênica , Perfilação da Expressão Gênica , Humanos , Anotação de Sequência Molecular , Família Multigênica , Fases de Leitura Aberta , Pseudogenes

6.

Disruption of POGZ Is Associated with Intellectual Disability and Autism Spectrum Disorders.

Stessman, Holly A F; Willemsen, Marjolein H; Fenckova, Michaela; Penn, Osnat; Hoischen, Alexander; Xiong, Bo; Wang, Tianyun; Hoekzema, Kendra; Vives, Laura; Vogel, Ida; Brunner, Han G; van der Burgt, Ineke; Ockeloen, Charlotte W; Schuurs-Hoeijmakers, Janneke H; Klein Wassink-Ruiter, Jolien S; Stumpel, Connie; Stevens, Servi J C; Vles, Hans S; Marcelis, Carlo M; van Bokhoven, Hans; Cantagrel, Vincent; Colleaux, Laurence; Nicouleau, Michael; Lyonnet, Stanislas; Bernier, Raphael A; Gerdts, Jennifer; Coe, Bradley P; Romano, Corrado; Alberti, Antonino; Grillo, Lucia; Scuderi, Carmela; Nordenskjöld, Magnus; Kvarnung, Malin; Guo, Hui; Xia, Kun; Piton, Amélie; Gerard, Bénédicte; Genevieve, David; Delobel, Bruno; Lehalle, Daphne; Perrin, Laurence; Prieur, Fabienne; Thevenon, Julien; Gecz, Jozef; Shaw, Marie; Pfundt, Rolph; Keren, Boris; Jacquette, Aurelia; Schenck, Annette; Eichler, Evan E.

Am J Hum Genet ; 98(3): 541-552, 2016 Mar 03.

Artigo em Inglês | MEDLINE | ID: mdl-26942287

RESUMO

Intellectual disability (ID) and autism spectrum disorders (ASD) are genetically heterogeneous, and a significant number of genes have been associated with both conditions. A few mutations in POGZ have been reported in recent exome studies; however, these studies do not provide detailed clinical information. We collected the clinical and molecular data of 25 individuals with disruptive mutations in POGZ by diagnostic whole-exome, whole-genome, or targeted sequencing of 5,223 individuals with neurodevelopmental disorders (ID primarily) or by targeted resequencing of this locus in 12,041 individuals with ASD and/or ID. The rarity of disruptive mutations among unaffected individuals (2/49,401) highlights the significance (p = 4.19 × 10(-13); odds ratio = 35.8) and penetrance (65.9%) of this genetic subtype with respect to ASD and ID. By studying the entire cohort, we defined common phenotypic features of POGZ individuals, including variable levels of developmental delay (DD) and more severe speech and language delay in comparison to the severity of motor delay and coordination issues. We also identified significant associations with vision problems, microcephaly, hyperactivity, a tendency to obesity, and feeding difficulties. Some features might be explained by the high expression of POGZ, particularly in the cerebellum and pituitary, early in fetal brain development. We conducted parallel studies in Drosophila by inducing conditional knockdown of the POGZ ortholog row, further confirming that dosage of POGZ, specifically in neurons, is essential for normal learning in a habituation paradigm. Combined, the data underscore the pathogenicity of loss-of-function mutations in POGZ and define a POGZ-related phenotype enriched in specific features.

Assuntos

Transtorno do Espectro Autista/genética , Deficiência Intelectual/genética , Transposases/genética , Adolescente , Adulto , Animais , Transtorno do Espectro Autista/diagnóstico , Criança , Pré-Escolar , Estudos de Coortes , Regulação para Baixo , Drosophila/genética , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Exoma , Feminino , Técnicas de Silenciamento de Genes , Estudo de Associação Genômica Ampla , Humanos , Lactente , Deficiência Intelectual/diagnóstico , Transtornos do Desenvolvimento da Linguagem/diagnóstico , Transtornos do Desenvolvimento da Linguagem/genética , Modelos Lineares , Masculino , Microcefalia/diagnóstico , Microcefalia/genética , Mutação , Fenótipo , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo

7.

The discovery of integrated gene networks for autism and related disorders.

Hormozdiari, Fereydoun; Penn, Osnat; Borenstein, Elhanan; Eichler, Evan E.

Genome Res ; 25(1): 142-54, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25378250

RESUMO

Despite considerable genetic heterogeneity underlying neurodevelopmental diseases, there is compelling evidence that many disease genes will map to a much smaller number of biological subnetworks. We developed a computational method, termed MAGI (merging affected genes into integrated networks), that simultaneously integrates protein-protein interactions and RNA-seq expression profiles during brain development to discover "modules" enriched for de novo mutations in probands. We applied this method to recent exome sequencing of 1116 patients with autism and intellectual disability, discovering two distinct modules that differ in their properties and associated phenotypes. The first module consists of 80 genes associated with Wnt, Notch, SWI/SNF, and NCOR complexes and shows the highest expression early during embryonic development (8-16 post-conception weeks [pcw]). The second module consists of 24 genes associated with synaptic function, including long-term potentiation and calcium signaling with higher levels of postnatal expression. Patients with de novo mutations in these modules are more significantly intellectually impaired and carry more severe missense mutations when compared to probands with de novo mutations outside of these modules. We used our approach to define subsets of the network associated with higher functioning autism as well as greater severity with respect to IQ. Finally, we applied MAGI independently to epilepsy and schizophrenia exome sequencing cohorts and found significant overlap as well as expansion of these modules, suggesting a core set of integrated neurodevelopmental networks common to seemingly diverse human diseases.

Assuntos

Transtorno Autístico/diagnóstico , Transtorno Autístico/genética , Redes Reguladoras de Genes , Algoritmos , Análise por Conglomerados , Estudos de Coortes , Bases de Dados Factuais , Epilepsia/diagnóstico , Epilepsia/genética , Exoma , Heterogeneidade Genética , Humanos , Mutação de Sentido Incorreto , Fenótipo , Esquizofrenia/diagnóstico , Esquizofrenia/genética , Análise de Sequência de RNA

8.

Changes in exon-intron structure during vertebrate evolution affect the splicing pattern of exons.

Gelfman, Sahar; Burstein, David; Penn, Osnat; Savchenko, Anna; Amit, Maayan; Schwartz, Schraga; Pupko, Tal; Ast, Gil.

Genome Res ; 22(1): 35-50, 2012 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-21974994

RESUMO

Exon-intron architecture is one of the major features directing the splicing machinery to the short exons that are located within long flanking introns. However, the evolutionary dynamics of exon-intron architecture and its impact on splicing is largely unknown. Using a comparative genomic approach, we analyzed 17 vertebrate genomes and reconstructed the ancestral motifs of both 3' and 5' splice sites, as also the ancestral length of exons and introns. Our analyses suggest that vertebrate introns increased in length from the shortest ancestral introns to the longest primate introns. An evolutionary analysis of splice sites revealed that weak splice sites act as a restrictive force keeping introns short. In contrast, strong splice sites allow recognition of exons flanked by long introns. Reconstruction of the ancestral state suggests these phenomena were not prevalent in the vertebrate ancestor, but appeared during vertebrate evolution. By calculating evolutionary rate shifts in exons, we identified cis-acting regulatory sequences that became fixed during the transition from early vertebrates to mammals. Experimental validations performed on a selection of these hexamers confirmed their regulatory function. We additionally revealed many features of exons that can discriminate alternative from constitutive exons. These features were integrated into a machine-learning approach to predict whether an exon is alternative. Our algorithm obtains very high predictive power (AUC of 0.91), and using these predictions we have identified and successfully validated novel alternatively spliced exons. Overall, we provide novel insights regarding the evolutionary constraints acting upon exons and their recognition by the splicing machinery.

Assuntos

Evolução Molecular , Éxons/fisiologia , Genoma/fisiologia , Íntrons/fisiologia , Sítios de Splice de RNA/genética , Splicing de RNA/genética , Vertebrados/genética , Animais , Modelos Genéticos

9.

FastML: a web server for probabilistic reconstruction of ancestral sequences.

Ashkenazy, Haim; Penn, Osnat; Doron-Faigenboim, Adi; Cohen, Ofir; Cannarozzi, Gina; Zomer, Oren; Pupko, Tal.

Nucleic Acids Res ; 40(Web Server issue): W580-4, 2012 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-22661579

RESUMO

Ancestral sequence reconstruction is essential to a variety of evolutionary studies. Here, we present the FastML web server, a user-friendly tool for the reconstruction of ancestral sequences. FastML implements various novel features that differentiate it from existing tools: (i) FastML uses an indel-coding method, in which each gap, possibly spanning multiples sites, is coded as binary data. FastML then reconstructs ancestral indel states assuming a continuous time Markov process. FastML provides the most likely ancestral sequences, integrating both indels and characters; (ii) FastML accounts for uncertainty in ancestral states: it provides not only the posterior probabilities for each character and indel at each sequence position, but also a sample of ancestral sequences from this posterior distribution, and a list of the k-most likely ancestral sequences; (iii) FastML implements a large array of evolutionary models, which makes it generic and applicable for nucleotide, protein and codon sequences; and (iv) a graphical representation of the results is provided, including, for example, a graphical logo of the inferred ancestral sequences. The utility of FastML is demonstrated by reconstructing ancestral sequences of the Env protein from various HIV-1 subtypes. FastML is freely available for all academic users and is available online at http://fastml.tau.ac.il/.

Assuntos

Filogenia , Software , Gráficos por Computador , Mutação INDEL , Internet , Probabilidade , Alinhamento de Sequência , Produtos do Gene env do Vírus da Imunodeficiência Humana/genética

10.

Improving the performance of positive selection inference by filtering unreliable alignment regions.

Privman, Eyal; Penn, Osnat; Pupko, Tal.

Mol Biol Evol ; 29(1): 1-5, 2012 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-21772063

RESUMO

Errors in the inferred multiple sequence alignment may lead to false prediction of positive selection. Recently, methods for detecting unreliable alignment regions were developed and were shown to accurately identify incorrectly aligned regions. While removing unreliable alignment regions is expected to increase the accuracy of positive selection inference, such filtering may also significantly decrease the power of the test, as positively selected regions are fast evolving, and those same regions are often those that are difficult to align. Here, we used realistic simulations that mimic sequence evolution of HIV-1 genes to test the hypothesis that the performance of positive selection inference using codon models can be improved by removing unreliable alignment regions. Our study shows that the benefit of removing unreliable regions exceeds the loss of power due to the removal of some of the true positively selected sites.

Assuntos

Modelos Genéticos , Alinhamento de Sequência/métodos , Alinhamento de Sequência/normas , Simulação por Computador , Bases de Dados Genéticas , Evolução Molecular , Genes Virais , HIV-1/genética , Filogenia , Seleção Genética

11.

GUIDANCE: a web server for assessing alignment confidence scores.

Penn, Osnat; Privman, Eyal; Ashkenazy, Haim; Landan, Giddy; Graur, Dan; Pupko, Tal.

Nucleic Acids Res ; 38(Web Server issue): W23-8, 2010 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-20497997

RESUMO

Evaluating the accuracy of multiple sequence alignment (MSA) is critical for virtually every comparative sequence analysis that uses an MSA as input. Here we present the GUIDANCE web-server, a user-friendly, open access tool for the identification of unreliable alignment regions. The web-server accepts as input a set of unaligned sequences. The server aligns the sequences and provides a simple graphic visualization of the confidence score of each column, residue and sequence of an alignment, using a color-coding scheme. The method is generic and the user is allowed to choose the alignment algorithm (ClustalW, MAFFT and PRANK are supported) as well as any type of molecular sequences (nucleotide, protein or codon sequences). The server implements two different algorithms for evaluating confidence scores: (i) the heads-or-tails (HoT) method, which measures alignment uncertainty due to co-optimal solutions; (ii) the GUIDANCE method, which measures the robustness of the alignment to guide-tree uncertainty. The server projects the confidence scores onto the MSA and points to columns and sequences that are unreliably aligned. These can be automatically removed in preparation for downstream analyses. GUIDANCE is freely available for use at http://guidance.tau.ac.il.

Assuntos

Alinhamento de Sequência/métodos , Software , Proteínas do Vírus da Imunodeficiência Humana/química , Internet , Análise de Sequência de Proteína , Proteínas Virais Reguladoras e Acessórias/química

12.

An alignment confidence score capturing robustness to guide tree uncertainty.

Penn, Osnat; Privman, Eyal; Landan, Giddy; Graur, Dan; Pupko, Tal.

Mol Biol Evol ; 27(8): 1759-67, 2010 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-20207713

RESUMO

Multiple sequence alignment (MSA) is the basis for a wide range of comparative sequence analyses from molecular phylogenetics to 3D structure prediction. Sophisticated algorithms have been developed for sequence alignment, but in practice, many errors can be expected and extensive portions of the MSA are unreliable. Hence, it is imperative to understand and characterize the various sources of errors in MSAs and to quantify site-specific alignment confidence. In this paper, we show that uncertainties in the guide tree used by progressive alignment methods are a major source of alignment uncertainty. We use this insight to develop a novel method for quantifying the robustness of each alignment column to guide tree uncertainty. We build on the widely used bootstrap method for perturbing the phylogenetic tree. Specifically, we generate a collection of trees and use each as a guide tree in the alignment algorithm, thus producing a set of MSAs. We next test the consistency of every column of the MSA obtained from the unperturbed guide tree with respect to the set of MSAs. We name this measure the "GUIDe tree based AligNment ConfidencE" (GUIDANCE) score. Using the Benchmark Alignment data BASE benchmark as well as simulation studies, we show that GUIDANCE scores accurately identify errors in MSAs. Additionally, we compare our results with the previously published Heads-or-Tails score and show that the GUIDANCE score is a better predictor of unreliably aligned regions.

Assuntos

Algoritmos , Sequência de Aminoácidos , Sequência de Bases , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Animais , Simulação por Computador , Bases de Dados Factuais , Drosophila melanogaster/genética , Dados de Sequência Molecular , Filogenia , Curva ROC , Software

13.

An evolutionary analysis of lateral gene transfer in thymidylate synthase enzymes.

Stern, Adi; Mayrose, Itay; Penn, Osnat; Shaul, Shaul; Gophna, Uri; Pupko, Tal.

Syst Biol ; 59(2): 212-25, 2010 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-20525631

RESUMO

Thymidylate synthases (Thy) are key enzymes in the synthesis of deoxythymidylate, 1 of the 4 building blocks of DNA. As such, they are essential for all DNA-based forms of life and therefore implicated in the hypothesized transition from RNA genomes to DNA genomes. Two evolutionally unrelated Thy enzymes, ThyA and ThyX, are known to catalyze the same biochemical reaction. Both enzymes are sporadically distributed within each of the 3 domains of life in a pattern that suggests multiple nonhomologous lateral gene transfer (LGT) events. We present a phylogenetic analysis of the evolution of the 2 enzymes, aimed at unraveling their entangled evolutionary history and tracing their origin back to early life. A novel probabilistic evolutionary model was developed, which allowed us to compute the posterior probabilities and the posterior expectation of the number of LGT events. Simulation studies were performed to validate the model's ability to accurately detect LGT events, which have occurred throughout a large phylogeny. Applying the model to the Thy data revealed widespread nonhomologous LGT between and within all 3 domains of life. By reconstructing the ThyA and ThyX gene trees, the most likely donor of each LGT event was inferred. The role of viruses in LGT of Thy is finally discussed.

Assuntos

Evolução Molecular , Transferência Genética Horizontal , Modelos Genéticos , Filogenia , Timidilato Sintase/genética , Composição de Bases , Sequência de Bases , Classificação/métodos , Biologia Computacional , Simulação por Computador , Funções Verossimilhança

14.

Rodent phylogeny revised: analysis of six nuclear genes from all major rodent clades.

Blanga-Kanfi, Shani; Miranda, Hector; Penn, Osnat; Pupko, Tal; DeBry, Ronald W; Huchon, Dorothée.

BMC Evol Biol ; 9: 71, 2009 Apr 02.

Artigo em Inglês | MEDLINE | ID: mdl-19341461

RESUMO

BACKGROUND: Rodentia is the most diverse order of placental mammals, with extant rodent species representing about half of all placental diversity. In spite of many morphological and molecular studies, the family-level relationships among rodents and the location of the rodent root are still debated. Although various datasets have already been analyzed to solve rodent phylogeny at the family level, these are difficult to combine because they involve different taxa and genes. RESULTS: We present here the largest protein-coding dataset used to study rodent relationships. It comprises six nuclear genes, 41 rodent species, and eight outgroups. Our phylogenetic reconstructions strongly support the division of Rodentia into three clades: (1) a "squirrel-related clade", (2) a "mouse-related clade", and (3) Ctenohystrica. Almost all evolutionary relationships within these clades are also highly supported. The primary remaining uncertainty is the position of the root. The application of various models and techniques aimed to remove non-phylogenetic signal was unable to solve the basal rodent trifurcation. CONCLUSION: Sequencing and analyzing a large sequence dataset enabled us to resolve most of the evolutionary relationships among Rodentia. Our findings suggest that the uncertainty regarding the position of the rodent root reflects the rapid rodent radiation that occurred in the Paleocene rather than the presence of conflicting phylogenetic and non-phylogenetic signals in the dataset.

Assuntos

Evolução Molecular , Filogenia , Roedores/genética , Animais , Teorema de Bayes , Núcleo Celular/genética , Funções Verossimilhança , Modelos Genéticos , Roedores/classificação , Alinhamento de Sequência , Análise de Sequência de DNA , Especificidade da Espécie

15.

Evolutionary modeling of rate shifts reveals specificity determinants in HIV-1 subtypes.

Penn, Osnat; Stern, Adi; Rubinstein, Nimrod D; Dutheil, Julien; Bacharach, Eran; Galtier, Nicolas; Pupko, Tal.

PLoS Comput Biol ; 4(11): e1000214, 2008 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-18989394

RESUMO

A hallmark of the human immunodeficiency virus 1 (HIV-1) is its rapid rate of evolution within and among its various subtypes. Two complementary hypotheses are suggested to explain the sequence variability among HIV-1 subtypes. The first suggests that the functional constraints at each site remain the same across all subtypes, and the differences among subtypes are a direct reflection of random substitutions, which have occurred during the time elapsed since their divergence. The alternative hypothesis suggests that the functional constraints themselves have evolved, and thus sequence differences among subtypes in some sites reflect shifts in function. To determine the contribution of each of these two alternatives to HIV-1 subtype evolution, we have developed a novel Bayesian method for testing and detecting site-specific rate shifts. The RAte Shift EstimatoR (RASER) method determines whether or not site-specific functional shifts characterize the evolution of a protein and, if so, points to the specific sites and lineages in which these shifts have most likely occurred. Applying RASER to a dataset composed of large samples of HIV-1 sequences from different group M subtypes, we reveal rampant evolutionary shifts throughout the HIV-1 proteome. Most of these rate shifts have occurred during the divergence of the major subtypes, establishing that subtype divergence occurred together with functional diversification. We report further evidence for the emergence of a new sub-subtype, characterized by abundant rate-shifting sites. When focusing on the rate-shifting sites detected, we find that many are associated with known function relating to viral life cycle and drug resistance. Finally, we discuss mechanisms of covariation of rate-shifting sites.

Assuntos

Adaptação Biológica/genética , Evolução Molecular , HIV-1/genética , Modelos Biológicos , Sequência de Aminoácidos/fisiologia , Teorema de Bayes , Farmacorresistência Viral Múltipla/genética , Especiação Genética , Geografia , Infecções por HIV/genética , HIV-1/patogenicidade , Humanos , Filogenia , Proteômica/métodos , Fatores de Tempo , Internalização do Vírus

16.

Dissecting the genetic basis of comorbid epilepsy phenotypes in neurodevelopmental disorders.

Chow, Julie; Jensen, Matthew; Amini, Hajar; Hormozdiari, Farhad; Penn, Osnat; Shifman, Sagiv; Girirajan, Santhosh; Hormozdiari, Fereydoun.

Genome Med ; 11(1): 65, 2019 10 25.

Artigo em Inglês | MEDLINE | ID: mdl-31653223

RESUMO

BACKGROUND: Neurodevelopmental disorders (NDDs) such as autism spectrum disorder, intellectual disability, developmental disability, and epilepsy are characterized by abnormal brain development that may affect cognition, learning, behavior, and motor skills. High co-occurrence (comorbidity) of NDDs indicates a shared, underlying biological mechanism. The genetic heterogeneity and overlap observed in NDDs make it difficult to identify the genetic causes of specific clinical symptoms, such as seizures. METHODS: We present a computational method, MAGI-S, to discover modules or groups of highly connected genes that together potentially perform a similar biological function. MAGI-S integrates protein-protein interaction and co-expression networks to form modules centered around the selection of a single "seed" gene, yielding modules consisting of genes that are highly co-expressed with the seed gene. We aim to dissect the epilepsy phenotype from a general NDD phenotype by providing MAGI-S with high confidence NDD seed genes with varying degrees of association with epilepsy, and we assess the enrichment of de novo mutation, NDD-associated genes, and relevant biological function of constructed modules. RESULTS: The newly identified modules account for the increased rate of de novo non-synonymous mutations in autism, intellectual disability, developmental disability, and epilepsy, and enrichment of copy number variations (CNVs) in developmental disability. We also observed that modules seeded with genes strongly associated with epilepsy tend to have a higher association with epilepsy phenotypes than modules seeded at other neurodevelopmental disorder genes. Modules seeded with genes strongly associated with epilepsy (e.g., SCN1A, GABRA1, and KCNB1) are significantly associated with synaptic transmission, long-term potentiation, and calcium signaling pathways. On the other hand, modules found with seed genes that are not associated or weakly associated with epilepsy are mostly involved with RNA regulation and chromatin remodeling. CONCLUSIONS: In summary, our method identifies modules enriched with de novo non-synonymous mutations and can capture specific networks that underlie the epilepsy phenotype and display distinct enrichment in relevant biological processes. MAGI-S is available at https://github.com/jchow32/magi-s .

Assuntos

Epilepsia/genética , Redes Reguladoras de Genes , Heterogeneidade Genética , Transtornos do Neurodesenvolvimento/genética , Comorbidade , Bases de Dados Factuais , Epilepsia/epidemiologia , Humanos , Mutação , Transtornos do Neurodesenvolvimento/epidemiologia , Fenótipo , Prognóstico

17.

Pepitope: epitope mapping from affinity-selected peptides.

Mayrose, Itay; Penn, Osnat; Erez, Elana; Rubinstein, Nimrod D; Shlomi, Tomer; Freund, Natalia Tarnovitski; Bublil, Erez M; Ruppin, Eytan; Sharan, Roded; Gershoni, Jonathan M; Martz, Eric; Pupko, Tal.

Bioinformatics ; 23(23): 3244-6, 2007 Dec 01.

Artigo em Inglês | MEDLINE | ID: mdl-17977889

RESUMO

UNLABELLED: Identifying the epitope to which an antibody binds is central for many immunological applications such as drug design and vaccine development. The Pepitope server is a web-based tool that aims at predicting discontinuous epitopes based on a set of peptides that were affinity-selected against a monoclonal antibody of interest. The server implements three different algorithms for epitope mapping: PepSurf, Mapitope, and a combination of the two. The rationale behind these algorithms is that the set of peptides mimics the genuine epitope in terms of physicochemical properties and spatial organization. When the three-dimensional (3D) structure of the antigen is known, the information in these peptides can be used to computationally infer the corresponding epitope. A user-friendly web interface and a graphical tool that allows viewing the predicted epitopes were developed. Pepitope can also be applied for inferring other types of protein-protein interactions beyond the immunological context, and as a general tool for aligning linear sequences to a 3D structure. AVAILABILITY: http://pepitope.tau.ac.il/

Assuntos

Algoritmos , Mapeamento de Epitopos/métodos , Peptídeos/química , Peptídeos/imunologia , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína/métodos , Software , Sequência de Aminoácidos , Sítios de Ligação , Dados de Sequência Molecular , Ligação Proteica

18.

Corrigendum: Independent Evolution of Strychnine Recognition by Bitter Taste Receptor Subtypes.

Xue, Ava Yuan; Di Pizio, Antonella; Levit, Anat; Yarnitzky, Tali; Penn, Osnat; Pupko, Tal; Niv, Masha Y.

Front Mol Biosci ; 5: 84, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-30255025

RESUMO

[This corrects the article DOI: 10.3389/fmolb.2018.00009.].

19.

Independent Evolution of Strychnine Recognition by Bitter Taste Receptor Subtypes.

Xue, Ava Yuan; Di Pizio, Antonella; Levit, Anat; Yarnitzky, Tali; Penn, Osnat; Pupko, Tal; Niv, Masha Y.

Front Mol Biosci ; 5: 9, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-29552563

RESUMO

The 25 human bitter taste receptors (hT2Rs) recognize thousands of structurally and chemically diverse bitter substances. The binding modes of human bitter taste receptors hT2R10 and hT2R46, which are responsible for strychnine recognition, were previously established using site-directed mutagenesis, functional assays, and molecular modeling. Here we construct a phylogenetic tree and reconstruct ancestral sequences of the T2R10 and T2R46 clades. We next analyze the binding sites in view of experimental data to predict their ability to recognize strychnine. This analysis suggests that the common ancestor of hT2R10 and hT2R46 is unlikely to bind strychnine in the same mode as either of its two descendants. Estimation of relative divergence times shows that hT2R10 evolved earlier than hT2R46. Strychnine recognition was likely acquired first by the earliest common ancestor of the T2R10 clade before the separation of primates from other mammals, and was highly conserved within the clade. It was probably independently acquired by the common ancestor of T2R43-47 before the homo-ape speciation, lost in most T2Rs within this clade, but enhanced in the hT2R46 after humans diverged from the rest of primates. Our findings suggest hypothetical strychnine T2R receptors in several species, and serve as an experimental guide for further study. Improved understanding of how bitter taste receptors acquire the ability to be activated by particular ligands is valuable for the development of sensors for bitterness and for potential toxicity.

20.

Stepwise prediction of conformational discontinuous B-cell epitopes using the Mapitope algorithm.

Bublil, Erez M; Freund, Natalia Tarnovitski; Mayrose, Itay; Penn, Osnat; Roitburd-Berman, Anna; Rubinstein, Nimrod D; Pupko, Tal; Gershoni, Jonathan M.

Proteins ; 68(1): 294-304, 2007 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-17427229

RESUMO

Mapping the epitope of an antibody is of great interest, since it contributes much to our understanding of the mechanisms of molecular recognition and provides the basis for rational vaccine design. Here we present Mapitope, a computer algorithm for epitope mapping. The algorithm input is a set of affinity isolated peptides obtained by screening phage display peptide-libraries with the antibody of interest. The output is usually 1-3 epitope candidates on the surface of the atomic structure of the antigen. We have systematically tested the performance of Mapitope by assessing the effect of the algorithm parameters on the final prediction. Thus, we have examined the effect of the statistical threshold (ST) parameter, relating to the frequency distribution and enrichment of amino acid pairs from the isolated peptides and the D (distance) and E (exposure) parameters which relate to the physical parameters of the antigen. Two model systems were analyzed in which the antibody of interest had previously been co-crystallized with the antigen and thus the epitope is a given. The Mapitope algorithm successfully predicted the epitopes in both models. Accordingly, we formulated a stepwise paradigm for the prediction of discontinuous conformational epitopes using peptides obtained from screening phage display libraries. We applied this paradigm to successfully predict the epitope of the Trastuzumab antibody on the surface of the Her-2/neu receptor in a third model system.

Assuntos

Algoritmos , Anticorpos/metabolismo , Mapeamento de Epitopos/métodos , Epitopos de Linfócito B/genética , Modelos Moleculares , Sequência de Aminoácidos , Anticorpos Monoclonais/genética , Anticorpos Monoclonais Humanizados , Epitopos de Linfócito B/metabolismo , Humanos , Dados de Sequência Molecular , Biblioteca de Peptídeos , Trastuzumab

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa