Búsqueda | Portal Regional de la BVS

1.

Hagfish genome elucidates vertebrate whole-genome duplication events and their evolutionary consequences.

Yu, Daqi; Ren, Yandong; Uesaka, Masahiro; Beavan, Alan J S; Muffato, Matthieu; Shen, Jieyu; Li, Yongxin; Sato, Iori; Wan, Wenting; Clark, James W; Keating, Joseph N; Carlisle, Emily M; Dearden, Richard P; Giles, Sam; Randle, Emma; Sansom, Robert S; Feuda, Roberto; Fleming, James F; Sugahara, Fumiaki; Cummins, Carla; Patricio, Mateus; Akanni, Wasiu; D'Aniello, Salvatore; Bertolucci, Cristiano; Irie, Naoki; Alev, Cantas; Sheng, Guojun; de Mendoza, Alex; Maeso, Ignacio; Irimia, Manuel; Fromm, Bastian; Peterson, Kevin J; Das, Sabyasachi; Hirano, Masayuki; Rast, Jonathan P; Cooper, Max D; Paps, Jordi; Pisani, Davide; Kuratani, Shigeru; Martin, Fergal J; Wang, Wen; Donoghue, Philip C J; Zhang, Yong E; Pascual-Anaya, Juan.

Nat Ecol Evol ; 8(3): 519-535, 2024 Mar.

Artículo en Inglés | MEDLINE | ID: mdl-38216617

RESUMEN

Polyploidy or whole-genome duplication (WGD) is a major event that drastically reshapes genome architecture and is often assumed to be causally associated with organismal innovations and radiations. The 2R hypothesis suggests that two WGD events (1R and 2R) occurred during early vertebrate evolution. However, the timing of the 2R event relative to the divergence of gnathostomes (jawed vertebrates) and cyclostomes (jawless hagfishes and lampreys) is unresolved and whether these WGD events underlie vertebrate phenotypic diversification remains elusive. Here we present the genome of the inshore hagfish, Eptatretus burgeri. Through comparative analysis with lamprey and gnathostome genomes, we reconstruct the early events in cyclostome genome evolution, leveraging insights into the ancestral vertebrate genome. Genome-wide synteny and phylogenetic analyses support a scenario in which 1R occurred in the vertebrate stem-lineage during the early Cambrian, and 2R occurred in the gnathostome stem-lineage, maximally in the late Cambrian-earliest Ordovician, after its divergence from cyclostomes. We find that the genome of stem-cyclostomes experienced an additional independent genome triplication. Functional genomic and morphospace analyses demonstrate that WGD events generally contribute to developmental evolution with similar changes in the regulatory genome of both vertebrate groups. However, appreciable morphological diversification occurred only in the gnathostome but not in the cyclostome lineage, calling into question the general expectation that WGDs lead to leaps of bodyplan complexity.

Asunto(s)

Anguila Babosa , Animales , Filogenia , Anguila Babosa/genética , Duplicación de Gen , Vertebrados/genética , Genoma , Lampreas/genética

2.

Reconstruction of hundreds of reference ancestral genomes across the eukaryotic kingdom.

Muffato, Matthieu; Louis, Alexandra; Nguyen, Nga Thi Thuy; Lucas, Joseph; Berthelot, Camille; Roest Crollius, Hugues.

Nat Ecol Evol ; 7(3): 355-366, 2023 03.

Artículo en Inglés | MEDLINE | ID: mdl-36646945

RESUMEN

Ancestral sequence reconstruction is a fundamental aspect of molecular evolution studies and can trace small-scale sequence modifications through the evolution of genomes and species. In contrast, fine-grained reconstructions of ancestral genome organizations are still in their infancy, limiting our ability to draw comprehensive views of genome and karyotype evolution. Here we reconstruct the detailed gene contents and organizations of 624 ancestral vertebrate, plant, fungi, metazoan and protist genomes, 183 of which are near-complete chromosomal gene order reconstructions. Reconstructed ancestral genomes are similar to their descendants in terms of gene content as expected and agree precisely with reference cytogenetic and in silico reconstructions when available. By comparing successive ancestral genomes along the phylogenetic tree, we estimate the intra- and interchromosomal rearrangement history of all major vertebrate clades at high resolution. This freely available resource introduces the possibility to follow evolutionary processes at genomic scales in chronological order, across multiple clades and without relying on a single extant species as reference.

Asunto(s)

Eucariontes , Genoma , Animales , Eucariontes/genética , Filogenia , Cromosomas , Genómica

3.

Scripting Analyses of Genomes in Ensembl Plants.

Contreras-Moreira, Bruno; Naamati, Guy; Rosello, Marc; Allen, James E; Hunt, Sarah E; Muffato, Matthieu; Gall, Astrid; Flicek, Paul.

Methods Mol Biol ; 2443: 27-55, 2022.

Artículo en Inglés | MEDLINE | ID: mdl-35037199

RESUMEN

Ensembl Plants ( http://plants.ensembl.org ) offers genome-scale information for plants, with four releases per year. As of release 47 (April 2020) it features 79 species and includes genome sequence, gene models, and functional annotation. Comparative analyses help reconstruct the evolutionary history of gene families, genomes, and components of polyploid genomes. Some species have gene expression baseline reports or variation across genotypes. While the data can be accessed through the Ensembl genome browser, here we review specifically how our plant genomes can be interrogated programmatically and the data downloaded in bulk. These access routes are generally consistent across Ensembl for other non-plant species, including plant pathogens, pests, and pollinators.

Asunto(s)

Bases de Datos Genéticas , Genómica , Genoma de Planta , Anotación de Secuencia Molecular , Plantas/genética , Programas Informáticos

4.

Ensembl Genomes 2022: an expanding genome resource for non-vertebrates.

Yates, Andrew D; Allen, James; Amode, Ridwan M; Azov, Andrey G; Barba, Matthieu; Becerra, Andrés; Bhai, Jyothish; Campbell, Lahcen I; Carbajo Martinez, Manuel; Chakiachvili, Marc; Chougule, Kapeel; Christensen, Mikkel; Contreras-Moreira, Bruno; Cuzick, Alayne; Da Rin Fioretto, Luca; Davis, Paul; De Silva, Nishadi H; Diamantakis, Stavros; Dyer, Sarah; Elser, Justin; Filippi, Carla V; Gall, Astrid; Grigoriadis, Dionysios; Guijarro-Clarke, Cristina; Gupta, Parul; Hammond-Kosack, Kim E; Howe, Kevin L; Jaiswal, Pankaj; Kaikala, Vinay; Kumar, Vivek; Kumari, Sunita; Langridge, Nick; Le, Tuan; Luypaert, Manuel; Maslen, Gareth L; Maurel, Thomas; Moore, Benjamin; Muffato, Matthieu; Mushtaq, Aleena; Naamati, Guy; Naithani, Sushma; Olson, Andrew; Parker, Anne; Paulini, Michael; Pedro, Helder; Perry, Emily; Preece, Justin; Quinton-Tulloch, Mark; Rodgers, Faye; Rosello, Marc.

Nucleic Acids Res ; 50(D1): D996-D1003, 2022 01 07.

Artículo en Inglés | MEDLINE | ID: mdl-34791415

RESUMEN

Ensembl Genomes (https://www.ensemblgenomes.org) provides access to non-vertebrate genomes and analysis complementing vertebrate resources developed by the Ensembl project (https://www.ensembl.org). The two resources collectively present genome annotation through a consistent set of interfaces spanning the tree of life presenting genome sequence, annotation, variation, transcriptomic data and comparative analysis. Here, we present our largest increase in plant, metazoan and fungal genomes since the project's inception creating one of the world's most comprehensive genomic resources and describe our efforts to reduce genome redundancy in our Bacteria portal. We detail our new efforts in gene annotation, our emerging support for pangenome analysis, our efforts to accelerate data dissemination through the Ensembl Rapid Release resource and our new AlphaFold visualization. Finally, we present details of our future plans including updates on our integration with Ensembl, and how we plan to improve our support for the microbial research community. Software and data are made available without restriction via our website, online tools platform and programmatic interfaces (available under an Apache 2.0 license). Data updates are synchronised with Ensembl's release cycle.

Asunto(s)

Bases de Datos Genéticas , Genómica , Internet , Programas Informáticos , Animales , Biología Computacional , Genoma Bacteriano/genética , Genoma Fúngico/genética , Genoma de Planta/genética , Plantas/clasificación , Plantas/genética , Vertebrados/clasificación , Vertebrados/genética

5.

Ensembl 2021.

Howe, Kevin L; Achuthan, Premanand; Allen, James; Allen, Jamie; Alvarez-Jarreta, Jorge; Amode, M Ridwan; Armean, Irina M; Azov, Andrey G; Bennett, Ruth; Bhai, Jyothish; Billis, Konstantinos; Boddu, Sanjay; Charkhchi, Mehrnaz; Cummins, Carla; Da Rin Fioretto, Luca; Davidson, Claire; Dodiya, Kamalkumar; El Houdaigui, Bilal; Fatima, Reham; Gall, Astrid; Garcia Giron, Carlos; Grego, Tiago; Guijarro-Clarke, Cristina; Haggerty, Leanne; Hemrom, Anmol; Hourlier, Thibaut; Izuogu, Osagie G; Juettemann, Thomas; Kaikala, Vinay; Kay, Mike; Lavidas, Ilias; Le, Tuan; Lemos, Diana; Gonzalez Martinez, Jose; Marugán, José Carlos; Maurel, Thomas; McMahon, Aoife C; Mohanan, Shamika; Moore, Benjamin; Muffato, Matthieu; Oheh, Denye N; Paraschas, Dimitrios; Parker, Anne; Parton, Andrew; Prosovetskaia, Irina; Sakthivel, Manoj P; Salam, Ahamed I Abdul; Schmitt, Bianca M; Schuilenburg, Helen; Sheppard, Dan.

Nucleic Acids Res ; 49(D1): D884-D891, 2021 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-33137190

RESUMEN

The Ensembl project (https://www.ensembl.org) annotates genomes and disseminates genomic data for vertebrate species. We create detailed and comprehensive annotation of gene structures, regulatory elements and variants, and enable comparative genomics by inferring the evolutionary history of genes and genomes. Our integrated genomic data are made available in a variety of ways, including genome browsers, search interfaces, specialist tools such as the Ensembl Variant Effect Predictor, download files and programmatic interfaces. Here, we present recent Ensembl developments including two new website portals. Ensembl Rapid Release (http://rapid.ensembl.org) is designed to provide core tools and services for genomes as soon as possible and has been deployed to support large biodiversity sequencing projects. Our SARS-CoV-2 genome browser (https://covid-19.ensembl.org) integrates our own annotation with publicly available genomic data from numerous sources to facilitate the use of genomics in the international scientific response to the COVID-19 pandemic. We also report on other updates to our annotation resources, tools and services. All Ensembl data and software are freely available without restriction.

Asunto(s)

Biología Computacional/métodos , Bases de Datos de Ácidos Nucleicos , Genómica/métodos , SARS-CoV-2/genética , Vertebrados/genética , Animales , COVID-19/epidemiología , COVID-19/virología , Humanos , Internet , Anotación de Secuencia Molecular/métodos , Pandemias , Vertebrados/clasificación

6.

The tuatara genome reveals ancient features of amniote evolution.

Gemmell, Neil J; Rutherford, Kim; Prost, Stefan; Tollis, Marc; Winter, David; Macey, J Robert; Adelson, David L; Suh, Alexander; Bertozzi, Terry; Grau, José H; Organ, Chris; Gardner, Paul P; Muffato, Matthieu; Patricio, Mateus; Billis, Konstantinos; Martin, Fergal J; Flicek, Paul; Petersen, Bent; Kang, Lin; Michalak, Pawel; Buckley, Thomas R; Wilson, Melissa; Cheng, Yuanyuan; Miller, Hilary; Schott, Ryan K; Jordan, Melissa D; Newcomb, Richard D; Arroyo, José Ignacio; Valenzuela, Nicole; Hore, Tim A; Renart, Jaime; Peona, Valentina; Peart, Claire R; Warmuth, Vera M; Zeng, Lu; Kortschak, R Daniel; Raison, Joy M; Zapata, Valeria Velásquez; Wu, Zhiqiang; Santesmasses, Didac; Mariotti, Marco; Guigó, Roderic; Rupp, Shawn M; Twort, Victoria G; Dussex, Nicolas; Taylor, Helen; Abe, Hideaki; Bond, Donna M; Paterson, James M; Mulcahy, Daniel G.

Nature ; 584(7821): 403-409, 2020 08.

Artículo en Inglés | MEDLINE | ID: mdl-32760000

RESUMEN

The tuatara (Sphenodon punctatus)-the only living member of the reptilian order Rhynchocephalia (Sphenodontia), once widespread across Gondwana1,2-is an iconic species that is endemic to New Zealand2,3. A key link to the now-extinct stem reptiles (from which dinosaurs, modern reptiles, birds and mammals evolved), the tuatara provides key insights into the ancestral amniotes2,4. Here we analyse the genome of the tuatara, which-at approximately 5 Gb-is among the largest of the vertebrate genomes yet assembled. Our analyses of this genome, along with comparisons with other vertebrate genomes, reinforce the uniqueness of the tuatara. Phylogenetic analyses indicate that the tuatara lineage diverged from that of snakes and lizards around 250 million years ago. This lineage also shows moderate rates of molecular evolution, with instances of punctuated evolution. Our genome sequence analysis identifies expansions of proteins, non-protein-coding RNA families and repeat elements, the latter of which show an amalgam of reptilian and mammalian features. The sequencing of the tuatara genome provides a valuable resource for deep comparative analyses of tetrapods, as well as for tuatara biology and conservation. Our study also provides important insights into both the technical challenges and the cultural obligations that are associated with genome sequencing.

Asunto(s)

Evolución Molecular , Genoma/genética , Filogenia , Reptiles/genética , Animales , Conservación de los Recursos Naturales/tendencias , Femenino , Genética de Población , Lagartos/genética , Masculino , Anotación de Secuencia Molecular , Nueva Zelanda , Caracteres Sexuales , Serpientes/genética , Sintenía

7.

Publisher Correction: The tuatara genome reveals ancient features of amniote evolution.

Gemmell, Neil J; Rutherford, Kim; Prost, Stefan; Tollis, Marc; Winter, David; Macey, J Robert; Adelson, David L; Suh, Alexander; Bertozzi, Terry; Grau, José H; Organ, Chris; Gardner, Paul P; Muffato, Matthieu; Patricio, Mateus; Billis, Konstantinos; Martin, Fergal J; Flicek, Paul; Petersen, Bent; Kang, Lin; Michalak, Pawel; Buckley, Thomas R; Wilson, Melissa; Cheng, Yuanyuan; Miller, Hilary; Schott, Ryan K; Jordan, Melissa D; Newcomb, Richard D; Arroyo, José Ignacio; Valenzuela, Nicole; Hore, Tim A; Renart, Jaime; Peona, Valentina; Peart, Claire R; Warmuth, Vera M; Zeng, Lu; Kortschak, R Daniel; Raison, Joy M; Zapata, Valeria Velásquez; Wu, Zhiqiang; Santesmasses, Didac; Mariotti, Marco; Guigó, Roderic; Rupp, Shawn M; Twort, Victoria G; Dussex, Nicolas; Taylor, Helen; Abe, Hideaki; Bond, Donna M; Paterson, James M; Mulcahy, Daniel G.

Nature ; 585(7823): E3, 2020 Sep.

Artículo en Inglés | MEDLINE | ID: mdl-32811988

RESUMEN

An amendment to this paper has been published and can be accessed via a link at the top of the paper.

8.

The Quest for Orthologs benchmark service and consensus calls in 2020.

Altenhoff, Adrian M; Garrayo-Ventas, Javier; Cosentino, Salvatore; Emms, David; Glover, Natasha M; Hernández-Plaza, Ana; Nevers, Yannis; Sundesha, Vicky; Szklarczyk, Damian; Fernández, José M; Codó, Laia; For Orthologs Consortium, The Quest; Gelpi, Josep Ll; Huerta-Cepas, Jaime; Iwasaki, Wataru; Kelly, Steven; Lecompte, Odile; Muffato, Matthieu; Martin, Maria J; Capella-Gutierrez, Salvador; Thomas, Paul D; Sonnhammer, Erik; Dessimoz, Christophe.

Nucleic Acids Res ; 48(W1): W538-W545, 2020 07 02.

Artículo en Inglés | MEDLINE | ID: mdl-32374845

RESUMEN

The identification of orthologs-genes in different species which descended from the same gene in their last common ancestor-is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficulties (need to compare them on a common input dataset, absence of ground truth, computational cost of calling orthologs). To address this, the Quest for Orthologs consortium maintains a reference set of proteomes and provides a web server for continuous orthology benchmarking (http://orthology.benchmarkservice.org). Furthermore, consensus ortholog calls derived from public benchmark submissions are provided on the Alliance of Genome Resources website, the joint portal of NIH-funded model organism databases.

Asunto(s)

Familia de Multigenes , Proteoma , Programas Informáticos , Animales , Benchmarking , Consenso , Genómica , Humanos , Ratones , Filogenia , Ratas

9.

Ensembl Genomes 2020-enabling non-vertebrate genomic research.

Howe, Kevin L; Contreras-Moreira, Bruno; De Silva, Nishadi; Maslen, Gareth; Akanni, Wasiu; Allen, James; Alvarez-Jarreta, Jorge; Barba, Matthieu; Bolser, Dan M; Cambell, Lahcen; Carbajo, Manuel; Chakiachvili, Marc; Christensen, Mikkel; Cummins, Carla; Cuzick, Alayne; Davis, Paul; Fexova, Silvie; Gall, Astrid; George, Nancy; Gil, Laurent; Gupta, Parul; Hammond-Kosack, Kim E; Haskell, Erin; Hunt, Sarah E; Jaiswal, Pankaj; Janacek, Sophie H; Kersey, Paul J; Langridge, Nick; Maheswari, Uma; Maurel, Thomas; McDowall, Mark D; Moore, Ben; Muffato, Matthieu; Naamati, Guy; Naithani, Sushma; Olson, Andrew; Papatheodorou, Irene; Patricio, Mateus; Paulini, Michael; Pedro, Helder; Perry, Emily; Preece, Justin; Rosello, Marc; Russell, Matthew; Sitnik, Vasily; Staines, Daniel M; Stein, Joshua; Tello-Ruiz, Marcela K; Trevanion, Stephen J; Urban, Martin.

Nucleic Acids Res ; 48(D1): D689-D695, 2020 01 08.

Artículo en Inglés | MEDLINE | ID: mdl-31598706

RESUMEN

Ensembl Genomes (http://www.ensemblgenomes.org) is an integrating resource for genome-scale data from non-vertebrate species, complementing the resources for vertebrate genomics developed in the context of the Ensembl project (http://www.ensembl.org). Together, the two resources provide a consistent set of interfaces to genomic data across the tree of life, including reference genome sequence, gene models, transcriptional data, genetic variation and comparative analysis. Data may be accessed via our website, online tools platform and programmatic interfaces, with updates made four times per year (in synchrony with Ensembl). Here, we provide an overview of Ensembl Genomes, with a focus on recent developments. These include the continued growth, more robust and reproducible sets of orthologues and paralogues, and enriched views of gene expression and gene function in plants. Finally, we report on our continued deeper integration with the Ensembl project, which forms a key part of our future strategy for dealing with the increasing quantity of available genome-scale data across the tree of life.

Asunto(s)

Biología Computacional/métodos , Bases de Datos Genéticas , Variación Genética , Genoma Bacteriano , Genoma Fúngico , Genoma de Planta , Algoritmos , Animales , Caenorhabditis elegans/genética , Genómica , Internet , Anotación de Secuencia Molecular , Fenotipo , Plantas/genética , Valores de Referencia , Programas Informáticos , Interfaz Usuario-Computador

10.

Advances and Applications in the Quest for Orthologs.

Glover, Natasha; Dessimoz, Christophe; Ebersberger, Ingo; Forslund, Sofia K; Gabaldón, Toni; Huerta-Cepas, Jaime; Martin, Maria-Jesus; Muffato, Matthieu; Patricio, Mateus; Pereira, Cécile; da Silva, Alan Sousa; Wang, Yan; Sonnhammer, Erik; Thomas, Paul D.

Mol Biol Evol ; 36(10): 2157-2164, 2019 10 01.

Artículo en Inglés | MEDLINE | ID: mdl-31241141

RESUMEN

Gene families evolve by the processes of speciation (creating orthologs), gene duplication (paralogs), and horizontal gene transfer (xenologs), in addition to sequence divergence and gene loss. Orthologs in particular play an essential role in comparative genomics and phylogenomic analyses. With the continued sequencing of organisms across the tree of life, the data are available to reconstruct the unique evolutionary histories of tens of thousands of gene families. Accurate reconstruction of these histories, however, is a challenging computational problem, and the focus of the Quest for Orthologs Consortium. We review the recent advances and outstanding challenges in this field, as revealed at a symposium and meeting held at the University of Southern California in 2017. Key advances have been made both at the level of orthology algorithm development and with respect to coordination across the community of algorithm developers and orthology end-users. Applications spanned a broad range, including gene function prediction, phylostratigraphy, genome evolution, and phylogenomics. The meetings highlighted the increasing use of meta-analyses integrating results from multiple different algorithms, and discussed ongoing challenges in orthology inference as well as the next steps toward improvement and integration of orthology resources.

Asunto(s)

Evolución Molecular , Genómica/tendencias , Familia de Multigenes , Algoritmos , Animales , Genómica/métodos , Humanos

11.

The comparative genomics and complex population history of Papio baboons.

Rogers, Jeffrey; Raveendran, Muthuswamy; Harris, R Alan; Mailund, Thomas; Leppälä, Kalle; Athanasiadis, Georgios; Schierup, Mikkel Heide; Cheng, Jade; Munch, Kasper; Walker, Jerilyn A; Konkel, Miriam K; Jordan, Vallmer; Steely, Cody J; Beckstrom, Thomas O; Bergey, Christina; Burrell, Andrew; Schrempf, Dominik; Noll, Angela; Kothe, Maximillian; Kopp, Gisela H; Liu, Yue; Murali, Shwetha; Billis, Konstantinos; Martin, Fergal J; Muffato, Matthieu; Cox, Laura; Else, James; Disotell, Todd; Muzny, Donna M; Phillips-Conroy, Jane; Aken, Bronwen; Eichler, Evan E; Marques-Bonet, Tomas; Kosiol, Carolin; Batzer, Mark A; Hahn, Matthew W; Tung, Jenny; Zinner, Dietmar; Roos, Christian; Jolly, Clifford J; Gibbs, Richard A; Worley, Kim C.

Sci Adv ; 5(1): eaau6947, 2019 01.

Artículo en Inglés | MEDLINE | ID: mdl-30854422

RESUMEN

Recent studies suggest that closely related species can accumulate substantial genetic and phenotypic differences despite ongoing gene flow, thus challenging traditional ideas regarding the genetics of speciation. Baboons (genus Papio) are Old World monkeys consisting of six readily distinguishable species. Baboon species hybridize in the wild, and prior data imply a complex history of differentiation and introgression. We produced a reference genome assembly for the olive baboon (Papio anubis) and whole-genome sequence data for all six extant species. We document multiple episodes of admixture and introgression during the radiation of Papio baboons, thus demonstrating their value as a model of complex evolutionary divergence, hybridization, and reticulation. These results help inform our understanding of similar cases, including modern humans, Neanderthals, Denisovans, and other ancient hominins.

Asunto(s)

Evolución Biológica , Genómica/métodos , Papio/genética , Animales , Secuencia de Bases , Femenino , Flujo Génico , Haplotipos/genética , Humanos , Hibridación Genética , Masculino , Filogenia , Polimorfismo Genético , Secuenciación Completa del Genoma

12.

Repeat associated mechanisms of genome evolution and function revealed by the Mus caroli and Mus pahari genomes.

Thybert, David; Roller, Masa; Navarro, Fábio C P; Fiddes, Ian; Streeter, Ian; Feig, Christine; Martin-Galvez, David; Kolmogorov, Mikhail; Janousek, Václav; Akanni, Wasiu; Aken, Bronwen; Aldridge, Sarah; Chakrapani, Varshith; Chow, William; Clarke, Laura; Cummins, Carla; Doran, Anthony; Dunn, Matthew; Goodstadt, Leo; Howe, Kerstin; Howell, Matthew; Josselin, Ambre-Aurore; Karn, Robert C; Laukaitis, Christina M; Jingtao, Lilue; Martin, Fergal; Muffato, Matthieu; Nachtweide, Stefanie; Quail, Michael A; Sisu, Cristina; Stanke, Mario; Stefflova, Klara; Van Oosterhout, Cock; Veyrunes, Frederic; Ward, Ben; Yang, Fengtang; Yazdanifar, Golbahar; Zadissa, Amonida; Adams, David J; Brazma, Alvis; Gerstein, Mark; Paten, Benedict; Pham, Son; Keane, Thomas M; Odom, Duncan T; Flicek, Paul.

Genome Res ; 28(4): 448-459, 2018 04.

Artículo en Inglés | MEDLINE | ID: mdl-29563166

RESUMEN

Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology.

Asunto(s)

Evolución Molecular , Genoma/genética , Muridae/genética , Filogenia , Animales , Sitios de Unión , Factor de Unión a CCCTC/genética , Cromosomas/genética , Cariotipificación/métodos , Elementos de Nucleótido Esparcido Largo/genética , Ratones , Retroelementos/genética , Especificidad de la Especie

13.

Gearing up to handle the mosaic nature of life in the quest for orthologs.

Forslund, Kristoffer; Pereira, Cecile; Capella-Gutierrez, Salvador; da Silva, Alan Sousa; Altenhoff, Adrian; Huerta-Cepas, Jaime; Muffato, Matthieu; Patricio, Mateus; Vandepoele, Klaas; Ebersberger, Ingo; Blake, Judith; Fernández Breis, Jesualdo Tomás; Boeckmann, Brigitte; Gabaldón, Toni; Sonnhammer, Erik; Dessimoz, Christophe; Lewis, Suzanna.

Bioinformatics ; 34(2): 323-329, 2018 Jan 15.

Artículo en Inglés | MEDLINE | ID: mdl-28968857

RESUMEN

The Quest for Orthologs (QfO) is an open collaboration framework for experts in comparative phylogenomics and related research areas who have an interest in highly accurate orthology predictions and their applications. We here report highlights and discussion points from the QfO meeting 2015 held in Barcelona. Achievements in recent years have established a basis to support developments for improved orthology prediction and to explore new approaches. Central to the QfO effort is proper benchmarking of methods and services, as well as design of standardized datasets and standardized formats to allow sharing and comparison of results. Simultaneously, analysis pipelines have been improved, evaluated and adapted to handle large datasets. All this would not have occurred without the long-term collaboration of Consortium members. Meeting regularly to review and coordinate complementary activities from a broad spectrum of innovative researchers clearly benefits the community. Highlights of the meeting include addressing sources of and legitimacy of disagreements between orthology calls, the context dependency of orthology definitions, special challenges encountered when analyzing very anciently rooted orthologies, orthology in the light of whole-genome duplications, and the concept of orthologous versus paralogous relationships at different levels, including domain-level orthology. Furthermore, particular needs for different applications (e.g. plant genomics, ancient gene families and others) and the infrastructure for making orthology inferences available (e.g. interfaces with model organism databases) were discussed, with several ongoing efforts that are expected to be reported on during the upcoming 2017 QfO meeting.

14.

Ensembl 2018.

Zerbino, Daniel R; Achuthan, Premanand; Akanni, Wasiu; Amode, M Ridwan; Barrell, Daniel; Bhai, Jyothish; Billis, Konstantinos; Cummins, Carla; Gall, Astrid; Girón, Carlos García; Gil, Laurent; Gordon, Leo; Haggerty, Leanne; Haskell, Erin; Hourlier, Thibaut; Izuogu, Osagie G; Janacek, Sophie H; Juettemann, Thomas; To, Jimmy Kiang; Laird, Matthew R; Lavidas, Ilias; Liu, Zhicheng; Loveland, Jane E; Maurel, Thomas; McLaren, William; Moore, Benjamin; Mudge, Jonathan; Murphy, Daniel N; Newman, Victoria; Nuhn, Michael; Ogeh, Denye; Ong, Chuang Kee; Parker, Anne; Patricio, Mateus; Riat, Harpreet Singh; Schuilenburg, Helen; Sheppard, Dan; Sparrow, Helen; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Walts, Brandon; Zadissa, Amonida; Frankish, Adam; Hunt, Sarah E; Kostadima, Myrto; Langridge, Nicholas; Martin, Fergal J; Muffato, Matthieu; Perry, Emily.

Nucleic Acids Res ; 46(D1): D754-D761, 2018 01 04.

Artículo en Inglés | MEDLINE | ID: mdl-29155950

RESUMEN

The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser.

Asunto(s)

Bases de Datos Genéticas , Conjuntos de Datos como Asunto , Genoma , Difusión de la Información , Animales , Epigenómica , Genoma Humano , Estudio de Asociación del Genoma Completo , Genómica , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Anotación de Secuencia Molecular , Vertebrados/genética , Navegador Web

15.

Ensembl 2017.

Aken, Bronwen L; Achuthan, Premanand; Akanni, Wasiu; Amode, M Ridwan; Bernsdorff, Friederike; Bhai, Jyothish; Billis, Konstantinos; Carvalho-Silva, Denise; Cummins, Carla; Clapham, Peter; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E; Janacek, Sophie H; Juettemann, Thomas; Keenan, Stephen; Laird, Matthew R; Lavidas, Ilias; Maurel, Thomas; McLaren, William; Moore, Benjamin; Murphy, Daniel N; Nag, Rishi; Newman, Victoria; Nuhn, Michael; Ong, Chuang Kee; Parker, Anne; Patricio, Mateus; Riat, Harpreet Singh; Sheppard, Daniel; Sparrow, Helen; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Walts, Brandon; Wilder, Steven P; Zadissa, Amonida; Kostadima, Myrto; Martin, Fergal J; Muffato, Matthieu; Perry, Emily; Ruffier, Magali; Staines, Daniel M; Trevanion, Stephen J; Cunningham, Fiona; Yates, Andrew; Zerbino, Daniel R; Flicek, Paul.

Nucleic Acids Res ; 45(D1): D635-D642, 2017 01 04.

Artículo en Inglés | MEDLINE | ID: mdl-27899575

RESUMEN

Ensembl (www.ensembl.org) is a database and genome browser for enabling research on vertebrate genomes. We import, analyse, curate and integrate a diverse collection of large-scale reference data to create a more comprehensive view of genome biology than would be possible from any individual dataset. Our extensive data resources include evidence-based gene and regulatory region annotation, genome variation and gene trees. An accompanying suite of tools, infrastructure and programmatic access methods ensure uniform data analysis and distribution for all supported species. Together, these provide a comprehensive solution for large-scale and targeted genomics applications alike. Among many other developments over the past year, we have improved our resources for gene regulation and comparative genomics, and added CRISPR/Cas9 target sites. We released new browser functionality and tools, including improved filtering and prioritization of genome variation, Manhattan plot visualization for linkage disequilibrium and eQTL data, and an ontology search for phenotypes, traits and disease. We have also enhanced data discovery and access with a track hub registry and a selection of new REST end points. All Ensembl data are freely released to the scientific community and our source code is available via the open source Apache 2.0 license.

Asunto(s)

Biología Computacional/métodos , Bases de Datos Genéticas , Genómica/métodos , Motor de Búsqueda , Programas Informáticos , Navegador Web , Animales , Minería de Datos , Evolución Molecular , Regulación de la Expresión Génica , Variación Genética , Genoma Humano , Humanos , Anotación de Secuencia Molecular , Especificidad de la Especie , Vertebrados

16.

Ensembl comparative genomics resources.

Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul.

Database (Oxford) ; 20162016.

Artículo en Inglés | MEDLINE | ID: mdl-27141089

17.

Standardized benchmarking in the quest for orthologs.

Altenhoff, Adrian M; Boeckmann, Brigitte; Capella-Gutierrez, Salvador; Dalquen, Daniel A; DeLuca, Todd; Forslund, Kristoffer; Huerta-Cepas, Jaime; Linard, Benjamin; Pereira, Cécile; Pryszcz, Leszek P; Schreiber, Fabian; da Silva, Alan Sousa; Szklarczyk, Damian; Train, Clément-Marie; Bork, Peer; Lecompte, Odile; von Mering, Christian; Xenarios, Ioannis; Sjölander, Kimmen; Jensen, Lars Juhl; Martin, Maria J; Muffato, Matthieu; Gabaldón, Toni; Lewis, Suzanna E; Thomas, Paul D; Sonnhammer, Erik; Dessimoz, Christophe.

Nat Methods ; 13(5): 425-30, 2016 05.

Artículo en Inglés | MEDLINE | ID: mdl-27043882

RESUMEN

Achieving high accuracy in orthology inference is essential for many comparative, evolutionary and functional genomic analyses, yet the true evolutionary history of genes is generally unknown and orthologs are used for very different applications across phyla, requiring different precision-recall trade-offs. As a result, it is difficult to assess the performance of orthology inference methods. Here, we present a community effort to establish standards and an automated web-based service to facilitate orthology benchmarking. Using this service, we characterize 15 well-established inference methods and resources on a battery of 20 different benchmarks. Standardized benchmarking provides a way for users to identify the most effective methods for the problem at hand, sets a minimum requirement for new tools and resources, and guides the development of more accurate orthology inference methods.

Asunto(s)

Biología Computacional/normas , Genómica/normas , Filogenia , Proteómica/normas , Archaea/clasificación , Archaea/genética , Bacterias/clasificación , Bacterias/genética , Biología Computacional/métodos , Bases de Datos Genéticas , Eucariontes/clasificación , Eucariontes/genética , Ontología de Genes , Genómica/métodos , Modelos Genéticos , Proteómica/métodos , Análisis de Secuencia de Proteína , Homología de Secuencia , Especificidad de la Especie

18.

ncRNA orthologies in the vertebrate lineage.

Pignatelli, Miguel; Vilella, Albert J; Muffato, Matthieu; Gordon, Leo; White, Simon; Flicek, Paul; Herrero, Javier.

Database (Oxford) ; 20162016.

Artículo en Inglés | MEDLINE | ID: mdl-26980512

RESUMEN

Annotation of orthologous and paralogous genes is necessary for many aspects of evolutionary analysis. Methods to infer these homology relationships have traditionally focused on protein-coding genes and evolutionary models used by these methods normally assume the positions in the protein evolve independently. However, as our appreciation for the roles of non-coding RNA genes has increased, consistently annotated sets of orthologous and paralogous ncRNA genes are increasingly needed. At the same time, methods such as PHASE or RAxML have implemented substitution models that consider pairs of sites to enable proper modelling of the loops and other features of RNA secondary structure. Here, we present a comprehensive analysis pipeline for the automatic detection of orthologues and paralogues for ncRNA genes. We focus on gene families represented in Rfam and for which a specific covariance model is provided. For each family ncRNA genes found in all Ensembl species are aligned using Infernal, and several trees are built using different substitution models. In parallel, a genomic alignment that includes the ncRNA genes and their flanking sequence regions is built with PRANK. This alignment is used to create two additional phylogenetic trees using the neighbour-joining (NJ) and maximum-likelihood (ML) methods. The trees arising from both the ncRNA and genomic alignments are merged using TreeBeST, which reconciles them with the species tree in order to identify speciation and duplication events. The final tree is used to infer the orthologues and paralogues following Fitch's definition. We also determine gene gain and loss events for each family using CAFE. All data are accessible through the Ensembl Comparative Genomics ('Compara') API, on our FTP site and are fully integrated in the Ensembl genome browser, where they can be accessed in a user-friendly manner. Database URL: http://www.ensembl.org.

Asunto(s)

Biología Computacional/métodos , ARN no Traducido/genética , Vertebrados/genética , Algoritmos , Animales , Ciona intestinalis , Ciprinodontiformes , Evolución Molecular , Variación Genética , Genoma , Genómica , Humanos , Funciones de Verosimilitud , Ratones , Conformación de Ácido Nucleico , Sistemas de Lectura Abierta , Filogenia , Ratas , Pez Cebra

19.

Ensembl comparative genomics resources.

Herrero, Javier; Muffato, Matthieu; Beal, Kathryn; Fitzgerald, Stephen; Gordon, Leo; Pignatelli, Miguel; Vilella, Albert J; Searle, Stephen M J; Amode, Ridwan; Brent, Simon; Spooner, William; Kulesha, Eugene; Yates, Andrew; Flicek, Paul.

Database (Oxford) ; 20162016.

Artículo en Inglés | MEDLINE | ID: mdl-26896847

RESUMEN

Evolution provides the unifying framework with which to understand biology. The coherent investigation of genic and genomic data often requires comparative genomics analyses based on whole-genome alignments, sets of homologous genes and other relevant datasets in order to evaluate and answer evolutionary-related questions. However, the complexity and computational requirements of producing such data are substantial: this has led to only a small number of reference resources that are used for most comparative analyses. The Ensembl comparative genomics resources are one such reference set that facilitates comprehensive and reproducible analysis of chordate genome data. Ensembl computes pairwise and multiple whole-genome alignments from which large-scale synteny, per-base conservation scores and constrained elements are obtained. Gene alignments are used to define Ensembl Protein Families, GeneTrees and homologies for both protein-coding and non-coding RNA genes. These resources are updated frequently and have a consistent informatics infrastructure and data presentation across all supported species. Specialized web-based visualizations are also available including synteny displays, collapsible gene tree plots, a gene family locator and different alignment views. The Ensembl comparative genomics infrastructure is extensively reused for the analysis of non-vertebrate species by other projects including Ensembl Genomes and Gramene and much of the information here is relevant to these projects. The consistency of the annotation across species and the focus on vertebrates makes Ensembl an ideal system to perform and support vertebrate comparative genomic analyses. We use robust software and pipelines to produce reference comparative data and make it freely available. Database URL: http://www.ensembl.org.

Asunto(s)

Biología Computacional/métodos , Genoma , Genómica , Algoritmos , Animales , ADN Complementario/genética , Bases de Datos Genéticas , Evolución Molecular , Etiquetas de Secuencia Expresada , Humanos , Filogenia , Control de Calidad , ARN no Traducido/genética , Alineación de Secuencia , Análisis de Secuencia de ARN , Programas Informáticos

20.

Ensembl 2016.

Yates, Andrew; Akanni, Wasiu; Amode, M Ridwan; Barrell, Daniel; Billis, Konstantinos; Carvalho-Silva, Denise; Cummins, Carla; Clapham, Peter; Fitzgerald, Stephen; Gil, Laurent; Girón, Carlos García; Gordon, Leo; Hourlier, Thibaut; Hunt, Sarah E; Janacek, Sophie H; Johnson, Nathan; Juettemann, Thomas; Keenan, Stephen; Lavidas, Ilias; Martin, Fergal J; Maurel, Thomas; McLaren, William; Murphy, Daniel N; Nag, Rishi; Nuhn, Michael; Parker, Anne; Patricio, Mateus; Pignatelli, Miguel; Rahtz, Matthew; Riat, Harpreet Singh; Sheppard, Daniel; Taylor, Kieron; Thormann, Anja; Vullo, Alessandro; Wilder, Steven P; Zadissa, Amonida; Birney, Ewan; Harrow, Jennifer; Muffato, Matthieu; Perry, Emily; Ruffier, Magali; Spudich, Giulietta; Trevanion, Stephen J; Cunningham, Fiona; Aken, Bronwen L; Zerbino, Daniel R; Flicek, Paul.

Nucleic Acids Res ; 44(D1): D710-6, 2016 Jan 04.

Artículo en Inglés | MEDLINE | ID: mdl-26687719

RESUMEN

The Ensembl project (http://www.ensembl.org) is a system for genome annotation, analysis, storage and dissemination designed to facilitate the access of genomic annotation from chordates and key model organisms. It provides access to data from 87 species across our main and early access Pre! websites. This year we introduced three newly annotated species and released numerous updates across our supported species with a concentration on data for the latest genome assemblies of human, mouse, zebrafish and rat. We also provided two data updates for the previous human assembly, GRCh37, through a dedicated website (http://grch37.ensembl.org). Our tools, in particular the VEP, have been improved significantly through integration of additional third party data. REST is now capable of larger-scale analysis and our regulatory data BioMart can deliver faster results. The website is now capable of displaying long-range interactions such as those found in cis-regulated datasets. Finally we have launched a website optimized for mobile devices providing views of genes, variants and phenotypes. Our data is made available without restriction and all code is available from our GitHub organization site (http://github.com/Ensembl) under an Apache 2.0 license.

Asunto(s)

Bases de Datos Genéticas , Genómica , Anotación de Secuencia Molecular , Animales , Genes , Variación Genética , Humanos , Internet , Ratones , Proteínas/genética , Ratas , Secuencias Reguladoras de Ácidos Nucleicos , Programas Informáticos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA