Pesquisa | Biblioteca Virtual em Saúde

1.

The complete sequence of a human Y chromosome.

Rhie, Arang; Nurk, Sergey; Cechova, Monika; Hoyt, Savannah J; Taylor, Dylan J; Altemose, Nicolas; Hook, Paul W; Koren, Sergey; Rautiainen, Mikko; Alexandrov, Ivan A; Allen, Jamie; Asri, Mobin; Bzikadze, Andrey V; Chen, Nae-Chyun; Chin, Chen-Shan; Diekhans, Mark; Flicek, Paul; Formenti, Giulio; Fungtammasan, Arkarachai; Garcia Giron, Carlos; Garrison, Erik; Gershman, Ariel; Gerton, Jennifer L; Grady, Patrick G S; Guarracino, Andrea; Haggerty, Leanne; Halabian, Reza; Hansen, Nancy F; Harris, Robert; Hartley, Gabrielle A; Harvey, William T; Haukness, Marina; Heinz, Jakob; Hourlier, Thibaut; Hubley, Robert M; Hunt, Sarah E; Hwang, Stephen; Jain, Miten; Kesharwani, Rupesh K; Lewis, Alexandra P; Li, Heng; Logsdon, Glennis A; Lucas, Julian K; Makalowski, Wojciech; Markovic, Christopher; Martin, Fergal J; Mc Cartney, Ann M; McCoy, Rajiv C; McDaniel, Jennifer; McNulty, Brandy M.

Nature ; 621(7978): 344-354, 2023 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-37612512

RESUMO

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications1-3. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished4,5. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region. We have combined T2T-Y with a previous assembly of the CHM13 genome4 and mapped available population variation, clinical variants and functional genomics data to produce a complete and comprehensive reference sequence for all 24 human chromosomes.

Assuntos

Cromossomos Humanos Y , Genômica , Análise de Sequência de DNA , Humanos , Sequência de Bases , Cromossomos Humanos Y/genética , DNA Satélite/genética , Variação Genética/genética , Genética Populacional , Genômica/métodos , Genômica/normas , Heterocromatina/genética , Família Multigênica/genética , Padrões de Referência , Duplicações Segmentares Genômicas/genética , Análise de Sequência de DNA/normas , Sequências de Repetição em Tandem/genética , Telômero/genética

2.

Chiropterans Are a Hotspot for Horizontal Transfer of DNA Transposons in Mammalia.

Paulat, Nicole S; Storer, Jessica M; Moreno-Santillán, Diana D; Osmanski, Austin B; Sullivan, Kevin A M; Grimshaw, Jenna R; Korstian, Jennifer; Halsey, Michaela; Garcia, Carlos J; Crookshanks, Claudia; Roberts, Jaquelyn; Smit, Arian F A; Hubley, Robert; Rosen, Jeb; Teeling, Emma C; Vernes, Sonja C; Myers, Eugene; Pippel, Martin; Brown, Thomas; Hiller, Michael; Rojas, Danny; Dávalos, Liliana M; Lindblad-Toh, Kerstin; Karlsson, Elinor K; Ray, David A.

Mol Biol Evol ; 40(5)2023 05 02.

Artigo em Inglês | MEDLINE | ID: mdl-37071810

RESUMO

Horizontal transfer of transposable elements (TEs) is an important mechanism contributing to genetic diversity and innovation. Bats (order Chiroptera) have repeatedly been shown to experience horizontal transfer of TEs at what appears to be a high rate compared with other mammals. We investigated the occurrence of horizontally transferred (HT) DNA transposons involving bats. We found over 200 putative HT elements within bats; 16 transposons were shared across distantly related mammalian clades, and 2 other elements were shared with a fish and two lizard species. Our results indicate that bats are a hotspot for horizontal transfer of DNA transposons. These events broadly coincide with the diversification of several bat clades, supporting the hypothesis that DNA transposon invasions have contributed to genetic diversification of bats.

Assuntos

Quirópteros , Elementos de DNA Transponíveis , Animais , Elementos de DNA Transponíveis/genética , Quirópteros/genética , Transferência Genética Horizontal , Evolução Molecular , Mamíferos/genética , Filogenia

3.

RepeatModeler2 for automated genomic discovery of transposable element families.

Flynn, Jullien M; Hubley, Robert; Goubert, Clément; Rosen, Jeb; Clark, Andrew G; Feschotte, Cédric; Smit, Arian F.

Proc Natl Acad Sci U S A ; 117(17): 9451-9457, 2020 04 28.

Artigo em Inglês | MEDLINE | ID: mdl-32300014

RESUMO

The accelerating pace of genome sequencing throughout the tree of life is driving the need for improved unsupervised annotation of genome components such as transposable elements (TEs). Because the types and sequences of TEs are highly variable across species, automated TE discovery and annotation are challenging and time-consuming tasks. A critical first step is the de novo identification and accurate compilation of sequence models representing all of the unique TE families dispersed in the genome. Here we introduce RepeatModeler2, a pipeline that greatly facilitates this process. This program brings substantial improvements over the original version of RepeatModeler, one of the most widely used tools for TE discovery. In particular, this version incorporates a module for structural discovery of complete long terminal repeat (LTR) retroelements, which are widespread in eukaryotic genomes but recalcitrant to automated identification because of their size and sequence complexity. We benchmarked RepeatModeler2 on three model species with diverse TE landscapes and high-quality, manually curated TE libraries: Drosophila melanogaster (fruit fly), Danio rerio (zebrafish), and Oryza sativa (rice). In these three species, RepeatModeler2 identified approximately 3 times more consensus sequences matching with >95% sequence identity and sequence coverage to the manually curated sequences than the original RepeatModeler. As expected, the greatest improvement is for LTR retroelements. Thus, RepeatModeler2 represents a valuable addition to the genome annotation toolkit that will enhance the identification and study of TEs in eukaryotic genome sequences. RepeatModeler2 is available as source code or a containerized package under an open license (https://github.com/Dfam-consortium/RepeatModeler, http://www.repeatmasker.org/RepeatModeler/).

Assuntos

Elementos de DNA Transponíveis/genética , Genômica/métodos , Animais , Drosophila melanogaster/genética , Genoma , Oryza/genética , Software , Peixe-Zebra/genética

4.

Survey of COVID-19 Vaccine Attitudes in Predominately Minority Pregnant Women.

Bonilla, Engelbert; Fogel, Joshua; Hubley, Robert; Anand, Rahul; Liu, Paul C.

South Med J ; 116(8): 677-682, 2023 08.

Artigo em Inglês | MEDLINE | ID: mdl-37536694

RESUMO

OBJECTIVES: Despite recommendations for coronavirus disease (COVID-19) vaccination during pregnancy, some pregnant women are concerned about COVID-19 vaccines and decline to be vaccinated. This study focuses on attitudes in a sample of mostly minority pregnant Hispanic and Black women that may influence vaccine hesitancy. METHODS: This was a cross-sectional survey of 400 pregnant women. Participants were provided with a one-page information sheet on pregnancy health, COVID-19 health, and COVID-19 vaccines. They were then asked to complete a survey on attitudes about these topics. RESULTS: We found that attitudes for knowing about the health topics were in the range from agree to strongly agree, whereas attitudes for knowing about topics pertaining to COVID-19 messenger RNA (mRNA) vaccines were in a lower-level range from neutral to agree. Negative vaccine attitudes were significantly associated with decreased agreement for knowing about health attitudes, but not significantly associated with COVID-19 mRNA vaccine attitudes. CONCLUSIONS: COVID-19 vaccine mRNA technology was a lesser understood topic than attitudes for knowing about other health topics. This finding suggests the need for physician intervention and that further education about COVID-19 vaccine mRNA technology may influence patient attitudes toward acceptance of the COVID-19 mRNA vaccine in pregnancy.

Assuntos

Vacinas contra COVID-19 , COVID-19 , Gravidez , Feminino , Humanos , Estudos Transversais , Gestantes , COVID-19/epidemiologia , COVID-19/prevenção & controle , Atitude Frente a Saúde , Vacinação , RNA Mensageiro

5.

The transposable element-rich genome of the cereal pest Sitophilus oryzae.

Parisot, Nicolas; Vargas-Chávez, Carlos; Goubert, Clément; Baa-Puyoulet, Patrice; Balmand, Séverine; Beranger, Louis; Blanc, Caroline; Bonnamour, Aymeric; Boulesteix, Matthieu; Burlet, Nelly; Calevro, Federica; Callaerts, Patrick; Chancy, Théo; Charles, Hubert; Colella, Stefano; Da Silva Barbosa, André; Dell'Aglio, Elisa; Di Genova, Alex; Febvay, Gérard; Gabaldón, Toni; Galvão Ferrarini, Mariana; Gerber, Alexandra; Gillet, Benjamin; Hubley, Robert; Hughes, Sandrine; Jacquin-Joly, Emmanuelle; Maire, Justin; Marcet-Houben, Marina; Masson, Florent; Meslin, Camille; Montagné, Nicolas; Moya, Andrés; Ribeiro de Vasconcelos, Ana Tereza; Richard, Gautier; Rosen, Jeb; Sagot, Marie-France; Smit, Arian F A; Storer, Jessica M; Vincent-Monegat, Carole; Vallier, Agnès; Vigneron, Aurélien; Zaidman-Rémy, Anna; Zamoum, Waël; Vieira, Cristina; Rebollo, Rita; Latorre, Amparo; Heddi, Abdelaziz.

BMC Biol ; 19(1): 241, 2021 11 09.

Artigo em Inglês | MEDLINE | ID: mdl-34749730

RESUMO

BACKGROUND: The rice weevil Sitophilus oryzae is one of the most important agricultural pests, causing extensive damage to cereal in fields and to stored grains. S. oryzae has an intracellular symbiotic relationship (endosymbiosis) with the Gram-negative bacterium Sodalis pierantonius and is a valuable model to decipher host-symbiont molecular interactions. RESULTS: We sequenced the Sitophilus oryzae genome using a combination of short and long reads to produce the best assembly for a Curculionidae species to date. We show that S. oryzae has undergone successive bursts of transposable element (TE) amplification, representing 72% of the genome. In addition, we show that many TE families are transcriptionally active, and changes in their expression are associated with insect endosymbiotic state. S. oryzae has undergone a high gene expansion rate, when compared to other beetles. Reconstruction of host-symbiont metabolic networks revealed that, despite its recent association with cereal weevils (30 kyear), S. pierantonius relies on the host for several amino acids and nucleotides to survive and to produce vitamins and essential amino acids required for insect development and cuticle biosynthesis. CONCLUSIONS: Here we present the genome of an agricultural pest beetle, which may act as a foundation for pest control. In addition, S. oryzae may be a useful model for endosymbiosis, and studying TE evolution and regulation, along with the impact of TEs on eukaryotic genomes.

Assuntos

Besouros , Gorgulhos , Animais , Comunicação Celular , Elementos de DNA Transponíveis/genética , Grão Comestível , Humanos , Gorgulhos/genética

6.

Gibbon genome and the fast karyotype evolution of small apes.

Carbone, Lucia; Harris, R Alan; Gnerre, Sante; Veeramah, Krishna R; Lorente-Galdos, Belen; Huddleston, John; Meyer, Thomas J; Herrero, Javier; Roos, Christian; Aken, Bronwen; Anaclerio, Fabio; Archidiacono, Nicoletta; Baker, Carl; Barrell, Daniel; Batzer, Mark A; Beal, Kathryn; Blancher, Antoine; Bohrson, Craig L; Brameier, Markus; Campbell, Michael S; Capozzi, Oronzo; Casola, Claudio; Chiatante, Giorgia; Cree, Andrew; Damert, Annette; de Jong, Pieter J; Dumas, Laura; Fernandez-Callejo, Marcos; Flicek, Paul; Fuchs, Nina V; Gut, Ivo; Gut, Marta; Hahn, Matthew W; Hernandez-Rodriguez, Jessica; Hillier, LaDeana W; Hubley, Robert; Ianc, Bianca; Izsvák, Zsuzsanna; Jablonski, Nina G; Johnstone, Laurel M; Karimpour-Fard, Anis; Konkel, Miriam K; Kostka, Dennis; Lazar, Nathan H; Lee, Sandra L; Lewis, Lora R; Liu, Yue; Locke, Devin P; Mallick, Swapan; Mendez, Fernando L.

Nature ; 513(7517): 195-201, 2014 Sep 11.

Artigo em Inglês | MEDLINE | ID: mdl-25209798

RESUMO

Gibbons are small arboreal apes that display an accelerated rate of evolutionary chromosomal rearrangement and occupy a key node in the primate phylogeny between Old World monkeys and great apes. Here we present the assembly and analysis of a northern white-cheeked gibbon (Nomascus leucogenys) genome. We describe the propensity for a gibbon-specific retrotransposon (LAVA) to insert into chromosome segregation genes and alter transcription by providing a premature termination site, suggesting a possible molecular mechanism for the genome plasticity of the gibbon lineage. We further show that the gibbon genera (Nomascus, Hylobates, Hoolock and Symphalangus) experienced a near-instantaneous radiation â¼5 million years ago, coincident with major geographical changes in southeast Asia that caused cycles of habitat compression and expansion. Finally, we identify signatures of positive selection in genes important for forelimb development (TBX5) and connective tissues (COL1A1) that may have been involved in the adaptation of gibbons to their arboreal habitat.

Assuntos

Genoma/genética , Hylobates/classificação , Hylobates/genética , Cariótipo , Filogenia , Animais , Evolução Molecular , Hominidae/classificação , Hominidae/genética , Humanos , Dados de Sequência Molecular , Retroelementos/genética , Seleção Genética , Terminação da Transcrição Genética

7.

Discovery of a new repeat family in the Callithrix jacchus genome.

Konkel, Miriam K; Ullmer, Brygg; Arceneaux, Erika L; Sanampudi, Sreeja; Brantley, Sarah A; Hubley, Robert; Smit, Arian F A; Batzer, Mark A.

Genome Res ; 26(5): 649-59, 2016 05.

Artigo em Inglês | MEDLINE | ID: mdl-26916108

RESUMO

We identified a novel repeat family, termed Platy-1, in the Callithrix jacchus (common marmoset) genome that arose around the time of the divergence of platyrrhines and catarrhines and established itself as a repeat family in New World monkeys (NWMs). A full-length Platy-1 element is â¼100 bp in length, making it the shortest known short interspersed element (SINE) in primates, and harbors features characteristic of non-LTR retrotransposons. We identified 2268 full-length Platy-1 elements across 62 subfamilies in the common marmoset genome. Our subfamily reconstruction and phylogenetic analyses support Platy-1 propagation throughout the evolution of NWMs in the lineage leading to C. jacchus Platy-1 appears to have reached its amplification peak in the common ancestor of current day marmosets and has since moderately declined. However, identification of more than 200 Platy-1 elements identical to their respective consensus sequence, and the presence of polymorphic elements within common marmoset populations, suggests ongoing retrotransposition activity. Platy-1, a SINE, appears to have originated from an Alu element, and hence is likely derived from 7SL RNA. Our analyses illustrate the birth of a new repeat family and its propagation dynamics in the lineage leading to the common marmoset over the last 40 million years.

Assuntos

Elementos Alu , Callithrix/genética , Evolução Molecular , Filogenia , Retroelementos , Animais

8.

The Dfam database of repetitive DNA families.

Hubley, Robert; Finn, Robert D; Clements, Jody; Eddy, Sean R; Jones, Thomas A; Bao, Weidong; Smit, Arian F A; Wheeler, Travis J.

Nucleic Acids Res ; 44(D1): D81-9, 2016 Jan 04.

Artigo em Inglês | MEDLINE | ID: mdl-26612867

RESUMO

Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download.

Assuntos

Elementos de DNA Transponíveis , DNA/química , Bases de Dados de Ácidos Nucleicos , Sequências Repetitivas de Ácido Nucleico , Animais , DNA/classificação , Genoma , Humanos , Internet , Cadeias de Markov , Camundongos , Anotação de Sequência Molecular , Alinhamento de Sequência

9.

The UCSC Genome Browser database: 2015 update.

Rosenbloom, Kate R; Armstrong, Joel; Barber, Galt P; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Dreszer, Timothy R; Fujita, Pauline A; Guruvadoo, Luvina; Haeussler, Maximilian; Harte, Rachel A; Heitner, Steve; Hickey, Glenn; Hinrichs, Angie S; Hubley, Robert; Karolchik, Donna; Learned, Katrina; Lee, Brian T; Li, Chin H; Miga, Karen H; Nguyen, Ngan; Paten, Benedict; Raney, Brian J; Smit, Arian F A; Speir, Matthew L; Zweig, Ann S; Haussler, David; Kuhn, Robert M; Kent, W James.

Nucleic Acids Res ; 43(Database issue): D670-81, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25428374

RESUMO

Launched in 2001 to showcase the draft human genome assembly, the UCSC Genome Browser database (http://genome.ucsc.edu) and associated tools continue to grow, providing a comprehensive resource of genome assemblies and annotations to scientists and students worldwide. Highlights of the past year include the release of a browser for the first new human genome reference assembly in 4 years in December 2013 (GRCh38, UCSC hg38), a watershed comparative genomics annotation (100-species multiple alignment and conservation) and a novel distribution mechanism for the browser (GBiB: Genome Browser in a Box). We created browsers for new species (Chinese hamster, elephant shark, minke whale), 'mined the web' for DNA sequences and expanded the browser display with stacked color graphs and region highlighting. As our user community increasingly adopts the UCSC track hub and assembly hub representations for sharing large-scale genomic annotation data sets and genome sequencing projects, our menu of public data hubs has tripled.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genômica , Animais , Cricetinae , Cães , Ebolavirus/genética , Expressão Gênica , Genoma , Internet , Camundongos , Anotação de Sequência Molecular , Fenótipo , Ratos , Software

10.

Relationship estimation from whole-genome sequence data.

Li, Hong; Glusman, Gustavo; Hu, Hao; Caballero, Juan; Hubley, Robert; Witherspoon, David; Guthery, Stephen L; Mauldin, Denise E; Jorde, Lynn B; Hood, Leroy; Roach, Jared C; Huff, Chad D.

PLoS Genet ; 10(1): e1004144, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24497848

RESUMO

The determination of the relationship between a pair of individuals is a fundamental application of genetics. Previously, we and others have demonstrated that identity-by-descent (IBD) information generated from high-density single-nucleotide polymorphism (SNP) data can greatly improve the power and accuracy of genetic relationship detection. Whole-genome sequencing (WGS) marks the final step in increasing genetic marker density by assaying all single-nucleotide variants (SNVs), and thus has the potential to further improve relationship detection by enabling more accurate detection of IBD segments and more precise resolution of IBD segment boundaries. However, WGS introduces new complexities that must be addressed in order to achieve these improvements in relationship detection. To evaluate these complexities, we estimated genetic relationships from WGS data for 1490 known pairwise relationships among 258 individuals in 30 families along with 46 population samples as controls. We identified several genomic regions with excess pairwise IBD in both the pedigree and control datasets using three established IBD methods: GERMLINE, fastIBD, and ISCA. These spurious IBD segments produced a 10-fold increase in the rate of detected false-positive relationships among controls compared to high-density microarray datasets. To address this issue, we developed a new method to identify and mask genomic regions with excess IBD. This method, implemented in ERSA 2.0, fully resolved the inflated cryptic relationship detection rates while improving relationship estimation accuracy. ERSA 2.0 detected all 1(st) through 6(th) degree relationships, and 55% of 9(th) through 11(th) degree relationships in the 30 families. We estimate that WGS data provides a 5% to 15% increase in relationship detection power relative to high-density microarray data for distant relationships. Our results identify regions of the genome that are highly problematic for IBD mapping and introduce new software to accurately detect 1(st) through 9(th) degree relationships from whole-genome sequence data.

Assuntos

Mapeamento Cromossômico/métodos , Genética Populacional , Polimorfismo de Nucleotídeo Único/genética , Software , Algoritmos , Ligação Genética , Genoma Humano , Genômica , Mutação em Linhagem Germinativa/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Linhagem

11.

Dfam: a database of repetitive DNA based on profile hidden Markov models.

Wheeler, Travis J; Clements, Jody; Eddy, Sean R; Hubley, Robert; Jones, Thomas A; Jurka, Jerzy; Smit, Arian F A; Finn, Robert D.

Nucleic Acids Res ; 41(Database issue): D70-82, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23203985

RESUMO

We present a database of repetitive DNA elements, called Dfam (http://dfam.janelia.org). Many genomes contain a large fraction of repetitive DNA, much of which is made up of remnants of transposable elements (TEs). Accurate annotation of TEs enables research into their biology and can shed light on the evolutionary processes that shape genomes. Identification and masking of TEs can also greatly simplify many downstream genome annotation and sequence analysis tasks. The commonly used TE annotation tools RepeatMasker and Censor depend on sequence homology search tools such as cross_match and BLAST variants, as well as Repbase, a collection of known TE families each represented by a single consensus sequence. Dfam contains entries corresponding to all Repbase TE entries for which instances have been found in the human genome. Each Dfam entry is represented by a profile hidden Markov model, built from alignments generated using RepeatMasker and Repbase. When used in conjunction with the hidden Markov model search tool nhmmer, Dfam produces a 2.9% increase in coverage over consensus sequence search methods on a large human benchmark, while maintaining low false discovery rates, and coverage of the full human genome is 54.5%. The website provides a collection of tools and data views to support improved TE curation and annotation efforts. Dfam is also available for download in flat file format or in the form of MySQL table dumps.

Assuntos

Elementos de DNA Transponíveis , Bases de Dados de Ácidos Nucleicos , Genoma Humano , Humanos , Internet , Cadeias de Markov , Modelos Estatísticos , Anotação de Sequência Molecular

12.

Chromosomal haplotypes by genetic phasing of human families.

Roach, Jared C; Glusman, Gustavo; Hubley, Robert; Montsaroff, Stephen Z; Holloway, Alisha K; Mauldin, Denise E; Srivastava, Deepak; Garg, Vidu; Pollard, Katherine S; Galas, David J; Hood, Leroy; Smit, Arian F A.

Am J Hum Genet ; 89(3): 382-97, 2011 Sep 09.

Artigo em Inglês | MEDLINE | ID: mdl-21855840

RESUMO

Assignment of alleles to haplotypes for nearly all the variants on all chromosomes can be performed by genetic analysis of a nuclear family with three or more children. Whole-genome sequence data enable deterministic phasing of nearly all sequenced alleles by permitting assignment of recombinations to precise chromosomal positions and specific meioses. We demonstrate this process of genetic phasing on two families each with four children. We generate haplotypes for all of the children and their parents; these haplotypes span all genotyped positions, including rare variants. Misassignments of phase between variants (switch errors) are nearly absent. Our algorithm can also produce multimegabase haplotypes for nuclear families with just two children and can handle families with missing individuals. We implement our algorithm in a suite of software scripts (Haploscribe). Haplotypes and family genome sequences will become increasingly important for personalized medicine and for fundamental biology.

Assuntos

Algoritmos , Cromossomos Humanos/genética , Variação Genética , Haplótipos/genética , Padrões de Herança/genética , Modelos Genéticos , Software , Humanos , Mutação/genética , Linhagem , Análise de Sequência de DNA/métodos

13.

Insights into mammalian TE diversity through the curation of 248 genome assemblies.

Osmanski, Austin B; Paulat, Nicole S; Korstian, Jenny; Grimshaw, Jenna R; Halsey, Michaela; Sullivan, Kevin A M; Moreno-Santillán, Diana D; Crookshanks, Claudia; Roberts, Jacquelyn; Garcia, Carlos; Johnson, Matthew G; Densmore, Llewellyn D; Stevens, Richard D; Rosen, Jeb; Storer, Jessica M; Hubley, Robert; Smit, Arian F A; Dávalos, Liliana M; Karlsson, Elinor K; Lindblad-Toh, Kerstin; Ray, David A.

Science ; 380(6643): eabn1430, 2023 04 28.

Artigo em Inglês | MEDLINE | ID: mdl-37104570

RESUMO

We examined transposable element (TE) content of 248 placental mammal genome assemblies, the largest de novo TE curation effort in eukaryotes to date. We found that although mammals resemble one another in total TE content and diversity, they show substantial differences with regard to recent TE accumulation. This includes multiple recent expansion and quiescence events across the mammalian tree. Young TEs, particularly long interspersed elements, drive increases in genome size, whereas DNA transposons are associated with smaller genomes. Mammals tend to accumulate only a few types of TEs at any given time, with one TE type dominating. We also found association between dietary habit and the presence of DNA transposon invasions. These detailed annotations will serve as a benchmark for future comparative TE analyses among placental mammals.

Assuntos

Elementos de DNA Transponíveis , Eutérios , Evolução Molecular , Variação Genética , Animais , Feminino , Gravidez , Elementos Nucleotídeos Longos e Dispersos , Eutérios/genética , Conjuntos de Dados como Assunto , Comportamento Alimentar

14.

Accuracy of multiple sequence alignment methods in the reconstruction of transposable element families.

Hubley, Robert; Wheeler, Travis J; Smit, Arian F A.

NAR Genom Bioinform ; 4(2): lqac040, 2022 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-35591887

RESUMO

The construction of a high-quality multiple sequence alignment (MSA) from copies of a transposable element (TE) is a critical step in the characterization of a new TE family. Most studies of MSA accuracy have been conducted on protein or RNA sequence families, where structural features and strong signals of selection may assist with alignment. Less attention has been given to the quality of sequence alignments involving neutrally evolving DNA sequences such as those resulting from TE replication. Transposable element sequences are challenging to align due to their wide divergence ranges, fragmentation, and predominantly-neutral mutation patterns. To gain insight into the effects of these properties on MSA accuracy, we developed a simulator of TE sequence evolution, and used it to generate a benchmark with which we evaluated the MSA predictions produced by several popular aligners, along with Refiner, a method we developed in the context of our RepeatModeler software. We find that MAFFT and Refiner generally outperform other aligners for low to medium divergence simulated sequences, while Refiner is uniquely effective when tasked with aligning high-divergent and fragmented instances of a family.

15.

Methodologies for the De novo Discovery of Transposable Element Families.

Storer, Jessica M; Hubley, Robert; Rosen, Jeb; Smit, Arian F A.

Genes (Basel) ; 13(4)2022 04 17.

Artigo em Inglês | MEDLINE | ID: mdl-35456515

RESUMO

The discovery and characterization of transposable element (TE) families are crucial tasks in the process of genome annotation. Careful curation of TE libraries for each organism is necessary as each has been exposed to a unique and often complex set of TE families. De novo methods have been developed; however, a fully automated and accurate approach to the development of complete libraries remains elusive. In this review, we cover established methods and recent developments in de novo TE analysis. We also present various methodologies used to assess these tools and discuss opportunities for further advancement of the field.

Assuntos

Elementos de DNA Transponíveis , Elementos de DNA Transponíveis/genética

16.

Curation Guidelines for de novo Generated Transposable Element Families.

Storer, Jessica M; Hubley, Robert; Rosen, Jeb; Smit, Arian F A.

Curr Protoc ; 1(6): e154, 2021 Jun.

Artigo em Inglês | MEDLINE | ID: mdl-34138525

RESUMO

Transposable elements (TEs) have the ability to alter individual genomic landscapes and shape the course of evolution for species in which they reside. Such profound changes can be understood by studying the biology of the organism and the interplay of the TEs it hosts. Characterizing and curating TEs across a wide range of species is a fundamental first step in this endeavor. This protocol employs techniques honed while developing TE libraries for a wide range of organisms and specifically addresses: (1) the extension of truncated de novo results into full-length TE families; (2) the iterative refinement of TE multiple sequence alignments; and (3) the use of alignment visualization to assess model completeness and subfamily structure. © 2021 Wiley Periodicals LLC. Basic Protocol: Extension and edge polishing of consensi and seed alignments derived from de novo repeat finders Support Protocol: Generating seed alignments using a library of consensi and a genome assembly.

Assuntos

Elementos de DNA Transponíveis , Genômica , Elementos de DNA Transponíveis/genética , Humanos , Alinhamento de Sequência

17.

The Dfam community resource of transposable element families, sequence models, and genome annotations.

Storer, Jessica; Hubley, Robert; Rosen, Jeb; Wheeler, Travis J; Smit, Arian F.

Mob DNA ; 12(1): 2, 2021 Jan 12.

Artigo em Inglês | MEDLINE | ID: mdl-33436076

RESUMO

Dfam is an open access database of repetitive DNA families, sequence models, and genome annotations. The 3.0-3.3 releases of Dfam ( https://dfam.org ) represent an evolution from a proof-of-principle collection of transposable element families in model organisms into a community resource for a broad range of species, and for both curated and uncurated datasets. In addition, releases since Dfam 3.0 provide auxiliary consensus sequence models, transposable element protein alignments, and a formalized classification system to support the growing diversity of organisms represented in the resource. The latest release includes 266,740 new de novo generated transposable element families from 336 species contributed by the EBI. This expansion demonstrates the utility of many of Dfam's new features and provides insight into the long term challenges ahead for improving de novo generated transposable element datasets.

18.

TE Hub: A community-oriented space for sharing and connecting tools, data, resources, and methods for transposable element annotation.

Elliott, Tyler A; Heitkam, Tony; Hubley, Robert; Quesneville, Hadi; Suh, Alexander; Wheeler, Travis J.

Mob DNA ; 12(1): 16, 2021 Jun 21.

Artigo em Inglês | MEDLINE | ID: mdl-34154643

RESUMO

Transposable elements (TEs) play powerful and varied evolutionary and functional roles, and are widespread in most eukaryotic genomes. Research into their unique biology has driven the creation of a large collection of databases, software, classification systems, and annotation guidelines. The diversity of available TE-related methods and resources raises compatibility concerns and can be overwhelming to researchers and communicators seeking straightforward guidance or materials. To address these challenges, we have initiated a new resource, TE Hub, that provides a space where members of the TE community can collaborate to document and create resources and methods. The space consists of (1) a website organized with an open wiki framework, https://tehub.org , (2) a conversation framework via a Twitter account and a Slack channel, and (3) bi-monthly Hub Update video chats on the platform's development. In addition to serving as a centralized repository and communication platform, TE Hub lays the foundation for improved integration, standardization, and effectiveness of diverse tools and protocols. We invite the TE community, both novices and experts in TE identification and analysis, to join us in expanding our community-oriented resource.

19.

A common open representation of mass spectrometry data and its application to proteomics research.

Pedrioli, Patrick G A; Eng, Jimmy K; Hubley, Robert; Vogelzang, Mathijs; Deutsch, Eric W; Raught, Brian; Pratt, Brian; Nilsson, Erik; Angeletti, Ruth H; Apweiler, Rolf; Cheung, Kei; Costello, Catherine E; Hermjakob, Henning; Huang, Sequin; Julian, Randall K; Kapp, Eugene; McComb, Mark E; Oliver, Stephen G; Omenn, Gilbert; Paton, Norman W; Simpson, Richard; Smith, Richard; Taylor, Chris F; Zhu, Weimin; Aebersold, Ruedi.

Nat Biotechnol ; 22(11): 1459-66, 2004 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-15529173

RESUMO

A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics research. Each type of instrument possesses a unique design, data system and performance specifications, resulting in strengths and weaknesses for different types of experiments. Unfortunately, the native binary data formats produced by each type of mass spectrometer also differ and are usually proprietary. The diverse, nontransparent nature of the data structure complicates the integration of new instruments into preexisting infrastructure, impedes the analysis, exchange, comparison and publication of results from different experiments and laboratories, and prevents the bioinformatics community from accessing data sets required for software development. Here, we introduce the 'mzXML' format, an open, generic XML (extensible markup language) representation of MS data. We have also developed an accompanying suite of supporting programs. We expect that this format will facilitate data management, interpretation and dissemination in proteomics research.

Assuntos

Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Disseminação de Informação/métodos , Armazenamento e Recuperação da Informação/métodos , Espectrometria de Massas/métodos , Proteômica/métodos , Interface Usuário-Computador , Armazenamento e Recuperação da Informação/normas , Espectrometria de Massas/normas , Proteoma/análise , Proteoma/química , Proteoma/classificação , Proteômica/normas , Software

20.

A call for benchmarking transposable element annotation methods.

Hoen, Douglas R; Hickey, Glenn; Bourque, Guillaume; Casacuberta, Josep; Cordaux, Richard; Feschotte, Cédric; Fiston-Lavier, Anna-Sophie; Hua-Van, Aurélie; Hubley, Robert; Kapusta, Aurélie; Lerat, Emmanuelle; Maumus, Florian; Pollock, David D; Quesneville, Hadi; Smit, Arian; Wheeler, Travis J; Bureau, Thomas E; Blanchette, Mathieu.

Mob DNA ; 6: 13, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26244060

RESUMO

DNA derived from transposable elements (TEs) constitutes large parts of the genomes of complex eukaryotes, with major impacts not only on genomic research but also on how organisms evolve and function. Although a variety of methods and tools have been developed to detect and annotate TEs, there are as yet no standard benchmarks-that is, no standard way to measure or compare their accuracy. This lack of accuracy assessment calls into question conclusions from a wide range of research that depends explicitly or implicitly on TE annotation. In the absence of standard benchmarks, toolmakers are impeded in improving their tools, annotators cannot properly assess which tools might best suit their needs, and downstream researchers cannot judge how accuracy limitations might impact their studies. We therefore propose that the TE research community create and adopt standard TE annotation benchmarks, and we call for other researchers to join the authors in making this long-overdue effort a success.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA