Búsqueda | BVS CLAP/SMR-OPS/OMS

Marker genes as predictors of shared genomic function.

Sevigny, Joseph L; Rothenheber, Derek; Diaz, Krystalle Sharlyn; Zhang, Ying; Agustsson, Kristin; Bergeron, R Daniel; Thomas, W Kelley.

BMC Genomics ; 20(1): 268, 2019 Apr 04.

Artículo en Inglés | MEDLINE | ID: mdl-30947688

RESUMEN

BACKGROUND: Although high-throughput marker gene studies provide valuable insight into the diversity and relative abundance of taxa in microbial communities, they do not provide direct measures of their functional capacity. Recently, scientists have shown a general desire to predict functional profiles of microbial communities based on phylogenetic identification inferred from marker genes, and recent tools have been developed to link the two. However, to date, no large-scale examination has quantified the correlation between the marker gene based taxonomic identity and protein coding gene conservation. Here we utilize 4872 representative prokaryotic genomes from NCBI to investigate the relationship between marker gene identity and shared protein coding gene content. RESULTS: Even at 99-100% marker gene identity, genomes share on average less than 75% of their protein coding gene content. This occurs regardless of the marker gene(s) used: V4 region of the 16S rRNA, complete 16S rRNA, or single copy orthologs through a multi-locus sequence analysis. An important aspect related to this observation is the intra-organism variation of 16S copies from a single genome. Although the majority of 16S copies were found to have high sequence similarity (> 99%), several genomes contained copies that were highly diverged (< 97% identity). CONCLUSIONS: This is the largest comparison between marker gene similarity and shared protein coding gene content to date. The study highlights the limitations of inferring a microbial community's functions based on marker gene phylogeny. The data presented expands upon the results of previous studies that examined one or few bacterial species and supports the hypothesis that 16S rRNA and other marker genes cannot be directly used to fully predict the functional potential of a bacterial community.

Asunto(s)

Bacterias/clasificación , Bacterias/genética , Genes Bacterianos/fisiología , Marcadores Genéticos , Genoma Bacteriano , Metagenoma , ADN Bacteriano/genética , Evolución Molecular , Genes Bacterianos/genética , Microbiota , Filogenia , ARN Ribosómico 16S/genética , Análisis de Secuencia de ADN/métodos

PALADIN: protein alignment for functional profiling whole metagenome shotgun data.

Westbrook, Anthony; Ramsdell, Jordan; Schuelke, Taruna; Normington, Louisa; Bergeron, R Daniel; Thomas, W Kelley; MacManes, Matthew D.

Bioinformatics ; 33(10): 1473-1478, 2017 May 15.

Artículo en Inglés | MEDLINE | ID: mdl-28158639

RESUMEN

MOTIVATION: Whole metagenome shotgun sequencing is a powerful approach for assaying the functional potential of microbial communities. We currently lack tools that efficiently and accurately align DNA reads against protein references, the technique necessary for constructing a functional profile. Here, we present PALADIN-a novel modification of the Burrows-Wheeler Aligner that provides accurate alignment, robust reporting capabilities and orders-of-magnitude improved efficiency by directly mapping in protein space. RESULTS: We compared the accuracy and efficiency of PALADIN against existing tools that employ nucleotide or protein alignment algorithms. Using simulated reads, PALADIN consistently outperformed the popular DNA read mappers BWA and NovoAlign in detected proteins, percentage of reads mapped and ontological similarity. We also compared PALADIN against four existing protein alignment tools: BLASTX, RAPSearch2, DIAMOND and Lambda, using empirically obtained reads. PALADIN yielded results seven times faster than the best performing alternative, DIAMOND and nearly 8000 times faster than BLASTX. PALADIN's accuracy was comparable to all tested solutions. AVAILABILITY AND IMPLEMENTATION: PALADIN was implemented in C, and its source code and documentation are available at https://github.com/twestbrookunh/paladin. CONTACT: anthonyw@wildcats.unh.edu. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Asunto(s)

Metagenómica/métodos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Bacterias/genética , Bacterias/metabolismo , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Microbiota/genética

An SNP-based second-generation genetic map of Daphnia magna and its application to QTL analysis of phenotypic traits.

Routtu, Jarkko; Hall, Matthew D; Albere, Brian; Beisel, Christian; Bergeron, R Daniel; Chaturvedi, Anurag; Choi, Jeong-Hyeon; Colbourne, John; De Meester, Luc; Stephens, Melissa T; Stelzer, Claus-Peter; Solorzano, Eleanne; Thomas, W Kelley; Pfrender, Michael E; Ebert, Dieter.

BMC Genomics ; 15: 1033, 2014 Nov 27.

Artículo en Inglés | MEDLINE | ID: mdl-25431334

RESUMEN

BACKGROUND: Although Daphnia is increasingly recognized as a model for ecological genomics and biomedical research, there is, as of yet, no high-resolution genetic map for the genus. Such a map would provide an important tool for mapping phenotypes and assembling the genome. Here we estimate the genome size of Daphnia magna and describe the construction of an SNP array based linkage map. We then test the suitability of the map for life history and behavioural trait mapping. The two parent genotypes used to produce the map derived from D. magna populations with and without fish predation, respectively and are therefore expected to show divergent behaviour and life-histories. RESULTS: Using flow cytometry we estimated the genome size of D. magna to be about 238 mb. We developed an SNP array tailored to type SNPs in a D. magna F2 panel and used it to construct a D. magna linkage map, which included 1,324 informative markers. The map produced ten linkage groups ranging from 108.9 to 203.6 cM, with an average distance between markers of 1.13 cM and a total map length of 1,483.6 cM (Kosambi corrected). The physical length per cM is estimated to be 160 kb. Mapping infertility genes, life history traits and behavioural traits on this map revealed several significant QTL peaks and showed a complex pattern of underlying genetics, with different traits showing strongly different genetic architectures. CONCLUSIONS: The new linkage map of D. magna constructed here allowed us to characterize genetic differences among parent genotypes from populations with ecological differences. The QTL effect plots are partially consistent with our expectation of local adaptation under contrasting predation regimes. Furthermore, the new genetic map will be an important tool for the Daphnia research community and will contribute to the physical map of the D. magna genome project and the further mapping of phenotypic traits. The clones used to produce the linkage map are maintained in a stock collection and can be used for mapping QTLs of traits that show variance among the F2 clones.

Asunto(s)

Mapeo Cromosómico , Daphnia/genética , Fenotipo , Polimorfismo de Nucleótido Simple , Sitios de Carácter Cuantitativo , Carácter Cuantitativo Heredable , Animales , Análisis por Conglomerados , Femenino , Frecuencia de los Genes , Estudios de Asociación Genética , Ligamiento Genético , Marcadores Genéticos , Genoma , Tamaño del Genoma , Genotipo , Escala de Lod , Masculino

Simple sequence repeat variation in the Daphnia pulex genome.

Sung, Way; Tucker, Abraham; Bergeron, R Daniel; Lynch, Michael; Thomas, W Kelley.

BMC Genomics ; 11: 691, 2010 Dec 03.

Artículo en Inglés | MEDLINE | ID: mdl-21129182

RESUMEN

BACKGROUND: Simple sequence repeats (SSRs) are highly variable features of all genomes. Their rapid evolution makes them useful for tracing the evolutionary history of populations and investigating patterns of selection and mutation across genomes. The recently sequenced Daphnia pulex genome provides us with a valuable data set to study the mode and tempo of SSR evolution, without the inherent biases that accompany marker selection. RESULTS: Here we catalogue SSR loci in the Daphnia pulex genome with repeated motif sizes of 1-100 nucleotides with a minimum of 3 perfect repeats. We then used whole genome shotgun reads to determine the average heterozygosity of each SSR type and the relationship that it has to repeat number, motif size, motif sequence, and distribution of SSR loci. We find that SSR heterozygosity is motif specific, and positively correlated with repeat number as well as motif size. For non-repeat unit polymorphisms, we identify a motif-dependent end-nucleotide polymorphism bias that may contribute to the patterns of abundance for specific homopolymers, dimers, and trimers. Our observations confirm the high frequency of multiple unit variation (multistep) at large microsatellite loci, and further show that the occurrence of multiple unit variation is dependent on both repeat number and motif size. Using the Daphnia pulex genetic map, we show a positive correlation between dimer and trimer frequency and recombination. CONCLUSIONS: This genome-wide analysis of SSR variation in Daphnia pulex indicates that several aspects of SSR variation are motif dependent and suggests that a combination of unit length variation and end repeat biased base substitution contribute to the unique spectrum of SSR repeat loci.

Asunto(s)

Daphnia/genética , Variación Genética , Genoma/genética , Repeticiones de Minisatélite/genética , Animales , Sitios Genéticos/genética , Heterocigoto , Repeticiones de Microsatélite/genética , Mutación Puntual/genética , Polimorfismo Genético , Recombinación Genética

Intra-genomic variation in the ribosomal repeats of nematodes.

Bik, Holly M; Fournier, David; Sung, Way; Bergeron, R Daniel; Thomas, W Kelley.

PLoS One ; 8(10): e78230, 2013.

Artículo en Inglés | MEDLINE | ID: mdl-24147124

RESUMEN

Ribosomal loci represent a major tool for investigating environmental diversity and community structure via high-throughput marker gene studies of eukaryotes (e.g. 18S rRNA). Since the estimation of species' abundance is a major goal of environmental studies (by counting numbers of sequences), understanding the patterns of rRNA copy number across species will be critical for informing such high-throughput approaches. Such knowledge is critical, given that ribosomal RNA genes exist within multi-copy repeated arrays in a genome. Here we measured the repeat copy number for six nematode species by mapping the sequences from whole genome shotgun libraries against reference sequences for their rRNA repeat. This revealed a 6-fold variation in repeat copy number amongst taxa investigated, with levels of intragenomic variation ranging from 56 to 323 copies of the rRNA array. By applying the same approach to four C. elegans mutation accumulation lines propagated by repeated bottlenecking for an average of ~400 generations, we find on average a 2-fold increase in repeat copy number (rate of increase in rRNA estimated at 0.0285-0.3414 copies per generation), suggesting that rRNA repeat copy number is subject to selection. Within each Caenorhabditis species, the majority of intragenomic variation found across the rRNA repeat was observed within gene regions (18S, 28S, 5.8S), suggesting that such intragenomic variation is not a product of selection for rRNA coding function. We find that the dramatic variation in repeat copy number among these six nematode genomes would limit the use of rRNA in estimates of organismal abundance. In addition, the unique pattern of variation within a single genome was uncorrelated with patterns of divergence between species, reflecting a strong signature of natural selection for rRNA function. A better understanding of the factors that control or affect copy number in these arrays, as well as their rates and patterns of evolution, will be critical for informing estimates of global biodiversity.

Asunto(s)

Nematodos/genética , ARN Ribosómico/genética , Animales , Caenorhabditis elegans/clasificación , Caenorhabditis elegans/genética , Nematodos/clasificación , Filogenia

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA