Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
PLoS Biol ; 22(3): e3002507, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38451924

RESUMO

While the malaria parasite Plasmodium falciparum has low average genome-wide diversity levels, likely due to its recent introduction from a gorilla-infecting ancestor (approximately 10,000 to 50,000 years ago), some genes display extremely high diversity levels. In particular, certain proteins expressed on the surface of human red blood cell-infecting merozoites (merozoite surface proteins (MSPs)) possess exactly 2 deeply diverged lineages that have seemingly not recombined. While of considerable interest, the evolutionary origin of this phenomenon remains unknown. In this study, we analysed the genetic diversity of 2 of the most variable MSPs, DBLMSP and DBLMSP2, which are paralogs (descended from an ancestral duplication). Despite thousands of available Illumina WGS datasets from malaria-endemic countries, diversity in these genes has been hard to characterise as reads containing highly diverged alleles completely fail to align to the reference genome. To solve this, we developed a pipeline leveraging genome graphs, enabling us to genotype them at high accuracy and completeness. Using our newly- resolved sequences, we found that both genes exhibit 2 deeply diverged lineages in a specific protein domain (DBL) and that one of the 2 lineages is shared across the genes. We identified clear evidence of nonallelic gene conversion between the 2 genes as the likely mechanism behind sharing, leading us to propose that gene conversion between diverged paralogs, and not recombination suppression, can generate this surprising genealogy; a model that is furthermore consistent with high diversity levels in these 2 genes despite the strong historical P. falciparum transmission bottleneck.


Assuntos
Hominidae , Malária Falciparum , Malária , Parasitos , Animais , Humanos , Plasmodium falciparum/metabolismo , Parasitos/metabolismo , Conversão Gênica , Antígenos de Superfície , Malária/parasitologia , Proteínas de Protozoários/genética , Proteínas de Protozoários/metabolismo , Variação Genética
2.
Genome Biol ; 23(1): 147, 2022 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-35791022

RESUMO

There are many short-read variant-calling tools, with different strengths and weaknesses. We present a tool, Minos, which combines outputs from arbitrary variant callers, increasing recall without loss of precision. We benchmark on 62 samples from three bacterial species and an outbreak of 385 Mycobacterium tuberculosis samples. Minos also enables joint genotyping; we demonstrate on a large (N=13k) M. tuberculosis cohort, building a map of non-synonymous SNPs and indels in a region where all such variants are assumed to cause rifampicin resistance. We quantify the correlation with phenotypic resistance and then replicate in a second cohort (N=10k).


Assuntos
Sequenciamento de Nucleotídeos em Larga Escala , Mycobacterium tuberculosis , Genoma Bacteriano , Genótipo , Humanos , Mutação INDEL , Mycobacterium tuberculosis/genética , Polimorfismo de Nucleotídeo Único
3.
Genome Biol ; 22(1): 259, 2021 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-34488837

RESUMO

Genome graphs allow very general representations of genetic variation; depending on the model and implementation, variation at different length-scales (single nucleotide polymorphisms (SNPs), structural variants) and on different sequence backgrounds can be incorporated with different levels of transparency. We implement a model which handles this multiscale variation and develop a JSON extension of VCF (jVCF) allowing for variant calls on multiple references, both implemented in our software gramtools. We find gramtools outperforms existing methods for genotyping SNPs overlapping large deletions in M. tuberculosis and is able to genotype on multiple alternate backgrounds in P. falciparum, revealing previously hidden recombination.


Assuntos
Algoritmos , Variação Genética , Genoma Humano , Alelos , Antígenos de Superfície/metabolismo , Simulação por Computador , Técnicas de Genotipagem , Haplótipos/genética , Humanos , Mycobacterium tuberculosis/genética , Plasmodium falciparum/genética , Polimorfismo de Nucleotídeo Único/genética , Reprodutibilidade dos Testes , Deleção de Sequência
4.
Genome Biol ; 22(1): 267, 2021 09 14.
Artigo em Inglês | MEDLINE | ID: mdl-34521456

RESUMO

We present pandora, a novel pan-genome graph structure and algorithms for identifying variants across the full bacterial pan-genome. As much bacterial adaptability hinges on the accessory genome, methods which analyze SNPs in just the core genome have unsatisfactory limitations. Pandora approximates a sequenced genome as a recombinant of references, detects novel variation and pan-genotypes multiple samples. Using a reference graph of 578 Escherichia coli genomes, we compare 20 diverse isolates. Pandora recovers more rare SNPs than single-reference-based tools, is significantly better than picking the closest RefSeq reference, and provides a stable framework for analyzing diverse samples without reference bias.


Assuntos
Genoma Bacteriano , Genômica/métodos , Software , Algoritmos , Escherichia coli/genética , Variação Genética , Sequenciamento de Nucleotídeos em Larga Escala , Sequenciamento por Nanoporos , Nucleotídeos , Alinhamento de Sequência , Análise de Sequência de DNA
5.
F1000Res ; 10: 33, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34035898

RESUMO

Data analysis often entails a multitude of heterogeneous steps, from the application of various command line tools to the usage of scripting languages like R or Python for the generation of plots and tables. It is widely recognized that data analyses should ideally be conducted in a reproducible way. Reproducibility enables technical validation and regeneration of results on the original or even new data. However, reproducibility alone is by no means sufficient to deliver an analysis that is of lasting impact (i.e., sustainable) for the field, or even just one research group. We postulate that it is equally important to ensure adaptability and transparency. The former describes the ability to modify the analysis to answer extended or slightly different research questions. The latter describes the ability to understand the analysis in order to judge whether it is not only technically, but methodologically valid. Here, we analyze the properties needed for a data analysis to become reproducible, adaptable, and transparent. We show how the popular workflow management system Snakemake can be used to guarantee this, and how it enables an ergonomic, combined, unified representation of all steps involved in data analysis, ranging from raw data processing, to quality control and fine-grained, interactive exploration and plotting of final results.


Assuntos
Análise de Dados , Software , Reprodutibilidade dos Testes , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA