Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
F1000Res ; 12: 1091, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-38716230

RESUMO

Background: Accurate genome sequences form the basis for genomic surveillance programs, the added value of which was impressively demonstrated during the COVID-19 pandemic by tracing transmission chains, discovering new viral lineages and mutations, and assessing them for infectiousness and resistance to available treatments. Amplicon strategies employing Illumina sequencing have become widely established for variant detection and reference-based reconstruction of SARS-CoV-2 genomes, and are routine bioinformatics tasks. Yet, specific challenges arise when analyzing amplicon data, for example, when crucial and even lineage-determining mutations occur near primer sites. Methods: We present CoVpipe2, a bioinformatics workflow developed at the Public Health Institute of Germany to reconstruct SARS-CoV-2 genomes based on short-read sequencing data accurately. The decisive factor here is the reliable, accurate, and rapid reconstruction of genomes, considering the specifics of the used sequencing protocol. Besides fundamental tasks like quality control, mapping, variant calling, and consensus generation, we also implemented additional features to ease the detection of mixed samples and recombinants. Results: We highlight common pitfalls in primer clipping, detecting heterozygote variants, and dealing with low-coverage regions and deletions. We introduce CoVpipe2 to address the above challenges and have compared and successfully validated the pipeline against selected publicly available benchmark datasets. CoVpipe2 features high usability, reproducibility, and a modular design that specifically addresses the characteristics of short-read amplicon protocols but can also be used for whole-genome short-read sequencing data. Conclusions: CoVpipe2 has seen multiple improvement cycles and is continuously maintained alongside frequently updated primer schemes and new developments in the scientific community. Our pipeline is easy to set up and use and can serve as a blueprint for other pathogens in the future due to its flexibility and modularity, providing a long-term perspective for continuous support. CoVpipe2 is written in Nextflow and is freely accessible from \href{https://github.com/rki-mf1/CoVpipe2}{github.com/rki-mf1/CoVpipe2} under the GPL3 license.

2.
Epidemiol Infect ; 150: e141, 2022 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-35912971

RESUMO

In daycare centres, the close contact of children with other children and employees favours the transmission of infections. The majority of children <6 years attend daycare programmes in Germany, but the role of daycare centres in the SARS-CoV-2 pandemic is unclear. We investigated the transmission risk in daycare centres and the spread of SARS-CoV-2 to associated households. 30 daycare groups with at least one recent laboratory-confirmed SARS-CoV-2 case were enrolled in the study (10/2020-06/2021). Close contact persons within daycare and households were examined over a 12-day period (repeated SARS-CoV-2 PCR tests, genetic sequencing of viruses, symptom diary). Households were interviewed to gain comprehensive information on each outbreak. We determined primary cases for all daycare groups. The number of secondary cases varied considerably between daycare groups. The pooled secondary attack rate (SAR) across all 30 daycare centres was 9.6%. The SAR tended to be higher when the Alpha variant was detected (15.9% vs. 5.1% with evidence of wild type). The household SAR was 53.3%. Exposed daycare children were less likely to get infected with SARS-CoV-2 than employees (7.7% vs. 15.5%). Containment measures in daycare programmes are critical to reduce SARS-CoV-2 transmission, especially to avoid spread to associated households.


Assuntos
COVID-19 , SARS-CoV-2 , COVID-19/epidemiologia , Criança , Surtos de Doenças , Humanos , Pandemias
3.
Clin Infect Dis ; 75(Suppl 1): S110-S120, 2022 08 15.
Artigo em Inglês | MEDLINE | ID: mdl-35749674

RESUMO

BACKGROUND: Comprehensive pathogen genomic surveillance represents a powerful tool to complement and advance precision vaccinology. The emergence of the Alpha variant in December 2020 and the resulting efforts to track the spread of this and other severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) variants of concern led to an expansion of genomic sequencing activities in Germany. METHODS: At Robert Koch Institute (RKI), the German National Institute of Public Health, we established the Integrated Molecular Surveillance for SARS-CoV-2 (IMS-SC2) network to perform SARS-CoV-2 genomic surveillance at the national scale, SARS-CoV-2-positive samples from laboratories distributed across Germany regularly undergo whole-genome sequencing at RKI. RESULTS: We report analyses of 3623 SARS-CoV-2 genomes collected between December 2020 and December 2021, of which 3282 were randomly sampled. All variants of concern were identified in the sequenced sample set, at ratios equivalent to those in the 100-fold larger German GISAID sequence dataset from the same time period. Phylogenetic analysis confirmed variant assignments. Multiple mutations of concern emerged during the observation period. To model vaccine effectiveness in vitro, we employed authentic-virus neutralization assays, confirming that both the Beta and Zeta variants are capable of immune evasion. The IMS-SC2 sequence dataset facilitated an estimate of the SARS-CoV-2 incidence based on genetic evolution rates. Together with modeled vaccine efficacies, Delta-specific incidence estimation indicated that the German vaccination campaign contributed substantially to a deceleration of the nascent German Delta wave. CONCLUSIONS: SARS-CoV-2 molecular and genomic surveillance may inform public health policies including vaccination strategies and enable a proactive approach to controlling coronavirus disease 2019 spread as the virus evolves.


Assuntos
COVID-19 , SARS-CoV-2 , COVID-19/epidemiologia , COVID-19/prevenção & controle , Genoma Viral , Genômica , Humanos , Filogenia , SARS-CoV-2/genética , Vacinologia
4.
Nat Commun ; 12(1): 7305, 2021 12 15.
Artigo em Inglês | MEDLINE | ID: mdl-34911965

RESUMO

Metaproteomics has matured into a powerful tool to assess functional interactions in microbial communities. While many metaproteomic workflows are available, the impact of method choice on results remains unclear. Here, we carry out a community-driven, multi-laboratory comparison in metaproteomics: the critical assessment of metaproteome investigation study (CAMPI). Based on well-established workflows, we evaluate the effect of sample preparation, mass spectrometry, and bioinformatic analysis using two samples: a simplified, laboratory-assembled human intestinal model and a human fecal sample. We observe that variability at the peptide level is predominantly due to sample processing workflows, with a smaller contribution of bioinformatic pipelines. These peptide-level differences largely disappear at the protein group level. While differences are observed for predicted community composition, similar functional profiles are obtained across workflows. CAMPI demonstrates the robustness of present-day metaproteomics research, serves as a template for multi-laboratory studies in metaproteomics, and provides publicly available data sets for benchmarking future developments.


Assuntos
Bactérias/genética , Proteínas de Bactérias/química , Fezes/microbiologia , Proteômica/métodos , Adulto , Bactérias/classificação , Bactérias/isolamento & purificação , Proteínas de Bactérias/genética , Feminino , Microbioma Gastrointestinal , Humanos , Intestinos/microbiologia , Laboratórios , Espectrometria de Massas , Peptídeos/química , Fluxo de Trabalho
5.
Nat Protoc ; 15(10): 3212-3239, 2020 10.
Artigo em Inglês | MEDLINE | ID: mdl-32859984

RESUMO

Metaproteomics, the study of the collective protein composition of multi-organism systems, provides deep insights into the biodiversity of microbial communities and the complex functional interplay between microbes and their hosts or environment. Thus, metaproteomics has become an indispensable tool in various fields such as microbiology and related medical applications. The computational challenges in the analysis of corresponding datasets differ from those of pure-culture proteomics, e.g., due to the higher complexity of the samples and the larger reference databases demanding specific computing pipelines. Corresponding data analyses usually consist of numerous manual steps that must be closely synchronized. With MetaProteomeAnalyzer and Prophane, we have established two open-source software solutions specifically developed and optimized for metaproteomics. Among other features, peptide-spectrum matching is improved by combining different search engines and, compared to similar tools, metaproteome annotation benefits from the most comprehensive set of available databases (such as NCBI, UniProt, EggNOG, PFAM, and CAZy). The workflow described in this protocol combines both tools and leads the user through the entire data analysis process, including protein database creation, database search, protein grouping and annotation, and results visualization. To the best of our knowledge, this protocol presents the most comprehensive, detailed and flexible guide to metaproteomics data analysis to date. While beginners are provided with robust, easy-to-use, state-of-the-art data analysis in a reasonable time (a few hours, depending on, among other factors, the protein database size and the number of identified peptides and inferred proteins), advanced users benefit from the flexibility and adaptability of the workflow.


Assuntos
Proteoma/análise , Proteômica/métodos , Análise de Dados , Bases de Dados de Proteínas , Microbiota , Peptídeos/química , Software , Fluxo de Trabalho
6.
PLoS Comput Biol ; 15(7): e1007208, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31335917

RESUMO

Horizontal gene transfer (HGT) has changed the way we regard evolution. Instead of waiting for the next generation to establish new traits, especially bacteria are able to take a shortcut via HGT that enables them to pass on genes from one individual to another, even across species boundaries. The tool Daisy offers the first HGT detection approach based on read mapping that provides complementary evidence compared to existing methods. However, Daisy relies on the acceptor and donor organism involved in the HGT being known. We introduce DaisyGPS, a mapping-based pipeline that is able to identify acceptor and donor reference candidates of an HGT event based on sequencing reads. Acceptor and donor identification is akin to species identification in metagenomic samples based on sequencing reads, a problem addressed by metagenomic profiling tools. However, acceptor and donor references have certain properties such that these methods cannot be directly applied. DaisyGPS uses MicrobeGPS, a metagenomic profiling tool tailored towards estimating the genomic distance between organisms in the sample and the reference database. We enhance the underlying scoring system of MicrobeGPS to account for the sequence patterns in terms of mapping coverage of an acceptor and donor involved in an HGT event, and report a ranked list of reference candidates. These candidates can then be further evaluated by tools like Daisy to establish HGT regions. We successfully validated our approach on both simulated and real data, and show its benefits in an investigation of an outbreak involving Methicillin-resistant Staphylococcus aureus data.


Assuntos
Evolução Molecular , Transferência Genética Horizontal , Metagenoma , Metagenômica/métodos , Modelos Genéticos , Biologia Computacional , Simulação por Computador , Bases de Dados Genéticas/estatística & dados numéricos , Surtos de Doenças/estatística & dados numéricos , Variação Genética , Genoma Bacteriano , Helicobacter pylori/genética , Humanos , Metagenômica/estatística & dados numéricos , Staphylococcus aureus Resistente à Meticilina/genética , Mutação , Infecções Estafilocócicas/epidemiologia , Infecções Estafilocócicas/microbiologia
7.
Mol Cell Proteomics ; 18(9): 1756-1771, 2019 09.
Artigo em Inglês | MEDLINE | ID: mdl-31221721

RESUMO

Epithelial-mesenchymal transition (EMT) is driven by complex signaling events that induce dramatic biochemical and morphological changes whereby epithelial cells are converted into cancer cells. However, the underlying molecular mechanisms remain elusive. Here, we used mass spectrometry based quantitative proteomics approach to systematically analyze the post-translational biochemical changes that drive differentiation of human mammary epithelial (HMLE) cells into mesenchymal. We identified 314 proteins out of more than 6,000 unique proteins and 871 phosphopeptides out of more than 7,000 unique phosphopeptides as differentially regulated. We found that phosphoproteome is more unstable and prone to changes during EMT compared with the proteome and multiple alterations at proteome level are not thoroughly represented by transcriptional data highlighting the necessity of proteome level analysis. We discovered cell state specific signaling pathways, such as Hippo, sphingolipid signaling, and unfolded protein response (UPR) by modeling the networks of regulated proteins and potential kinase-substrate groups. We identified two novel factors for EMT whose expression increased on EMT induction: DnaJ heat shock protein family (Hsp40) member B4 (DNAJB4) and cluster of differentiation 81 (CD81). Suppression of DNAJB4 or CD81 in mesenchymal breast cancer cells resulted in decreased cell migration in vitro and led to reduced primary tumor growth, extravasation, and lung metastasis in vivo Overall, we performed the global proteomic and phosphoproteomic analyses of EMT, identified and validated new mRNA and/or protein level modulators of EMT. This work also provides a unique platform and resource for future studies focusing on metastasis and drug resistance.


Assuntos
Neoplasias da Mama/patologia , Transição Epitelial-Mesenquimal/fisiologia , Proteínas de Choque Térmico HSP40/metabolismo , Fosfoproteínas/metabolismo , Tetraspanina 28/metabolismo , Animais , Neoplasias da Mama/metabolismo , Neoplasias da Mama/mortalidade , Linhagem Celular Tumoral , Movimento Celular/fisiologia , Transição Epitelial-Mesenquimal/genética , Feminino , Proteínas de Choque Térmico HSP40/genética , Humanos , Estimativa de Kaplan-Meier , Neoplasias Mamárias Experimentais/patologia , Camundongos Nus , Reprodutibilidade dos Testes , Tetraspanina 28/genética
8.
Bioinformatics ; 32(17): i595-i604, 2016 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-27587679

RESUMO

MOTIVATION: Horizontal gene transfer (HGT) is a fundamental mechanism that enables organisms such as bacteria to directly transfer genetic material between distant species. This way, bacteria can acquire new traits such as antibiotic resistance or pathogenic toxins. Current bioinformatics approaches focus on the detection of past HGT events by exploring phylogenetic trees or genome composition inconsistencies. However, these techniques normally require the availability of finished and fully annotated genomes and of sufficiently large deviations that allow detection and are thus not widely applicable. Especially in outbreak scenarios with HGT-mediated emergence of new pathogens, like the enterohemorrhagic Escherichia coli outbreak in Germany 2011, there is need for fast and precise HGT detection. Next-generation sequencing (NGS) technologies facilitate rapid analysis of unknown pathogens but, to the best of our knowledge, so far no approach detects HGTs directly from NGS reads. RESULTS: We present Daisy, a novel mapping-based tool for HGT detection. Daisy determines HGT boundaries with split-read mapping and evaluates candidate regions relying on read pair and coverage information. Daisy successfully detects HGT regions with base pair resolution in both simulated and real data, and outperforms alternative approaches using a genome assembly of the reads. We see our approach as a powerful complement for a comprehensive analysis of HGT in the context of NGS data. AVAILABILITY AND IMPLEMENTATION: Daisy is freely available from http://github.com/ktrappe/daisy CONTACT: renardb@rki.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Mapeamento Cromossômico/métodos , Transferência Genética Horizontal , Sequenciamento de Nucleotídeos em Larga Escala , Filogenia , Biologia Computacional/métodos , Genoma
9.
Bioinformatics ; 30(24): 3484-90, 2014 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-25028727

RESUMO

MOTIVATION: The landscape of structural variation (SV) including complex duplication and translocation patterns is far from resolved. SV detection tools usually exhibit low agreement, are often geared toward certain types or size ranges of variation and struggle to correctly classify the type and exact size of SVs. RESULTS: We present Gustaf (Generic mUlti-SpliT Alignment Finder), a sound generic multi-split SV detection tool that detects and classifies deletions, inversions, dispersed duplications and translocations of ≥ 30 bp. Our approach is based on a generic multi-split alignment strategy that can identify SV breakpoints with base pair resolution. We show that Gustaf correctly identifies SVs, especially in the range from 30 to 100 bp, which we call the next-generation sequencing (NGS) twilight zone of SVs, as well as larger SVs >500 bp. Gustaf performs better than similar tools in our benchmark and is furthermore able to correctly identify size and location of dispersed duplications and translocations, which otherwise might be wrongly classified, for example, as large deletions.


Assuntos
Variação Estrutural do Genoma , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos , Humanos , Alinhamento de Sequência , Deleção de Sequência , Software , Translocação Genética
10.
BMC Bioinformatics ; 15: 99, 2014 Apr 09.
Artigo em Inglês | MEDLINE | ID: mdl-24712884

RESUMO

BACKGROUND: Recent advances in rapid, low-cost sequencing have opened up the opportunity to study complete genome sequences. The computational approach of multiple genome alignment allows investigation of evolutionarily related genomes in an integrated fashion, providing a basis for downstream analyses such as rearrangement studies and phylogenetic inference.Graphs have proven to be a powerful tool for coping with the complexity of genome-scale sequence alignments. The potential of graphs to intuitively represent all aspects of genome alignments led to the development of graph-based approaches for genome alignment. These approaches construct a graph from a set of local alignments, and derive a genome alignment through identification and removal of graph substructures that indicate errors in the alignment. RESULTS: We compare the structures of commonly used graphs in terms of their abilities to represent alignment information. We describe how the graphs can be transformed into each other, and identify and classify graph substructures common to one or more graphs. Based on previous approaches, we compile a list of modifications that remove these substructures. CONCLUSION: We show that crucial pieces of alignment information, associated with inversions and duplications, are not visible in the structure of all graphs. If we neglect vertex or edge labels, the graphs differ in their information content. Still, many ideas are shared among all graph-based approaches. Based on these findings, we outline a conceptual framework for graph-based genome alignment that can assist in the development of future genome alignment tools.


Assuntos
Genômica/métodos , Alinhamento de Sequência/métodos , Algoritmos , Gráficos por Computador , Genoma
11.
PLoS One ; 5(12): e15661, 2010 Dec 22.
Artigo em Inglês | MEDLINE | ID: mdl-21203531

RESUMO

BACKGROUND: Colorectal cancer (CRC) is with approximately 1 million cases the third most common cancer worldwide. Extensive research is ongoing to decipher the underlying genetic patterns with the hope to improve early cancer diagnosis and treatment. In this direction, the recent progress in next generation sequencing technologies has revolutionized the field of cancer genomics. However, one caveat of these studies remains the large amount of genetic variations identified and their interpretation. METHODOLOGY/PRINCIPAL FINDINGS: Here we present the first work on whole exome NGS of primary colon cancers. We performed 454 whole exome pyrosequencing of tumor as well as adjacent not affected normal colonic tissue from microsatellite stable (MSS) and microsatellite instable (MSI) colon cancer patients and identified more than 50,000 small nucleotide variations for each tissue. According to predictions based on MSS and MSI pathomechanisms we identified eight times more somatic non-synonymous variations in MSI cancers than in MSS and we were able to reproduce the result in four additional CRCs. Our bioinformatics filtering approach narrowed down the rate of most significant mutations to 359 for MSI and 45 for MSS CRCs with predicted altered protein functions. In both CRCs, MSI and MSS, we found somatic mutations in the intracellular kinase domain of bone morphogenetic protein receptor 1A, BMPR1A, a gene where so far germline mutations are associated with juvenile polyposis syndrome, and show that the mutations functionally impair the protein function. CONCLUSIONS/SIGNIFICANCE: We conclude that with deep sequencing of tumor exomes one may be able to predict the microsatellite status of CRC and in addition identify potentially clinically relevant mutations.


Assuntos
Neoplasias Colorretais/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Instabilidade de Microssatélites , Repetições de Microssatélites , Adenocarcinoma/genética , Idoso , Neoplasias do Colo/genética , Biologia Computacional/métodos , Análise Mutacional de DNA , Genômica , Humanos , Masculino , Pessoa de Meia-Idade , Mutação
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA