Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
1.
Nature ; 607(7920): 732-740, 2022 07.
Artigo em Inglês | MEDLINE | ID: mdl-35859178

RESUMO

Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data1,2. Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank3. This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation.


Assuntos
Bancos de Espécimes Biológicos , Bases de Dados Genéticas , Variação Genética , Genoma Humano , Genômica , Sequenciamento Completo do Genoma , África/etnologia , Ásia/etnologia , Estudos de Coortes , Sequência Conservada , Éxons/genética , Genoma Humano/genética , Haplótipos/genética , Humanos , Mutação INDEL , Irlanda/etnologia , Repetições de Microssatélites , Polimorfismo de Nucleotídeo Único/genética , Reino Unido
2.
Nature ; 543(7643): 122-125, 2017 03 02.
Artigo em Inglês | MEDLINE | ID: mdl-28178237

RESUMO

Human cells have twenty-three pairs of chromosomes. In cancer, however, genes can be amplified in chromosomes or in circular extrachromosomal DNA (ecDNA), although the frequency and functional importance of ecDNA are not understood. We performed whole-genome sequencing, structural modelling and cytogenetic analyses of 17 different cancer types, including analysis of the structure and function of chromosomes during metaphase of 2,572 dividing cells, and developed a software package called ECdetect to conduct unbiased, integrated ecDNA detection and analysis. Here we show that ecDNA was found in nearly half of human cancers; its frequency varied by tumour type, but it was almost never found in normal cells. Driver oncogenes were amplified most commonly in ecDNA, thereby increasing transcript level. Mathematical modelling predicted that ecDNA amplification would increase oncogene copy number and intratumoural heterogeneity more effectively than chromosomal amplification. We validated these predictions by quantitative analyses of cancer samples. The results presented here suggest that ecDNA contributes to accelerated evolution in cancer.


Assuntos
Variações do Número de Cópias de DNA/genética , Evolução Molecular , Amplificação de Genes/genética , Heterogeneidade Genética , Modelos Genéticos , Neoplasias/genética , Oncogenes/genética , Cromossomos Humanos/genética , Análise Citogenética , Análise Mutacional de DNA , Genoma Humano/genética , Humanos , Metáfase/genética , Neoplasias/classificação , RNA Mensageiro/análise , RNA Neoplásico/genética , Reprodutibilidade dos Testes , Software
3.
N Engl J Med ; 390(23): 2217-2219, 2024 Jun 20.
Artigo em Inglês | MEDLINE | ID: mdl-38899702
4.
Nat Methods ; 15(8): 591-594, 2018 08.
Artigo em Inglês | MEDLINE | ID: mdl-30013048

RESUMO

We describe Strelka2 ( https://github.com/Illumina/strelka ), an open-source small-variant-calling method for research and clinical germline and somatic sequencing applications. Strelka2 introduces a novel mixture-model-based estimation of insertion/deletion error parameters from each sample, an efficient tiered haplotype-modeling strategy, and a normal sample contamination model to improve liquid tumor analysis. For both germline and somatic calling, Strelka2 substantially outperformed the current leading tools in terms of both variant-calling accuracy and computing cost.


Assuntos
Variação Genética , Mutação em Linhagem Germinativa , Software , Bases de Dados Genéticas/estatística & dados numéricos , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala/estatística & dados numéricos , Humanos , Mutação INDEL , Modelos Genéticos , Neoplasias/genética , Sequenciamento Completo do Genoma/estatística & dados numéricos
5.
Appl Environ Microbiol ; 82(8): 2494-2505, 2016 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-26896141

RESUMO

Managing ecosystems to maintain biodiversity may be one approach to ensuring their dynamic stability, productivity, and delivery of vital services. The applicability of this approach to industrial ecosystems that harness the metabolic activities of microbes has been proposed but has never been tested at relevant scales. We used a tag-sequencing approach with bacterial small subunit rRNA (16S) genes and eukaryotic internal transcribed spacer 2 (ITS2) to measuring the taxonomic composition and diversity of bacteria and eukaryotes in an open pond managed for bioenergy production by microalgae over a year. Periods of high eukaryotic diversity were associated with high and more-stable biomass productivity. In addition, bacterial diversity and eukaryotic diversity were inversely correlated over time, possibly due to their opposite responses to temperature. The results indicate that maintaining diverse communities may be essential to engineering stable and productive bioenergy ecosystems using microorganisms.


Assuntos
Bactérias/crescimento & desenvolvimento , Biota , Eucariotos/crescimento & desenvolvimento , Microbiologia Industrial , Microbiologia da Água , Bactérias/classificação , Bactérias/genética , Análise por Conglomerados , DNA Ribossômico/química , DNA Ribossômico/genética , DNA Espaçador Ribossômico/química , DNA Espaçador Ribossômico/genética , Eucariotos/classificação , Eucariotos/genética , Filogenia , RNA Ribossômico 16S/genética , Análise de Sequência de DNA
6.
Genome Biol ; 25(1): 69, 2024 03 11.
Artigo em Inglês | MEDLINE | ID: mdl-38468278

RESUMO

BACKGROUND: Long-read sequencing can enable the detection of base modifications, such as CpG methylation, in single molecules of DNA. The most commonly used methods for long-read sequencing are nanopore developed by Oxford Nanopore Technologies (ONT) and single molecule real-time (SMRT) sequencing developed by Pacific Bioscience (PacBio). In this study, we systematically compare the performance of CpG methylation detection from long-read sequencing. RESULTS: We demonstrate that CpG methylation detection from 7179 nanopore-sequenced DNA samples is highly accurate and consistent with 132 oxidative bisulfite-sequenced (oxBS) samples, isolated from the same blood draws. We introduce quality filters for CpGs that further enhance the accuracy of CpG methylation detection from nanopore-sequenced DNA, while removing at most 30% of CpGs. We evaluate the per-site performance of CpG methylation detection across different genomic features and CpG methylation rates and demonstrate how the latest R10.4 flowcell chemistry and base-calling algorithms improve methylation detection from nanopore sequencing. Additionally, we show how the methylation detection of 50 SMRT-sequenced genomes compares to nanopore sequencing and oxBS. CONCLUSIONS: This study provides the first systematic comparison of CpG methylation detection tools for long-read sequencing methods. We compare two commonly used computational methods for the detection of CpG methylation in a large number of nanopore genomes, including samples sequenced using the latest R10.4 nanopore flowcell chemistry and 50 SMRT sequenced samples. We provide insights into the strengths and limitations of each sequencing method as well as recommendations for standardization and evaluation of tools designed for genome-scale modified base detection using long-read sequencing.


Assuntos
Metilação de DNA , Genoma Humano , Humanos , Análise de Sequência de DNA/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , DNA
7.
Nat Genet ; 56(8): 1624-1631, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39048797

RESUMO

Gene promoter and enhancer sequences are bound by transcription factors and are depleted of methylated CpG sites (cytosines preceding guanines in DNA). The absence of methylated CpGs in these sequences typically correlates with increased gene expression, indicating a regulatory role for methylation. We used nanopore sequencing to determine haplotype-specific methylation rates of 15.3 million CpG units in 7,179 whole-blood genomes. We identified 189,178 methylation depleted sequences where three or more proximal CpGs were unmethylated on at least one haplotype. A total of 77,789 methylation depleted sequences (~41%) associated with 80,503 cis-acting sequence variants, which we termed allele-specific methylation quantitative trait loci (ASM-QTLs). RNA sequencing of 896 samples from the same blood draws used to perform nanopore sequencing showed that the ASM-QTL, that is, DNA sequence variability, drives most of the correlation found between gene expression and CpG methylation. ASM-QTLs were enriched 40.2-fold (95% confidence interval 32.2, 49.9) among sequence variants associating with hematological traits, demonstrating that ASM-QTLs are important functional units in the noncoding genome.


Assuntos
Ilhas de CpG , Metilação de DNA , Locos de Características Quantitativas , Humanos , Regiões Promotoras Genéticas , Haplótipos , Alelos , Regulação da Expressão Gênica , Variação Genética , Sequenciamento por Nanoporos/métodos , Genoma Humano
8.
JAMA Cardiol ; 9(2): 165-172, 2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38150231

RESUMO

Importance: Recurrent pericarditis is a treatment challenge and often a debilitating condition. Drugs inhibiting interleukin 1 cytokines are a promising new treatment option, but their use is based on scarce biological evidence and clinical trials of modest sizes, and the contributions of innate and adaptive immune processes to the pathophysiology are incompletely understood. Objective: To use human genomics, transcriptomics, and proteomics to shed light on the pathogenesis of pericarditis. Design, Setting, and Participants: This was a meta-analysis of genome-wide association studies of pericarditis from 5 countries. Associations were examined between the pericarditis-associated variants and pericarditis subtypes (including recurrent pericarditis) and secondary phenotypes. To explore mechanisms, associations with messenger RNA expression (cis-eQTL), plasma protein levels (pQTL), and CpG methylation of DNA (ASM-QTL) were assessed. Data from Iceland (deCODE genetics, 1983-2020), Denmark (Copenhagen Hospital Biobank/Danish Blood Donor Study, 1977-2022), the UK (UK Biobank, 1953-2021), the US (Intermountain, 1996-2022), and Finland (FinnGen, 1970-2022) were included. Data were analyzed from September 2022 to August 2023. Exposure: Genotype. Main Outcomes and Measures: Pericarditis. Results: In this genome-wide association study of 4894 individuals with pericarditis (mean [SD] age at diagnosis, 51.4 [17.9] years, 2734 [67.6%] male, excluding the FinnGen cohort), associations were identified with 2 independent common intergenic variants at the interleukin 1 locus on chromosome 2q14. The lead variant was rs12992780 (T) (effect allele frequency [EAF], 31%-40%; odds ratio [OR], 0.83; 95% CI, 0.79-0.87; P = 6.67 × 10-16), downstream of IL1B and the secondary variant rs7575402 (A or T) (EAF, 45%-55%; adjusted OR, 0.89; 95% CI, 0.85-0.93; adjusted P = 9.6 × 10-8). The lead variant rs12992780 had a smaller odds ratio for recurrent pericarditis (0.76) than the acute form (0.86) (P for heterogeneity = .03) and rs7575402 was associated with CpG methylation overlapping binding sites of 4 transcription factors known to regulate interleukin 1 production: PU.1 (encoded by SPI1), STAT1, STAT3, and CCAAT/enhancer-binding protein ß (encoded by CEBPB). Conclusions and Relevance: This study found an association between pericarditis and 2 independent sequence variants at the interleukin 1 gene locus. This finding has the potential to contribute to development of more targeted and personalized therapy of pericarditis with interleukin 1-blocking drugs.


Assuntos
Estudo de Associação Genômica Ampla , Humanos , Masculino , Adolescente , Feminino , Genótipo , Fenótipo , Frequência do Gene , Finlândia
9.
Nat Commun ; 14(1): 3855, 2023 06 29.
Artigo em Inglês | MEDLINE | ID: mdl-37386006

RESUMO

Microsatellites are polymorphic tracts of short tandem repeats with one to six base-pair (bp) motifs and are some of the most polymorphic variants in the genome. Using 6084 Icelandic parent-offspring trios we estimate 63.7 (95% CI: 61.9-65.4) microsatellite de novo mutations (mDNMs) per offspring per generation, excluding one bp repeats motifs (homopolymers) the estimate is 48.2 mDNMs (95% CI: 46.7-49.6). Paternal mDNMs occur at longer repeats than maternal ones, which are in turn larger with a mean size of 3.4 bp vs 3.1 bp for paternal ones. mDNMs increase by 0.97 (95% CI: 0.90-1.04) and 0.31 (95% CI: 0.25-0.37) per year of father's and mother's age at conception, respectively. Here, we find two independent coding variants that associate with the number of mDNMs transmitted to offspring; The minor allele of a missense variant (allele frequency (AF) = 1.9%) in MSH2, a mismatch repair gene, increases transmitted mDNMs from both parents (effect: 13.1 paternal and 7.8 maternal mDNMs). A synonymous variant (AF = 20.3%) in NEIL2, a DNA damage repair gene, increases paternally transmitted mDNMs (effect: 4.4 mDNMs). Thus, the microsatellite mutation rate in humans is in part under genetic control.


Assuntos
Reparo de Erro de Pareamento de DNA , Mutação em Linhagem Germinativa , Humanos , Alelos , Mutação em Linhagem Germinativa/genética , Repetições de Microssatélites/genética , Células Germinativas
10.
Commun Biol ; 6(1): 703, 2023 07 10.
Artigo em Inglês | MEDLINE | ID: mdl-37430141

RESUMO

Urticaria is a skin disorder characterized by outbreaks of raised pruritic wheals. In order to identify sequence variants associated with urticaria, we performed a meta-analysis of genome-wide association studies for urticaria with a total of 40,694 cases and 1,230,001 controls from Iceland, the UK, Finland, and Japan. We also performed transcriptome- and proteome-wide analyses in Iceland and the UK. We found nine sequence variants at nine loci associating with urticaria. The variants are at genes participating in type 2 immune responses and/or mast cell biology (CBLB, FCER1A, GCSAML, STAT6, TPSD1, ZFPM1), the innate immunity (C4), and NF-κB signaling. The most significant association was observed for the splice-donor variant rs56043070[A] (hg38: chr1:247556467) in GCSAML (MAF = 6.6%, OR = 1.24 (95%CI: 1.20-1.28), P-value = 3.6 × 10-44). We assessed the effects of the variants on transcripts, and levels of proteins relevant to urticaria pathophysiology. Our results emphasize the role of type 2 immune response and mast cell activation in the pathogenesis of urticaria. Our findings may point to an IgE-independent urticaria pathway that could help address unmet clinical need.


Assuntos
Estudo de Associação Genômica Ampla , Urticária , Humanos , Mastócitos , Urticária/genética , Splicing de RNA , Proteoma
11.
Nat Genet ; 55(11): 1843-1853, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37884687

RESUMO

Migraine is a complex neurovascular disease with a range of severity and symptoms, yet mostly studied as one phenotype in genome-wide association studies (GWAS). Here we combine large GWAS datasets from six European populations to study the main migraine subtypes, migraine with aura (MA) and migraine without aura (MO). We identified four new MA-associated variants (in PRRT2, PALMD, ABO and LRRK2) and classified 13 MO-associated variants. Rare variants with large effects highlight three genes. A rare frameshift variant in brain-expressed PRRT2 confers large risk of MA and epilepsy, but not MO. A burden test of rare loss-of-function variants in SCN11A, encoding a neuron-expressed sodium channel with a key role in pain sensation, shows strong protection against migraine. Finally, a rare variant with cis-regulatory effects on KCNK5 confers large protection against migraine and brain aneurysms. Our findings offer new insights with therapeutic potential into the complex biology of migraine and its subtypes.


Assuntos
Epilepsia , Transtornos de Enxaqueca , Enxaqueca com Aura , Humanos , Estudo de Associação Genômica Ampla , Transtornos de Enxaqueca/genética , Enxaqueca com Aura/genética , Fenótipo
12.
Nat Commun ; 14(1): 3453, 2023 06 10.
Artigo em Inglês | MEDLINE | ID: mdl-37301908

RESUMO

Genotypes causing pregnancy loss and perinatal mortality are depleted among living individuals and are therefore difficult to find. To explore genetic causes of recessive lethality, we searched for sequence variants with deficit of homozygosity among 1.52 million individuals from six European populations. In this study, we identified 25 genes harboring protein-altering sequence variants with a strong deficit of homozygosity (10% or less of predicted homozygotes). Sequence variants in 12 of the genes cause Mendelian disease under a recessive mode of inheritance, two under a dominant mode, but variants in the remaining 11 have not been reported to cause disease. Sequence variants with a strong deficit of homozygosity are over-represented among genes essential for growth of human cell lines and genes orthologous to mouse genes known to affect viability. The function of these genes gives insight into the genetics of intrauterine lethality. We also identified 1077 genes with homozygous predicted loss-of-function genotypes not previously described, bringing the total set of genes completely knocked out in humans to 4785.


Assuntos
Proteínas , Humanos , Animais , Camundongos , Homozigoto , Genótipo , Proteínas/genética , Genes Recessivos
13.
Nat Commun ; 13(1): 7592, 2022 12 08.
Artigo em Inglês | MEDLINE | ID: mdl-36481753

RESUMO

Genome-wide association studies have identified thousands of single nucleotide variants and small indels that contribute to variation in hematologic traits. While structural variants are known to cause rare blood or hematopoietic disorders, the genome-wide contribution of structural variants to quantitative blood cell trait variation is unknown. Here we utilized whole genome sequencing data in ancestrally diverse participants of the NHLBI Trans Omics for Precision Medicine program (N = 50,675) to detect structural variants associated with hematologic traits. Using single variant tests, we assessed the association of common and rare structural variants with red cell-, white cell-, and platelet-related quantitative traits and observed 21 independent signals (12 common and 9 rare) reaching genome-wide significance. The majority of these associations (N = 18) replicated in independent datasets. In genome-editing experiments, we provide evidence that a deletion associated with lower monocyte counts leads to disruption of an S1PR3 monocyte enhancer and decreased S1PR3 expression.


Assuntos
Células Sanguíneas , Estudo de Associação Genômica Ampla , Humanos , Sequenciamento Completo do Genoma
14.
Genome Biol ; 22(1): 28, 2021 01 08.
Artigo em Inglês | MEDLINE | ID: mdl-33419473

RESUMO

A major challenge to long read sequencing data is their high error rate of up to 15%. We present Ratatosk, a method to correct long reads with short read data. We demonstrate on 5 human genome trios that Ratatosk reduces the error rate of long reads 6-fold on average with a median error rate as low as 0.22 %. SNP calls in Ratatosk corrected reads are nearly 99 % accurate and indel calls accuracy is increased by up to 37 %. An assembly of Ratatosk corrected reads from an Ashkenazi individual yields a contig N50 of 45 Mbp and less misassemblies than a PacBio HiFi reads assembly.


Assuntos
Quimera , Genoma Humano , Feminino , Genômica , Humanos , Masculino , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA
15.
Nat Commun ; 12(1): 730, 2021 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-33526789

RESUMO

Thousands of genomic structural variants (SVs) segregate in the human population and can impact phenotypic traits and diseases. Their identification in whole-genome sequence data of large cohorts is a major computational challenge. Most current approaches identify SVs in single genomes and afterwards merge the identified variants into a joint call set across many genomes. We describe the approach PopDel, which directly identifies deletions of about 500 to at least 10,000 bp in length in data of many genomes jointly, eliminating the need for subsequent variant merging. PopDel scales to tens of thousands of genomes as we demonstrate in evaluations on up to 49,962 genomes. We show that PopDel reliably reports common, rare and de novo deletions. On genomes with available high-confidence reference call sets PopDel shows excellent recall and precision. Genotype inheritance patterns in up to 6794 trios indicate that genotypes predicted by PopDel are more reliable than those of previous SV callers. Furthermore, PopDel's running time is competitive with the fastest tested previous tools. The demonstrated scalability and accuracy of PopDel enables routine scans for deletions in large-scale sequencing studies.


Assuntos
Genoma Humano/genética , Variação Estrutural do Genoma , Metagenômica/métodos , Deleção de Sequência , Estudos de Viabilidade , Feminino , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Padrões de Herança , Masculino , Reprodutibilidade dos Testes , Análise de Sequência de DNA
16.
Nat Genet ; 53(1): 27-34, 2021 01.
Artigo em Inglês | MEDLINE | ID: mdl-33414551

RESUMO

Despite the important role that monozygotic twins have played in genetics research, little is known about their genomic differences. Here we show that monozygotic twins differ on average by 5.2 early developmental mutations and that approximately 15% of monozygotic twins have a substantial number of these early developmental mutations specific to one of them. Using the parents and offspring of twins, we identified pre-twinning mutations. We observed instances where a twin was formed from a single cell lineage in the pre-twinning cell mass and instances where a twin was formed from several cell lineages. CpG>TpG mutations increased in frequency with embryonic development, coinciding with an increase in DNA methylation. Our results indicate that allocations of cells during development shapes genomic differences between monozygotic twins.


Assuntos
Genoma Humano , Células Germinativas/metabolismo , Gêmeos Monozigóticos/genética , Desenvolvimento Embrionário/genética , Feminino , Frequência do Gene/genética , Humanos , Masculino , Mosaicismo , Mutação/genética , Zigoto/metabolismo
17.
Nat Genet ; 53(6): 779-786, 2021 06.
Artigo em Inglês | MEDLINE | ID: mdl-33972781

RESUMO

Long-read sequencing (LRS) promises to improve the characterization of structural variants (SVs). We generated LRS data from 3,622 Icelanders and identified a median of 22,636 SVs per individual (a median of 13,353 insertions and 9,474 deletions). We discovered a set of 133,886 reliably genotyped SV alleles and imputed them into 166,281 individuals to explore their effects on diseases and other traits. We discovered an association of a rare deletion in PCSK9 with lower low-density lipoprotein (LDL) cholesterol levels, compared to the population average. We also discovered an association of a multiallelic SV in ACAN with height; we found 11 alleles that differed in the number of a 57-bp-motif repeat and observed a linear relationship between the number of repeats carried and height. These results show that SVs can be accurately characterized at the population scale using LRS data in a genome-wide non-targeted approach and demonstrate how SVs impact phenotypes.


Assuntos
Doença/genética , Variação Estrutural do Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Característica Quantitativa Herdável , Alelos , LDL-Colesterol/metabolismo , Cromossomos Humanos/genética , Feminino , Frequência do Gene/genética , Humanos , Islândia , Modelos Lineares , Masculino , Pró-Proteína Convertase 9/genética , Recombinação Genética/genética , Deleção de Sequência/genética
18.
Nat Commun ; 10(1): 5402, 2019 11 27.
Artigo em Inglês | MEDLINE | ID: mdl-31776332

RESUMO

Analysis of sequence diversity in the human genome is fundamental for genetic studies. Structural variants (SVs) are frequently omitted in sequence analysis studies, although each has a relatively large impact on the genome. Here, we present GraphTyper2, which uses pangenome graphs to genotype SVs and small variants using short-reads. Comparison to the syndip benchmark dataset shows that our SV genotyping is sensitive and variant segregation in families demonstrates the accuracy of our approach. We demonstrate that incorporating public assembly data into our pipeline greatly improves sensitivity, particularly for large insertions. We validate 6,812 SVs on average per genome using long-read data of 41 Icelanders. We show that GraphTyper2 can simultaneously genotype tens of thousands of whole-genomes by characterizing 60 million small variants and half a million SVs in 49,962 Icelanders, including 80 thousand SVs with high-confidence.


Assuntos
Genoma Humano , Variação Estrutural do Genoma , Técnicas de Genotipagem/métodos , Software , Gráficos por Computador , Bases de Dados Genéticas , Genética Populacional , Técnicas de Genotipagem/estatística & dados numéricos , Humanos , Islândia , Linhagem , Polimorfismo de Nucleotídeo Único , Reprodutibilidade dos Testes , Fluxo de Trabalho
19.
Cell Syst ; 7(4): 463-467.e6, 2018 10 24.
Artigo em Inglês | MEDLINE | ID: mdl-30268435

RESUMO

Shotgun metaproteomics has the potential to reveal the functional landscape of microbial communities but lacks appropriate methods for complex samples with unknown compositions. In the absence of prior taxonomic information, tandem mass spectra would be searched against large pan-microbial databases, which requires heavy computational workload and reduces sensitivity. We present ProteoStorm, an efficient database search framework for large-scale metaproteomics studies, which identifies high-confidence peptide-spectrum matches (PSMs) while achieving a two-to-three orders-of-magnitude speedup over popular tools. A reanalysis of a urinary tract infection (UTI) dataset of 110 individuals revealed a complex pattern of polymicrobial expression, including sub-types of UTIs, cases of bacterial vaginosis, and evidence of no underlying disease. Importantly, compared to the initial UTI study that restricted the search database to a manually curated list of 20 genera, ProteoStorm identified additional genera that were previously unreported, including a case of infection with the rare pathogen Propionimicrobium.


Assuntos
Metagenoma , Proteômica/métodos , Software , Bases de Dados Genéticas , Humanos , Microbiota/genética , Proteômica/normas , Infecções Urinárias/microbiologia
SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa