Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 14 de 14
Filtrar
1.
Proteomics ; 24(14): e2300280, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38742951

RESUMO

Mass spectrometry proteomics data are typically evaluated against publicly available annotated sequences, but the proteogenomics approach is a useful alternative. A single genome is commonly utilized in custom proteomic and proteogenomic data analysis. We pose the question of whether utilizing numerous different genome assemblies in a search database would be beneficial. We reanalyzed raw data from the exoprotein fraction of four reference Enterobacterial Repetitive Intergenic Consensus (ERIC) I-IV genotypes of the honey bee bacterial pathogen Paenibacillus larvae and evaluated them against three reference databases (from NCBI-protein, RefSeq, and UniProt) together with an array of protein sequences generated by six-frame direct translation of 15 genome assemblies from GenBank. The wide search yielded 453 protein hits/groups, which UpSet analysis categorized into 50 groups based on the success of protein identification by the 18 database components. Nine hits that were not identified by a unique peptide were not considered for marker selection, which discarded the only protein that was not identified by the reference databases. We propose that the variability in successful identifications between genome assemblies is useful for marker mining. The results suggest that various strains of P. larvae can exhibit specific traits that set them apart from the established genotypes ERIC I-V.


Assuntos
Proteínas de Bactérias , Genoma Bacteriano , Paenibacillus larvae , Proteogenômica , Fatores de Virulência , Proteogenômica/métodos , Animais , Abelhas/microbiologia , Paenibacillus larvae/genética , Paenibacillus larvae/patogenicidade , Paenibacillus larvae/metabolismo , Fatores de Virulência/genética , Fatores de Virulência/metabolismo , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Genoma Bacteriano/genética , Bases de Dados de Proteínas , Proteômica/métodos
2.
Proteomics ; 23(20): e2300188, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37488995

RESUMO

Relative and absolute intensity-based protein quantification across cell lines, tissue atlases and tumour datasets is increasingly available in public datasets. These atlases enable researchers to explore fundamental biological questions, such as protein existence, expression location, quantity and correlation with RNA expression. Most studies provide MS1 feature-based label-free quantitative (LFQ) datasets; however, growing numbers of isobaric tandem mass tags (TMT) datasets remain unexplored. Here, we compare traditional intensity-based absolute quantification (iBAQ) proteome abundance ranking to an analogous method using reporter ion proteome abundance ranking with data from an experiment where LFQ and TMT were measured on the same samples. This new TMT method substitutes reporter ion intensities for MS1 feature intensities in the iBAQ framework. Additionally, we compared LFQ-iBAQ values to TMT-iBAQ values from two independent large-scale tissue atlas datasets (one LFQ and one TMT) using robust bottom-up proteomic identification, normalisation and quantitation workflows.

3.
Genet Med ; 24(8): 1618-1629, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35550369

RESUMO

PURPOSE: The study aimed to determine the diagnostic yield, optimal timing, and methodology of next generation sequencing data reanalysis in suspected Mendelian disorders. METHODS: We conducted a systematic review and meta-analysis of studies that conducted data reanalysis in patients with suspected Mendelian disorders. Random effects model was used to pool the estimated outcome with subgroup analysis stratified by timing, sequencing methodology, sample size, segregation, use of research validation, and artificial intelligence (AI) variant curation tools. RESULTS: A search of PubMed, Embase, Scopus, and Web of Science between 2007 and 2021 yielded 9327 articles, of which 29 were selected. Significant heterogeneity was noted between studies. Reanalysis had an overall diagnostic yield of 0.10 (95% CI = 0.06-0.13). Literature updates accounted for most new diagnoses. Diagnostic yield was higher after 24 months, although this was not statistically significant. Increased diagnoses were obtained with research validation and data sharing. AI-based tools did not adversely affect reanalysis diagnostic rate. CONCLUSION: Next generation sequencing data reanalysis can improve diagnostic yield. Owing to the heterogeneity of the studies, the optimal time to reanalysis and the impact of AI-based tools could not be determined with confidence. We propose standardized guidelines for future studies to reduce heterogeneity and improve the quality of the conclusions.


Assuntos
Inteligência Artificial , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Sequenciamento do Exoma/métodos
4.
Stat Med ; 41(8): 1319-1333, 2022 04 15.
Artigo em Inglês | MEDLINE | ID: mdl-34897784

RESUMO

Testing the equality of two proportions is a common procedure in science, especially in medicine and public health. In these domains, it is crucial to be able to quantify evidence for the absence of a treatment effect. Bayesian hypothesis testing by means of the Bayes factor provides one avenue to do so, requiring the specification of prior distributions for parameters. The most popular analysis approach views the comparison of proportions from a contingency table perspective, assigning prior distributions directly to the two proportions. Another, less popular approach views the problem from a logistic regression perspective, assigning prior distributions to logit-transformed parameters. Reanalyzing 39 null results from the New England Journal of Medicine with both approaches, we find that they can lead to markedly different conclusions, especially when the observed proportions are at the extremes (ie, very low or very high). We explain these stark differences and provide recommendations for researchers interested in testing the equality of two proportions and users of Bayes factors more generally. The test that assigns prior distributions to logit-transformed parameters creates prior dependence between the two proportions and yields weaker evidence when the observations are at the extremes. When comparing two proportions, we argue that this test should become the new default.


Assuntos
Projetos de Pesquisa , Teorema de Bayes , Humanos , Modelos Logísticos
5.
J Proteome Res ; 19(10): 3906-3909, 2020 10 02.
Artigo em Inglês | MEDLINE | ID: mdl-32786688

RESUMO

Metadata is essential in proteomics data repositories and is crucial to interpret and reanalyze the deposited data sets. For every proteomics data set, we should capture at least three levels of metadata: (i) data set description, (ii) the sample to data files related information, and (iii) standard data file formats (e.g., mzIdentML, mzML, or mzTab). While the data set description and standard data file formats are supported by all ProteomeXchange partners, the information regarding the sample to data files is mostly missing. Recently, members of the European Bioinformatics Community for Mass Spectrometry (EuBIC) have created an open-source project called Sample to Data file format for Proteomics (https://github.com/bigbio/proteomics-metadata-standard/) to enable the standardization of sample metadata of public proteomics data sets. Here, the project is presented to the proteomics community, and we call for contributors, including researchers, journals, and consortiums to provide feedback about the format. We believe this work will improve reproducibility and facilitate the development of new tools dedicated to proteomics data analysis.


Assuntos
Metadados , Proteômica , Espectrometria de Massas , Reprodutibilidade dos Testes , Software
6.
J Proteome Res ; 17(12): 4160-4170, 2018 12 07.
Artigo em Inglês | MEDLINE | ID: mdl-30175587

RESUMO

The practice of data sharing in the proteomics field took off and quickly spread in recent years as a result of collective effort. Nowadays, most journal editors mandate the submission of the original raw mass spectra to one of the databases of the ProteomeXchange consortium. With the exception of large institutional initiatives such as PeptideAtlas or the GPMDB, few new studies are however based on the reanalysis of mass spectrometry data. A wealth of information is thus left unexploited in public databases and repositories. Here, we present the large-scale reanalysis of 41 publicly available data sets corresponding to experiments carried out on the HeLa cancer cell line using a custom workflow. In addition to the search of new post-translational modification sites and "missing proteins", our main goal is to identify single amino acid variants and evaluate their impact on protein expression and stability through the spectral counting quantification approach. The X!Tandem software was selected to perform the search of a total of 56 363 701 tandem mass spectra against a customized variant protein database, compiled by the application of the in-house MzVar tool on HeLa-specific somatic and genomic variants retrieved from the COSMIC cell line project. After filtering the resulting identifications with a 1% FDR threshold computed at the protein level, 49 466 unique peptides were identified in 7266 protein entries, allowing the validation of 5576 protein entries in accordance with the HPP guidelines version 2.1. A new "missing protein" was observed (FRAT2, NX_O75474, chromosome 10), and 189 new phosphorylation and 392 new protein N-terminal acetylation sites could be identified. Twenty-four variant peptides were also identified, corresponding to 21 variants in 21 proteins. For three of the nine heterozygous cases where both the variant peptide and its wild-type counterpart were detected, the application of a two-tailed sign test showed a significant difference in the abundance of the two peptide versions.


Assuntos
Bases de Dados de Proteínas , Variação Genética , Processamento de Proteína Pós-Traducional , Proteoma/análise , Acetilação , Sequência de Aminoácidos , Linhagem Celular Tumoral , Células HeLa , Humanos , Fosforilação , Proteômica/métodos , Software
8.
Clin Chim Acta ; 554: 117795, 2024 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-38262496

RESUMO

BACKGROUND: Hematuria is a common condition in clinical practice of pediatric patients. It is related to a wide spectrum of disorders and has high heterogeneity both clinically and genetically, which contributes to challenges of diagnosis and lead many pediatric patients with hematuria not to receive accurate diagnosis and early management. METHODS: In this single center study, 42 children with hematuria were included in Tianjin Children's Hospital between 2019 and 2020. We analyzed the clinical information and performed WES (Whole exome sequencing) for all cases. Then the classification of identified variants was performed according to the American College of Medical Genetics and Genomics (ACMG) guidelines for interpreting sequence variants. For the fragment deletion, qPCR was performed to validate and confirm the inherited pattern. RESULTS: For the 42 patients, 16 cases had gross hematuria and 26 had microscopic hematuria. Molecular genetic causes were uncovered in 9 (21.4%) children, including 7 with Alport syndrome (AS), one with polycystic nephropathy and one with lipoprotein glomerulopathy. The genetic causes for other patients were not related with hematuria. CONCLUSIONS: WES is a rapid and effective way to evaluate patients with hematuria. The analysis of genotype-phenotype correlations of patients with AS indicated that severe variants were associated with early kidney failure. Secondary findings were not rare in Chinese children, thus the clinician should pay more attention to the clinical interpretation of sequencing results and properly interaction with patients and their family.


Assuntos
Hematúria , Nefropatias , Criança , Humanos , Hematúria/diagnóstico , Hematúria/genética , Sequenciamento do Exoma , Genômica , Estudos de Associação Genética
9.
Front Genet ; 14: 1122985, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37152996

RESUMO

Introduction: Exome sequencing has a diagnostic yield ranging from 25% to 70% in rare diseases and regularly implicates genes in novel disorders. Retrospective data reanalysis has demonstrated strong efficacy in improving diagnosis, but poses organizational difficulties for clinical laboratories. Patients and methods: We applied a reanalysis strategy based on intensive prospective bibliographic monitoring along with direct application of the GREP command-line tool (to "globally search for a regular expression and print matching lines") in a large ES database. For 18 months, we submitted the same five keywords of interest [(intellectual disability, (neuro)developmental delay, and (neuro)developmental disorder)] to PubMed on a daily basis to identify recently published novel disease-gene associations or new phenotypes in genes already implicated in human pathology. We used the Linux GREP tool and an in-house script to collect all variants of these genes from our 5,459 exome database. Results: After GREP queries and variant filtration, we identified 128 genes of interest and collected 56 candidate variants from 53 individuals. We confirmed causal diagnosis for 19/128 genes (15%) in 21 individuals and identified variants of unknown significance for 19/128 genes (15%) in 23 individuals. Altogether, GREP queries for only 128 genes over a period of 18 months permitted a causal diagnosis to be established in 21/2875 undiagnosed affected probands (0.7%). Conclusion: The GREP query strategy is efficient and less tedious than complete periodic reanalysis. It is an interesting reanalysis strategy to improve diagnosis.

10.
Cortex ; 153: 87-96, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35635860

RESUMO

Kording and Wolpert (2004), hereafter referred to as KW, describe an experiment where subjects strove for accuracy in a stochastic environment and, on some trials, received mid-trial and post-trial feedback. KW claims that subjects learned the underlying stochastic distribution from the post-trial feedback of previous trials. KW also claims that subjects regarded mid-trial feedback that had a smaller visual size as more precise and they were therefore more sensitive to such mid-trial feedback. KW concludes that the observations are consistent with optimal Bayesian learning. KW has become an extremely influential paper in the large literature arguing that subjects are optimal Bayesian learners in stochastic environments. It is therefore crucial that the KW conclusions follow from their dataset. We note that KW analyzes data that have been both averaged across trials and averaged across other important trial-specific details. We also note that KW mischaracterizes the accuracy of the mid-trial feedback and the relative sizes of the mid-trial feedback. When we analyze the trial-level KW data, we do not find that subjects were more sensitive to mid-trial feedback when it had a smaller visual size. Our trial-level analysis also suggests a recency bias, rather than evidence that the subjects learned the stochastic distribution. In other words, we do not find that the observations are consistent with optimal Bayesian learning. In the KW dataset, it seems that evidence for optimal Bayesian learning is a statistical artifact of analyzing averaged data. Our results from the KW dataset would seem to have important implications for the literature on Bayesian judgments.


Assuntos
Retroalimentação Sensorial , Aprendizagem , Teorema de Bayes , Humanos
11.
Virus Res ; 320: 198887, 2022 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-35953004

RESUMO

PURPOSE: Japanese encephalitis (JE), caused by the Japanese encephalitis virus (JEV), is the principal cause of viral encephalitis in South-East Asian and Western Pacific countries; accounting for 68,000 cases, and up to 20,400 fatalities, annually across the world. Despite being a high-risk condition, there is no specific treatment for JE. Given rapid additions in genomics databases and the power of data reanalysis in addressing critical medical questions, the present study was designed to identify novel host factors that might have potential roles in JEV infection. METHODS: We extracted microarray and RNA-Seq data sets from NCBI-GEO and compared mock and JEV-infected samples. Raw data from all the studies were re-analyzed to identify host factors associated with JEV replication. RESULTS: We identified several coding and non-coding host factors that had no prior known role in viral infections. Of these, the coding transcripts: Myosin Heavy Chain 10 (MYH10), Progestin and AdipoQ Receptor Family Member 8 (PAQR8), and the microRNAs: hsa-miR-193b-5p, hsa-miR-3714 and hsa-miR-513a-5p were found to be novel host factors deregulated during JEV infection. MYH10 encodes a conventional non-muscle myosin, and mutations in MYH10 have been shown to cause neurological defects. PAQR8 has been associated with epilepsy, which exhibits symptoms similar to JEV infection. JE is a neuro-degenerative disease, and the known involvement of MYH10 and PAQR8 in neurological disorders strongly indicates potential roles of these host factors in JEV infection. Additionally, we observed that MYH10 and PAQR8 had a significant negative correlation with Activating transcription factor 3 (ATF3), which is a previously validated modulator of JEV infection. ATF3 is a transcription factor that binds to the promotors of genes encoding other transcription factors or interferon-stimulated genes and negatively regulates host antiviral responses during JE. CONCLUSION: Our findings demonstrate the significance of data reanalysis in the identification of novel host factors that may become targets for diagnosis/ therapy against viral diseases of major concern, such as, JE. The deregulated coding and non-coding transcripts identified in this study need further experimental analysis for validation.


Assuntos
Vírus da Encefalite Japonesa (Espécie) , Vírus da Encefalite Japonesa (Subgrupo) , Encefalite Japonesa , MicroRNAs , Vírus da Encefalite Japonesa (Espécie)/metabolismo , Encefalite Japonesa/genética , Humanos , MicroRNAs/genética , MicroRNAs/metabolismo , Transcriptoma
12.
Methods Mol Biol ; 1977: 217-235, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30980331

RESUMO

Mass spectrometry based proteomics is no longer only a qualitative discipline, and can be successfully employed to obtain a truly multidimensional view of the proteome. In particular, systematic protein expression profiling is now a routine part of many studies in the field and beyond. The large growth in the number of quantitative studies is accompanied by a trend to share publicly the associated analysis results and the underlying raw data. This trend, established and strongly supported by public repositories such as the PRIDE database at the European Bioinformatics Institute, opens up enormous possibilities to explore the data beyond the original publications, for instance by reusing, reanalyzing, and performing different flavors of meta-analysis studies. To help researchers and scientists realize about this potential, here we describe the mainstream public proteomics resources containing quantitative proteomics data, including the processed analysis results and/or the underlying raw data. We then present and discuss the most important points to consider when attempting to (re)use proteomics data in the public domain. We conclude by highlighting potential pitfalls of (re)using quantitative data and discuss some of our own experiences in this context.


Assuntos
Biologia Computacional , Bases de Dados de Proteínas , Proteômica/métodos , Biologia Computacional/métodos , Análise de Dados , Humanos , Espectrometria de Massas , Proteômica/normas , Reprodutibilidade dos Testes , Navegador
14.
J Am Dent Assoc ; 146(3): 164-173.e4, 2015 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-25726343

RESUMO

BACKGROUND: It has been proposed that the PST and PerioPredict genetic tests that are based on polymorphisms in interleukin 1 (IL-1) genes identify a subset of patients who experience fewer tooth extractions if provided with 2 annual preventive visits. Economic analyses indicate rationing preventive care to only "high-risk" genotypes, smokers, patients with diabetes, or combinations of these risk factors would reduce the cost of dental care by $4.8 billion annually in the United States. METHODS: Data presented in the study that claimed clinical utility for the PST and PerioPredict tests were obtained for reanalysis using logistic regression to assess whether the PST genetic test, smoking, diabetes, or number of preventive visits were risk factors for tooth extraction during a span of 16 years. Consistency of risk classification by the PST (version 1) and PerioPredict (version 2) genetic tests was evaluated in different ethnic groups from the 1000 Genomes database. RESULTS: Multivariate analyses revealed association of tooth extraction with diabetes (P < .0001), smoking (P < .0001), and number of preventive visits (P = .004), but no support for the PST genetic test (P = .96) nor indication that the benefit of 2 preventive visits was affected by this genetic test (P = .58). Classification of risk was highly inconsistent between the PST (version 1) and PerioPredict (version 2) genetic tests. CONCLUSIONS: Two annual preventive visits were supported as beneficial for all patients, and there was no evidence that the IL-1 PST genetic test has any effect on tooth extraction risk or influences the benefits of 2 annual preventive visits. PRACTICAL IMPLICATIONS: Neither IL-1 PST nor PerioPredict genetic tests are useful for rationing preventive dental care. Further research is needed to identify genetic biomarkers with robust clinical validity and clinical utility to effectively personalize the practice of dentistry.


Assuntos
Testes Genéticos , Interleucina-1/genética , Medicina Preventiva/métodos , Adulto , Assistência Odontológica/estatística & dados numéricos , Complicações do Diabetes/epidemiologia , Feminino , Marcadores Genéticos/genética , Predisposição Genética para Doença/genética , Testes Genéticos/métodos , Humanos , Cobertura do Seguro , Seguro Odontológico , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único/genética , Fatores de Risco , Fumar/efeitos adversos , Doenças Dentárias/etiologia , Doenças Dentárias/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA