Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 57
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Brief Bioinform ; 24(6)2023 09 22.
Artículo en Inglés | MEDLINE | ID: mdl-37930029

RESUMEN

The principal use of mass cytometry is to identify distinct cell types and changes in their composition, phenotype and function in different samples and conditions. Combining data from different studies has the potential to increase the power of these discoveries in diverse fields such as immunology, oncology and infection. However, current tools are lacking in scalable, reproducible and automated methods to integrate and study data sets from mass cytometry that often use heterogenous approaches to study similar samples. To address these limitations, we present two novel developments: (1) a pre-trained cell identification model named Immunopred that allows automated identification of immune cells without user-defined prior knowledge of expected cell types and (2) a fully automated cytometry meta-analysis pipeline built around Immunopred. We evaluated this pipeline on six COVID-19 study data sets comprising 270 unique samples and uncovered novel significant phenotypic changes in the wider immune landscape of COVID-19 that were not identified when each study was analyzed individually. Applied widely, our approach will support the discovery of novel findings in research areas where cytometry data sets are available for integration.


Asunto(s)
COVID-19 , Redes Neurales de la Computación , Humanos , Citometría de Flujo/métodos , Fenotipo
2.
PLoS Comput Biol ; 19(6): e1010459, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37352361

RESUMEN

Phosphoproteomics allows one to measure the activity of kinases that drive the fluxes of signal transduction pathways involved in biological processes such as immune function, senescence and cell growth. However, deriving knowledge of signalling network circuitry from these data is challenging due to a scarcity of phosphorylation sites that define kinase-kinase relationships. To address this issue, we previously identified around 6,000 phosphorylation sites as markers of kinase-kinase relationships (that may be conceptualised as network edges), from which empirical cell-model-specific weighted kinase networks may be reconstructed. Here, we assess whether the application of community detection algorithms to such networks can identify new components linked to canonical signalling pathways. Phosphoproteomics data from acute myeloid leukaemia (AML) cells treated separately with PI3K, AKT, MEK and ERK inhibitors were used to reconstruct individual kinase networks. We used modularity maximisation to detect communities in each network, and selected the community containing the main target of the inhibitor used to treat cells. These analyses returned communities that contained known canonical signalling components. Interestingly, in addition to canonical PI3K/AKT/mTOR members, the community assignments returned TTK (also known as MPS1) as a likely component of PI3K/AKT/mTOR signalling. We drew similar insights from an external phosphoproteomics dataset from breast cancer cells treated with rapamycin and oestrogen. We confirmed this observation with wet-lab laboratory experiments showing that TTK phosphorylation was decreased in AML cells treated with AKT and MTOR inhibitors. This study illustrates the application of community detection algorithms to the analysis of empirical kinase networks to uncover new members linked to canonical signalling pathways.


Asunto(s)
Leucemia Mieloide Aguda , Proteínas Proto-Oncogénicas c-akt , Humanos , Proteínas Proto-Oncogénicas c-akt/metabolismo , Fosfatidilinositol 3-Quinasas/metabolismo , Transducción de Señal , Serina-Treonina Quinasas TOR/metabolismo , Fosfotransferasas/metabolismo
3.
Value Health ; 26(7): 1057-1066, 2023 07.
Artículo en Inglés | MEDLINE | ID: mdl-36804528

RESUMEN

OBJECTIVES: Clinical outcome assessment (COA) developers must ensure that measures assess aspects of health that are meaningful to the target patient population. Although the methodology for doing this is well understood for certain COAs, such as patient-reported outcome measures, there are fewer examples of this practice in the development of digital endpoints using mobile sensor technology such as physical activity monitors. This study explored the utility of social media data, specifically, posts on online health boards, in understanding meaningful aspects of health related to physical activity in 3 different chronic diseases: fibromyalgia, chronic obstructive pulmonary disease, and chronic heart failure. METHODS: We used machine learning and manual coding to summarize the content of posts extracted from 4 online health boards. Where available, patient age and sex were retrieved from post content or user profiles. We utilized analytical approaches to assess the robustness of findings to differences in the characteristics of online samples compared to the true patient population. Finally, we assessed concept saturation by measuring the convergence of autocorrelations. RESULTS: We identify a number of aspects of health described as important by patients in our samples, and summarize these into concepts for measurement. For chronic heart failure, these included purposeful walking duration and speed, fatigue, difficulty going upstairs, standing, and aspects of physical exercise. Overall and age-adjusted results did not differ considerably for each disease group. CONCLUSIONS: This study illustrates the potential of performing concept elicitation research using social media data, which may provide valuable insight to inform COA development.


Asunto(s)
Enfermedad Pulmonar Obstructiva Crónica , Humanos , Fatiga , Medición de Resultados Informados por el Paciente , Ejercicio Físico , Aprendizaje Automático
4.
Nucleic Acids Res ; 46(10): 4893-4902, 2018 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-29718325

RESUMEN

Proteomics informed by transcriptomics (PIT), in which proteomic MS/MS spectra are searched against open reading frames derived from de novo assembled transcripts, can reveal previously unknown translated genomic elements (TGEs). However, determining which TGEs are truly novel, which are variants of known proteins, and which are simply artefacts of poor sequence assembly, is challenging. We have designed and implemented an automated solution that classifies putative TGEs by comparing to reference proteome sequences. This allows large-scale identification of sequence polymorphisms, splice isoforms and novel TGEs supported by presence or absence of variant-specific peptide evidence. Unlike previously reported methods, ours does not require a catalogue of known variants, making it more applicable to non-model organisms. The method was validated on human PIT data, then applied to Mus musculus, Pteropus alecto and Aedes aegypti. Novel discoveries included 60 human protein isoforms, 32 392 polymorphisms in P. alecto, and TGEs with non-methionine start sites including tyrosine.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Isoformas de Proteínas/genética , Proteómica/métodos , Aedes/genética , Aedes/metabolismo , Animales , Línea Celular , Quirópteros/genética , Quirópteros/metabolismo , Codón Iniciador , Humanos , Proteínas de Insectos/genética , Proteínas de Insectos/metabolismo , Ratones , Sistemas de Lectura Abierta , Polimorfismo Genético , Reproducibilidad de los Resultados , Espectrometría de Masas en Tándem , Tirosina/genética
5.
Nucleic Acids Res ; 46(D1): D1223-D1228, 2018 01 04.
Artículo en Inglés | MEDLINE | ID: mdl-30053269

RESUMEN

PITDB is a freely available database of translated genomic elements (TGEs) that have been observed in PIT (proteomics informed by transcriptomics) experiments. In PIT, a sample is analyzed using both RNA-seq transcriptomics and proteomic mass spectrometry. Transcripts assembled from RNA-seq reads are used to create a library of sample-specific amino acid sequences against which the acquired mass spectra are searched, permitting detection of any TGE, not just those in canonical proteome databases. At the time of writing, PITDB contains over 74 000 distinct TGEs from four species, supported by more than 600 000 peptide spectrum matches. The database, accessible via http://pitdb.org, provides supporting evidence for each TGE, often from multiple experiments and an indication of the confidence in the TGE's observation and its type, ranging from known protein (exact match to a UniProt protein sequence), through multiple types of protein variant including various splice isoforms, to a putative novel molecule. PITDB's modern web interface allows TGEs to be viewed individually or by species or experiment, and downloaded for further analysis. PITDB is for bench scientists seeking to share their PIT results, for researchers investigating novel genome products in model organisms and for those wishing to construct proteomes for lesser studied species.


Asunto(s)
Bases de Datos Factuales , Proteínas/química , Proteínas/genética , Análisis de Secuencia de ARN , Algoritmos , Secuencia de Aminoácidos , Animales , Presentación de Datos , Humanos , Internet , Sistemas de Lectura Abierta , Biosíntesis de Proteínas , Isoformas de Proteínas/genética , Proteómica/métodos , Espectrometría de Masas en Tándem , Interfaz Usuario-Computador
6.
BMC Genomics ; 18(1): 101, 2017 01 19.
Artículo en Inglés | MEDLINE | ID: mdl-28103802

RESUMEN

BACKGROUND: Aedes aegypti is a vector for the (re-)emerging human pathogens dengue, chikungunya, yellow fever and Zika viruses. Almost half of the Ae. aegypti genome is comprised of transposable elements (TEs). Transposons have been linked to diverse cellular processes, including the establishment of viral persistence in insects, an essential step in the transmission of vector-borne viruses. However, up until now it has not been possible to study the overall proteome derived from an organism's mobile genetic elements, partly due to the highly divergent nature of TEs. Furthermore, as for many non-model organisms, incomplete genome annotation has hampered proteomic studies on Ae. aegypti. RESULTS: We analysed the Ae. aegypti proteome using our new proteomics informed by transcriptomics (PIT) technique, which bypasses the need for genome annotation by identifying proteins through matched transcriptomic (rather than genomic) data. Our data vastly increase the number of experimentally confirmed Ae. aegypti proteins. The PIT analysis also identified hotspots of incomplete genome annotation, and showed that poor sequence and assembly quality do not explain all annotation gaps. Finally, in a proof-of-principle study, we developed criteria for the characterisation of proteomically active TEs. Protein expression did not correlate with a TE's genomic abundance at different levels of classification. Most notably, long terminal repeat (LTR) retrotransposons were markedly enriched compared to other elements. PIT was superior to 'conventional' proteomic approaches in both our transposon and genome annotation analyses. CONCLUSIONS: We present the first proteomic characterisation of an organism's repertoire of mobile genetic elements, which will open new avenues of research into the function of transposon proteins in health and disease. Furthermore, our study provides a proof-of-concept that PIT can be used to evaluate a genome's annotation to guide annotation efforts which has the potential to improve the efficiency of annotation projects in non-model organisms. PIT therefore represents a valuable new tool to study the biology of the important vector species Ae. aegypti, including its role in transmitting emerging viruses of global public health concern.


Asunto(s)
Aedes/metabolismo , Elementos Transponibles de ADN/genética , Genoma , Proteoma/análisis , Proteómica/métodos , Aedes/genética , Animales , Línea Celular , Cromatografía Líquida de Alta Presión , Mapeo Contig , Proteínas de Insectos/análisis , Proteínas de Insectos/aislamiento & purificación , ARN/aislamiento & purificación , ARN/metabolismo , Análisis de Secuencia de ARN , Espectrometría de Masas en Tándem
7.
Mol Cell Proteomics ; 14(11): 3087-93, 2015 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-26269333

RESUMEN

With the recent advent of RNA-seq technology the proteomics community has begun to generate sample-specific protein databases for peptide and protein identification, an approach we call proteomics informed by transcriptomics (PIT). This approach has gained a lot of interest, particularly among researchers who work with nonmodel organisms or with particularly dynamic proteomes such as those observed in developmental biology and host-pathogen studies. PIT has been shown to improve coverage of known proteins, and to reveal potential novel gene products. However, many groups are impeded in their use of PIT by the complexity of the required data analysis. Necessarily, this analysis requires complex integration of a number of different software tools from at least two different communities, and because PIT has a range of biological applications a single software pipeline is not suitable for all use cases. To overcome these problems, we have created GIO, a software system that uses the well-established Galaxy platform to make PIT analysis available to the typical bench scientist via a simple web interface. Within GIO we provide workflows for four common use cases: a standard search against a reference proteome; PIT protein identification without a reference genome; PIT protein identification using a genome guide; and PIT genome annotation. These workflows comprise individual tools that can be reconfigured and rearranged within the web interface to create new workflows to support additional use cases.


Asunto(s)
Proteómica/métodos , Programas Informáticos , Transcriptoma , Algoritmos , Minería de Datos , Bases de Datos de Proteínas , Humanos , Espectrometría de Masas/estadística & datos numéricos , Flujo de Trabajo
8.
Proteomics ; 15(18): 3152-62, 2015 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-26037908

RESUMEN

The mzQuantML standard has been developed by the Proteomics Standards Initiative for capturing, archiving and exchanging quantitative proteomic data, derived from mass spectrometry. It is a rich XML-based format, capable of representing data about two-dimensional features from LC-MS data, and peptides, proteins or groups of proteins that have been quantified from multiple samples. In this article we report the development of an open source Java-based library of routines for mzQuantML, called the mzqLibrary, and associated software for visualising data called the mzqViewer. The mzqLibrary contains routines for mapping (peptide) identifications on quantified features, inference of protein (group)-level quantification values from peptide-level values, normalisation and basic statistics for differential expression. These routines can be accessed via the command line, via a Java programming interface access or a basic graphical user interface. The mzqLibrary also contains several file format converters, including import converters (to mzQuantML) from OpenMS, Progenesis LC-MS and MaxQuant, and exporters (from mzQuantML) to other standards or useful formats (mzTab, HTML, csv). The mzqViewer contains in-built routines for viewing the tables of data (about features, peptides or proteins), and connects to the R statistical library for more advanced plotting options. The mzqLibrary and mzqViewer packages are available from https://code.google.com/p/mzq-lib/.


Asunto(s)
Sistemas de Administración de Bases de Datos , Bases de Datos de Proteínas/normas , Proteómica/métodos , Proteómica/normas , Programas Informáticos
9.
Biochim Biophys Acta ; 1844(1 Pt A): 88-97, 2014 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-23584085

RESUMEN

The Human Proteome Organisation - Proteomics Standards Initiative (HUPO-PSI) has been working for ten years on the development of standardised formats that facilitate data sharing and public database deposition. In this article, we review three HUPO-PSI data standards - mzML, mzIdentML and mzQuantML, which can be used to design a complete quantitative analysis pipeline in mass spectrometry (MS)-based proteomics. In this tutorial, we briefly describe the content of each data model, sufficient for bioinformaticians to devise proteomics software. We also provide guidance on the use of recently released application programming interfaces (APIs) developed in Java for each of these standards, which makes it straightforward to read and write files of any size. We have produced a set of example Java classes and a basic graphical user interface to demonstrate how to use the most important parts of the PSI standards, available from http://code.google.com/p/psi-standard-formats-tutorial. This article is part of a Special Issue entitled: Computational Proteomics in the Post-Identification Era. Guest Editors: Martin Eisenacher and Christian Stephan.


Asunto(s)
Proteómica , Programas Informáticos , Biología Computacional , Humanos , Lenguajes de Programación
10.
Nat Methods ; 9(12): 1207-11, 2012 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-23142869

RESUMEN

Identification of proteins by tandem mass spectrometry requires a reference protein database, but these are only available for model species. Here we demonstrate that, for a non-model species, the sequencing of expressed mRNA can generate a protein database for mass spectrometry-based identification. This combination of high-throughput sequencing and protein identification technologies allows detection of genes and proteins. We use human cells infected with human adenovirus as a complex and dynamic model to demonstrate the robustness of this approach. Our proteomics informed by transcriptomics (PIT) technique identifies >99% of over 3,700 distinct proteins identified using traditional analysis that relies on comprehensive human and adenovirus protein lists. We show that this approach can also be used to highlight genes and proteins undergoing dynamic changes in post-transcriptional protein stability.


Asunto(s)
Proteoma/química , Proteómica/métodos , Transcriptoma , Adenoviridae/genética , Adenoviridae/metabolismo , Animales , Arginina/metabolismo , Células CHO , Isótopos de Carbono , Cromatografía Liquida , Cricetinae , Cricetulus , Bases de Datos de Proteínas , Células HeLa , Humanos , Lisina/metabolismo , Isótopos de Nitrógeno , Proteínas Nucleares/metabolismo , Polimorfismo de Nucleótido Simple , ARN Mensajero/metabolismo , Proteínas de Unión al ARN/metabolismo , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Espectrometría de Masas en Tándem/métodos
11.
Mol Cell Proteomics ; 12(8): 2332-40, 2013 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-23599424

RESUMEN

The range of heterogeneous approaches available for quantifying protein abundance via mass spectrometry (MS)(1) leads to considerable challenges in modeling, archiving, exchanging, or submitting experimental data sets as supplemental material to journals. To date, there has been no widely accepted format for capturing the evidence trail of how quantitative analysis has been performed by software, for transferring data between software packages, or for submitting to public databases. In the context of the Proteomics Standards Initiative, we have developed the mzQuantML data standard. The standard can represent quantitative data about regions in two-dimensional retention time versus mass/charge space (called features), peptides, and proteins and protein groups (where there is ambiguity regarding peptide-to-protein inference), and it offers limited support for small molecule (metabolomic) data. The format has structures for representing replicate MS runs, grouping of replicates (for example, as study variables), and capturing the parameters used by software packages to arrive at these values. The format has the capability to reference other standards such as mzML and mzIdentML, and thus the evidence trail for the MS workflow as a whole can now be described. Several software implementations are available, and we encourage other bioinformatics groups to use mzQuantML as an input, internal, or output format for quantitative software and for structuring local repositories. All project resources are available in the public domain from the HUPO Proteomics Standards Initiative http://www.psidev.info/mzquantml.


Asunto(s)
Espectrometría de Masas/normas , Proteómica/normas , Bases de Datos de Proteínas , Espectrometría de Masas/métodos , Modelos Teóricos , Proteómica/métodos , Programas Informáticos
12.
Nature ; 455(7216): 1138-42, 2008 Oct 23.
Artículo en Inglés | MEDLINE | ID: mdl-18948958

RESUMEN

Metals are needed by at least one-quarter of all proteins. Although metallochaperones insert the correct metal into some proteins, they have not been found for the vast majority, and the view is that most metalloproteins acquire their metals directly from cellular pools. However, some metals form more stable complexes with proteins than do others. For instance, as described in the Irving-Williams series, Cu(2+) and Zn(2+) typically form more stable complexes than Mn(2+). Thus it is unclear what cellular mechanisms manage metal acquisition by most nascent proteins. To investigate this question, we identified the most abundant Cu(2+)-protein, CucA (Cu(2+)-cupin A), and the most abundant Mn(2+)-protein, MncA (Mn(2+)-cupin A), in the periplasm of the cyanobacterium Synechocystis PCC 6803. Each of these newly identified proteins binds its respective metal via identical ligands within a cupin fold. Consistent with the Irving-Williams series, MncA only binds Mn(2+) after folding in solutions containing at least a 10(4) times molar excess of Mn(2+) over Cu(2+) or Zn(2+). However once MncA has bound Mn(2+), the metal does not exchange with Cu(2+). MncA and CucA have signal peptides for different export pathways into the periplasm, Tat and Sec respectively. Export by the Tat pathway allows MncA to fold in the cytoplasm, which contains only tightly bound copper or Zn(2+) (refs 10-12) but micromolar Mn(2+) (ref. 13). In contrast, CucA folds in the periplasm to acquire Cu(2+). These results reveal a mechanism whereby the compartment in which a protein folds overrides its binding preference to control its metal content. They explain why the cytoplasm must contain only tightly bound and buffered copper and Zn(2+).


Asunto(s)
Proteínas Bacterianas/metabolismo , Metales Pesados/metabolismo , Pliegue de Proteína , Proteínas Bacterianas/química , Proteínas Bacterianas/aislamiento & purificación , Cobre/metabolismo , Manganeso/metabolismo , Modelos Moleculares , Periplasma/metabolismo , Unión Proteica , Estructura Terciaria de Proteína , Synechocystis/metabolismo , Zinc/metabolismo
13.
Bioinform Adv ; 4(1): vbad190, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38282976

RESUMEN

Motivation: Anti-cancer drug response prediction is a central problem within stratified medicine. Transcriptomic profiles of cancer cell lines are typically used for drug response prediction, but we hypothesize that proteomics or phosphoproteomics might be more suitable as they give a more direct insight into cellular processes. However, there has not yet been a systematic comparison between all three of these datatypes using consistent evaluation criteria. Results: Due to the limited number of cell lines with phosphoproteomics profiles we use learning curves, a plot of predictive performance as a function of dataset size, to compare the current performance and predict the future performance of the three omics datasets with more data. We use neural networks and XGBoost and compare them against a simple rule-based benchmark. We show that phosphoproteomics slightly outperforms RNA-seq and proteomics using the 38 cell lines with profiles of all three omics data types. Furthermore, using the 877 cell lines with proteomics and RNA-seq profiles, we show that RNA-seq slightly outperforms proteomics. With the learning curves we predict that the mean squared error using the phosphoproteomics dataset would decrease by ∼15% if a dataset of the same size as the proteomics/transcriptomics was collected. For the cell lines with proteomics and RNA-seq profiles the learning curves reveal that for smaller dataset sizes neural networks outperform XGBoost and vice versa for larger datasets. Furthermore, the trajectory of the XGBoost curve suggests that it will improve faster than the neural networks as more data are collected. Availability and implementation: See https://github.com/Nik-BB/Learning-curves-for-DRP for the code used.

14.
Clin Mol Hepatol ; 29(2): 417-432, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-36727210

RESUMEN

BACKGROUND/AIMS: Immune and inflammatory cells respond to multiple pathological hits in the development of nonalcoholic steatohepatitis (NASH) and fibrosis. Relatively little is known about how their type and function change through the non-alcoholic fatty liver disease (NAFLD) spectrum. Here we used multi-dimensional mass cytometry and a tailored bioinformatic approach to study circulating immune cells sampled from healthy individuals and people with NAFLD. METHODS: Cytometry by time of flight using 36 metal-conjugated antibodies was applied to peripheral blood mononuclear cells (PBMCs) from biopsy-proven NASH fibrosis (late disease), steatosis (early disease), and healthy patients. Supervised and unsupervised analyses were used, findings confirmed, and mechanisms assessed using independent healthy and disease PBMC samples. RESULTS: Of 36 PBMC clusters, 21 changed between controls and disease samples. Significant differences were observed between diseases stages with changes in T cells and myeloid cells throughout disease and B cell changes in late stages. Semi-supervised gating and re-clustering showed that disease stages were associated with fewer monocytes with active signalling and more inactive NK cells; B and T cells bearing activation markers were reduced in late stages, while B cells bearing co-stimulatory molecules were increased. Functionally, disease states were associated with fewer activated mucosal-associated invariant T cells and reduced toll-like receptor-mediated cytokine production in late disease. CONCLUSION: A range of innate and adaptive immune changes begin early in NAFLD, and disease stages are associated with a functionally less active phenotype compared to controls. Further study of the immune response in NAFLD spectrum may give insight into mechanisms of disease with potential clinical application.


Asunto(s)
Enfermedad del Hígado Graso no Alcohólico , Humanos , Enfermedad del Hígado Graso no Alcohólico/patología , Hígado/patología , Leucocitos Mononucleares , Fenotipo , Fibrosis
15.
Anal Biochem ; 414(1): 23-30, 2011 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-21352797

RESUMEN

The optimization of DNA hybridization for genotyping assays is a complex experimental problem that depends on multiple factors such as assay formats, fluorescent probes, target sequence, experimental conditions, and data analysis. Quantum dot-doped particle bioconjugates have been previously described as fluorescent probes to identify single nucleotide polymorphisms even though this advanced fluorescent material has shown structural instability in aqueous environments. To achieve the optimization of DNA hybridization to quantum dot-doped particle bioconjugates in suspension while maximizing the stability of the probe materials, a nonsequential optimization approach was evaluated. The design of experiment with response surface methodology and multiple optimization response was used to maximize the recovery of fluorescent probe at the end of the assay simultaneously with the optimization of target-probe binding. Hybridization efficiency was evaluated by the attachment of fluorescent oligonucleotides to the fluorescent probe through continuous flow cytometry detection. Optimal conditions were predicted with the model and tested for the identification of single nucleotide polymorphisms. The design of experiment has been shown to significantly improve biochemistry and biotechnology optimization processes. Here we demonstrate the potential of this statistical approach to facilitate the optimization of experimental protocol that involves material science and molecular biology.


Asunto(s)
ADN/genética , Hibridación de Ácido Nucleico/métodos , Puntos Cuánticos , Análisis de Varianza , Citometría de Flujo/métodos , Colorantes Fluorescentes/química , Humanos , Oligonucleótidos/química , Oligonucleótidos/genética , Polimorfismo de Nucleótido Simple , Sensibilidad y Especificidad
16.
Analyst ; 136(2): 359-64, 2011 Jan 21.
Artículo en Inglés | MEDLINE | ID: mdl-20967397

RESUMEN

Previous studies have indicated that volatile compounds specific to bladder cancer may exist in urine headspace, raising the possibility that headspace analysis could be used for diagnosis of this particular cancer. In this paper, we evaluate the use of a commercially available gas sensor array coupled with a specifically designed pattern recognition algorithm for this purpose. The best diagnostic performance that we were able to obtain with independent test data provided by healthy volunteers and bladder cancer patients was 70% overall accuracy (70% sensitivity and 70% specificity). When the data of patients suffering from other non-cancerous urological diseases were added to those of the healthy controls, the classification accuracy fell to 65% with 60% sensitivity and 67% specificity. While this is not sufficient for a diagnostic test, it is significantly better than random chance, leading us to conclude that there is useful information in the urine headspace but that a more informative analytical technique, such as mass spectrometry, is required if this is to be exploited fully.


Asunto(s)
Biomarcadores de Tumor/orina , Carcinoma de Células Transicionales/orina , Gases/orina , Urinálisis/instrumentación , Neoplasias de la Vejiga Urinaria/orina , Anciano , Anciano de 80 o más Años , Carcinoma de Células Transicionales/diagnóstico , Femenino , Humanos , Masculino , Persona de Mediana Edad , Sensibilidad y Especificidad , Neoplasias de la Vejiga Urinaria/diagnóstico
17.
Mol Cell Proteomics ; 8(4): 696-705, 2009 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-19011259

RESUMEN

Multiple reaction monitoring (MRM) of peptides uses tandem mass spectrometry to quantify selected proteins of interest, such as those previously identified in differential studies. Using this technique, the specificity of precursor to product transitions is harnessed for quantitative analysis of multiple proteins in a single sample. The design of transitions is critical for the success of MRM experiments, but predicting signal intensity of peptides and fragmentation patterns ab initio is challenging given existing methods. The tool presented here, MRMaid (pronounced "mermaid") offers a novel alternative for rapid design of MRM transitions for the proteomics researcher. The program uses a combination of knowledge of the properties of optimal MRM transitions taken from expert practitioners and literature with MS/MS evidence derived from interrogation of a database of peptide identifications and their associated mass spectra. The tool also predicts retention time using a published model, allowing ordering of transition candidates. By exploiting available knowledge and resources to generate the most reliable transitions, this approach negates the need for theoretical prediction of fragmentation and the need to undertake prior "discovery" MS studies. MRMaid is a modular tool built around the Genome Annotating Proteomic Pipeline framework, providing a web-based solution with both descriptive and graphical visualizations of transitions. Predicted transition candidates are ranked based on a novel transition scoring system, and users may filter the results by selecting optional stringency criteria, such as omitting frequently modified residues, constraining the length of peptides, or omitting missed cleavages. Comparison with published transitions showed that MRMaid successfully predicted the peptide and product ion pairs in the majority of cases with appropriate retention time estimates. As the data content of the Genome Annotating Proteomic Pipeline repository increases, the coverage and reliability of MRMaid are set to increase further. MRMaid is freely available over the internet as an executable web-based service at www.mrmaid.info.


Asunto(s)
Biología Computacional/métodos , Internet , Espectrometría de Masas/métodos , Proteínas/análisis , Programas Informáticos , Humanos , Péptidos/análisis , Factores de Tiempo
18.
Food Microbiol ; 28(4): 782-90, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-21511139

RESUMEN

A series of partial least squares (PLS) models were employed to correlate spectral data from FTIR analysis with beef fillet spoilage during aerobic storage at different temperatures (0, 5, 10, 15, and 20 °C) using the dataset presented by Argyri et al. (2010). The performance of the PLS models was compared with a three-layer feed-forward artificial neural network (ANN) developed using the same dataset. FTIR spectra were collected from the surface of meat samples in parallel with microbiological analyses to enumerate total viable counts. Sensory evaluation was based on a three-point hedonic scale classifying meat samples as fresh, semi-fresh, and spoiled. The purpose of the modelling approach employed in this work was to classify beef samples in the respective quality class as well as to predict their total viable counts directly from FTIR spectra. The results obtained demonstrated that both approaches showed good performance in discriminating meat samples in one of the three predefined sensory classes. The PLS classification models showed performances ranging from 72.0 to 98.2% using the training dataset, and from 63.1 to 94.7% using independent testing dataset. The ANN classification model performed equally well in discriminating meat samples, with correct classification rates from 98.2 to 100% and 63.1 to 73.7% in the train and test sessions, respectively. PLS and ANN approaches were also applied to create models for the prediction of microbial counts. The performance of these was based on graphical plots and statistical indices (bias factor, accuracy factor, root mean square error). Furthermore, results demonstrated reasonably good correlation of total viable counts on meat surface with FTIR spectral data with PLS models presenting better performance indices compared to ANN.


Asunto(s)
Microbiología de Alimentos/métodos , Análisis de los Mínimos Cuadrados , Carne/microbiología , Modelos Biológicos , Redes Neurales de la Computación , Animales , Bovinos , Recuento de Colonia Microbiana , Análisis de Regresión , Espectroscopía Infrarroja por Transformada de Fourier
19.
J AOAC Int ; 94(4): 1026-33, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21919335

RESUMEN

Allergen detection and quantification is an essential part of allergen management as practiced by food manufacturers. Recently, protein MS methods (in particular, multiple reaction monitoring experiments) have begun to be adopted by the allergen detection community to provide an alternative technique to ELISA and PCR methods. MS analysis of proteins in foods provides additional challenges to the analyst, both in terms of experimental design and methodology: (1) choice of analyte, including multiplexing to simultaneously detect several biologically relevant molecules able to trigger allergic reactions; (2) choice of processing stable peptide markers for different target analytes that should be placed in publicly available databases; (3) markers allowing quantification (e.g., through standard addition or isotopically labeled peptide standards); (4) optimization of protease digestion protocols to ensure reproducible and robust method development; and (5) effective validation of methods and harmonization of results through the use of naturally incurred reference materials spanning several types of food matrix.


Asunto(s)
Alérgenos/análisis , Análisis de los Alimentos/métodos , Espectrometría de Masas/métodos , Cromatografía Líquida de Alta Presión/métodos , Humanos , Reproducibilidad de los Resultados
20.
BMJ Open ; 11(11): e056601, 2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34740937

RESUMEN

OBJECTIVES: Online health forums provide rich and untapped real-time data on population health. Through novel data extraction and natural language processing (NLP) techniques, we characterise the evolution of mental and physical health concerns relating to the COVID-19 pandemic among online health forum users. SETTING AND DESIGN: We obtained data from three leading online health forums: HealthBoards, Inspire and HealthUnlocked, from the period 1 January 2020 to 31 May 2020. Using NLP, we analysed the content of posts related to COVID-19. PRIMARY OUTCOME MEASURES: (1) Proportion of forum posts containing COVID-19 keywords; (2) proportion of forum users making their very first post about COVID-19; (3) proportion of COVID-19-related posts containing content related to physical and mental health comorbidities. RESULTS: Data from 739 434 posts created by 53 134 unique users were analysed. A total of 35 581 posts (4.8%) contained a COVID-19 keyword. Posts discussing COVID-19 and related comorbid disorders spiked in early March to mid-March around the time of global implementation of lockdowns prompting a large number of users to post on online health forums for the first time. Over a quarter of COVID-19-related thread titles mentioned a physical or mental health comorbidity. CONCLUSIONS: We demonstrate that it is feasible to characterise the content of online health forum user posts regarding COVID-19 and measure changes over time. The pandemic and corresponding public response has had a significant impact on posters' queries regarding mental health. Social media data sources such as online health forums can be harnessed to strengthen population-level mental health surveillance.


Asunto(s)
COVID-19 , Medios de Comunicación Sociales , Control de Enfermedades Transmisibles , Humanos , Procesamiento de Lenguaje Natural , Pandemias , SARS-CoV-2
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA