Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 5 de 5
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
PLoS One ; 15(1): e0226770, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-31945070

RESUMO

Despite the increasing importance of non-targeted metabolomics to answer various life science questions, extracting biochemically relevant information from metabolomics spectral data is still an incompletely solved problem. Most computational tools to identify tandem mass spectra focus on a limited set of molecules of interest. However, such tools are typically constrained by the availability of reference spectra or molecular databases, limiting their applicability of generating structural hypotheses for unknown metabolites. In contrast, recent advances in the field illustrate the possibility to expose the underlying biochemistry without relying on metabolite identification, in particular via substructure prediction. We describe an automated method for substructure recommendation motivated by association rule mining. Our framework captures potential relationships between spectral features and substructures learned from public spectral libraries. These associations are used to recommend substructures for any unknown mass spectrum. Our method does not require any predefined metabolite candidates, and therefore it can be used for the hypothesis generation or partial identification of unknown unknowns. The method is called MESSAR (MEtabolite SubStructure Auto-Recommender) and is implemented in a free online web service available at messar.biodatamining.be.


Assuntos
Produtos Biológicos/análise , Bases de Dados Factuais , Metaboloma , Preparações Farmacêuticas/análise , Espectrometria de Massas em Tandem/métodos , Automação , Humanos
2.
BioData Min ; 11: 20, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30202444

RESUMO

Searching for interesting common subgraphs in graph data is a well-studied problem in data mining. Subgraph mining techniques focus on the discovery of patterns in graphs that exhibit a specific network structure that is deemed interesting within these data sets. The definition of which subgraphs are interesting and which are not is highly dependent on the application. These techniques have seen numerous applications and are able to tackle a range of biological research questions, spanning from the detection of common substructures in sets of biomolecular compounds, to the discovery of network motifs in large-scale molecular interaction networks. Thus far, information about the bioinformatics application of subgraph mining remains scattered over heterogeneous literature. In this review, we provide an introduction to subgraph mining for life scientists. We give an overview of various subgraph mining algorithms from a bioinformatics perspective and present several of their potential biomedical applications.

3.
Immunogenetics ; 70(3): 159-168, 2018 03.
Artigo em Inglês | MEDLINE | ID: mdl-28779185

RESUMO

Current T cell epitope prediction tools are a valuable resource in designing targeted immunogenicity experiments. They typically focus on, and are able to, accurately predict peptide binding and presentation by major histocompatibility complex (MHC) molecules on the surface of antigen-presenting cells. However, recognition of the peptide-MHC complex by a T cell receptor (TCR) is often not included in these tools. We developed a classification approach based on random forest classifiers to predict recognition of a peptide by a T cell receptor and discover patterns that contribute to recognition. We considered two approaches to solve this problem: (1) distinguishing between two sets of TCRs that each bind to a known peptide and (2) retrieving TCRs that bind to a given peptide from a large pool of TCRs. Evaluation of the models on two HIV-1, B*08-restricted epitopes reveals good performance and hints towards structural CDR3 features that can determine peptide immunogenicity. These results are of particular importance as they show that prediction of T cell epitope and T cell epitope recognition based on sequence data is a feasible approach. In addition, the validity of our models not only serves as a proof of concept for the prediction of immunogenic T cell epitopes but also paves the way for more general and high-performing models.


Assuntos
Epitopos de Linfócito T/imunologia , HIV-1/imunologia , Peptídeos/imunologia , Receptores de Antígenos de Linfócitos T/imunologia , Sequência de Aminoácidos/genética , Apresentação de Antígeno/imunologia , Células Apresentadoras de Antígenos/imunologia , Linfócitos T CD8-Positivos/imunologia , HIV-1/isolamento & purificação , Humanos , Complexo Principal de Histocompatibilidade/imunologia , Ligação Proteica/imunologia
4.
Rapid Commun Mass Spectrom ; 31(17): 1396-1404, 2017 Sep 15.
Artigo em Inglês | MEDLINE | ID: mdl-28569011

RESUMO

RATIONALE: Using mass spectrometry, the analysis of known metabolite structures has become feasible in a systematic high-throughput fashion. Nevertheless, the identification of previously unknown structures remains challenging, partially because many unidentified variants originate from known molecules that underwent unexpected modifications. Here, we present a method for the discovery of unknown metabolite modifications and conjugate metabolite isoforms in a high-throughput fashion. METHODS: The method is based on user-controlled in-source fragmentation which is used to induce loss of weakly bound modifications. This is followed by the comparison of product ions from in-source fragmentation and collision-induced dissociation (CID). Diagonal MS2 -MS3 matching allows the detection of unknown metabolite modifications, as well as substructure similarities. As the method relies heavily on the advantages of in-source fragmentation and its ability to 'magically' elucidate unknown modification, we have named it inSourcerer as a portmanteau of in-source and sorcerer. RESULTS: The method was evaluated using a set of 15 different cytokinin standards. Product ions from in-source fragmentation and CID were compared. Hierarchical clustering revealed that good matches are due to the presence of common substructures. Plant leaf extract, spiked with a mix of all 15 standards, was used to demonstrate the method's ability to detect these standards in a complex mixture, as well as confidently identify compounds already present in the plant material. CONCLUSIONS: Here we present a method that incorporates a classic liquid chromatography/mass spectrometry (LC/MS) workflow with fragmentation models and computational algorithms. The assumptions upon which the concept of the method was built were shown to be valid and the method showed that in-source fragmentation can be used to pinpoint structural similarities and indicate the occurrence of a modification.


Assuntos
Ensaios de Triagem em Larga Escala/métodos , Espectrometria de Massas/métodos , Modelos Químicos , Biologia Computacional , Citocininas/análise , Citocininas/química , Ensaios de Triagem em Larga Escala/normas , Espectrometria de Massas/normas , Metaboloma , Extratos Vegetais/química , Folhas de Planta/química
5.
Proteome Sci ; 12(1): 54, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25429250

RESUMO

BACKGROUND: Mass spectrometry-based proteomics experiments generate spectra that are rich in information. Often only a fraction of this information is used for peptide/protein identification, whereas a significant proportion of the peaks in a spectrum remain unexplained. In this paper we explore how a specific class of data mining techniques termed "frequent itemset mining" can be employed to discover patterns in the unassigned data, and how such patterns can help us interpret the origin of the unexpected/unexplained peaks. RESULTS: First a model is proposed that describes the origin of the observed peaks in a mass spectrum. For this purpose we use the classical correlative database search algorithm. Peaks that support a positive identification of the spectrum are termed explained peaks. Next, frequent itemset mining techniques are introduced to infer which unexplained peaks are associated in a spectrum. The method is validated on two types of experimental proteomic data. First, peptide mass fingerprint data is analyzed to explain the unassigned peaks in a full scan mass spectrum. Interestingly, a large numbers of experimental spectra reveals several highly frequent unexplained masses, and pattern mining on these frequent masses demonstrates that subsets of these peaks frequently co-occur. Further evaluation shows that several of these co-occurring peaks indeed have a known common origin, and other patterns are promising hypothesis generators for further analysis. Second, the proposed methodology is validated on tandem mass spectrometral data using a public spectral library, where associations within the mass differences of unassigned peaks and peptide modifications are explored. The investigation of the found patterns illustrates that meaningful patterns can be discovered that can be explained by features of the employed technology and found modifications. CONCLUSIONS: This simple approach offers opportunities to monitor accumulating unexplained mass spectrometry data for emerging new patterns, with possible applications for the development of mass exclusion lists, for the refinement of quality control strategies and for a further interpretation of unexplained spectral peaks in mass spectrometry and tandem mass spectrometry.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...