Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Eur Child Adolesc Psychiatry ; 26(11): 1309-1317, 2017 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-28455596

RESUMO

Psychiatric disorders are amongst the most prevalent and impairing conditions in childhood and adolescence. Unfortunately, it is well known that general practitioners (GPs) and other frontline health providers (i.e., child protection workers, public health nurses, and pediatricians) are not adequately trained to address these ubiquitous problems (Braddick et al. Child and Adolescent mental health in Europe: infrastructures, policy and programmes, European Communities, 2009; Levav et al. Eur Child Adolesc Psychiatry 13:395-401, 2004). Advances in technology may offer a solution to this problem with clinical decision support systems (CDSS) that are designed to help professionals make sound clinical decisions in real time. This paper offers a systematic review of currently available CDSS for child and adolescent mental health disorders prepared according to the PRISMA-Protocols (Preferred Reporting Items for Systematic Reviews and Meta-Analyses Protocols). Applying strict eligibility criteria, the identified studies (n = 5048) were screened. Ten studies, describing eight original clinical decision support systems for child and adolescent psychiatric disorders, fulfilled inclusion criteria. Based on this systematic review, there appears to be a need for a new, readily available CDSS for child neuropsychiatric disorder which promotes evidence-based, best practices, while enabling consideration of national variation in practices by leveraging data-reuse to generate predictions regarding treatment outcome, addressing a broader cluster of clinical disorders, and targeting frontline practice environments.


Assuntos
Psiquiatria do Adolescente/normas , Psiquiatria Infantil/normas , Sistemas de Apoio a Decisões Clínicas/normas , Adolescente , Criança , Humanos
2.
BMC Bioinformatics ; 17: 155, 2016 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-27059896

RESUMO

BACKGROUND: Understanding the interactions between antibodies and the linear epitopes that they recognize is an important task in the study of immunological diseases. We present a novel computational method for the design of linear epitopes of specified binding affinity to Intravenous Immunoglobulin (IVIg). RESULTS: We show that the method, called Pythia-design can accurately design peptides with both high-binding affinity and low binding affinity to IVIg. To show this, we experimentally constructed and tested the computationally constructed designs. We further show experimentally that these designed peptides are more accurate that those produced by a recent method for the same task. Pythia-design is based on combining random walks with an ensemble of probabilistic support vector machines (SVM) classifiers, and we show that it produces a diverse set of designed peptides, an important property to develop robust sets of candidates for construction. We show that by combining Pythia-design and the method of (PloS ONE 6(8):23616, 2011), we are able to produce an even more accurate collection of designed peptides. Analysis of the experimental validation of Pythia-design peptides indicates that binding of IVIg is favored by epitopes that contain trypthophan and cysteine. CONCLUSIONS: Our method, Pythia-design, is able to generate a diverse set of binding and non-binding peptides, and its designs have been experimentally shown to be accurate.


Assuntos
Biologia Computacional/métodos , Epitopos/química , Imunoglobulinas Intravenosas/química , Peptídeos Cíclicos/química , Citrulina/química , Cisteína/química , Humanos , Modelos Moleculares , Reprodutibilidade dos Testes , Máquina de Vetores de Suporte , Triptofano/química
3.
Diabetologia ; 58(6): 1363-71, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25740695

RESUMO

AIMS/HYPOTHESIS: We selected the most informative protein biomarkers for the prediction of incident cardiovascular disease (CVD) in people with type 2 diabetes. METHODS: In this nested case-control study we measured 42 candidate CVD biomarkers in 1,123 incident CVD cases and 1,187 controls with type 2 diabetes selected from five European centres. Combinations of biomarkers were selected using cross-validated logistic regression models. Model prediction was assessed using the area under the receiver operating characteristic curve (AUROC). RESULTS: Sixteen biomarkers showed univariate associations with incident CVD. The most predictive subset selected by forward selection methods contained six biomarkers: N-terminal pro-B-type natriuretic peptide (OR 1.69 per 1 SD, 95% CI 1.47, 1.95), high-sensitivity troponin T (OR 1.29, 95% CI 1.11, 1.51), IL-6 (OR 1.13, 95% CI 1.02, 1.25), IL-15 (OR 1.15, 95% CI 1.01, 1.31), apolipoprotein C-III (OR 0.79, 95% CI 0.70, 0.88) and soluble receptor for AGE (OR 0.84, 95% CI 0.76, 0.94). The prediction of CVD beyond clinical covariates improved from an AUROC of 0.66 to 0.72 (AUROC for Framingham Risk Score covariates 0.59). In addition to the biomarkers, the most important clinical covariates for improving prediction beyond the Framingham covariates were estimated GFR, insulin therapy and HbA1c. CONCLUSIONS/INTERPRETATION: We identified six protein biomarkers that in combination with clinical covariates improved the prediction of our model beyond the Framingham Score covariates. Biomarkers can contribute to improved prediction of CVD in diabetes but clinical data including measures of renal function and diabetes-specific factors not included in the Framingham Risk Score are also needed.


Assuntos
Biomarcadores/sangue , Doenças Cardiovasculares/complicações , Diabetes Mellitus Tipo 2/complicações , Idoso , Apolipoproteína C-III/sangue , Área Sob a Curva , Doenças Cardiovasculares/diagnóstico , Estudos de Casos e Controles , Complicações do Diabetes , Diabetes Mellitus Tipo 2/diagnóstico , Europa (Continente) , Feminino , Taxa de Filtração Glomerular , Hemoglobinas Glicadas/metabolismo , Humanos , Insulina/uso terapêutico , Interleucina-15/sangue , Interleucina-6/sangue , Modelos Logísticos , Masculino , Pessoa de Meia-Idade , Peptídeo Natriurético Encefálico/sangue , Fragmentos de Peptídeos/sangue , Curva ROC , Fatores de Risco , Troponina T/sangue
4.
J Biomed Inform ; 57: 369-76, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26325295

RESUMO

The increasing prevalence of diabetes and its related complications is raising the need for effective methods to predict patient evolution and for stratifying cohorts in terms of risk of developing diabetes-related complications. In this paper, we present a novel approach to the simulation of a type 1 diabetes population, based on Dynamic Bayesian Networks, which combines literature knowledge with data mining of a rich longitudinal cohort of type 1 diabetes patients, the DCCT/EDIC study. In particular, in our approach we simulate the patient health state and complications through discretized variables. Two types of models are presented, one entirely learned from the data and the other partially driven by literature derived knowledge. The whole cohort is simulated for fifteen years, and the simulation error (i.e. for each variable, the percentage of patients predicted in the wrong state) is calculated every year on independent test data. For each variable, the population predicted in the wrong state is below 10% on both models over time. Furthermore, the distributions of real vs. simulated patients greatly overlap. Thus, the proposed models are viable tools to support decision making in type 1 diabetes.


Assuntos
Teorema de Bayes , Simulação por Computador , Mineração de Dados , Complicações do Diabetes , Diabetes Mellitus Tipo 1 , Humanos
5.
Diabetologia ; 57(8): 1611-22, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24871321

RESUMO

AIMS/HYPOTHESIS: Diabetic nephropathy is a major diabetic complication, and diabetes is the leading cause of end-stage renal disease (ESRD). Family studies suggest a hereditary component for diabetic nephropathy. However, only a few genes have been associated with diabetic nephropathy or ESRD in diabetic patients. Our aim was to detect novel genetic variants associated with diabetic nephropathy and ESRD. METHODS: We exploited a novel algorithm, 'Bag of Naive Bayes', whose marker selection strategy is complementary to that of conventional genome-wide association models based on univariate association tests. The analysis was performed on a genome-wide association study of 3,464 patients with type 1 diabetes from the Finnish Diabetic Nephropathy (FinnDiane) Study and subsequently replicated with 4,263 type 1 diabetes patients from the Steno Diabetes Centre, the All Ireland-Warren 3-Genetics of Kidneys in Diabetes UK collection (UK-Republic of Ireland) and the Genetics of Kidneys in Diabetes US Study (GoKinD US). RESULTS: Five genetic loci (WNT4/ZBTB40-rs12137135, RGMA/MCTP2-rs17709344, MAPRE1P2-rs1670754, SEMA6D/SLC24A5-rs12917114 and SIK1-rs2838302) were associated with ESRD in the FinnDiane study. An association between ESRD and rs17709344, tagging the previously identified rs12437854 and located between the RGMA and MCTP2 genes, was replicated in independent case-control cohorts. rs12917114 near SEMA6D was associated with ESRD in the replication cohorts under the genotypic model (p < 0.05), and rs12137135 upstream of WNT4 was associated with ESRD in Steno. CONCLUSIONS/INTERPRETATION: This study supports the previously identified findings on the RGMA/MCTP2 region and suggests novel susceptibility loci for ESRD. This highlights the importance of applying complementary statistical methods to detect novel genetic variants in diabetic nephropathy and, in general, in complex diseases.


Assuntos
Nefropatias Diabéticas/genética , Loci Gênicos , Predisposição Genética para Doença , Falência Renal Crônica/genética , Adulto , Teorema de Bayes , Feminino , Estudo de Associação Genômica Ampla , Humanos , Masculino , Pessoa de Meia-Idade , Polimorfismo de Nucleotídeo Único , População Branca/genética
6.
BMC Bioinformatics ; 13 Suppl 14: S6, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-23095471

RESUMO

BACKGROUND: Genome Wide Association Studies represent powerful approaches that aim at disentangling the genetic and molecular mechanisms underlying complex traits. The usual "one-SNP-at-the-time" testing strategy cannot capture the multi-factorial nature of this kind of disorders. We propose a Hierarchical Naïve Bayes classification model for taking into account associations in SNPs data characterized by Linkage Disequilibrium. Validation shows that our model reaches classification performances superior to those obtained by the standard Naïve Bayes classifier for simulated and real datasets. METHODS: In the Hierarchical Naïve Bayes implemented, the SNPs mapping to the same region of Linkage Disequilibrium are considered as "details" or "replicates" of the locus, each contributing to the overall effect of the region on the phenotype. A latent variable for each block, which models the "population" of correlated SNPs, can be then used to summarize the available information. The classification is thus performed relying on the latent variables conditional probability distributions and on the SNPs data available. RESULTS: The developed methodology has been tested on simulated datasets, each composed by 300 cases, 300 controls and a variable number of SNPs. Our approach has been also applied to two real datasets on the genetic bases of Type 1 Diabetes and Type 2 Diabetes generated by the Wellcome Trust Case Control Consortium. CONCLUSIONS: The approach proposed in this paper, called Hierarchical Naïve Bayes, allows dealing with classification of examples for which genetic information of structurally correlated SNPs are available. It improves the Naïve Bayes performances by properly handling the within-loci variability.


Assuntos
Teorema de Bayes , Diabetes Mellitus Tipo 1/genética , Diabetes Mellitus Tipo 2/genética , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Estudos de Casos e Controles , Simulação por Computador , Humanos , Desequilíbrio de Ligação , Modelos Genéticos
7.
BMC Evol Biol ; 11: 159, 2011 Jun 10.
Artigo em Inglês | MEDLINE | ID: mdl-21663612

RESUMO

BACKGROUND: We have recently discovered that the two tryptophans of human ß2-microglobulin have distinctive roles within the structure and function of the protein. Deeply buried in the core, Trp95 is essential for folding stability, whereas Trp60, which is solvent-exposed, plays a crucial role in promoting the binding of ß2-microglobulin to the heavy chain of the class I major histocompatibility complex (MHCI). We have previously shown that the thermodynamic disadvantage of having Trp60 exposed on the surface is counter-balanced by the perfect fit between it and a cavity within the MHCI heavy chain that contributes significantly to the functional stabilization of the MHCI. Therefore, based on the peculiar differences of the two tryptophans, we have analysed the evolution of ß2-microglobulin with respect to these residues. RESULTS: Having defined the ß2-microglobulin protein family, we performed multiple sequence alignments and analysed the residue conservation in homologous proteins to generate a phylogenetic tree. Our results indicate that Trp60 is highly conserved, whereas some species have a Leu in position 95; the replacement of Trp95 with Leu destabilizes ß2-microglobulin by 1 kcal/mol and accelerates the kinetics of unfolding. Both thermodynamic and kinetic data fit with the crystallographic structure of the Trp95Leu variant, which shows how the hydrophobic cavity of the wild-type protein is completely occupied by Trp95, but is only half filled by Leu95. CONCLUSIONS: We have established that the functional Trp60 has been present within the sequence of ß2-microglobulin since the evolutionary appearance of proteins responsible for acquired immunity, whereas the structural Trp95 was selected and stabilized, most likely, for its capacity to fully occupy an internal cavity of the protein thereby creating a better stabilization of its folded state.


Assuntos
Filogenia , Triptofano/genética , Triptofano/metabolismo , Microglobulina beta-2/genética , Microglobulina beta-2/metabolismo , Sequência de Aminoácidos , Amiloide/metabolismo , Animais , Cristalografia por Raios X , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Conformação Proteica , Dobramento de Proteína , Alinhamento de Sequência , Triptofano/química , Microglobulina beta-2/química
8.
Stud Health Technol Inform ; 281: 506-507, 2021 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-34042623

RESUMO

i2b2 data-warehouse could be a useful tool to support the enrollment phase of clinical studies. The aim of this work is to evaluate its performance on two clinical trials. We developed also an i2b2 extension to help in suggesting eligible patients for a study. The work showed good results in terms of ability to implement inclusion/exclusion criteria, but also in terms of identified patients actually enrolled and high number of patients suggested as potentially enrollable.


Assuntos
Data Warehousing , Armazenamento e Recuperação da Informação , Humanos
9.
BMC Bioinformatics ; 11: 518, 2010 Oct 16.
Artigo em Inglês | MEDLINE | ID: mdl-20950483

RESUMO

BACKGROUND: Mass spectrometry is an essential technique in proteomics both to identify the proteins of a biological sample and to compare proteomic profiles of different samples. In both cases, the main phase of the data analysis is the procedure to extract the significant features from a mass spectrum. Its final output is the so-called peak list which contains the mass, the charge and the intensity of every detected biomolecule. The main steps of the peak list extraction procedure are usually preprocessing, peak detection, peak selection, charge determination and monoisotoping operation. RESULTS: This paper describes an original algorithm for peak list extraction from low and high resolution mass spectra. It has been developed principally to improve the precision of peak extraction in comparison to other reference algorithms. It contains many innovative features among which a sophisticated method for managing the overlapping isotopic distributions. CONCLUSIONS: The performances of the basic version of the algorithm and of its optional functionalities have been evaluated in this paper on both SELDI-TOF, MALDI-TOF and ESI-FTICR ECD mass spectra. Executable files of MassSpec, a MATLAB implementation of the peak list extraction procedure for Windows and Linux systems, can be downloaded free of charge for nonprofit institutions from the following web site: http://aimed11.unipv.it/MassSpec.


Assuntos
Espectrometria de Massas/métodos , Proteínas/química , Proteômica/métodos , Algoritmos , Bases de Dados de Proteínas
10.
BMC Struct Biol ; 10: 18, 2010 Jun 17.
Artigo em Inglês | MEDLINE | ID: mdl-20565796

RESUMO

BACKGROUND: Topological descriptors, other graph measures, and in a broader sense, graph-theoretical methods, have been proven as powerful tools to perform biological network analysis. However, the majority of the developed descriptors and graph-theoretical methods does not have the ability to take vertex- and edge-labels into account, e.g., atom- and bond-types when considering molecular graphs. Indeed, this feature is important to characterize biological networks more meaningfully instead of only considering pure topological information. RESULTS: In this paper, we put the emphasis on analyzing a special type of biological networks, namely bio-chemical structures. First, we derive entropic measures to calculate the information content of vertex- and edge-labeled graphs and investigate some useful properties thereof. Second, we apply the mentioned measures combined with other well-known descriptors to supervised machine learning methods for predicting Ames mutagenicity. Moreover, we investigate the influence of our topological descriptors - measures for only unlabeled vs. measures for labeled graphs - on the prediction performance of the underlying graph classification problem. CONCLUSIONS: Our study demonstrates that the application of entropic measures to molecules representing graphs is useful to characterize such structures meaningfully. For instance, we have found that if one extends the measures for determining the structural information content of unlabeled graphs to labeled graphs, the uniqueness of the resulting indices is higher. Because measures to structurally characterize labeled graphs are clearly underrepresented so far, the further development of such methods might be valuable and fruitful for solving problems within biological network analysis.


Assuntos
Biologia Computacional/métodos , Inteligência Artificial , Entropia , Testes de Mutagenicidade , Software
11.
J Biomed Biotechnol ; 2010: 670125, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20625507

RESUMO

Protein interactions are crucial in most biological processes. Several in silico methods have been recently developed to predict them. This paper describes a bioinformatics method that combines sequence similarity and structural information to support experimental studies on protein interactions. Given a target protein, the approach selects the most likely interactors among the candidates revealed by experimental techniques, but not yet in vivo validated. The sequence and the structural information of the in vivo confirmed proteins and complexes are exploited to evaluate the candidate interactors. Finally, a score is calculated to suggest the most likely interactors of the target protein. As an example, we searched for GRB2 interactors. We ranked a set of 46 candidate interactors by the presented method. These candidates were then reduced to 21, through a score threshold chosen by means of a cross-validation strategy. Among them, the isoform 1 of MAPK14 was in silico confirmed as a GRB2 interactor. Finally, given a set of already confirmed interactors of GRB2, the accuracy and the precision of the approach were 75% and 86%, respectively. In conclusion, the proposed method can be conveniently exploited to select the proteins to be experimentally investigated within a set of potential interactors.


Assuntos
Biologia Computacional/métodos , Mapeamento de Interação de Proteínas/métodos , Motivos de Aminoácidos , Bases de Dados de Proteínas , Proteína Adaptadora GRB2/química , Proteína Adaptadora GRB2/metabolismo , Humanos , Ligação de Hidrogênio , Proteína Quinase 1 Ativada por Mitógeno/química , Proteína Quinase 1 Ativada por Mitógeno/metabolismo , Proteína Quinase 14 Ativada por Mitógeno/química , Proteína Quinase 14 Ativada por Mitógeno/metabolismo , Modelos Moleculares , Complexos Multiproteicos/química , Complexos Multiproteicos/metabolismo , Ligação Proteica , Reprodutibilidade dos Testes , Alinhamento de Sequência
12.
BMC Bioinformatics ; 10 Suppl 12: S11, 2009 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-19828071

RESUMO

BACKGROUND: One of the topics of major interest in proteomics is protein identification. Protein identification can be achieved by analyzing the mass spectrum of a protein sample through different approaches. One of them, called Peptide Mass Fingerprinting (PMF), combines mass spectrometry (MS) data with searching strategies in a suitable database of known protein to provide a list of candidate proteins ranked by a score. To this aim, several algorithms and software tools have been proposed. However, the scoring methods and mainly the statistical evaluation of the results can be significantly improved. RESULTS: In this work, a Perl procedure for protein identification by PMF, called MsPI (Mass spectrometry Protein Identification), is presented. The implemented scoring methods were derived from the literature. MsPI implements a strategy to remove the contaminant masses present in the acquired spectra. Moreover, MsPI includes a statistical method to assign to each candidate protein, in addition to the scoring value, a p-value. Results obtained by MsPI on a dataset of 10 protein samples were compared with those achieved using two other software tools, i.e. Piums and Mascot. Piums implements one of the scoring methods available in MsPI, while Mascot is one of the most frequently used software tools in the protein identification field. MsPI scripts are available for downloading on the web site http://aimed11.unipv.it/MsPI. CONCLUSION: The performances of MsPI seem to be better than those of Piums and Mascot. In fact, on the considered dataset, MsPI includes in its candidate proteins list, the "true" proteins nine times over ten, whereas Piums includes in its list the "true" proteins only four time over ten. Even if Mascot also correctly includes in the candidates list the "true" proteins nine times over ten, it provides longer candidate lists, therefore increasing the number of false positives when the molecular weight of the proteins in the sample is approximatively known (e.g. by the 1-D/2-D electrophoresis gel). Moreover, being MsPI a Perl tool, it can be easily extended and customized by the final users.


Assuntos
Biologia Computacional/métodos , Mapeamento de Peptídeos/métodos , Proteínas/química , Software , Algoritmos , Bases de Dados de Proteínas , Biblioteca de Peptídeos , Proteínas/classificação , Proteômica/métodos
13.
Stud Health Technol Inform ; 258: 21-25, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-30942706

RESUMO

i2b2 and REDCap are two widely adopted solutions respectively to facilitate data re-use for research purpose and to manage non-for-profit research studies. REDCap provides the design specifications to build a web service used to import data from an external source with a procedure called DDP. In this work we have developed a web service that implements these specifications in order to import data from i2b2. Our approach has been tested with a real REDCap study.


Assuntos
Data Warehousing , Análise de Dados
14.
Stud Health Technol Inform ; 247: 715-719, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29678054

RESUMO

Medical reports often contain a lot of relevant information in the form of free text. To reuse these unstructured texts for biomedical research, it is important to extract structured data from them. In this work, we adapted a previously developed information extraction system to the oncology domain, to process a set of anatomic pathology reports in the Italian language. The information extraction system relies on a domain ontology, which was adapted and refined in an iterative way. The final output was evaluated by a domain expert, with promising results.


Assuntos
Armazenamento e Recuperação da Informação , Idioma , Processamento de Linguagem Natural , Pesquisa Biomédica , Mineração de Dados , Humanos , Itália
17.
Stud Health Technol Inform ; 228: 572-6, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27577448

RESUMO

The i2b2 software is a widely adopted solution for secondary use of clinical data for clinical research, specifically designed for cohort identification. i2b2 is still lacking functionalities for data analysis. The aim of this work is to empower the i2b2 framework enabling clinical researchers to perform statistical analyses for accelerating the process of hypothesis testing. To this aim we have developed a flexible extension of i2b2 able to exploit different statistical engines. We have implemented some first applications for basic statistics and survival analyses, exploiting this extension and accessible through suitable user interfaces designed with a special consideration for usability.


Assuntos
Estudos de Coortes , Troca de Informação em Saúde , Ferramenta de Busca , Bases de Dados Factuais , Humanos , Armazenamento e Recuperação da Informação/métodos , Software , Interface Usuário-Computador
19.
Adv Bioinformatics ; 2015: 382869, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25653679

RESUMO

Phosphorylation is a protein posttranslational modification. It is responsible of the activation/inactivation of disease-related pathways, thanks to its role of "molecular switch." The study of phosphorylated proteins becomes a key point for the proteomic analyses focused on the identification of diagnostic/therapeutic targets. Liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) is the most widely used analytical approach. Although unmodified peptides are automatically identified by consolidated algorithms, phosphopeptides still require automated tools to avoid time-consuming manual interpretation. To improve phosphopeptide identification efficiency, a novel procedure was developed and implemented in a Perl/C tool called PhosphoHunter, here proposed and evaluated. It includes a preliminary heuristic step for filtering out the MS/MS spectra produced by nonphosphorylated peptides before sequence identification. A method to assess the statistical significance of identified phosphopeptides was also formulated. PhosphoHunter performance was tested on a dataset of 1500 MS/MS spectra and it was compared with two other tools: Mascot and Inspect. Comparisons demonstrated that a strong point of PhosphoHunter is sensitivity, suggesting that it is able to identify real phosphopeptides with superior performance. Performance indexes depend on a single parameter (intensity threshold) that users can tune according to the study aim. All the three tools localized >90% of phosphosites.

20.
PLoS One ; 6(8): e23616, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21887285

RESUMO

The prediction of antibody-protein (antigen) interactions is very difficult due to the huge variability that characterizes the structure of the antibodies. The region of the antigen bound to the antibodies is called epitope. Experimental data indicate that many antibodies react with a panel of distinct epitopes (positive reaction). The Challenge 1 of DREAM5 aims at understanding whether there exists rules for predicting the reactivity of a peptide/epitope, i.e., its capability to bind to human antibodies. DREAM 5 provided a training set of peptides with experimentally identified high and low reactivities to human antibodies. On the basis of this training set, the participants to the challenge were asked to develop a predictive model of reactivity. A test set was then provided to evaluate the performance of the model implemented so far.We developed a logistic regression model to predict the peptide reactivity, by facing the challenge as a machine learning problem. The initial features have been generated on the basis of the available knowledge and the information reported in the dataset. Our predictive model had the second best performance of the challenge. We also developed a method, based on a clustering approach, able to "in-silico" generate a list of positive and negative new peptide sequences, as requested by the DREAM5 "bonus round" additional challenge.The paper describes the developed model and its results in terms of reactivity prediction, and highlights some open issues concerning the propensity of a peptide to react with human antibodies.


Assuntos
Imunoglobulinas Intravenosas/metabolismo , Bases de Conhecimento , Peptídeos/metabolismo , Sequência de Aminoácidos , Aminoácidos/metabolismo , Análise por Conglomerados , Humanos , Modelos Moleculares , Dados de Sequência Molecular , Peptídeos/química , Curva ROC , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA