Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
1.
Bioinformatics ; 39(9)2023 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-37669154

RESUMO

MOTIVATION: Computationally predicting major histocompatibility complex class I (MHC-I) peptide binding affinity is an important problem in immunological bioinformatics, which is also crucial for the identification of neoantigens for personalized therapeutic cancer vaccines. Recent cutting-edge deep learning-based methods for this problem cannot achieve satisfactory performance, especially for non-9-mer peptides. This is because such methods generate the input by simply concatenating the two given sequences: a peptide and (the pseudo sequence of) an MHC class I molecule, which cannot precisely capture the anchor positions of the MHC binding motif for the peptides with variable lengths. We thus developed an anchor position-aware and high-performance deep model, DeepMHCI, with a position-wise gated layer and a residual binding interaction convolution layer. This allows the model to control the information flow in peptides to be aware of anchor positions and model the interactions between peptides and the MHC pseudo (binding) sequence directly with multiple convolutional kernels. RESULTS: The performance of DeepMHCI has been thoroughly validated by extensive experiments on four benchmark datasets under various settings, such as 5-fold cross-validation, validation with the independent testing set, external HPV vaccine identification, and external CD8+ epitope identification. Experimental results with visualization of binding motifs demonstrate that DeepMHCI outperformed all competing methods, especially on non-9-mer peptides binding prediction. AVAILABILITY AND IMPLEMENTATION: DeepMHCI is publicly available at https://github.com/ZhuLab-Fudan/DeepMHCI.


Assuntos
Algoritmos , Benchmarking , Biologia Computacional , Epitopos , Peptídeos
2.
ACS Appl Mater Interfaces ; 15(15): 19470-19479, 2023 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-37023404

RESUMO

Efficient dispersion of nanoparticles (NPs) is a crucial challenge in the preparation and application of composites that contain NPs, particularly in coatings, inks, and related materials. Physical adsorption and chemical modification are the two common methods used to disperse NPs. However, the former suffers from desorption, and the latter is more specific and has limited versatility. To address these issues, we developed a novel photo-cross-linked polymeric dispersant, comb-shaped benzophenone-containing poly(ether amine) (bPEA), using a one-pot nucleophilic/cyclic-opening addition reaction. The results demonstrated that the bPEA dispersant forms a dense and stable shell on the surface of pigment NPs through physical adsorption and subsequent chemical photo-cross-linking, which effectively overcome the drawbacks of the desorption occurred in physical adsorption and the specificity of the chemical modification. By means of the dispersing effect of bPEA, the obtained pigment dispersions show high solvent, thermal, and pH stability without flocculation during storage. Moreover, the NPs dispersants show good compatibility with screen printing, coating, and 3D printing, endowing the ornamental products with high uniformity, color fastness, and less color shading. These properties make bPEA dispersants ideal candidates in fabrication dispersions of other NPs.

3.
Virol J ; 19(1): 114, 2022 06 28.
Artigo em Inglês | MEDLINE | ID: mdl-35765099

RESUMO

BACKGROUND: Chronic infection with hepatitis B virus (HBV) has been proved highly associated with the development of hepatocellular carcinoma (HCC). AIMS: The purpose of the study is to investigate the association between HBV preS region quasispecies and HCC development, as well as to develop HCC diagnosis model using HBV preS region quasispecies. METHODS: A total of 104 chronic hepatitis B (CHB) patients and 117 HBV-related HCC patients were enrolled. HBV preS region was sequenced using next generation sequencing (NGS) and the nucleotide entropy was calculated for quasispecies evaluation. Sparse logistic regression (SLR) was used to predict HCC development and prediction performances were evaluated using receiver operating characteristic curves. RESULTS: Entropy of HBV preS1, preS2 regions and several nucleotide points showed significant divergence between CHB and HCC patients. Using SLR, the classification of HCC/CHB groups achieved a mean area under the receiver operating characteristic curve (AUC) of 0.883 in the training data and 0.795 in the test data. The prediction model was also validated by a completely independent dataset from Hong Kong. The 10 selected nucleotide positions showed significantly different entropy between CHB and HCC patients. The HBV quasispecies also classified three clinical parameters, including HBeAg, HBVDNA, and Alkaline phosphatase (ALP) with the AUC value greater than 0.6 in the test data. CONCLUSIONS: Using NGS and SLR, the association between HBV preS region nucleotide entropy and HCC development was validated in our study and this could promote the understanding of HCC progression mechanism.


Assuntos
Carcinoma Hepatocelular , Neoplasias Hepáticas , Antígenos de Superfície da Hepatite B/genética , Vírus da Hepatite B/genética , Humanos , Modelos Logísticos , Nucleotídeos , Quase-Espécies
4.
Bioinformatics ; 38(Suppl 1): i220-i228, 2022 06 24.
Artigo em Inglês | MEDLINE | ID: mdl-35758790

RESUMO

MOTIVATION: Computationally predicting major histocompatibility complex (MHC)-peptide binding affinity is an important problem in immunological bioinformatics. Recent cutting-edge deep learning-based methods for this problem are unable to achieve satisfactory performance for MHC class II molecules. This is because such methods generate the input by simply concatenating the two given sequences: (the estimated binding core of) a peptide and (the pseudo sequence of) an MHC class II molecule, ignoring biological knowledge behind the interactions of the two molecules. We thus propose a binding core-aware deep learning-based model, DeepMHCII, with a binding interaction convolution layer, which allows to integrate all potential binding cores (in a given peptide) with the MHC pseudo (binding) sequence, through modeling the interaction with multiple convolutional kernels. RESULTS: Extensive empirical experiments with four large-scale datasets demonstrate that DeepMHCII significantly outperformed four state-of-the-art methods under numerous settings, such as 5-fold cross-validation, leave one molecule out, validation with independent testing sets and binding core prediction. All these results and visualization of the predicted binding cores indicate the effectiveness of our model, DeepMHCII, and the importance of properly modeling biological facts in deep learning for high predictive performance and efficient knowledge discovery. AVAILABILITY AND IMPLEMENTATION: DeepMHCII is publicly available at https://github.com/yourh/DeepMHCII. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Assuntos
Antígenos de Histocompatibilidade Classe II , Peptídeos , Algoritmos , Antígenos de Histocompatibilidade Classe II/metabolismo , Peptídeos/química , Ligação Proteica , Transporte Proteico
5.
J Transl Med ; 20(1): 193, 2022 05 04.
Artigo em Inglês | MEDLINE | ID: mdl-35509104

RESUMO

PURPOSE: We develop a new risk score to predict patients with stroke-associated pneumonia (SAP) who have an acute intracranial hemorrhage (ICH). METHOD: We applied logistic regression to develop a new risk score called ICH-LR2S2. It was derived from examining a dataset of 70,540 ICH patients between 2015 and 2018 from the Chinese Stroke Center Alliance (CSCA). During the training of ICH-LR2S2, patients were randomly divided into two groups - 80% for the training set and 20% for model validation. A prospective test set was developed using 12,523 patients recruited in 2019. To further verify its effectiveness, we tested ICH-LR2S2 on an external dataset of 24,860 patients from the China National Stroke Registration Management System II (CNSR II). The performance of ICH-LR2S2 was measured by the area under the receiver operating characteristic curve (AUROC). RESULTS: The incidence of SAP in the dataset was 25.52%. A 24-point ICH-LR2S2 was developed from independent predictors, including age, modified Rankin Scale, fasting blood glucose, National Institutes of Health Stroke Scale admission score, Glasgow Coma Scale score, C-reactive protein, dysphagia, Chronic Obstructive Pulmonary Disease, and current smoking. The results showed that ICH-LR2S2 achieved an AUC = 0.749 [95% CI 0.739-0.759], which outperforms the best baseline ICH-APS (AUC = 0.704) [95% CI 0.694-0.714]. Compared with the previous ICH risk scores, ICH-LR2S2 incorporates fasting blood glucose and C-reactive protein, improving its discriminative ability. Machine learning methods such as XGboost (AUC = 0.772) [95% CI 0.762-0.782] can further improve our prediction performance. It also performed well when further validated by the external independent cohort of patients (n = 24,860), ICH-LR2S2 AUC = 0.784 [95% CI 0.774-0.794]. CONCLUSION: ICH-LR2S2 accurately distinguishes SAP patients based on easily available clinical features. It can help identify high-risk patients in the early stages of diseases.


Assuntos
Pneumonia , Acidente Vascular Cerebral , Glicemia , Proteína C-Reativa , Hemorragia Cerebral/complicações , Humanos , Hemorragias Intracranianas/complicações , Pneumonia/complicações , Prognóstico , Estudos Prospectivos , Fatores de Risco , Acidente Vascular Cerebral/complicações
6.
J Infect Dis ; 223(11): 1887-1896, 2021 06 04.
Artigo em Inglês | MEDLINE | ID: mdl-33049037

RESUMO

BACKGROUND: Hepatitis B virus (HBV) infection is one of the main leading causes of hepatocellular carcinoma (HCC) worldwide. However, it remains uncertain how the reverse-transcriptase (rt) gene contributes to HCC progression. METHODS: We enrolled a total of 307 patients with chronic hepatitis B (CHB) and 237 with HBV-related HCC from 13 medical centers. Sequence features comprised multidimensional attributes of rt nucleic acid and rt/s amino acid sequences. Machine-learning models were used to establish HCC predictive algorithms. Model performances were tested in the training and independent validation cohorts using receiver operating characteristic curves and calibration plots. RESULTS: A random forest (RF) model based on combined metrics (10 features) demonstrated the best predictive performances in both cross and independent validation (AUC, 0.96; accuracy, 0.90), irrespective of HBV genotypes and sequencing depth. Moreover, HCC risk scores for individuals obtained from the RF model (AUC, 0.966; 95% confidence interval, .922-.989) outperformed α-fetoprotein (0.713; .632-.784) in distinguishing between patients with HCC and those with CHB. CONCLUSIONS: Our study provides evidence for the first time that HBV rt sequences contain vital HBV quasispecies features in predicting HCC. Integrating deep sequencing with feature extraction and machine-learning models benefits the longitudinal surveillance of CHB and HCC risk assessment.


Assuntos
Carcinoma Hepatocelular , Vírus da Hepatite B , Hepatite B Crônica , Neoplasias Hepáticas , Quase-Espécies , Carcinoma Hepatocelular/diagnóstico , Carcinoma Hepatocelular/virologia , Vírus da Hepatite B/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias Hepáticas/diagnóstico , Neoplasias Hepáticas/virologia , Aprendizado de Máquina , DNA Polimerase Dirigida por RNA
7.
PLoS Genet ; 14(2): e1007206, 2018 02.
Artigo em Inglês | MEDLINE | ID: mdl-29474353

RESUMO

Hepatitis B virus (HBV) infection is a common problem in the world, especially in China. More than 60-80% of hepatocellular carcinoma (HCC) cases can be attributed to HBV infection in high HBV prevalent regions. Although traditional Sanger sequencing has been extensively used to investigate HBV sequences, NGS is becoming more commonly used. Further, it is unknown whether word pattern frequencies of HBV reads by Next Generation Sequencing (NGS) can be used to investigate HBV genotypes and predict HCC status. In this study, we used NGS to sequence the pre-S region of the HBV sequence of 94 HCC patients and 45 chronic HBV (CHB) infected individuals. Word pattern frequencies among the sequence data of all individuals were calculated and compared using the Manhattan distance. The individuals were grouped using principal coordinate analysis (PCoA) and hierarchical clustering. Word pattern frequencies were also used to build prediction models for HCC status using both K-nearest neighbors (KNN) and support vector machine (SVM). We showed the extremely high power of analyzing HBV sequences using word patterns. Our key findings include that the first principal coordinate of the PCoA analysis was highly associated with the fraction of genotype B (or C) sequences and the second principal coordinate was significantly associated with the probability of having HCC. Hierarchical clustering first groups the individuals according to their major genotypes followed by their HCC status. Using cross-validation, high area under the receiver operational characteristic curve (AUC) of around 0.88 for KNN and 0.92 for SVM were obtained. In the independent data set of 46 HCC patients and 31 CHB individuals, a good AUC score of 0.77 was obtained using SVM. It was further shown that 3000 reads for each individual can yield stable prediction results for SVM. Thus, another key finding is that word patterns can be used to predict HCC status with high accuracy. Therefore, our study shows clearly that word pattern frequencies of HBV sequences contain much information about the composition of different HBV genotypes and the HCC status of an individual.


Assuntos
Carcinoma Hepatocelular/virologia , Heterogeneidade Genética , Antígenos de Superfície da Hepatite B/genética , Vírus da Hepatite B/genética , Hepatite B Crônica/virologia , Neoplasias Hepáticas/virologia , Carcinoma Hepatocelular/epidemiologia , Carcinoma Hepatocelular/genética , Impressões Digitais de DNA , DNA Viral/análise , Frequência do Gene , Estudos de Associação Genética/métodos , Genótipo , Vírus da Hepatite B/classificação , Hepatite B Crônica/complicações , Hepatite B Crônica/epidemiologia , Hepatite B Crônica/genética , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Neoplasias Hepáticas/epidemiologia , Neoplasias Hepáticas/genética , Filogenia , Precursores de Proteínas/genética
8.
J Gen Virol ; 98(11): 2748-2758, 2017 11.
Artigo em Inglês | MEDLINE | ID: mdl-29022863

RESUMO

In order to investigate if deletion patterns of the preS region can predict liver disease advancement, the preS region of the hepatitis B virus (HBV) genome in 45 chronic hepatitis B (CHB) and 94 HBV-related hepatocellular carcinoma (HCC) patients was sequenced by next-generation sequencing (NGS) and the percentages of nucleotide deletion in the preS region were analysed. Hierarchical clustering and heatmaps based on deletion percentages of preS revealed different deletion patterns between CHB and HCC patients. Intergenotype comparison also indicated divergence in preS deletions between HBV genotype B and C. No significant difference was found in preS deletion patterns between sera and matched adjacent non-tumour tissues. Based on hierarchical clustering, HCC patients were classed into two groups with different preS deletion patterns and different clinical features. Finally, the support vector machine (SVM) model was trained on preS nucleotide deletion percentages and used to predict HCC versus CHB patients. The prediction performance was assessed with fivefold cross-validation and independent cohort validation. The median area under the curve (AUC) was 0.729 after repeating SVM 500 times with fivefold cross-validations. After parameter optimization, the SVM model was used to predict an independent cohort with 51 CHB patients and 72 HCC patients and the AUC was 0.727. In conclusion, the use of the NGS method revealed a prominent divergence in preS deletion patterns between disease groups and virus genotypes, but not between different tissue types. Quantitative NGS data combined with a machine learning method could be a powerful approach for prediction of the status of different diseases.


Assuntos
Carcinoma Hepatocelular/virologia , Antígenos de Superfície da Hepatite B/genética , Vírus da Hepatite B/genética , Hepatite B Crônica/virologia , Polimorfismo Genético , Deleção de Sequência , Adulto , Biologia Computacional , Feminino , Genoma Viral , Genótipo , Vírus da Hepatite B/classificação , Vírus da Hepatite B/isolamento & purificação , Hepatite B Crônica/complicações , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Aprendizado de Máquina , Masculino , Pessoa de Meia-Idade , Técnicas de Diagnóstico Molecular
9.
Methods Mol Biol ; 1404: 753-760, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-27076335

RESUMO

Recent computational approaches in bioinformatics can achieve high performance, by which they can be a powerful support for performing real biological experiments, making biologists pay more attention to bioinformatics than before. In immunology, predicting peptides which can bind to MHC alleles is an important task, being tackled by many computational approaches. However, this situation causes a serious problem for immunologists to select the appropriate method to be used in bioinformatics. To overcome this problem, we develop an ensemble prediction-based Web server, which we call MetaMHCpan, consisting of two parts: MetaMHCIpan and MetaMHCIIpan, for predicting peptides which can bind MHC-I and MHC-II, respectively. MetaMHCIpan and MetaMHCIIpan use two (MHC2SKpan and LApan) and four (TEPITOPEpan, MHC2SKpan, LApan, and MHC2MIL) existing predictors, respectively. MetaMHCpan is available at http://datamining-iip.fudan.edu.cn/MetaMHCpan/index.php/pages/view/info .


Assuntos
Biologia Computacional/métodos , Antígenos HLA/metabolismo , Internet , Peptídeos/metabolismo , Humanos , Ligação Proteica , Software
10.
BMC Genomics ; 15 Suppl 9: S9, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25521198

RESUMO

BACKGROUND: Computational prediction of major histocompatibility complex class II (MHC-II) binding peptides can assist researchers in understanding the mechanism of immune systems and developing peptide based vaccines. Although many computational methods have been proposed, the performance of these methods are far from satisfactory. The difficulty of MHC-II peptide binding prediction comes mainly from the large length variation of binding peptides. METHODS: We develop a novel multiple instance learning based method called MHC2MIL, in order to predict MHC-II binding peptides. We deem each peptide in MHC2MIL as a bag, and some substrings of the peptide as the instances in the bag. Unlike previous multiple instance learning based methods that consider only instances of fixed length 9 (9 amino acids), MHC2MIL is able to deal with instances of both lengths of 9 and 11 (11 amino acids), simultaneously. As such, MHC2MIL incorporates important information in the peptide flanking region. For measuring the distances between different instances, furthermore, MHC2MIL explicitly highlights the amino acids in some important positions. RESULTS: Experimental results on a benchmark dataset have shown that, the performance of MHC2MIL is significantly improved by considering the instances of both 9 and 11 amino acids, as well as by emphasizing amino acids at key positions in the instance. The results are consistent with those reported in the literature on MHC-II peptide binding. In addition to five important positions (1, 4, 6, 7 and 9) for HLA(human leukocyte antigen, the name of MHC in Humans) DR peptide binding, we also find that position 2 may play some roles in the binding process. By using 5-fold cross validation on the benchmark dataset, MHC2MIL outperforms two state-of-the-art methods of MHC2SK and NN-align with being statistically significant, on 12 HLA DP and DQ molecules. In addition, it achieves comparable performance with MHC2SK and NN-align on 14 HLA DR molecules. MHC2MIL is freely available at http://datamining-iip.fudan.edu.cn/service/MHC2MIL/index.html.


Assuntos
Inteligência Artificial , Biologia Computacional/métodos , Antígenos de Histocompatibilidade Classe II/química , Antígenos de Histocompatibilidade Classe II/metabolismo , Peptídeos/metabolismo , Alelos , Antígenos de Histocompatibilidade Classe II/genética , Humanos , Ligação Proteica
11.
BMC Genomics ; 14 Suppl 5: S11, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24564280

RESUMO

BACKGROUND: Computational methods for the prediction of Major Histocompatibility Complex (MHC) class II binding peptides play an important role in facilitating the understanding of immune recognition and the process of epitope discovery. To develop an effective computational method, we need to consider two important characteristics of the problem: (1) the length of binding peptides is highly flexible; and (2) MHC molecules are extremely polymorphic and for the vast majority of them there are no sufficient training data. METHODS: We develop a novel string kernel MHC2SK (MHC-II String Kernel) method to measure the similarities among peptides with variable lengths. By considering the distinct features of MHC-II peptide binding prediction problem, MHC2SK differs significantly from the recently developed kernel based method, GS (Generic String) kernel, in the way of computing similarities. Furthermore, we extend MHC2SK to MHC2SKpan for pan-specific MHC-II peptide binding prediction by leveraging the binding data of various MHC molecules. RESULTS: MHC2SK outperformed GS in allele specific prediction using a benchmark dataset, which demonstrates the effectiveness of MHC2SK. Furthermore, we evaluated the performance of MHC2SKpan using various benckmark data sets from several different perspectives: Leave-one-allele-out (LOO), 5-fold cross validation as well as independent data testing. MHC2SKpan has achieved comparable performance with NetMHCIIpan-2.0 and outperformed NetMHCIIpan-1.0, TEPITOPEpan and MultiRTA, being statistically significant. MHC2SKpan can be freely accessed at http://datamining-iip.fudan.edu.cn/service/MHC2SKpan/index.html.


Assuntos
Sítios de Ligação de Anticorpos/genética , Biologia Computacional/métodos , Antígenos de Histocompatibilidade Classe II/química , Antígenos de Histocompatibilidade Classe II/metabolismo , Peptídeos/metabolismo , Algoritmos , Antígenos de Histocompatibilidade Classe II/genética , Humanos , Modelos Moleculares , Peptídeos/genética
12.
PLoS One ; 7(2): e30483, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22383964

RESUMO

MOTIVATION: Accurate identification of peptides binding to specific Major Histocompatibility Complex Class II (MHC-II) molecules is of great importance for elucidating the underlying mechanism of immune recognition, as well as for developing effective epitope-based vaccines and promising immunotherapies for many severe diseases. Due to extreme polymorphism of MHC-II alleles and the high cost of biochemical experiments, the development of computational methods for accurate prediction of binding peptides of MHC-II molecules, particularly for the ones with few or no experimental data, has become a topic of increasing interest. TEPITOPE is a well-used computational approach because of its good interpretability and relatively high performance. However, TEPITOPE can be applied to only 51 out of over 700 known HLA DR molecules. METHOD: We have developed a new method, called TEPITOPEpan, by extrapolating from the binding specificities of HLA DR molecules characterized by TEPITOPE to those uncharacterized. First, each HLA-DR binding pocket is represented by amino acid residues that have close contact with the corresponding peptide binding core residues. Then the pocket similarity between two HLA-DR molecules is calculated as the sequence similarity of the residues. Finally, for an uncharacterized HLA-DR molecule, the binding specificity of each pocket is computed as a weighted average in pocket binding specificities over HLA-DR molecules characterized by TEPITOPE. RESULT: The performance of TEPITOPEpan has been extensively evaluated using various data sets from different viewpoints: predicting MHC binding peptides, identifying HLA ligands and T-cell epitopes and recognizing binding cores. Among the four state-of-the-art competing pan-specific methods, for predicting binding specificities of unknown HLA-DR molecules, TEPITOPEpan was roughly the second best method next to NETMHCIIpan-2.0. Additionally, TEPITOPEpan achieved the best performance in recognizing binding cores. We further analyzed the motifs detected by TEPITOPEpan, examining the corresponding literature of immunology. Its online server and PSSMs therein are available at http://www.biokdd.fudan.edu.cn/Service/TEPITOPEpan/.


Assuntos
Epitopos/química , Antígenos HLA-DR/genética , Antígenos HLA-DR/imunologia , Antígenos de Histocompatibilidade Classe II/genética , Peptídeos/química , Algoritmos , Alelos , Área Sob a Curva , Biologia Computacional/métodos , Simulação por Computador , Cristalografia por Raios X/métodos , Regulação da Expressão Gênica , Humanos , Ligantes , Modelos Estatísticos , Biblioteca de Peptídeos , Polimorfismo Genético , Ligação Proteica , Conformação Proteica , Reprodutibilidade dos Testes , Linfócitos T/citologia
13.
Brief Bioinform ; 13(3): 350-64, 2012 May.
Artigo em Inglês | MEDLINE | ID: mdl-21949215

RESUMO

Binding of short antigenic peptides to major histocompatibility complex (MHC) molecules is a core step in adaptive immune response. Precise identification of MHC-restricted peptides is of great significance for understanding the mechanism of immune response and promoting the discovery of immunogenic epitopes. However, due to the extremely high MHC polymorphism and huge cost of biochemical experiments, there is no experimentally measured binding data for most MHC molecules. To address the problem of predicting peptides binding to these MHC molecules, recently computational approaches, called pan-specific methods, have received keen interest. Pan-specific methods make use of experimentally obtained binding data of multiple alleles, by which binding peptides (binders) of not only these alleles but also those alleles with no known binders can be predicted. To investigate the possibility of further improvement in performance and usability of pan-specific methods, this article extensively reviews existing pan-specific methods and their web servers. We first present a general framework of pan-specific methods. Then, the strategies and performance as well as utilities of web servers are compared. Finally, we discuss the future direction to improve pan-specific methods for MHC-peptide binding prediction.


Assuntos
Complexo Principal de Histocompatibilidade , Peptídeos/química , Algoritmos , Alelos , Sítios de Ligação , Bases de Dados de Proteínas , Epitopos/genética , Epitopos/imunologia , Peptídeos/metabolismo
15.
Nucleic Acids Res ; 38(Web Server issue): W474-9, 2010 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-20483919

RESUMO

As antigenic peptides binding to major histocompatibility complex (MHC) molecules is the prerequisite of cellular immune responses, an accurate computational predictor will be of great benefit to biologists and immunologists for understanding the underlying mechanism of immune recognition as well as facilitating the process of epitope mapping and vaccine design. Although various computational approaches have been developed, recent experimental results on benchmark data sets show that the development of improved predictors is needed, especially for MHC Class II peptide binding. To make the most of current methods and achieve a higher predictive performance, we developed a new web server, MetaMHC, to integrate the outputs of leading predictors by several popular ensemble strategies. MetaMHC consists of two components: MetaMHCI and MetaMHCII for MHC Class I peptide and MHC Class II peptide binding predictions, respectively. Experimental results by both cross-validation and using an independent data set show that the ensemble approaches outperform individual predictors, being statistically significant. MetaMHC is freely available at http://www.biokdd.fudan.edu.cn/Service/MetaMHC.html.


Assuntos
Antígenos de Histocompatibilidade Classe II/metabolismo , Antígenos de Histocompatibilidade Classe I/metabolismo , Peptídeos/metabolismo , Software , Animais , Sítios de Ligação , Humanos , Internet , Camundongos , Peptídeos/química , Peptídeos/imunologia , Interface Usuário-Computador
16.
J Genet Genomics ; 36(5): 289-96, 2009 May.
Artigo em Inglês | MEDLINE | ID: mdl-19447377

RESUMO

Effective identification of major histocompatibility complex (MHC) molecules restricted peptides is a critical step in discovering immune epitopes. Although many online servers have been built to predict class II MHC-peptide binding affinity, they have been trained on different datasets, and thus fail in providing a unified comparison of various methods. In this paper, we present our implementation of seven popular predictive methods, namely SMM-align, ARB, SVR-pairwise, Gibbs sampler, ProPred, LP-top2, and MHCPred, on a single web server named BiodMHC (http://biod.whu.edu.cn/BiodMHC/index.html, the software is available upon request). Using a standard measure of AUC (Area Under the receiver operating characteristic Curves), we compare these methods by means of not only cross validation but also prediction on independent test datasets. We find that SMM-align, ProPred, SVR-pairwise, ARB, and Gibbs sampler are the five best-performing methods. For the binding affinity prediction of class II MHC-peptide, BiodMHC provides a convenient online platform for researchers to obtain binding information simultaneously using various methods.


Assuntos
Bases de Dados de Proteínas , Antígenos de Histocompatibilidade Classe II/química , Antígenos de Histocompatibilidade Classe II/metabolismo , Internet , Sistemas On-Line , Animais , Humanos , Peptídeos/química , Peptídeos/metabolismo , Ligação Proteica , Software
17.
Cancer Inform ; 2: 361-71, 2007 Feb 25.
Artigo em Inglês | MEDLINE | ID: mdl-19458778

RESUMO

An important issue in current medical science research is to find the genes that are strongly related to an inherited disease. A particular focus is placed on cancer-gene relations, since some types of cancers are inherited. As biomedical databases have grown speedily in recent years, an informatics approach to predict such relations from currently available databases should be developed. Our objective is to find implicit associated cancer-genes from biomedical databases including the literature database. Co-occurrence of biological entities has been shown to be a popular and efficient technique in biomedical text mining. We have applied a new probabilistic model, called mixture aspect model (MAM) [48], to combine different types of co-occurrences of genes and cancer derived from Medline and OMIM (Online Mendelian Inheritance in Man). We trained the probability parameters of MAM using a learning method based on an EM (Expectation and Maximization) algorithm. We examined the performance of MAM by predicting associated cancer gene pairs. Through cross-validation, prediction accuracy was shown to be improved by adding gene-gene co-occurrences from Medline to cancer-gene cooccurrences in OMIM. Further experiments showed that MAM found new cancer-gene relations which are unknown in the literature. Supplementary information can be found at http://www.bic.kyotou.ac.jp/pathway/zhusf/CancerInformatics/Supplemental2006.html.

18.
Bioinformatics ; 22(13): 1648-55, 2006 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-16613909

RESUMO

MOTIVATION: Various computational methods have been proposed to tackle the problem of predicting the peptide binding ability for a specific MHC molecule. These methods are based on known binding peptide sequences. However, current available peptide databases do not have very abundant amounts of examples and are highly redundant. Existing studies show that MHC molecules can be classified into supertypes in terms of peptide-binding specificities. Therefore, we first give a method for reducing the redundancy in a given dataset based on information entropy, then present a novel approach for prediction by learning a predictive model from a dataset of binders for not only the molecule of interest but also for other MHC molecules. RESULTS: We experimented on the HLA-A family with the binding nonamers of A1 supertype (HLA-A*0101, A*2601, A*2902, A*3002), A2 supertype (A*0201, A*0202, A*0203, A*0206, A*6802), A3 supertype (A*0301, A*1101, A*3101, A*3301, A*6801) and A24 supertype (A*2301 and A*2402), whose data were collected from six publicly available peptide databases and two private sources. The results show that our approach significantly improves the prediction accuracy of peptides that bind a specific HLA molecule when we combine binding data of HLA molecules in the same supertype. Our approach can thus be used to help find new binders for MHC molecules.


Assuntos
Biologia Computacional/métodos , Genes MHC Classe I , Antígenos HLA-A/genética , Antígenos de Histocompatibilidade/química , Oligopeptídeos/química , Peptídeos/química , Algoritmos , Alelos , Bases de Dados Genéticas , Entropia , Humanos , Modelos Estatísticos , Ligação Proteica , Reprodutibilidade dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA