Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 137
Filtrar
1.
Molecules ; 29(2)2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38257197

RESUMO

Peptide-protein interactions form a cornerstone in molecular biology, governing cellular signaling, structure, and enzymatic activities in living organisms. Improving computational models and experimental techniques to describe and predict these interactions remains an ongoing area of research. Here, we present a computational method for peptide-protein interactions' description and prediction based on leveraged amino acid frequencies within specific binding cores. Utilizing normalized frequencies, we construct quantitative matrices (QMs), termed 'logo models' derived from sequence logos. The method was developed to predict peptide binding to HLA-DQ2.5 and HLA-DQ8.1 proteins associated with susceptibility to celiac disease. The models were validated by more than 17,000 peptides demonstrating their efficacy in discriminating between binding and non-binding peptides. The logo method could be applied to diverse peptide-protein interactions, offering a versatile tool for predictive analysis in molecular binding studies.


Assuntos
Doença Celíaca , Peptídeos , Humanos , Aminoácidos , Biologia Molecular , Matrizes de Pontuação de Posição Específica
2.
Rev. cuba. med. mil ; 52(4)dic. 2023. ilus
Artigo em Espanhol | LILACS, CUMED | ID: biblio-1559881

RESUMO

Los fenotipos de obesidad se presentan en individuos con igual índice de masa corporal que tienen diferentes perfiles metabólicos y pronósticos de salud. Su presencia desde etapas tempranas de la vida hace que incremente la probabilidad de que una mujer arribe al embarazo con estas características, por lo que es necesario promover un posicionamiento conceptual para su identificación. En gestantes normopeso, se sugiere identificar el fenotipo normopeso obeso cuando presenta valor igual o superior al 30 por ciento de la grasa corporal o al 90 percentil de la suma de pliegues cutáneos tricipital y subescapular. De ellas, las que tengan valores iguales o superiores al 75 percentil del índice de adiposidad visceral y del producto de acumulación de los lípidos, se consideran normopeso metabólicamente obesas. En las obesas se propone el uso de los criterios que definen al síndrome metabólico en mujeres, con valores ajustados para gestantes, para identificar la salud metabólica. Los argumentos expuestos demuestran lo idóneo de estratificar el riesgo metabólico al inicio de la gestación al clasificarlas en fenotipos de obesidad, mediante indicadores antropométricos, bioquímicos y clínicos que identifican al síndrome metabólico(AU)


The obesity phenotypes settle down in individuals with equal body mass index that present different metabolic profiles and health prognosis. Its presence from early stages of life increases the probability that women get pregnant with this characteristic, so it is considered necessary to promote a conceptual position for its identification at the beginning of pregnancy. In normal-weight pregnant woman, we propose to use the value of 30 percent or the 90th percentile of the sum of the triceps and subescapularis skinfold to define obese normal-weight phenotype. Of these, those with values equal to or greater than the 75th percentile of visceral adiposity index and the lipids accumulation product would be considered obese metabolically normal-weight. In obese pregnant woman the use of the criteria that define metabolic syndrome in women, is proposed to identify the metabolic health. The exposed theoretical foundations demonstrate the suitability of stratifying metabolic risk at the beginning of pregnancy by classifying it into obesity phenotypes, through anthropometric, biochemical, and clinical indicators(AU)


Assuntos
Humanos , Gravidez , Fenótipo , Índice de Massa Corporal , Síndrome Metabólica/metabolismo , Matrizes de Pontuação de Posição Específica , Obesidade Materna/metabolismo , Obesidade/classificação , Fatores de Risco , Adiposidade , Indicadores e Reagentes/metabolismo
3.
Int J Biol Macromol ; 244: 124993, 2023 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-37307968

RESUMO

Copper ion-binding proteins play an essential role in metabolic processes and are critical factors in many diseases, such as breast cancer, lung cancer, and Menkes disease. Many algorithms have been developed for predicting metal ion classification and binding sites, but none have been applied to copper ion-binding proteins. In this study, we developed a copper ion-bound protein classifier, RPCIBP, which integrating the reduced amino acid composition into position-specific scoring matrix (PSSM). The reduced amino acid composition filters out a large number of useless evolutionary features, improving the operational efficiency and predictive ability of the model (feature dimension from 2900 to 200, ACC from 83 % to 85.1 %). Compared with the basic model using only three sequence feature extraction methods (ACC in training set between 73.8 %-86.2 %, ACC in test set between 69.3 %-87.5 %), the model integrating the evolutionary features of the reduced amino acid composition showed higher accuracy and robustness (ACC in training set between 83.1 %-90.8 %, ACC in test set between 79.1 %-91.9 %). Best copper ion-binding protein classifiers filtered by feature selection progress were deployed in a user-friendly web server (http://bioinfor.imu.edu.cn/RPCIBP). RPCIBP can accurately predict copper ion-binding proteins, which is convenient for further structural and functional studies, and conducive to mechanism exploration and target drug development.


Assuntos
Cobre , Proteínas , Matrizes de Pontuação de Posição Específica , Proteínas/química , Algoritmos , Aminoácidos/química , Bases de Dados de Proteínas , Biologia Computacional/métodos
4.
Int J Biol Macromol ; 243: 125296, 2023 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-37301349

RESUMO

Angiogenic proteins (AGPs) play a primary role in the formation of new blood vessels from pre-existing ones. AGPs have diverse applications in cancer, including serving as biomarkers, guiding anti-angiogenic therapies, and aiding in tumor imaging. Understanding the role of AGPs in cardiovascular and neurodegenerative diseases is vital for developing new diagnostic tools and therapeutic approaches. Considering the significance of AGPs, in this research, we first time established a computational model using deep learning for identifying AGPs. First, we constructed a sequence-based dataset. Second, we explored features by designing a novel feature encoder, called position-specific scoring matrix-decomposition-discrete cosine transform (PSSM-DC-DCT) and existing descriptors including Dipeptide Deviation from Expected Mean (DDE) and bigram-position-specific scoring matrix (Bi-PSSM). Third, each feature set is fed into two-dimensional convolutional neural network (2D-CNN) and machine learning classifiers. Finally, the performance of each learning model is validated by 10-fold cross-validation (CV). The experimental results demonstrate that 2D-CNN with proposed novel feature descriptor achieved the highest success rate on both training and testing datasets. In addition to being an accurate predictor for identification of angiogenic proteins, our proposed method (Deep-AGP) might be fruitful in understanding cancer, cardiovascular, and neurodegenerative diseases, development of their novel therapeutic methods and drug designing.


Assuntos
Aprendizado de Máquina , Redes Neurais de Computação , Matrizes de Pontuação de Posição Específica
5.
Methods ; 209: 10-17, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36427763

RESUMO

Adaptor proteins, also known as signal transduction adaptor proteins, are important proteins in signal transduction pathways, and play a role in connecting signal proteins for signal transduction between cells. Studies have shown that adaptor proteins are closely related to some diseases, such as tumors and diabetes. Therefore, it is very meaningful to construct a relevant model to accurately identify adaptor proteins. In recent years, many studies have used a position-specific scoring matrix (PSSM) and neural network methods to identify adaptor proteins. However, ordinary neural network models cannot correlate the contextual information in PSSM profiles well, so these studies usually process 20×N (N > 20) PSSM into 20×20 dimensions, which results in the loss of a large amount of protein information; This research proposes an efficient method that combines one-dimensional convolution (1-D CNN) and a bidirectional long short-term memory network (biLSTM) to identify adaptor proteins. The complete PSSM profiles are the input of the model, and the complete information of the protein is retained during the training process. We perform cross-validation during model training and test the performance of the model on an independent test set; in the data set with 1224 adaptor proteins and 11,078 non-adaptor proteins, five indicators including specificity, sensitivity, accuracy, area under the receiver operating characteristic curve (AUC) metric and Matthews correlation coefficient (MCC), were employed to evaluate model performance. On the independent test set, the specificity, sensitivity, accuracy and MCC were 0.817, 0.865, 0.823 and 0.465, respectively. Those results show that our method is better than the state-of-the art methods. This study is committed to improve the accuracy of adaptor protein identification, and laid a foundation for further research on diseases related to adaptor protein. This research provided a new idea for the application of deep learning related models in bioinformatics and computational biology.


Assuntos
Aprendizado Profundo , Matrizes de Pontuação de Posição Específica , Redes Neurais de Computação , Software , Proteínas Adaptadoras de Transdução de Sinal , Algoritmos
6.
J Chem Inf Model ; 62(19): 4820-4826, 2022 10 10.
Artigo em Inglês | MEDLINE | ID: mdl-36166351

RESUMO

Background: SNARE proteins play a vital role in membrane fusion and cellular physiology and pathological processes. Many potential therapeutics for mental diseases or even cancer based on SNAREs are also developed. Therefore, there is a dire need to predict the SNAREs for further manipulation of these essential proteins, which demands new and efficient approaches. Methods: Some computational frameworks were proposed to tackle the hurdles of biological methods, which take plenty of time and budget to conduct the identification of SNAREs. However, the performances of existing frameworks were insufficiently satisfied, as they failed to retain the SNARE sequence order and capture the mass hidden features from SNAREs. This paper proposed a novel model constructed on the multiscan convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to address these limitations. We employed and trained our model on the benchmark dataset with fivefold cross-validation and two different independent datasets. Results: Overall, the multiscan CNN was cross-validated on the training set and excelled in the SNARE classification reaching 0.963 in AUC and 0.955 in AUPRC. On top of that, with the sensitivity, specificity, accuracy, and MCC of 0.842, 0.968, 0.955, and 0.767, respectively, our proposed framework outperformed previous models in the SNARE recognition task. Conclusions: It is truly believed that our model can contribute to the discrimination of SNARE proteins and general proteins.


Assuntos
Redes Neurais de Computação , Proteínas SNARE , Matrizes de Pontuação de Posição Específica
7.
Comput Biol Med ; 145: 105533, 2022 06.
Artigo em Inglês | MEDLINE | ID: mdl-35447463

RESUMO

DNA-protein interaction is a critical biological process that performs influential activities, including DNA transcription and recombination. DBPs (DNA-binding proteins) are closely associated with different kinds of human diseases (asthma, cancer, and AIDS), while some of the DBPs are used in the production of antibiotics, steroids, and anti-inflammatories. Several methods have been reported for the prediction of DBPs. However, a more intelligent method is still highly desirable for the accurate prediction of DBPs. This study presents an intelligent computational method, Target-DBPPred, to improve DBPs prediction. Important features from primary protein sequences are investigated via a novel feature descriptor, called EDF-PSSM-DWT (Evolutionary difference formula position-specific scoring matrix-discrete wavelet transform) and several other multi-evolutionary methods, including F-PSSM (Filtered position-specific scoring matrix), EDF-PSSM (Evolutionary difference formula position-specific scoring matrix), PSSM-DPC (Position-specific scoring matrix-dipeptide composition), and Lead-BiPSSM (Lead-bigram-position specific scoring matrix) to encapsulate diverse multivariate features. The best feature set from the features of each descriptor is selected using sequential forward selection (SFS). Further, four models are trained using Adaboost, XGB (eXtreme gradient boosting), ERT (extremely randomized trees), and LiXGB (Light eXtreme gradient boosting) classifiers. LiXGB, with the best feature set of EDF-PSSM-DWT, has attained 6.69% and 15.07% higher performance in terms of accuracies using training and testing datasets, respectively. The obtained results verify the improved performance of our proposed predictor over the existing predictors.


Assuntos
Proteínas de Ligação a DNA , Análise de Ondaletas , Algoritmos , Biologia Computacional/métodos , DNA/química , Proteínas de Ligação a DNA/química , Proteínas de Ligação a DNA/metabolismo , Bases de Dados de Proteínas , Humanos , Matrizes de Pontuação de Posição Específica , Máquina de Vetores de Suporte
8.
J Bioinform Comput Biol ; 19(4): 2150018, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34291709

RESUMO

DNA-binding proteins (DBPs) perform an influential role in diverse biological activities like DNA replication, slicing, repair, and transcription. Some DBPs are indispensable for understanding many types of human cancers (i.e. lung, breast, and liver cancer) and chronic diseases (i.e. AIDS/HIV, asthma), while other kinds are involved in antibiotics, steroids, and anti-inflammatory drugs designing. These crucial processes are closely related to DBPs types. DBPs are categorized into single-stranded DNA-binding proteins (ssDBPs) and double-stranded DNA-binding proteins (dsDBPs). Few computational predictors have been reported for discriminating ssDBPs and dsDBPs. However, due to the limitations of the existing methods, an intelligent computational system is still highly desirable. In this work, features from protein sequences are discovered by extending the notion of dipeptide composition (DPC), evolutionary difference formula (EDF), and K-separated bigram (KSB) into the position-specific scoring matrix (PSSM). The highly intrinsic information was encoded by a compression approach named discrete cosine transform (DCT) and the model was trained with support vector machine (SVM). The prediction performance was further boosted by the genetic algorithm (GA) ensemble strategy. The novel predictor (DBP-GAPred) acquired 1.89%, 0.28%, and 6.63% higher accuracies on jackknife, 10-fold, and independent dataset tests, respectively than the best predictor. These outcomes confirm the superiority of our method over the existing predictors.


Assuntos
Proteínas de Ligação a DNA , Máquina de Vetores de Suporte , Algoritmos , Sequência de Aminoácidos , Biologia Computacional , Proteínas de Ligação a DNA/genética , Bases de Dados de Proteínas , Humanos , Matrizes de Pontuação de Posição Específica
9.
Front Immunol ; 12: 592447, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33717070

RESUMO

The micropolymorphism of major histocompatibility complex class I (MHC-I) can greatly alter the plasticity of peptide presentation, but elucidating the underlying mechanism remains a challenge. Here we investigated the impact of the micropolymorphism on peptide presentation of swine MHC-I (termed swine leukocyte antigen class I, SLA-I) molecules via immunopeptidomes that were determined by our newly developed random peptide library combined with the mass spectrometry (MS) de novo sequencing method (termed RPLD-MS) and the corresponding crystal structures. The immunopeptidomes of SLA-1*04:01, SLA-1*13:01, and their mutants showed that mutations of residues 156 and 99 could expand and narrow the ranges of peptides presented by SLA-I molecules, respectively. R156A mutation of SLA-1*04:01 altered the charge properties and enlarged the volume size of pocket D, which eliminated the harsh restriction to accommodate the third (P3) anchor residue of the peptide and expanded the peptide binding scope. Compared with 99Tyr of SLA-1*0401, 99Phe of SLA-1*13:01 could not form a conservative hydrogen bond with the backbone of the P3 residues, leading to fewer changes in the pocket properties but a significant decrease in quantitative of immunopeptidomes. This absent force could be compensated by the salt bridge formed by P1-E and 170Arg. These data illustrate two distinguishing manners that show how micropolymorphism alters the peptide-binding plasticity of SLA-I alleles, verifying the sensitivity and accuracy of the RPLD-MS method for determining the peptide binding characteristics of MHC-I in vitro and helping to more accurately predict and identify MHC-I restricted epitopes.


Assuntos
Epitopos de Linfócito T/química , Antígenos de Histocompatibilidade Classe I/química , Simulação de Acoplamento Molecular , Simulação de Dinâmica Molecular , Peptídeos/química , Motivos de Aminoácidos , Sequência de Aminoácidos , Animais , Sítios de Ligação , Cromatografia Líquida , Dicroísmo Circular , Epitopos de Linfócito T/genética , Epitopos de Linfócito T/imunologia , Antígenos de Histocompatibilidade Classe I/genética , Antígenos de Histocompatibilidade Classe I/imunologia , Mutação , Biblioteca de Peptídeos , Peptídeos/genética , Peptídeos/imunologia , Matrizes de Pontuação de Posição Específica , Ligação Proteica , Conformação Proteica , Relação Estrutura-Atividade , Suínos , Espectrometria de Massas em Tandem , Difração de Raios X
10.
Curr Top Med Chem ; 20(21): 1888-1897, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32648847

RESUMO

OBJECTIVE: Cancer is one of the most serious diseases affecting human health. Among all current cancer treatments, early diagnosis and control significantly help increase the chances of cure. Detecting cancer biomarkers in body fluids now is attracting more attention within oncologists. In-silico predictions of body fluid-related proteins, which can be served as cancer biomarkers, open a door for labor-intensive and time-consuming biochemical experiments. METHODS: In this work, we propose a novel method for high-throughput identification of cancer biomarkers in human body fluids. We incorporate physicochemical properties into the weighted observed percentages (WOP) and position-specific scoring matrices (PSSM) profiles to enhance their attributes that reflect the evolutionary conservation of the body fluid-related proteins. The least absolute selection and shrinkage operator (LASSO) feature selection strategy is introduced to generate the optimal feature subset. RESULTS: The ten-fold cross-validation results on training datasets demonstrate the accuracy of the proposed model. We also test our proposed method on independent testing datasets and apply it to the identification of potential cancer biomarkers in human body fluids. CONCLUSION: The testing results promise a good generalization capability of our approach.


Assuntos
Biomarcadores Tumorais/análise , Líquidos Corporais/química , Proteínas de Neoplasias/análise , Neoplasias/diagnóstico , Matrizes de Pontuação de Posição Específica , Máquina de Vetores de Suporte , Físico-Química , Bases de Dados de Proteínas , Humanos
11.
Nucleic Acids Res ; 48(7): 3435-3454, 2020 04 17.
Artigo em Inglês | MEDLINE | ID: mdl-32133533

RESUMO

Analysis of ENCODE long RNA-Seq and ChIP-seq (Chromatin Immunoprecipitation Sequencing) datasets for HepG2 and HeLa cell lines uncovered 1647 and 1958 transcripts that interfere with transcription factor binding to human enhancer domains. TFBSs (Transcription Factor Binding Sites) intersected by these 'Enhancer Occlusion Transcripts' (EOTrs) displayed significantly lower relative transcription factor (TF) binding affinities compared to TFBSs for the same TF devoid of EOTrs. Expression of most EOTrs was regulated in a cell line specific manner; analysis for the same TFBSs across cell lines, i.e. in the absence or presence of EOTrs, yielded consistently higher relative TF/DNA-binding affinities for TFBSs devoid of EOTrs. Lower activities of EOTr-associated enhancer domains coincided with reduced occupancy levels for histone tail modifications H3K27ac and H3K9ac. Similarly, the analysis of EOTrs with allele-specific expression identified lower activities for alleles associated with EOTrs. ChIA-PET (Chromatin Interaction Analysis by Paired-End Tag Sequencing) and 5C (Carbon Copy Chromosome Conformation Capture) uncovered that enhancer domains associated with EOTrs preferentially interacted with poised gene promoters. Analysis of EOTr regions with GRO-seq (Global run-on) data established the correlation of RNA polymerase pausing and occlusion of TF-binding. Our results implied that EOTr expression regulates human enhancer domains via transcriptional interference.


Assuntos
Elementos Facilitadores Genéticos , Fatores de Transcrição/metabolismo , Transcrição Gênica , Alelos , Sítios de Ligação , Cromatina/química , Sequenciamento de Cromatina por Imunoprecipitação , RNA Polimerases Dirigidas por DNA/metabolismo , Células HeLa , Células Hep G2 , Código das Histonas , Humanos , Matrizes de Pontuação de Posição Específica , Regiões Promotoras Genéticas , RNA-Seq , Fatores de Transcrição de p300-CBP/metabolismo
12.
Genes (Basel) ; 10(12)2019 11 22.
Artigo em Inglês | MEDLINE | ID: mdl-31771119

RESUMO

The prediction of protein-ligand binding sites is important in drug discovery and drug design. Protein-ligand binding site prediction computational methods are inexpensive and fast compared with experimental methods. This paper proposes a new computational method, SXGBsite, which includes the synthetic minority over-sampling technique (SMOTE) and the Extreme Gradient Boosting (XGBoost). SXGBsite uses the position-specific scoring matrix discrete cosine transform (PSSM-DCT) and predicted solvent accessibility (PSA) to extract features containing sequence information. A new balanced dataset was generated by SMOTE to improve classifier performance, and a prediction model was constructed using XGBoost. The parallel computing and regularization techniques enabled high-quality and fast predictions and mitigated overfitting caused by SMOTE. An evaluation using 12 different types of ligand binding site independent test sets showed that SXGBsite performs similarly to the existing methods on eight of the independent test sets with a faster computation time. SXGBsite may be applied as a complement to biological experiments.


Assuntos
Proteínas/química , Algoritmos , Sequência de Aminoácidos , Sítios de Ligação , Biologia Computacional/métodos , Ligantes , Matrizes de Pontuação de Posição Específica , Ligação Proteica , Proteínas/metabolismo , Análise de Ondaletas
13.
Sci Rep ; 9(1): 11966, 2019 08 19.
Artigo em Inglês | MEDLINE | ID: mdl-31427604

RESUMO

Progesterone receptor (PGR) co-ordinately regulates ovulation, fertilisation and embryo implantation through tissue-specific actions, but the mechanisms for divergent PGR action are poorly understood. Here we characterised PGR activity in mouse granulosa cells using combined ChIP-seq for PGR and H3K27ac and gene expression microarray. Comparison of granulosa, uterus and oviduct PGR-dependent genes showed almost complete tissue specificity in PGR target gene profiles. In granulosa cells 82% of identified PGR-regulated genes bound PGR within 3 kb of the gene and PGR binding sites were highly enriched in proximal promoter regions in close proximity to H3K27ac-modified active chromatin. Motif analysis showed highly enriched PGR binding to the PGR response element (GnACAnnnTGTnC), but PGR also interacted significantly with other transcription factor binding motifs. In uterus PGR showed far more tendency to bind intergenic chromatin regions and low evidence of interaction with other transcription factors. This is the first genome-wide description of PGR action in granulosa cells and systematic comparison of diverse PGR action in different reproductive tissues. It clarifies finely-tuned contextual PGR-chromatin interactions with implications for more targeted reproductive medicine.


Assuntos
Cromatina/genética , Cromatina/metabolismo , Regulação da Expressão Gênica , Progesterona/metabolismo , Receptores de Progesterona/metabolismo , Sequência de Bases , Sítios de Ligação , Feminino , Células da Granulosa/metabolismo , Histonas/metabolismo , Humanos , Motivos de Nucleotídeos , Especificidade de Órgãos , Ovário/metabolismo , Ovulação/genética , Matrizes de Pontuação de Posição Específica , Ligação Proteica , Elementos de Resposta
14.
J Mol Graph Model ; 92: 86-93, 2019 11.
Artigo em Inglês | MEDLINE | ID: mdl-31344547

RESUMO

Membrane proteins, the most important drug targets, account for around 30% of total proteins encoded by the genome of living organisms. An important role of these proteins is to bind adenosine triphosphate (ATP), facilitating crucial biological processes such as metabolism and cell signaling. There are several reports elucidating ATP-binding sites within proteins. However, such studies on membrane proteins are limited. Our prediction tool, DeepATP, combines evolutionary information in the form of Position Specific Scoring Matrix and two-dimensional Convolutional Neural Network to predict ATP-binding sites in membrane proteins with an MCC of 0.89 and an AUC of 99%. Compared to recently published ATP-binding site predictors and classifiers that use traditional machine learning algorithms, our approach performs significantly better. We suggest this method as a reliable tool for biologists for ATP-binding site prediction in membrane proteins.


Assuntos
Trifosfato de Adenosina/química , Sítios de Ligação , Proteínas de Membrana/química , Modelos Teóricos , Redes Neurais de Computação , Trifosfato de Adenosina/metabolismo , Algoritmos , Motivos de Aminoácidos , Sequência de Aminoácidos , Biologia Computacional/métodos , Bases de Dados de Proteínas , Aprendizado de Máquina , Proteínas de Membrana/metabolismo , Matrizes de Pontuação de Posição Específica , Curva ROC , Reprodutibilidade dos Testes , Navegador
15.
Comput Methods Programs Biomed ; 177: 81-88, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-31319963

RESUMO

BACKGROUND AND OBJECTIVES: Clathrin is an adaptor protein that serves as the principal element of the vesicle-coating complex and is important for the membrane cleavage to dispense the invaginated vesicle from the plasma membrane. The functional loss of clathrins has been tied to a lot of human diseases, i.e., neurodegenerative disorders, cancer, Alzheimer's diseases, and so on. Therefore, creating a precise model to identify its functions is a crucial step towards understanding human diseases and designing drug targets. METHODS: We present a deep learning model using a two-dimensional convolutional neural network (CNN) and position-specific scoring matrix (PSSM) profiles to identify clathrin proteins from high throughput sequences. Traditionally, the 2D CNNs take images as an input so we treated the PSSM profile with a 20 × 20 matrix as an image of 20 × 20 pixels. The input PSSM profile was then connected to our 2D CNN in which we set a variety of parameters to improve the performance of the model. Based on the 10-fold cross-validation results, hyper-parameter optimization process was employed to find the best model for our dataset. Finally, an independent dataset was used to assess the predictive ability of the current model. RESULTS: Our model could identify clathrin proteins with sensitivity of 92.2%, specificity of 91.2%, accuracy of 91.8%, and MCC of 0.83 in the independent dataset. Compared to state-of-the-art traditional neural networks, our method achieved a significant improvement in all typical measurement metrics. CONCLUSIONS: Throughout the proposed study, we provide an effective tool for investigating clathrin proteins and our achievement could promote the use of deep learning in biomedical research. We also provide source codes and dataset freely at https://www.github.com/khanhlee/deep-clathrin/.


Assuntos
Clatrina/química , Aprendizado Profundo , Redes Neurais de Computação , Matrizes de Pontuação de Posição Específica , Algoritmos , Membrana Celular/química , Humanos , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Software
16.
J Comput Aided Mol Des ; 33(7): 645-658, 2019 07.
Artigo em Inglês | MEDLINE | ID: mdl-31123959

RESUMO

DNA-binding proteins (DBPs) participate in various biological processes including DNA replication, recombination, and repair. In the human genome, about 6-7% of these proteins are utilized for genes encoding. DBPs shape the DNA into a compact structure known chromatin while some of these proteins regulate the chromosome packaging and transcription process. In the pharmaceutical industry, DBPs are used as a key component of antibiotics, steroids, and cancer drugs. These proteins also involve in biophysical, biological, and biochemical studies of DNA. Due to the crucial role in various biological activities, identification of DBPs is a hot issue in protein science. A series of experimental and computational methods have been proposed, however, some methods didn't achieve the desired results while some are inadequate in its accuracy and authenticity. Still, it is highly desired to present more intelligent computational predictors. In this work, we introduce an innovative computational method namely DP-BINDER based on physicochemical and evolutionary information. We captured local highly decisive features from physicochemical properties of primary protein sequences via normalized Moreau-Broto autocorrelation (NMBAC) and evolutionary information by position specific scoring matrix-transition probability composition (PSSM-TPC) and pseudo-position specific scoring matrix (PsePSSM) using training and independent datasets. The optimal features were selected by the support vector machine-recursive feature elimination and correlation bias reduction (SVM-RFE + CBR) from fused features and were fed into random forest (RF) and support vector machine (SVM). Our method attained 92.46% and 89.58% accuracy with jackknife and ten-fold cross-validation, respectively on the training dataset, while 81.17% accuracy on the independent dataset for prediction of DBPs. These results demonstrate that our method attained the highest success rate in the literature. The superiority of DP-BINDER over existing approaches due to several reasons including abstraction of local dominant features via effective feature descriptors, utilization of appropriate feature selection algorithms and effective classifier.


Assuntos
Proteínas de Ligação a DNA/química , Aprendizado de Máquina , Algoritmos , Animais , Sítios de Ligação , DNA/química , Bases de Dados de Proteínas , Evolução Molecular , Humanos , Matrizes de Pontuação de Posição Específica , Máquina de Vetores de Suporte
17.
Sci Rep ; 9(1): 460, 2019 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-30679521

RESUMO

Cardiac hypertrophy is closely correlated with diverse cardiovascular diseases, augmenting the risk of heart failure and sudden death. Long non-coding RNAs (lncRNAs) have been studied in cardiac hypertrophy for their regulatory function. LncRNA MEG3 has been reported in human cancers. Whereas, it is unknown whether MEG3 regulates the growth of cardiac hypertrophy. Therefore, this study aims to investigate the specific role of MEG3 in the progression of cardiac hypertrophy. Here, we found that MEG3 contributed to the pathogenesis of cardiac hypertrophy. MEG3 expression was remarkably strengthened in the mice heart which undergone the transverse aortic constriction (TAC). Moreover, qRT-PCR analysis revealed that MEG3 was upregulated in the cardiomyocytes which were treated with Ang-II. Silenced MEG3 inhibited the increasing size of hypertrophic cardiomyocytes and reversed other hypertrophic responses. Mechanically, MEG3 could affect cardiac hypertrophy by regulating gene expression. Mechanically, we found that MEG3 could be upregulated by the transcription factor STAT3 and could regulate miR-361-5p and HDAC9 by acting as a ceRNA. Finally, rescue assays were made to do further confirmation. All our findings revealed that STAT3-inducetd upregulation of lncRNA MEG3 controls cardiac hypertrophy by regulating miR-362-5p/HDAC9 axis.


Assuntos
Cardiomegalia/genética , Cardiomegalia/metabolismo , Regulação da Expressão Gênica , Histona Desacetilases/genética , MicroRNAs/genética , RNA Longo não Codificante/genética , Proteínas Repressoras/genética , Fator de Transcrição STAT3/metabolismo , Animais , Sítios de Ligação , Cardiomegalia/patologia , Masculino , Camundongos , Modelos Biológicos , Miócitos Cardíacos/metabolismo , Motivos de Nucleotídeos , Matrizes de Pontuação de Posição Específica , Ligação Proteica , Interferência de RNA
18.
Clin Cancer Res ; 25(2): 698-709, 2019 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-30327303

RESUMO

PURPOSE: There is a growing interest in the use of tumor antigens for therapeutic vaccination strategies. Unfortunately, in most cases, the use of peptide vaccines in patients does not mediate shrinkage of solid tumor masses.Experimental Design: Here, we studied the opportunity to boost peptide vaccination with F8-TNF, an antibody fusion protein that selectively delivers TNF to the tumor extracellular matrix. AH1, a model antigen to investigate CD8+ T-cell immunity in BALB/c mice, was used as vaccine. RESULTS: Peptide antigens alone exhibited only a modest tumor growth inhibition. However, anticancer activity could be substantially increased by combination with F8-TNF. Analysis of T cells in tumors and in draining lymph nodes revealed a dramatic expansion of AH1-specific CD8+ T cells, which were strongly positive for PD-1, LAG-3, and TIM-3. The synergistic anticancer activity, observed in the combined use of peptide vaccination and F8-TNF, was largely due to the ability of the fusion protein to induce a rapid hemorrhagic necrosis in the tumor mass, thus leaving few residual tumor cells. While the cell surface phenotype of tumor-infiltrating CD8+ T cells did not substantially change upon treatment, the proportion of AH1-specific T cells was strongly increased in the combination therapy group, reaching more than 50% of the CD8+ T cells within the tumor mass. CONCLUSIONS: Because both peptide vaccination strategies and tumor-homing TNF fusion proteins are currently being studied in clinical trials, our study provides a rationale for the combination of these 2 regimens for the treatment of patients with cancer.


Assuntos
Vacinas Anticâncer/imunologia , Imunoconjugados/administração & dosagem , Neoplasias/patologia , Neovascularização Patológica , Fator de Necrose Tumoral alfa/administração & dosagem , Vacinas de Subunidades Antigênicas/imunologia , Sequência de Aminoácidos , Animais , Vacinas Anticâncer/administração & dosagem , Linhagem Celular Tumoral , Terapia Combinada , Modelos Animais de Doenças , Feminino , Antígenos de Histocompatibilidade Classe I/química , Antígenos de Histocompatibilidade Classe I/imunologia , Humanos , Camundongos , Neoplasias/terapia , Neovascularização Patológica/imunologia , Neovascularização Patológica/terapia , Peptídeos/química , Peptídeos/imunologia , Matrizes de Pontuação de Posição Específica , Vacinas de Subunidades Antigênicas/administração & dosagem , Ensaios Antitumorais Modelo de Xenoenxerto
19.
Int J Mol Sci ; 19(10)2018 Oct 18.
Artigo em Inglês | MEDLINE | ID: mdl-30340407

RESUMO

Signaling in host plants is an integral part of a successful infection by pathogenic RNA viruses. Therefore, identifying early signaling events in host plants that play an important role in establishing the infection process will help our understanding of the disease process. In this context, phosphorylation constitutes one of the most important post-translational protein modifications, regulating many cellular signaling processes. In this study, we aimed to identify the processes affected by infection with Peanut stunt virus (PSV) and its satellite RNA (satRNA) in Nicotiana benthamiana at the early stage of pathogenesis. To achieve this, we performed proteome and phosphoproteome analyses on plants treated with PSV and its satRNA. The analysis of the number of differentially phosphorylated proteins showed strong down-regulation in phosphorylation in virus-treated plants (without satRNA). Moreover, proteome analysis revealed more down-regulated proteins in PSV and satRNA-treated plants, which indicated a complex dependence between proteins and their modifications. Apart from changes in photosynthesis and carbon metabolism, which are usually observed in virus-infected plants, alterations in proteins involved in RNA synthesis, transport, and turnover were observed. As a whole, this is the first community (phospho)proteome resource upon infection of N. benthamiana with a cucumovirus and its satRNA and this resource constitutes a valuable data set for future studies.


Assuntos
Cucumovirus/fisiologia , Interações Hospedeiro-Patógeno , Nicotiana/metabolismo , Nicotiana/virologia , Doenças das Plantas/virologia , RNA Satélite , RNA Viral , Motivos de Aminoácidos , Sequência de Aminoácidos , Humanos , Fenótipo , Fosfoproteínas , Fosforilação , Matrizes de Pontuação de Posição Específica , Mapeamento de Interação de Proteínas , Mapas de Interação de Proteínas , Proteoma , Proteômica/métodos
20.
Front Immunol ; 9: 1695, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30100904

RESUMO

Identification of B-cell epitopes (BCEs) is a fundamental step for epitope-based vaccine development, antibody production, and disease prevention and diagnosis. Due to the avalanche of protein sequence data discovered in postgenomic age, it is essential to develop an automated computational method to enable fast and accurate identification of novel BCEs within vast number of candidate proteins and peptides. Although several computational methods have been developed, their accuracy is unreliable. Thus, developing a reliable model with significant prediction improvements is highly desirable. In this study, we first constructed a non-redundant data set of 5,550 experimentally validated BCEs and 6,893 non-BCEs from the Immune Epitope Database. We then developed a novel ensemble learning framework for improved linear BCE predictor called iBCE-EL, a fusion of two independent predictors, namely, extremely randomized tree (ERT) and gradient boosting (GB) classifiers, which, respectively, uses a combination of physicochemical properties (PCP) and amino acid composition and a combination of dipeptide and PCP as input features. Cross-validation analysis on a benchmarking data set showed that iBCE-EL performed better than individual classifiers (ERT and GB), with a Matthews correlation coefficient (MCC) of 0.454. Furthermore, we evaluated the performance of iBCE-EL on the independent data set. Results show that iBCE-EL significantly outperformed the state-of-the-art method with an MCC of 0.463. To the best of our knowledge, iBCE-EL is the first ensemble method for linear BCEs prediction. iBCE-EL was implemented in a web-based platform, which is available at http://thegleelab.org/iBCE-EL. iBCE-EL contains two prediction modes. The first one identifying peptide sequences as BCEs or non-BCEs, while later one is aimed at providing users with the option of mining potential BCEs from protein sequences.


Assuntos
Biologia Computacional/métodos , Mapeamento de Epitopos/métodos , Epitopos de Linfócito B/imunologia , Algoritmos , Sequência de Aminoácidos , Epitopos de Linfócito B/química , Humanos , Peptídeos/química , Peptídeos/imunologia , Matrizes de Pontuação de Posição Específica , Curva ROC , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA