Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 47
Filtrar
1.
J Cell Biochem ; : e30642, 2024 Aug 20.
Artículo en Inglés | MEDLINE | ID: mdl-39164870

RESUMEN

The Type III secretion effectors (T3SEs) are bacterial proteins synthesized by Gram-negative pathogens and delivered into host cells via the Type III secretion system (T3SS). These effectors usually play a pivotal role in the interactions between bacteria and hosts. Hence, the precise identification of T3SEs aids researchers in exploring the pathogenic mechanisms of bacterial infections. Since the diversity and complexity of T3SE sequences often make traditional experimental methods time-consuming, it is imperative to explore more efficient and convenient computational approaches for T3SE prediction. Inspired by the promising potential exhibited by pre-trained language models in protein recognition tasks, we proposed a method called PLM-T3SE that utilizes protein language models (PLMs) for effective recognition of T3SEs. First, we utilized PLM embeddings and evolutionary features from the position-specific scoring matrix (PSSM) profiles to transform protein sequences into fixed-length vectors for model training. Second, we employed the extreme gradient boosting (XGBoost) algorithm to rank these features based on their importance. Finally, a MLP neural network model was used to predict T3SEs based on the selected optimal feature set. Experimental results from the cross-validation and independent test demonstrated that our model exhibited superior performance compared to the existing models. Specifically, our model achieved an accuracy of 98.1%, which is 1.8%-42.4% higher than the state-of-the-art predictors based on the same independent data set test. These findings highlight the superiority of the PLM-T3SE and the remarkable characterization ability of PLM embeddings for T3SE prediction.

2.
Anal Biochem ; 694: 115603, 2024 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-38986796

RESUMEN

The recognition of DNA-binding proteins (DBPs) is the crucial step to understanding their roles in various biological processes such as genetic regulation, gene expression, cell cycle control, DNA repair, and replication within cells. However, conventional experimental methods for identifying DBPs are usually time-consuming and expensive. Therefore, there is an urgent need to develop rapid and efficient computational methods for the prediction of DBPs. In this study, we proposed a novel predictor named PreDBP-PLMs to further improve the identification accuracy of DBPs by fusing the pre-trained protein language model (PLM) ProtT5 embedding with evolutionary features as input to the classic convolutional neural network (CNN) model. Firstly, the ProtT5 embedding was combined with different evolutionary features derived from the position-specific scoring matrix (PSSM) to represent protein sequences. Then, the optimal feature combination was selected and input to the CNN classifier for the prediction of DBPs. Finally, the 5-fold cross-validation (CV), the leave-one-out CV (LOOCV), and the independent set test were adopted to examine the performance of PreDBP-PLMs on the benchmark datasets. Compared to the existing state-of-the-art predictors, PreDBP-PLMs exhibits an accuracy improvement of 0.5 % and 5.2 % on the PDB186 and PDB2272 datasets, respectively. It demonstrated that the proposed method could serve as a useful tool for the recognition of DBPs.


Asunto(s)
Proteínas de Unión al ADN , Redes Neurales de la Computación , Proteínas de Unión al ADN/metabolismo , Proteínas de Unión al ADN/química , Biología Computacional/métodos , Bases de Datos de Proteínas , Humanos
3.
Int J Mol Sci ; 25(8)2024 Apr 19.
Artículo en Inglés | MEDLINE | ID: mdl-38674091

RESUMEN

Identification of druggable proteins can greatly reduce the cost of discovering new potential drugs. Traditional experimental approaches to exploring these proteins are often costly, slow, and labor-intensive, making them impractical for large-scale research. In response, recent decades have seen a rise in computational methods. These alternatives support drug discovery by creating advanced predictive models. In this study, we proposed a fast and precise classifier for the identification of druggable proteins using a protein language model (PLM) with fine-tuned evolutionary scale modeling 2 (ESM-2) embeddings, achieving 95.11% accuracy on the benchmark dataset. Furthermore, we made a careful comparison to examine the predictive abilities of ESM-2 embeddings and position-specific scoring matrix (PSSM) features by using the same classifiers. The results suggest that ESM-2 embeddings outperformed PSSM features in terms of accuracy and efficiency. Recognizing the potential of language models, we also developed an end-to-end model based on the generative pre-trained transformers 2 (GPT-2) with modifications. To our knowledge, this is the first time a large language model (LLM) GPT-2 has been deployed for the recognition of druggable proteins. Additionally, a more up-to-date dataset, known as Pharos, was adopted to further validate the performance of the proposed model.


Asunto(s)
Proteínas , Proteínas/metabolismo , Biología Computacional/métodos , Descubrimiento de Drogas/métodos , Posición Específica de Matrices de Puntuación , Bases de Datos de Proteínas , Humanos , Algoritmos
4.
Int J Mol Sci ; 25(15)2024 Jul 23.
Artículo en Inglés | MEDLINE | ID: mdl-39125602

RESUMEN

The benzofuran core inhibitors HCV-796, BMS-929075, MK-8876, compound 2, and compound 9B exhibit good pan-genotypic activity against various genotypes of NS5B polymerase. To elucidate their mechanism of action, multiple molecular simulation methods were used to investigate the complex systems of these inhibitors binding to GT1a, 1b, 2a, and 2b NS5B polymerases. The calculation results indicated that these five inhibitors can not only interact with the residues in the palm II subdomain of NS5B polymerase, but also with the residues in the palm I subdomain or the palm I/III overlap region. Interestingly, the binding of inhibitors with longer substituents at the C5 position (BMS-929075, MK-8876, compound 2, and compound 9B) to the GT1a and 2b NS5B polymerases exhibits different binding patterns compared to the binding to the GT1b and 2a NS5B polymerases. The interactions between the para-fluorophenyl groups at the C2 positions of the inhibitors and the residues at the binding pockets, together with the interactions between the substituents at the C5 positions and the residues at the reverse ß-fold (residues 441-456), play a key role in recognition and the induction of the binding. The relevant studies could provide valuable information for further research and development of novel anti-HCV benzofuran core pan-genotypic inhibitors.


Asunto(s)
Antivirales , Benzofuranos , Genotipo , Hepacivirus , Proteínas no Estructurales Virales , Proteínas no Estructurales Virales/antagonistas & inhibidores , Proteínas no Estructurales Virales/metabolismo , Proteínas no Estructurales Virales/química , Benzofuranos/química , Benzofuranos/farmacología , Hepacivirus/efectos de los fármacos , Hepacivirus/enzimología , Hepacivirus/genética , Antivirales/farmacología , Antivirales/química , Simulación de Dinámica Molecular , Simulación del Acoplamiento Molecular , Sitios de Unión , Unión Proteica , Humanos , Inhibidores Enzimáticos/farmacología , Inhibidores Enzimáticos/química , ARN Polimerasa Dependiente del ARN
5.
Molecules ; 29(2)2024 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-38276629

RESUMEN

Lysine-specific demethylase 1 (LSD1/KDM1A) has emerged as a promising therapeutic target for treating various cancers (such as breast cancer, liver cancer, etc.) and other diseases (blood diseases, cardiovascular diseases, etc.), owing to its observed overexpression, thereby presenting significant opportunities in drug development. Since its discovery in 2004, extensive research has been conducted on LSD1 inhibitors, with notable contributions from computational approaches. This review systematically summarizes LSD1 inhibitors investigated through computer-aided drug design (CADD) technologies since 2010, showcasing a diverse range of chemical scaffolds, including phenelzine derivatives, tranylcypromine (abbreviated as TCP or 2-PCPA) derivatives, nitrogen-containing heterocyclic (pyridine, pyrimidine, azole, thieno[3,2-b]pyrrole, indole, quinoline and benzoxazole) derivatives, natural products (including sanguinarine, phenolic compounds and resveratrol derivatives, flavonoids and other natural products) and others (including thiourea compounds, Fenoldopam and Raloxifene, (4-cyanophenyl)glycine derivatives, propargylamine and benzohydrazide derivatives and inhibitors discovered through AI techniques). Computational techniques, such as virtual screening, molecular docking and 3D-QSAR models, have played a pivotal role in elucidating the interactions between these inhibitors and LSD1. Moreover, the integration of cutting-edge technologies such as artificial intelligence holds promise in facilitating the discovery of novel LSD1 inhibitors. The comprehensive insights presented in this review aim to provide valuable information for advancing further research on LSD1 inhibitors.


Asunto(s)
Productos Biológicos , Inhibidores Enzimáticos , Inhibidores Enzimáticos/farmacología , Inhibidores Enzimáticos/química , Lisina , Simulación del Acoplamiento Molecular , Inteligencia Artificial , Diseño de Fármacos , Histona Demetilasas/metabolismo , Relación Estructura-Actividad
6.
Molecules ; 29(11)2024 Jun 04.
Artículo en Inglés | MEDLINE | ID: mdl-38893524

RESUMEN

The stimulator of interferon genes (STING) plays a significant role in immune defense and protection against tumor proliferation. Many cyclic dinucleotide (CDN) analogues have been reported to regulate its activity, but the dynamic process involved when the ligands activate STING remains unclear. In this work, all-atom molecular dynamics simulations were performed to explore the binding mode between human STING (hSTING) and four cyclic adenosine-inosine monophosphate analogs (cAIMPs), as well as 2',3'-cGMP-AMP (2',3'-cGAMP). The results indicate that these cAIMPs adopt a U-shaped configuration within the binding pocket, forming extensive non-covalent interaction networks with hSTING. These interactions play a significant role in augmenting the binding, particularly in interactions with Tyr167, Arg238, Thr263, and Thr267. Additionally, the presence of hydrophobic interactions between the ligand and the receptor further contributes to the overall stability of the binding. In this work, the conformational changes in hSTING upon binding these cAIMPs were also studied and a significant tendency for hSTING to shift from open to closed state was observed after binding some of the cAIMP ligands.


Asunto(s)
Proteínas de la Membrana , Simulación de Dinámica Molecular , Unión Proteica , Humanos , Proteínas de la Membrana/química , Proteínas de la Membrana/metabolismo , Sitios de Unión , Nucleótidos Cíclicos/química , Nucleótidos Cíclicos/metabolismo , Ligandos , Interacciones Hidrofóbicas e Hidrofílicas
7.
BMC Biol ; 20(1): 231, 2022 10 13.
Artículo en Inglés | MEDLINE | ID: mdl-36224580

RESUMEN

BACKGROUND: Antarctica harbors the bulk of the species diversity of the dominant teleost fish suborder-Notothenioidei. However, the forces that shape their evolution are still under debate. RESULTS: We sequenced the genome of an icefish, Chionodraco hamatus, and used population genomics and demographic modelling of sequenced genomes of 52 C. hamatus individuals collected mainly from two East Antarctic regions to investigate the factors driving speciation. Results revealed four icefish populations with clear reproduction separation were established 15 to 50 kya (kilo years ago) during the last glacial maxima (LGM). Selection sweeps in genes involving immune responses, cardiovascular development, and photoperception occurred differentially among the populations and were correlated with population-specific microbial communities and acquisition of distinct morphological features in the icefish taxa. Population and species-specific antifreeze glycoprotein gene expansion and glacial cycle-paced duplication/degeneration of the zona pellucida protein gene families indicated fluctuating thermal environments and periodic influence of glacial cycles on notothenioid divergence. CONCLUSIONS: We revealed a series of genomic evidence indicating differential adaptation of C. hamatus populations and notothenioid species divergence in the extreme and unique marine environment. We conclude that geographic separation and adaptation to heterogeneous pathogen, oxygen, and light conditions of local habitats, periodically shaped by the glacial cycles, were the key drivers propelling species diversity in Antarctica.


Asunto(s)
Cubierta de Hielo , Perciformes , Animales , Regiones Antárticas , Peces/genética , Genoma , Metagenómica , Oxígeno , Filogenia
8.
Molecules ; 28(5)2023 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-36903531

RESUMEN

The subcellular localization of messenger RNA (mRNA) precisely controls where protein products are synthesized and where they function. However, obtaining an mRNA's subcellular localization through wet-lab experiments is time-consuming and expensive, and many existing mRNA subcellular localization prediction algorithms need to be improved. In this study, a deep neural network-based eukaryotic mRNA subcellular location prediction method, DeepmRNALoc, was proposed, utilizing a two-stage feature extraction strategy that featured bimodal information splitting and fusing for the first stage and a VGGNet-like CNN module for the second stage. The five-fold cross-validation accuracies of DeepmRNALoc in the cytoplasm, endoplasmic reticulum, extracellular region, mitochondria, and nucleus were 0.895, 0.594, 0.308, 0.944, and 0.865, respectively, demonstrating that it outperforms existing models and techniques.


Asunto(s)
Aprendizaje Profundo , Eucariontes , Eucariontes/metabolismo , Proteínas/metabolismo , Retículo Endoplásmico/metabolismo , ARN Mensajero , Biología Computacional/métodos
9.
RNA ; 26(4): 470-480, 2020 04.
Artículo en Inglés | MEDLINE | ID: mdl-31988191

RESUMEN

Due to the polyanionic nature of RNAs, the structural folding of RNAs are sensitive to solution salt conditions, while there is still lack of a deep understanding of the salt effect on the thermodynamics and kinetics of RNAs at a single base-pair level. In this work, the thermodynamic and the kinetic parameters for the base-pair AU closing/opening at different salt concentrations were calculated by 3-µsec all-atom molecular dynamics (MD) simulations at different temperatures. It was found that for the base-pair formation, the enthalpy change [Formula: see text] is nearly independent of salt concentration, while the entropy change [Formula: see text] exhibits a linear dependence on the logarithm of salt concentration, verifying the empirical assumption based on thermodynamic experiments. Our analyses revealed that such salt concentration dependence of the entropy change mainly results from the dependence of ion translational entropy change for the base pair closing/opening on salt concentration. Furthermore, the closing rate increases with the increasing of salt concentration, while the opening rate is nearly independent of salt concentration. Additionally, our analyses revealed that the free energy surface for describing the base-pair opening and closing dynamics becomes more rugged with the decrease of salt concentration.


Asunto(s)
Simulación de Dinámica Molecular , ARN/química , Emparejamiento Base , Concentración Osmolar , Cloruro de Sodio/química
10.
Molecules ; 27(23)2022 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-36500451

RESUMEN

Lysine-specific demethylase 1 (LSD1) is a histone-modifying enzyme, which is a significant target for anticancer drug research. In this work, 40 reported tetrahydroquinoline-derivative inhibitors targeting LSD1 were studied to establish the three-dimensional quantitative structure-activity relationship (3D-QSAR). The established models CoMFA (Comparative Molecular Field Analysis (q2 = 0.778, Rpred2 = 0.709)) and CoMSIA (Comparative Molecular Similarity Index Analysis (q2 = 0.764, Rpred2 = 0.713)) yielded good statistical and predictive properties. Based on the corresponding contour maps, seven novel tetrahydroquinoline derivatives were designed. For more information, three of the compounds (D1, D4, and Z17) and the template molecule 18x were explored with molecular dynamics simulations, binding free energy calculations by MM/PBSA method as well as the ADME (absorption, distribution, metabolism, and excretion) prediction. The results suggested that D1, D4, and Z17 performed better than template molecule 18x due to the introduction of the amino and hydrophobic groups, especially for the D1 and D4, which will provide guidance for the design of LSD1 inhibitors.


Asunto(s)
Antineoplásicos , Relación Estructura-Actividad Cuantitativa , Simulación del Acoplamiento Molecular , Simulación de Dinámica Molecular , Interacciones Hidrofóbicas e Hidrofílicas , Antineoplásicos/farmacología , Diseño de Fármacos
11.
RNA ; 25(5): 620-629, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30770397

RESUMEN

The small interfering RNAs (siRNA) or microRNAs (miRNA) incorporated into the RNA-induced silencing complex with the Argonaute (Ago) protein associates with target mRNAs through base-pairing, which leads to the cleavage or knockdown of the target mRNA. The seed region of the s(m)iRNA is crucial for target recognition. In this work, a molecular dynamic simulation was utilized to study the thermodynamics and kinetic properties of the third seed base binding to the target in the presence of the PIWI/MID domain of Ago. The results showed that in the presence of the PIWI/MID domain, the entropy and enthalpy changes for the association of the seed base with the target were smaller than those in the absence of protein. The binding affinity was increased due to the reduced entropy penalty, which resulted from the preorganization of the seed base into the A-helix form. In the presence of the protein, the association barrier resulting from the unfavorable entropy loss and the dissociation barrier coming from the destruction of hydrogen bonding and base-stacking interactions were lower than those in the absence of the protein. These results indicate that the seed region is crucial for fast recognition and association with the correct target.


Asunto(s)
Proteínas Argonautas/química , Factores Eucarióticos de Iniciación/química , MicroARNs/química , Proteínas Argonautas/genética , Proteínas Argonautas/metabolismo , Sitios de Unión , Cristalografía por Rayos X , Factores Eucarióticos de Iniciación/genética , Factores Eucarióticos de Iniciación/metabolismo , Humanos , Enlace de Hidrógeno , Cinética , MicroARNs/genética , MicroARNs/metabolismo , Simulación de Dinámica Molecular , Conformación de Ácido Nucleico , Unión Proteica , Conformación Proteica en Hélice alfa , Conformación Proteica en Lámina beta , Dominios y Motivos de Interacción de Proteínas , Termodinámica
12.
Molecules ; 26(24)2021 Dec 07.
Artículo en Inglés | MEDLINE | ID: mdl-34946497

RESUMEN

An important reason of cancer proliferation is the change in DNA methylation patterns, characterized by the localized hypermethylation of the promoters of tumor-suppressor genes together with an overall decrease in the level of 5-methylcytosine (5mC). Therefore, identifying the 5mC sites in the promoters is a critical step towards further understanding the diverse functions of DNA methylation in genetic diseases such as cancers and aging. However, most wet-lab experimental techniques are often time consuming and laborious for detecting 5mC sites. In this study, we proposed a deep learning-based approach, called BiLSTM-5mC, for accurately identifying 5mC sites in genome-wide DNA promoters. First, we randomly divided the negative samples into 11 subsets of equal size, one of which can form the balance subset by combining with the positive samples in the same amount. Then, two types of feature vectors encoded by the one-hot method, and the nucleotide property and frequency (NPF) methods were fed into a bidirectional long short-term memory (BiLSTM) network and a full connection layer to train the 22 submodels. Finally, the outputs of these models were integrated to predict 5mC sites by using the majority vote strategy. Our experimental results demonstrated that BiLSTM-5mC outperformed existing methods based on the same independent dataset.


Asunto(s)
5-Metilcitosina/análisis , Envejecimiento/metabolismo , ADN/genética , Aprendizaje Profundo , Neoplasias/metabolismo , 5-Metilcitosina/metabolismo , Envejecimiento/genética , Metilación de ADN , Humanos , Memoria a Corto Plazo , Neoplasias/genética , Regiones Promotoras Genéticas/genética
13.
Molecules ; 26(9)2021 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-33923273

RESUMEN

Many gram-negative bacteria use type IV secretion systems to deliver effector molecules to a wide range of target cells. These substrate proteins, which are called type IV secreted effectors (T4SE), manipulate host cell processes during infection, often resulting in severe diseases or even death of the host. Therefore, identification of putative T4SEs has become a very active research topic in bioinformatics due to its vital roles in understanding host-pathogen interactions. PSI-BLAST profiles have been experimentally validated to provide important and discriminatory evolutionary information for various protein classification tasks. In the present study, an accurate computational predictor termed iT4SE-EP was developed for identifying T4SEs by extracting evolutionary features from the position-specific scoring matrix and the position-specific frequency matrix profiles. First, four types of encoding strategies were designed to transform protein sequences into fixed-length feature vectors based on the two profiles. Then, the feature selection technique based on the random forest algorithm was utilized to reduce redundant or irrelevant features without much loss of information. Finally, the optimal features were input into a support vector machine classifier to carry out the prediction of T4SEs. Our experimental results demonstrated that iT4SE-EP outperformed most of existing methods based on the independent dataset test.


Asunto(s)
Evolución Molecular , Bacterias Gramnegativas/genética , Interacciones Huésped-Patógeno/genética , Sistemas de Secreción Tipo IV/genética , Secuencia de Aminoácidos/genética , Infecciones Bacterianas/tratamiento farmacológico , Infecciones Bacterianas/genética , Infecciones Bacterianas/microbiología , Biología Computacional , Bacterias Gramnegativas/patogenicidad , Humanos , Sistemas de Secreción Tipo IV/química
14.
RNA ; 24(9): 1229-1240, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-29954950

RESUMEN

Hepatitis delta virus (HDV) ribozyme performs the self-cleavage activity through folding to a double pseudoknot structure. The folding of functional RNA structures is often coupled with the transcription process. In this work, we developed a new approach for predicting the cotranscriptional folding kinetics of RNA secondary structures with pseudoknots. We theoretically studied the cotranscriptional folding behavior of the 99-nucleotide (nt) HDV sequence, two upstream flanking sequences, and one downstream flanking sequence. During transcription, the 99-nt HDV can effectively avoid the trap intermediates and quickly fold to the cleavage-active state. It is different from its refolding kinetics, which folds into an intermediate trap state. For all the sequences, the ribozyme regions (from 1 to 73) all fold to the same structure during transcription. However, the existence of the 30-nt upstream flanking sequence can inhibit the ribozyme region folding into the active native state through forming an alternative helix Alt1 with the segments 70-90. The longer upstream flanking sequence of 54 nt itself forms a stable hairpin structure, which sequesters the formation of the Alt1 helix and leads to rapid formation of the cleavage-active structure. Although the 55-nt downstream flanking sequence could invade the already folded active structure during transcription by forming a more stable helix with the ribozyme region, the slow transition rate could keep the structure in the cleavage-active structure to perform the activity.


Asunto(s)
Virus de la Hepatitis Delta/genética , ARN Catalítico/química , ARN Catalítico/genética , Transcripción Genética , Dominio Catalítico , Virus de la Hepatitis Delta/química , Cinética , Modelos Moleculares , Conformación de Ácido Nucleico , Pliegue del ARN , ARN Viral/química , ARN Viral/genética
15.
Int J Mol Sci ; 20(9)2019 May 11.
Artículo en Inglés | MEDLINE | ID: mdl-31083553

RESUMEN

To reveal the working pattern of programmed cell death, knowledge of the subcellular location of apoptosis proteins is essential. Besides the costly and time-consuming method of experimental determination, research into computational locating schemes, focusing mainly on the innovation of representation techniques on protein sequences and the selection of classification algorithms, has become popular in recent decades. In this study, a novel tri-gram encoding model is proposed, which is based on using the protein overlapping property matrix (POPM) for predicting apoptosis protein subcellular location. Next, a 1000-dimensional feature vector is built to represent a protein. Finally, with the help of support vector machine-recursive feature elimination (SVM-RFE), we select the optimal features and put them into a support vector machine (SVM) classifier for predictions. The results of jackknife tests on two benchmark datasets demonstrate that our proposed method can achieve satisfactory prediction performance level with less computing capacity required and could work as a promising tool to predict the subcellular locations of apoptosis proteins.


Asunto(s)
Algoritmos , Proteínas Reguladoras de la Apoptosis/metabolismo , Apoptosis , Aminoácidos/metabolismo , Bases de Datos de Proteínas , Transporte de Proteínas , Máquina de Vectores de Soporte
16.
BMC Genomics ; 19(1): 315, 2018 May 02.
Artículo en Inglés | MEDLINE | ID: mdl-29720106

RESUMEN

BACKGROUND: Temperature adaptation of biological molecules is fundamental in evolutionary studies but remains unsolved. Fishes living in cold water are adapted to low temperatures through adaptive modification of their biological molecules, which enables their functioning in extreme cold. To study nucleotide and amino acid preference in cold-water fishes, we investigated the substitution asymmetry of codons and amino acids in protein-coding DNA sequences between cold-water fishes and tropical fishes., The former includes two Antarctic fishes, Dissostichus mawsoni (Antarctic toothfish), Gymnodraco acuticeps (Antarctic dragonfish), and two temperate fishes, Gadus morhua (Atlantic cod) and Gasterosteus aculeatus (stickleback), and the latter includes three tropical fishes, including Danio rerio (zebrafish), Oreochromis niloticus (Nile tilapia) and Xiphophorus maculatus (Platyfish). RESULTS: Cold-water fishes showed preference for Guanines and cytosines (GCs) in both synonymous and nonsynonymous codon substitution when compared with tropical fishes. Amino acids coded by GC-rich codons are favored in the temperate fishes, while those coded by AT-rich codons are disfavored. Similar trends were discovered in Antarctic fishes but were statistically weaker. The preference of GC rich codons in nonsynonymous substitution tends to increase ratio of small amino acid in proteins, which was demonstrated by biased small amino acid substitutions in the cold-water species when compared with the tropical species, especially in the temperate species. Prediction and comparison of secondary structure of the proteomes showed that frequency of random coils are significantly larger in the cold-water fish proteomes than those of the tropical fishes. CONCLUSIONS: Our results suggested that natural selection in cold temperature might favor biased GC content in the coding DNA sequences, which lead to increased frequency of small amino acids and consequently increased random coils in the proteomes of cold-water fishes.


Asunto(s)
Frío , Proteínas de Peces/química , Proteínas de Peces/genética , Peces/genética , Secuencia Rica en GC , Secuencia de Aminoácidos , Sustitución de Aminoácidos , Animales , Estructura Secundaria de Proteína/genética , Alineación de Secuencia , Análisis de Secuencia de ARN
17.
J Chem Phys ; 148(4): 045101, 2018 Jan 28.
Artículo en Inglés | MEDLINE | ID: mdl-29390847

RESUMEN

The thermodynamic and kinetic parameters of an RNA base pair with different nearest and next nearest neighbors were obtained through long-time molecular dynamics simulation of the opening-closing switch process of the base pair near its melting temperature. The results indicate that thermodynamic parameters of GC base pair are dependent on the nearest neighbor base pair, and the next nearest neighbor base pair has little effect, which validated the nearest-neighbor model. The closing and opening rates of the GC base pair also showed nearest neighbor dependences. At certain temperature, the closing and opening rates of the GC pair with nearest neighbor AU is larger than that with the nearest neighbor GC, and the next nearest neighbor plays little role. The free energy landscape of the GC base pair with the nearest neighbor GC is rougher than that with nearest neighbor AU.


Asunto(s)
ARN/química , Termodinámica , Emparejamiento Base , Cinética , Simulación de Dinámica Molecular
18.
BMC Genomics ; 18(1): 436, 2017 06 05.
Artículo en Inglés | MEDLINE | ID: mdl-28583064

RESUMEN

BACKGROUND: Vibrio parahaemolyticus causes serious seafood-borne gastroenteritis and death in humans. Raw seafood is often subjected to post-harvest processing and low-temperature storage. To date, very little information is available regarding the biological functions of cold shock proteins (CSPs) in the low-temperature survival of the bacterium. In this study, we determined the complete genome sequence of V. parahaemolyticus CHN25 (serotype: O5:KUT). The two main CSP-encoding genes (VpacspA and VpacspD) were deleted from the bacterial genome, and comparative transcriptomic analysis between the mutant and wild-type strains was performed to dissect the possible molecular mechanisms that underlie low-temperature adaptation by V. parahaemolyticus. RESULTS: The 5,443,401-bp V. parahaemolyticus CHN25 genome (45.2% G + C) consisted of two circular chromosomes and three plasmids with 4,724 predicted protein-encoding genes. One dual-gene and two single-gene deletion mutants were generated for VpacspA and VpacspD by homologous recombination. The growth of the ΔVpacspA mutant was strongly inhibited at 10 °C, whereas the VpacspD gene deletion strongly stimulated bacterial growth at this low temperature compared with the wild-type strain. The complementary phenotypes were observed in the reverse mutants (ΔVpacspA-com, and ΔVpacspD-com). The transcriptome data revealed that 12.4% of the expressed genes in V. parahaemolyticus CHN25 were significantly altered in the ΔVpacspA mutant when it was grown at 10 °C. These included genes that were involved in amino acid degradation, secretion systems, sulphur metabolism and glycerophospholipid metabolism along with ATP-binding cassette transporters. However, a low temperature elicited significant expression changes for 10.0% of the genes in the ΔVpacspD mutant, including those involved in the phosphotransferase system and in the metabolism of nitrogen and amino acids. The major metabolic pathways that were altered by the dual-gene deletion mutant (ΔVpacspAD) radically differed from those that were altered by single-gene mutants. Comparison of the transcriptome profiles further revealed numerous differentially expressed genes that were shared among the three mutants and regulators that were specifically, coordinately or antagonistically modulated by VpaCspA and VpaCspD. Our data also revealed several possible molecular coping strategies for low-temperature adaptation by the bacterium. CONCLUSIONS: This study is the first to describe the complete genome sequence of V. parahaemolyticus (serotype: O5:KUT). The gene deletions, complementary insertions, and comparative transcriptomics demonstrate that VpaCspA is a primary CSP in the bacterium, while VpaCspD functions as a growth inhibitor at 10 °C. These results have improved our understanding of the genetic basis for low-temperature survival by the most common seafood-borne pathogen worldwide.


Asunto(s)
Proteínas Bacterianas/genética , Frío , Respuesta al Choque por Frío/genética , Genómica , Vibrio parahaemolyticus/genética , Vibrio parahaemolyticus/fisiología , Adaptación Fisiológica/genética , Perfilación de la Expresión Génica , Mutación , Fenotipo
19.
Amino Acids ; 47(3): 461-8, 2015 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-25583603

RESUMEN

Knowledge of structural class plays an important role in understanding protein folding patterns. As a transitional stage in recognition of three-dimensional structure of a protein, protein structural class prediction is considered to be an important and challenging task. In this study, we firstly introduce a feature extraction technique which is based on tri-grams computed directly from position-specific scoring matrix (PSSM). A total of 8,000 features are extracted to represent a protein. Then, support vector machine-recursive feature elimination (SVM-RFE) is applied for feature selection and reduced features are input to a support vector machine (SVM) classifier to predict structural class of a given protein. To examine the effectiveness of our method, jackknife tests are performed on six widely used benchmark datasets, i.e., Z277, Z498, 1189, 25PDB, D640, and D1185. The overall accuracies of 97.1, 98.6, 92.5, 93.5, 94.2, and 95.9% are achieved on these datasets, respectively. Comparison of the proposed method with other prediction methods shows that our method is very promising to perform the prediction of protein structural class.


Asunto(s)
Bases de Datos de Proteínas , Proteínas/química , Proteínas/genética , Programas Informáticos , Estructura Terciaria de Proteína
20.
J Theor Biol ; 366: 8-12, 2015 Feb 07.
Artículo en Inglés | MEDLINE | ID: mdl-25463695

RESUMEN

Knowledge of apoptosis proteins plays an important role in understanding the mechanism of programmed cell death. Obtaining information on subcellular location of apoptosis proteins is very helpful to reveal the apoptosis mechanism and understand the function of apoptosis proteins. Because of the cost in time and labor associated with large-scale wet-bench experiments, computational prediction of apoptosis proteins subcellular location becomes very important and many computational tools have been developed in the recent decades. Existing methods differ in the protein sequence representation techniques and classification algorithms adopted. In this study, we firstly introduce a sequence encoding scheme based on tri-grams computed directly from position-specific score matrices, which incorporates evolution information represented in the PSI-BLAST profile and sequence-order information. Then SVM-RFE algorithm is applied for feature selection and reduced vectors are input to a support vector machine classifier to predict subcellular location of apoptosis proteins. Jackknife tests on three widely used datasets show that our method provides the state-of-the-art performance in comparison with other existing methods.


Asunto(s)
Algoritmos , Proteínas Reguladoras de la Apoptosis/metabolismo , Posición Específica de Matrices de Puntuación , Bases de Datos de Proteínas , Humanos , Transporte de Proteínas , Curva ROC , Fracciones Subcelulares/metabolismo , Máquina de Vectores de Soporte
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda