Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
1.
Brief Bioinform ; 23(6)2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36266243

RESUMEN

Glioblastoma is a fast and aggressively growing tumor in the brain and spinal cord. Mutation of amino acid residues in targets proteins, which are involved in glioblastoma, alters the structure and function and may lead to disease. In this study, we collected a set of 9386 disease-causing (drivers) mutations based on the recurrence in patient samples and experimentally annotated as pathogenic and 8728 as neutral (passenger) mutations. We observed that Arg is highly preferred at the mutant sites of drivers, whereas Met and Ile showed preferences in passengers. Inspecting neighboring residues at the mutant sites revealed that the motifs YP, CP and GRH, are preferred in drivers, whereas SI, IQ and TVI are dominant in neutral. In addition, we have computed other sequence-based features such as conservation scores, Position Specific Scoring Matrices (PSSM) and physicochemical properties, and developed a machine learning-based method, GBMDriver (GlioBlastoma Multiforme Drivers), for distinguishing between driver and passenger mutations. Our method showed an accuracy and AUC of 73.59% and 0.82, respectively, on 10-fold cross-validation and 81.99% and 0.87 in a blind set of 1809 mutants. The tool is available at https://web.iitm.ac.in/bioinfo2/GBMDriver/index.html. We envisage that the present method is helpful to prioritize driver mutations in glioblastoma and assist in identifying therapeutic targets.


Asunto(s)
Glioblastoma , Humanos , Glioblastoma/genética , Aprendizaje Automático , Mutación , Proteínas/genética , Aminoácidos
2.
Adv Protein Chem Struct Biol ; 139: 141-171, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38448134

RESUMEN

Advancements in genome sequencing have expanded the scope of investigating mutations in proteins across different diseases. Amino acid mutations in a protein alter its structure, stability and function and some of them lead to diseases. Identification of disease-causing mutations is a challenging task and it will be helpful for designing therapeutic strategies. Hence, mutation data available in the literature have been curated and stored in several databases, which have been effectively utilized for developing computational methods to identify deleterious mutations (drivers), using sequence and structure-based properties of proteins. In this chapter, we describe the contents of specific databases that have information on disease-causing and neutral mutations followed by sequence and structure-based properties. Further, characteristic features of disease-causing mutations will be discussed along with computational methods for identifying cancer hotspot residues and disease-causing mutations in proteins.


Asunto(s)
Bases de Datos Factuales , Mutación
3.
Biochim Biophys Acta Mol Basis Dis ; 1869(6): 166721, 2023 08.
Artículo en Inglés | MEDLINE | ID: mdl-37105446

RESUMEN

Understanding the molecular basis and impact of mutations at different stages of cancer are long-standing challenges in cancer biology. Identification of driver mutations from experiments is expensive and time intensive. In the present study, we collected the data for experimentally known driver mutations in 22 different cancer types and classified them into six categories: breast cancer (BRCA), acute myeloid leukaemia (LAML), endometrial carcinoma (EC), stomach cancer (STAD), skin cancer (SKCM), and other cancer types which contains 5747 disease prone and 5514 neutral sites in 516 proteins. The analysis of amino acid distribution along mutant sites revealed that the motifs AAA and LR are preferred in disease-prone sites whereas QPP and QF are dominant in neutral sites. Further, we developed a method using deep neural networks to predict disease-prone sites with amino acid sequence-based features such as physicochemical properties, secondary structure, tri-peptide motifs and conservation scores. We obtained an average AUC of 0.97 in five cancer types BRCA, LAML, EC, STAD and SKCM in a test dataset and 0.72 in all other cancer types together. Our method showed excellent performance for identifying cancer-specific mutations with an average sensitivity, specificity, and accuracy of 96.56 %, 97.39 %, and 97.64 %, respectively. We developed a web server for identifying cancer-prone sites, and it is available at https://web.iitm.ac.in/bioinfo2/MutBLESS/index.html. We suggest that our method can serve as an effective method to identify disease-prone sites and assist to develop therapeutic strategies.


Asunto(s)
Neoplasias de la Mama , Aprendizaje Profundo , Humanos , Femenino , Proteínas/metabolismo , Redes Neurales de la Computación , Aminoácidos
4.
Comput Biol Med ; 147: 105708, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-35714506

RESUMEN

The prolonged transmission of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) virus in the human population has led to demographic divergence and the emergence of several location-specific clusters of viral strains. Although the effect of mutation(s) on severity and survival of the virus is still unclear, it is evident that certain sites in the viral proteome are more/less prone to mutations. In fact, millions of SARS-CoV-2 sequences collected all over the world have provided us a unique opportunity to understand viral protein mutations and develop novel computational approaches to predict mutational patterns. In this study, we have classified the mutation sites into low and high mutability classes based on viral isolates count containing mutations. The physicochemical features and structural analysis of the SARS-CoV-2 proteins showed that features including residue type, surface accessibility, residue bulkiness, stability and sequence conservation at the mutation site were able to classify the low and high mutability sites. We further developed machine learning models using above-mentioned features, to predict low and high mutability sites at different selection thresholds (ranging 5-30% of topmost and bottommost mutated sites) and observed the improvement in performance as the selection threshold is reduced (prediction accuracy ranging from 65 to 77%). The analysis will be useful for early detection of variants of concern for the SARS-CoV-2, which can also be applied to other existing and emerging viruses for another pandemic prevention.


Asunto(s)
COVID-19 , SARS-CoV-2 , COVID-19/genética , Genoma Viral , Humanos , Mutación/genética , Pandemias , Proteoma/genética , SARS-CoV-2/genética
5.
J Med Entomol ; 59(6): 2176-2181, 2022 11 16.
Artículo en Inglés | MEDLINE | ID: mdl-36166571

RESUMEN

The Asian longhorned tick (Haemaphysalis longicornis Neumann), native to East Asia, was first reported in the United States in 2017 and is now established in at least 17 states. Haemaphysalis longicornis feeds on birds in its range outside of the United States, and migratory birds disperse this tick and tick-borne pathogens. However, early studies in the United States did not find H. longicornis on migrating passerine birds. The transport of the parthenogenetic H. longicornis on birds has the potential to greatly expand its range. We report the first discovery of H. longicornis on migratory passerine birds in the Americas.


Asunto(s)
Ixodidae , Passeriformes , Garrapatas , Estados Unidos , Animales
6.
Mutat Res ; 822: 111737, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33508631

RESUMEN

Lung cancer is a prominent type of cancer, which leads to high mortality rate worldwide. The major lung cancers lung adenocarcinoma (LUAD) and lung squamous carcinoma (LUSC) occur mainly due to somatic driver mutations in proteins and screening of such mutations is often cost and time intensive. Hence, in the present study, we systematically analyzed the preferred residues, residues pairs and motifs of 4172 disease prone sites in 195 proteins and compared with 4137 neutral sites. We observed that the motifs LG, QF and TST are preferred in disease prone sites whereas GK, KA and ISL are predominant in neutral sites. In addition, Gly, Asp, Glu, Gln and Trp are preferred in disease prone sites whereas, Ile, Val, Lys, Asn and Phe are preferred in neutral sites. Further, utilizing deep neural networks, we have developed a method for predicting disease prone sites with amino acid sequence based features such as physicochemical properties, conservation scores, secondary structure and di and tri-peptide motifs. The model is able to predict the disease prone sites at an accuracy of 81 % with sensitivity, specificity and AUC of 82 %, 78 % and 0.91, respectively, on 10-fold cross-validation. When the model was tested with a set of 417 disease-causing and 413 neutral sites, we obtained an accuracy and AUC of 80 % and 0.89, respectively. We suggest that our method can serve as an effective method to identify the disease causing and neutral sites in lung cancer. We have developed a web server CanProSite for identifying the disease prone sites and it is freely available at-https://web.iitm.ac.in/bioinfo2/CanProSite/.


Asunto(s)
Adenocarcinoma del Pulmón/genética , Aprendizaje Profundo , Neoplasias Pulmonares/genética , Modelos Genéticos , Proteínas de Neoplasias/genética , Adenocarcinoma del Pulmón/metabolismo , Humanos , Neoplasias Pulmonares/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA