Your browser doesn't support javascript.
loading
Montrer: 20 | 50 | 100
Résultats 1 - 17 de 17
Filtrer
1.
Article de Anglais | MEDLINE | ID: mdl-38427544

RÉSUMÉ

Transfer RNAs (tRNA) are non-coding RNAs. Encouraged by biological applications discovered for peptides derived from other non-coding genomic regions, we explore the possibility of deriving epitope-based vaccines from tRNA encoded peptides (tREP) in this study. Epitope-based vaccines have been identified as an effective strategy to mitigate safety and specificity concerns observed in vaccine development. In this study, we explore the potential of tREP as a source for epitope-based vaccines for virus pathogens. We present a computational workflow that uses verified data sources and community-validated predictive tools to produce a ranked list of plausible epitope-based vaccines starting from tRNA sequences. The top epitope, bound to the predicted HLA molecule, for the virus pathogen is computationally validated through 200 ns molecular dynamics (MD) simulations followed by binding free energy calculations. The simulation results indicate that two tRNA encoded epitope-based vaccines, RRHIDIVV and IMVRFSAE for Mamastrovirus 3 and Norovirus GII, respectively, are likely candidates. Peptides originating from tRNAs provide unexplored opportunities for vaccine design. Encouraged by our previous experimental study, which established the inhibitory properties of tREPs against infectious parasites, we have proposed a computationally validated set of peptides derived from tREPs as vaccines for viral pathogens.


Sujet(s)
Biologie informatique , Simulation de dynamique moléculaire , Peptides , ARN de transfert , ARN de transfert/génétique , ARN de transfert/composition chimique , Biologie informatique/méthodes , Peptides/composition chimique , Peptides/génétique , Peptides/immunologie , Humains , Vaccins antiviraux/immunologie , Vaccins antiviraux/génétique , Vaccins antiviraux/composition chimique , Épitopes/composition chimique , Épitopes/immunologie , Épitopes/génétique , Norovirus/génétique , Norovirus/immunologie , Norovirus/composition chimique
2.
BMC Bioinformatics ; 24(1): 241, 2023 Jun 07.
Article de Anglais | MEDLINE | ID: mdl-37286944

RÉSUMÉ

BACKGROUND: RNA sequencing (RNA-Seq) is a technique that utilises the capabilities of next-generation sequencing to study a cellular transcriptome i.e., to determine the amount of RNA at a given time for a given biological sample. The advancement of RNA-Seq technology has resulted in a large volume of gene expression data for analysis. RESULTS: Our computational model (built on top of TabNet) is first pretrained on an unlabelled dataset of multiple types of adenomas and adenocarcinomas and later fine-tuned on the labelled dataset, showing promising results in the context of the estimation of the vital status of colorectal cancer patients. We achieve a final cross-validated (ROC-AUC) Score of 0.88 by using multiple modalities of data. CONCLUSION: The results of this study demonstrate that self-supervised learning methods pretrained on a vast corpus of unlabelled data outperform traditional supervised learning methods such as XGBoost, Neural Networks, and Decision Trees that have been prevalent in the tabular domain. The results of this study are further boosted by the inclusion of multiple modalities of data pertaining to the patients in question. We find that genes such as RBM3, GSPT1, MAD2L1, and others important to the computation model's prediction task obtained through model interpretability corroborate with pathological evidence in current literature.


Sujet(s)
Tumeurs colorectales , ARN , Humains , RNA-Seq/méthodes , ARN/génétique , Analyse de séquence d'ARN/méthodes , Apprentissage machine supervisé , Tumeurs colorectales/génétique , Protéines de liaison à l'ARN/génétique
3.
ACS Omega ; 7(37): 32877-32896, 2022 Sep 20.
Article de Anglais | MEDLINE | ID: mdl-36157750

RÉSUMÉ

Molecular dynamics (MD) simulations probe the conformational repertoire of macromolecular systems using Newtonian dynamic equations. The time scales of MD simulations allow the exploration of biologically relevant phenomena and can elucidate spatial and temporal properties of the building blocks of life, such as deoxyribonucleic acid (DNA) and protein, across microsecond (µs) time scales using femtosecond (fs) time steps. A principal bottleneck toward extending MD calculations to larger time scales is the long-range electrostatic force measuring component of the naive nonbonded force computation algorithm, which scales with a complexity of (N, number of atoms). In this review, we present various methods to determine electrostatic interactions in often-used open-source MD packages as well as the implementation details that facilitate acceleration of the electrostatic interaction calculation.

4.
Chem Biol Drug Des ; 100(2): 169-184, 2022 08.
Article de Anglais | MEDLINE | ID: mdl-35587730

RÉSUMÉ

The ability to estimate the probability of a drug to receive approval in clinical trials provides natural advantages to optimizing pharmaceutical research workflows. Success rates of clinical trials have deep implications for costs, duration of development, and under pressure due to stringent regulatory approval processes. We propose a machine learning approach that can predict the outcome of the trial with reliable accuracies, using biological activities, physicochemical properties of the compounds, target-related features, and NLP-based compound representation. In the above list, biological activities have never been used as an independent variable towards the prediction of clinical trial outcomes. We have extracted the drug-disease pair from clinical trials and mapped target(s) to that pair using multiple data sources. Empirical results demonstrate that ensemble learning outperforms independently trained, small-data ML models. We report results and inferences derived from a Random forest classifier with an average accuracy of 93%, and an F1 score of 0.96 for the "Pass" class. "Pass" refers to one of the two classes (Pass/Fail) of all clinical trials, and the model performed well in predicting the "Pass" category. Through the analysis of feature contributions to predictive capability, we have demonstrated that bioactivity plays a statistically significant role in predicting clinical trial outcome. A significant effort has gone into the production of the dataset that, for the first time, integrates clinical trial information with protein targets. Cleaned, organized, integrated data and code to map these entities, created as a part of this work, are available open-source. This reproducibility and the freely available code ensure that researchers with access to deep curated and proprietary clinical trial databases (we only use open-source data in this study) can further expand the scope of the results.


Sujet(s)
Algorithmes , Apprentissage machine , Bases de données factuelles , Reproductibilité des résultats
5.
Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 2175-2179, 2021 11.
Article de Anglais | MEDLINE | ID: mdl-34891719

RÉSUMÉ

Sepsis is a serious cause of morbidity and mortality and yet its pathophysiology remains elusive. Recently, medical and technological advances have helped redefine the criteria for sepsis incidence, which is otherwise poorly understood. With the recording of clinical parameters and outcomes of patients, enabling technologies, such as machine learning, open avenues for early prognostic systems for sepsis. In this work, we propose a two-phase approach towards prognostic scoring by predicting two outcomes in sepsis patients - Sepsis Severity and Comorbidity Severity. We train and evaluate multiple machine learning models on a dataset of 80 parameters collected from 800 patients at Amrita Institute of Medical Sciences, Kerala, India. We present an analysis of these results and harmonize consistencies and/or contradictions between elements of human knowledge and that of the model, using local interpretable model-agnostic explanations and other methods.


Sujet(s)
Apprentissage machine , Sepsie , Humains , Incidence , Inde , Sepsie/diagnostic
6.
Chem Biol Drug Des ; 97(3): 665-673, 2021 03.
Article de Anglais | MEDLINE | ID: mdl-33006799

RÉSUMÉ

Adverse drug reactions (ADRs) are pharmacological events triggered by drug interactions with various sources of origin including drug-drug interactions. While there are many computational studies that explore models to predict ADRs originating from single drugs, only a few of them explore models that predict ADRs from drug combinations. Further, as far as we know, none of them have developed models using transcriptomic data, specifically the LINCS L1000 drug-induced gene expression data to predict ADRs for drug combinations. In this study, we use the TWOSIDES database as a source of ADRs originating from two-drug combinations. 34,549 common drug pairs between these two databases were used to train an artificial neural network (ANN), to predict 243 ADRs that were induced by at least 10% of the drug pairs. Our model predicts the occurrence of these ADRs with an average accuracy of 82% across a multifold cross-validation.


Sujet(s)
Effets secondaires indésirables des médicaments , , Bases de données factuelles , Association médicamenteuse , Interactions médicamenteuses , Humains , Transcriptome
7.
Mol Inform ; 39(9): e2000013, 2020 09.
Article de Anglais | MEDLINE | ID: mdl-32390334

RÉSUMÉ

Computational approaches to analyze various drug/ compound centered analysis often present a need to map attributes from multiple drug databases. In this study, we provide a Neo4j repository that integrates two of the most prominent open source drug databases, DrugBank and ChEMBL, with a goal of establishing an integrated data visualization and analysis tool for drug discovery studies. The drugs present in DrugBank are mapped to their counterparts in ChEMBL. The integration of these resources and the harmonization using knowledge graph serialization using Neo4j lead to identification of relationships between drugs and other related features that are otherwise spread across two different resources. A common data format, a prerequisite to populate the Neo4j database, enables users to identify new relationships central to drug discovery research, like Drug Target Interactions (DTI). The resource is freely available at: https://github.com/ambf0632/CompoundDB4j.


Sujet(s)
Bases de données chimiques , Découverte de médicament/méthodes , Biologie informatique , Curation de données , Repositionnement des médicaments , Humains
8.
Brief Bioinform ; 20(1): 299-316, 2019 01 18.
Article de Anglais | MEDLINE | ID: mdl-29028878

RÉSUMÉ

Drug repurposing (a.k.a. drug repositioning) is the search for new indications or molecular targets distinct from a drug's putative activity, pharmacological effect or binding specificities. With the ever-increasing rates of termination of drugs in clinical trials, drug repositioning has risen as one of the effective solutions against the risk of drug failures. Repositioning finds a way to reverse the grim but real trend that Eroom's law portends for the pharmaceutical and biotech industry, and drug discovery in general. Further, the advent of high-throughput technologies to explore biological systems has enabled the generation of zeta bytes of data and a massive collection of databases that store them. Computational analytics and mining are frequently used as effective tools to explore this byzantine series of biological and biomedical data. However, advanced computational tools are often difficult to understand or use, thereby limiting their accessibility to scientists without a strong computational background. Hence it is of great importance to build user-friendly interfaces to extend the user-base beyond computational scientists, to include life scientists who may have deeper chemical and biological insights. This survey is focused on systematically presenting the available Web-based tools that aid in repositioning drugs.


Sujet(s)
Repositionnement des médicaments/méthodes , Internet , Logiciel , Algorithmes , Sites de fixation , Biologie informatique/méthodes , Bases de données pharmaceutiques/statistiques et données numériques , Découverte de médicament/méthodes , Découverte de médicament/statistiques et données numériques , Repositionnement des médicaments/statistiques et données numériques , Tests de criblage à haut débit/statistiques et données numériques , Humains , Ligands , Moteur de recherche
9.
Brief Bioinform ; 20(5): 1754-1768, 2019 09 27.
Article de Anglais | MEDLINE | ID: mdl-29931155

RÉSUMÉ

In recent years, the emphasis of scientific inquiry has shifted from whole-genome analyses to an understanding of cellular responses specific to tissue, developmental stage or environmental conditions. One of the central mechanisms underlying the diversity and adaptability of the contextual responses is alternative splicing (AS). It enables a single gene to encode multiple isoforms with distinct biological functions. However, to date, the functions of the vast majority of differentially spliced protein isoforms are not known. Integration of genomic, proteomic, functional, phenotypic and contextual information is essential for supporting isoform-based modeling and analysis. Such integrative proteogenomics approaches promise to provide insights into the functions of the alternatively spliced protein isoforms and provide high-confidence hypotheses to be validated experimentally. This manuscript provides a survey of the public databases supporting isoform-based biology. It also presents an overview of the potential global impact of AS on the human canonical gene functions, molecular interactions and cellular pathways.


Sujet(s)
Épissage alternatif , Isoformes de protéines/métabolisme , Biologie informatique , Bases de données de protéines , Humains
10.
Chemphyschem ; 17(23): 3831-3835, 2016 Dec 05.
Article de Anglais | MEDLINE | ID: mdl-27706880

RÉSUMÉ

Biomimicry is a strategy that makes practical use of evolution to find efficient and sustainable ways to produce chemical compounds or engineer products. Exploring the natural machinery of enzymes for the production of desired compounds is a highly profitable investment, but the design of efficient biomimetic systems remains a considerable challenge. An ideal biomimetic system self-assembles in solution, binds a desired range of substrates and catalyzes reactions with turnover rates similar to the native system. To this end, tailoring catalytic functionality in engineered peptides generally requires site-directed mutagenesis or the insertion of additional amino acids, which entails an intensive search across chemical and sequence space. Here we discuss a novel strategy for the computational design of biomimetic compounds and processes that consists of a) characterization of the wild-type and biomimetic systems; b) identification of key descriptors for optimization; c) an efficient search through sequence and chemical space to tailor the catalytic capabilities of the biomimetic system. Through this proof-of-principle study, we are able to decisively understand and identify whether a given scaffold is useful, appropriate and tailorable for a given, desired task.


Sujet(s)
Algorithmes , Matériaux biomimétiques/composition chimique , Dioxyde de carbone/composition chimique , Peptides/composition chimique , Peptides/génétique , Catalyse , Ingénierie des protéines , Eau/composition chimique
11.
Chimia (Aarau) ; 65(9): 667-71, 2011.
Article de Anglais | MEDLINE | ID: mdl-22026176

RÉSUMÉ

The Laboratory of Computational Chemistry and Biochemistry is active in the development and application of first-principles based simulations of complex chemical and biochemical phenomena. Here, we review some of our recent efforts in extending these methods to larger systems, longer time scales and increased accuracies. Their versatility is illustrated with a diverse range of applications, ranging from the determination of the gas phase structure of the cyclic decapeptide gramicidin S, to the study of G protein coupled receptors, the interaction of transition metal based anti-cancer agents with protein targets, the mechanism of action of DNA repair enzymes, the role of metal ions in neurodegenerative diseases and the computational design of dye-sensitized solar cells. Many of these projects are done in collaboration with experimental groups from the Institute of Chemical Sciences and Engineering (ISIC) at the EPFL.


Sujet(s)
Biologie informatique/méthodes , Biologie informatique/tendances , Modèles chimiques , Modèles moléculaires , Simulation de dynamique moléculaire/tendances , Conception de médicament , Conformation des protéines
12.
Genome Res ; 21(2): 216-26, 2011 Feb.
Article de Anglais | MEDLINE | ID: mdl-21177970

RÉSUMÉ

The Polycomb group (PcG) and Trithorax group (TrxG) of proteins are required for stable and heritable maintenance of repressed and active gene expression states. Their antagonistic function on gene control, repression for PcG and activity for TrxG, is mediated by binding to chromatin and subsequent epigenetic modification of target loci. Despite our broad knowledge about composition and enzymatic activities of the protein complexes involved, our understanding still lacks important mechanistic detail and a comprehensive view on target genes. In this study we use an extensive data set of ChIP-seq, RNA-seq, and genome-wide detection of transcription start sites (TSSs) to identify and analyze thousands of binding sites for the PcG proteins and Trithorax from a Drosophila S2 cell line. In addition of finding a preference for stalled promoter regions of annotated genes, we uncover many intergenic PcG binding sites coinciding with nonannotated TSSs. Interestingly, this set includes previously unknown promoters for primary transcripts of microRNA genes, thereby expanding the scope of Polycomb control to noncoding RNAs essential for development, apoptosis, and growth.


Sujet(s)
Drosophila melanogaster/génétique , Drosophila melanogaster/métabolisme , Régions promotrices (génétique) , Protéines de répression/métabolisme , Animaux , Cellules cultivées , Protéines chromosomiques nonhistones/génétique , Protéines chromosomiques nonhistones/métabolisme , Protéines de Drosophila/génétique , Protéines de Drosophila/métabolisme , Régulation de l'expression des gènes , microARN/génétique , microARN/métabolisme , Données de séquences moléculaires , Protéines du groupe Polycomb , ARN/génétique , ARN/métabolisme , Protéines de répression/génétique , Facteurs de transcription/génétique , Facteurs de transcription/métabolisme , Site d'initiation de la transcription
13.
Eur J Med Chem ; 45(12): 6147-51, 2010 Dec.
Article de Anglais | MEDLINE | ID: mdl-20884090

RÉSUMÉ

Pentamidine and its analogs constitute a class of compounds that are known to be active against Plasmodium falciparum, which causes the most dangerous malarial infection. Malaria is a widespread disease known to affect hundreds of millions of people and presents a perceivable threat of spreading. Hence, there is a need for well-defined scaffolds that lead to new, effective treatment. Here we present a pentamidine-based pharmacophore constructed using GALAHAD that would aid targeted synthesis of leads with enhanced properties, as well as the development of lead scaffolds. The study was supported by high-quality biological in vitro data of 22 compounds against the P. falciparum strains NF54 and K1. The model established reveals the importance of hydrophobic phenyl rings with polar oxygen and amidine substituents and the hydrophobic linking chain for the activity against malaria.


Sujet(s)
Antiprotozoaires/pharmacologie , Pentamidine/analogues et dérivés , Pentamidine/pharmacologie , Plasmodium falciparum/effets des médicaments et des substances chimiques , Antiprotozoaires/synthèse chimique , Antiprotozoaires/composition chimique , Modèles moléculaires , Structure moléculaire , Tests de sensibilité parasitaire , Pentamidine/synthèse chimique , Pentamidine/composition chimique , Stéréoisomérie
14.
BMC Bioinformatics ; 11: 471, 2010 Sep 20.
Article de Anglais | MEDLINE | ID: mdl-20854673

RÉSUMÉ

BACKGROUND: The enormous throughput and low cost of second-generation sequencing platforms now allow research and clinical geneticists to routinely perform single experiments that identify tens of thousands to millions of variant sites. Existing methods to annotate variant sites using information from publicly available databases via web browsers are too slow to be useful for the large sequencing datasets being routinely generated by geneticists. Because sequence annotation of variant sites is required before functional characterization can proceed, the lack of a high-throughput pipeline to efficiently annotate variant sites can act as a significant bottleneck in genetics research. RESULTS: SeqAnt (Sequence Annotator) is an open source web service and software package that rapidly annotates DNA sequence variants and identifies recessive or compound heterozygous loci in human, mouse, fly, and worm genome sequencing experiments. Variants are characterized with respect to their functional type, frequency, and evolutionary conservation. Annotated variants can be viewed on a web browser, downloaded in a tab-delimited text file, or directly uploaded in a BED format to the UCSC genome browser. To demonstrate the speed of SeqAnt, we annotated a series of publicly available datasets that ranged in size from 37 to 3,439,107 variant sites. The total time to completely annotate these data completely ranged from 0.17 seconds to 28 minutes 49.8 seconds. CONCLUSION: SeqAnt is an open source web service and software package that overcomes a critical bottleneck facing research and clinical geneticists using second-generation sequencing platforms. SeqAnt will prove especially useful for those investigators who lack dedicated bioinformatics personnel or infrastructure in their laboratories.


Sujet(s)
Génomique/méthodes , Annotation de séquence moléculaire/méthodes , Analyse de séquence d'ADN/méthodes , Logiciel , Animaux , Séquence nucléotidique , Bases de données génétiques , Variation génétique , Humains , Internet , Souris
15.
Ann Hum Genet ; 73(Pt 5): 502-13, 2009 Sep.
Article de Anglais | MEDLINE | ID: mdl-19573206

RÉSUMÉ

Novel methods of targeted sequencing of unique regions from complex eukaryotic genomes have generated a great deal of excitement, but critical demonstrations of these methods efficacy with respect to diploid genotype calling and experimental variation are lacking. To address this issue, we optimized microarray-based genomic selection (MGS) for use with the Illumina Genome Analyzer (IGA). A set of 202 fragments (304 kb total) contained within a 1.7 Mb genomic region on human chromosome X were MGS/IGA sequenced in ten female HapMap samples generating a total of 2.4 GB of DNA sequence. At a minimum coverage threshold of 5X, 93.9% of all bases and 94.9% of segregating sites were called, while 57.7% of bases (57.4% of segregating sites) were called at a 50X threshold. Data accuracy at known segregating sites was 98.9% at 5X coverage, rising to 99.6% at 50X coverage. Accuracy at homozygous sites was 98.7% at 5X sequence coverage and 99.5% at 50X coverage. Although accuracy at heterozygous sites was modestly lower, it was still over 92% at 5X coverage and increased to nearly 97% at 50X coverage. These data provide the first demonstration that MGS/IGA sequencing can generate the very high quality sequence data necessary for human genetics research. All sequences generated in this study have been deposited in NCBI Short Read Archive (http://www.ncbi.nlm.nih.gov/Traces/sra, Accession # SRA007913).


Sujet(s)
Génome humain , Séquençage par oligonucléotides en batterie/méthodes , Analyse de séquence d'ADN/méthodes , /génétique , Chromosomes X humains/génétique , Diploïdie , Femelle , Humains , /génétique
16.
J Am Chem Soc ; 131(22): 7618-25, 2009 Jun 10.
Article de Anglais | MEDLINE | ID: mdl-19445463

RÉSUMÉ

DB921 has a benzimidazole-biphenyl system with terminal amidines that gives the compound a linear conformation with a radius of curvature that does not match the DNA minor groove shape. Surprisingly, the compound binds in the groove with an unusually high equilibrium constant [Miao, Y.; Lee, M. P. H.; Parkinson, G. N.; Batista-Parra, A.; Ismail, M. A.; Neidle, S.; Boykin, D. W.; Wilson, W. D. Biochemistry 2005, 44, 14701-14708]. X-ray crystallographic analysis of DB921 bound to -AATT- in d(CGCGAATTCGCG)(2) showed that the benzimidazole is in position to directly interact with bases at the floor of the groove, while the phenylamidine of DB921 forms indirect contacts with the bases through an interfacial water. The DB921-water pair forms a curved, flexible module with a high K(a) (or a low K(d)) value of binding. To better understand the dynamics of the DB921-DNA complex and how water can be used in the design of compounds to recognize DNA, a 100 ns molecular dynamics simulation of the complex was conducted. In addition to the X-ray conformation, some significantly variant, dynamic conformations, which had additional interfacial water molecules between DB921 and DNA, appeared in the MD simulation. The benzimidazole contacts remained relatively constant through the entire simulation. The biphenylamidine end of the bound molecule, however, undergoes much larger changes in orientation relative to the floor of the groove as well as variations in the type of water interactions. The results provide an understanding of how water couples the linear DB921 compound to the minor groove for tight binding, without a large unfavorable contribution to the entropy of binding.


Sujet(s)
Benzimidazoles/composition chimique , Dérivés du biphényle/composition chimique , ADN/composition chimique , Pentamidine/composition chimique , Eau/composition chimique , Cristallographie aux rayons X , Entropie , Liaison hydrogène , Cinétique , Modèles moléculaires , Conformation d'acide nucléique
17.
Bioorg Med Chem ; 14(9): 3144-52, 2006 May 01.
Article de Anglais | MEDLINE | ID: mdl-16442293

RÉSUMÉ

African trypanosomes, Trypanosoma brucei rhodesiense (TBR) and Trypanosoma brucei gambiense (TBG), affect hundreds of thousands of lives in tropical regions of the world. The toxicity of the diamidine pentamidine, an effective drug against TBG, necessitates the design of better drugs. An orally effective prodrug of the diamidine, furamidine (DB75), presently scheduled for phase III clinical trials, has excellent activity against TBG with toxicity lower than that of pentamidine. As part of an effort to develop additional and improved diamidines against African trypanosomes, CoMFA and CoMSIA 3D QSAR analyses have been conducted with furamidine and a set of 25 other structurally related compounds. Two different alignment strategies, based on a putative kinetoplast DNA minor groove target, were used. Due to conserved electrostatic properties across the compounds, models that used only steric and electronic properties did not perform well in predicting biological results. An extended CoMSIA model with additional descriptors for hydrophobic, donor, and acceptor properties had good predictive ability with a q2=0.699, r2=0.974, SEE, standard error of estimate=0.1, and F=120.04. The results have been used as a guide to design compounds that, potentially, have better activity against African trypanosomes.


Sujet(s)
Antiparasitaires/composition chimique , Antiparasitaires/pharmacologie , Pentamidine/composition chimique , Pentamidine/pharmacologie , Relation quantitative structure-activité , Animaux , Cyclisation , Bases de données factuelles , Liaison hydrogène , Interactions hydrophobes et hydrophiles , Modèles moléculaires , Structure moléculaire , Électricité statique , Trypanosoma brucei brucei/composition chimique , Trypanosoma brucei brucei/effets des médicaments et des substances chimiques
SÉLECTION CITATIONS
DÉTAIL DE RECHERCHE
...