Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Bioinformatics ; 39(8)2023 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-37594752

RESUMEN

MOTIVATION: Increasing efforts are being made in the field of machine learning to advance the learning of robust and accurate models from experimentally measured data and enable more efficient drug discovery processes. The prediction of binding affinity is one of the most frequent tasks of compound bioactivity modelling. Learned models for binding affinity prediction are assessed by their average performance on unseen samples, but point predictions are typically not provided with a rigorous confidence assessment. Approaches, such as the conformal predictor framework equip conventional models with a more rigorous assessment of confidence for individual point predictions. In this article, we extend the inductive conformal prediction framework for interaction data, in particular the compound-target binding affinity prediction task. The new framework is based on dynamically defined calibration sets that are specific for each testing pair and provides prediction assessment in the context of calibration pairs from its compound-target neighbourhood, enabling improved estimates based on the local properties of the prediction model. RESULTS: The effectiveness of the approach is benchmarked on several publicly available datasets and tested in realistic use-case scenarios with increasing levels of difficulty on a complex compound-target binding affinity space. We demonstrate that in such scenarios, novel approach combining applicability domain paradigm with conformal prediction framework, produces superior confidence assessment with valid and more informative prediction regions compared to other 'state-of-the-art' conformal prediction approaches. AVAILABILITY AND IMPLEMENTATION: Dataset and the code are available on GitHub (https://github.com/mlkr-rbi/dAD).


Asunto(s)
Benchmarking , Descubrimiento de Drogas , Calibración , Aprendizaje Automático , Conformación Molecular
2.
Molecules ; 26(19)2021 Sep 23.
Artículo en Inglés | MEDLINE | ID: mdl-34641295

RESUMEN

Due to sedentary lifestyle and harsh environmental conditions, gorgonian coral extracts are recognized as a rich source of novel compounds with various biological activities, of interest to the pharmaceutical and cosmetic industries. The presented study aimed to perform chemical screening of organic extracts and semi-purified fractions obtained from the common Adriatic gorgonian, sea fan, Eunicella cavolini (Koch, 1887) and explore its abilities to exert different biological effects in vitro. Qualitative chemical evaluation revealed the presence of several classes of secondary metabolites extended with mass spectrometry analysis and tentative dereplication by using Global Natural Product Social Molecular Networking online platform (GNPS). Furthermore, fractions F4 and F3 showed the highest phenolic (3.28 ± 0.04 mg GAE/g sample) and carotene (23.11 ± 2.48 mg ß-CA/g sample) content, respectively. The fraction F3 inhibited 50% of DPPH (2,2-diphenyl-1-picryl-hydrazyl-hydrate) and ABTS (2,2'-azino-bis (3-ethylbenzthiazolin-6-yl) sulfonic acid) radicals at the concentrations of 767.09 ± 11.57 and 157.16 ± 10.83 µg/mL, respectively. The highest anti-inflammatory potential was exhibited by F2 (IC50 = 198.70 ± 28.77 µg/mL) regarding the inhibition of albumin denaturation and F1 (IC50 = 254.49 ± 49.17 µg/mL) in terms of soybean lipoxygenase inhibition. In addition, the most pronounced antiproliferative effects were observed for all samples (IC50 ranging from 0.82 ± 0.14-231.18 ± 46.13 µg/mL) against several carcinoma cell lines, but also towards non-transformed human fibroblasts pointing to a generally cytotoxic effect. In addition, the antibacterial activity was tested by broth microdilution assay against three human pathogenic bacteria: Escherichia coli, Pseudomonas aeruginosa, and Staphylococcus aureus. The latter was the most affected by fractions F2 and F3. Finally, further purification, isolation and characterization of pure compounds from the most active fractions are under investigation.


Asunto(s)
Antozoos/química , Antibacterianos/farmacología , Antiinflamatorios/farmacología , Antineoplásicos/farmacología , Antioxidantes/farmacología , Factores Biológicos/farmacología , Animales , Antibacterianos/química , Antibacterianos/aislamiento & purificación , Antiinflamatorios/química , Antiinflamatorios/aislamiento & purificación , Antineoplásicos/química , Antineoplásicos/aislamiento & purificación , Antioxidantes/química , Antioxidantes/aislamiento & purificación , Factores Biológicos/química , Factores Biológicos/aislamiento & purificación , Línea Celular Tumoral , Proliferación Celular/efectos de los fármacos , Supervivencia Celular/efectos de los fármacos , Escherichia coli/efectos de los fármacos , Células Hep G2 , Humanos , Células MCF-7 , Espectrometría de Masas , Pruebas de Sensibilidad Microbiana , Estructura Molecular , Pseudomonas aeruginosa , Metabolismo Secundario , Staphylococcus aureus/efectos de los fármacos
3.
Molecules ; 26(12)2021 Jun 08.
Artículo en Inglés | MEDLINE | ID: mdl-34201401

RESUMEN

The limited number of medicinal products available to treat of fungal infections makes control of fungal pathogens problematic, especially since the number of fungal resistance incidents increases. Given the high costs and slow development of new antifungal treatment options, repurposing of already known compounds is one of the proposed strategies. The objective of this study was to perform in vitro experimental tests of already identified lead compounds in our previous in silico drug repurposing study, which had been conducted on the known Drugbank database using a seven-step procedure which includes machine learning and molecular docking. This study identifies siramesine as a novel antifungal agent. This novel indication was confirmed through in vitro testing using several yeast species and one mold. The results showed susceptibility of Candida species to siramesine with MIC at concentration 12.5 µg/mL, whereas other candidates had no antifungal activity. Siramesine was also effective against in vitro biofilm formation and already formed biofilm was reduced following 24 h treatment with a MBEC range of 50-62.5 µg/mL. Siramesine is involved in modulation of ergosterol biosynthesis in vitro, which indicates it is a potential target for its antifungal activity. This implicates the possibility of siramesine repurposing, especially since there are already published data about nontoxicity. Following our in vitro results, we provide additional in depth in silico analysis of siramesine and compounds structurally similar to siramesine, providing an extended lead set for further preclinical and clinical investigation, which is needed to clearly define molecular targets and to elucidate its in vivo effectiveness as well.


Asunto(s)
Antifúngicos/química , Antifúngicos/farmacología , Indoles/química , Indoles/farmacología , Compuestos de Espiro/química , Compuestos de Espiro/farmacología , Biopelículas/efectos de los fármacos , Candida/efectos de los fármacos , Simulación por Computador , Reposicionamiento de Medicamentos/métodos , Ergosterol/metabolismo , Aprendizaje Automático , Simulación del Acoplamiento Molecular/métodos
4.
Molecules ; 26(14)2021 Jul 16.
Artículo en Inglés | MEDLINE | ID: mdl-34299598

RESUMEN

In this work we introduce a novel filtering and molecular modeling pipeline based on a fingerprint and descriptor similarity procedure, coupled with molecular docking and molecular dynamics (MD), to select potential novel quoinone outside inhibitors (QoI) of cytochrome bc1 with the aim of determining the same or different chromophores to usual. The study was carried out using the yeast cytochrome bc1 complex with its docked ligand (stigmatellin), using all the fungicides from FRAC code C3 mode of action, 8617 Drugbank compounds and 401,624 COCONUT compounds. The introduced drug repurposing pipeline consists of compound similarity with C3 fungicides and molecular docking (MD) simulations with final QM/MM binding energy determination, while aiming for potential novel chromophores and perserving at least an amide (R1HN(C=O)R2) or ester functional group of almost all up to date C3 fungicides. 3D descriptors used for a similarity test were based on the 280 most stable Padel descriptors. Hit compounds that passed fingerprint and 3D descriptor similarity condition and had either an amide or an ester group were submitted to docking where they further had to satisfy both Chemscore fitness and specific conformation constraints. This rigorous selection resulted in a very limited number of candidates that were forwarded to MD simulations and QM/MM binding affinity estimations by the ORCA DFT program. In this final step, stringent criteria based on (a) sufficiently high frequency of H-bonds; (b) high interaction energy between protein and ligand through the whole MD trajectory; and (c) high enough QM/MM binding energy scores were applied to further filter candidate inhibitors. This elaborate search pipeline led finaly to four Drugbank synthetic lead compounds (DrugBank) and seven natural (COCONUT database) lead compounds-tentative new inhibitors of cytochrome bc1. These eleven lead compounds were additionally validated through a comparison of MM/PBSA free binding energy for new leads against those obtatined for 19 QoIs.


Asunto(s)
Complejo III de Transporte de Electrones/antagonistas & inhibidores , Inhibidores Enzimáticos/química , Simulación del Acoplamiento Molecular , Simulación de Dinámica Molecular , Proteínas de Saccharomyces cerevisiae/antagonistas & inhibidores , Saccharomyces cerevisiae/enzimología , Evaluación Preclínica de Medicamentos , Complejo III de Transporte de Electrones/química , Proteínas de Saccharomyces cerevisiae/química
5.
Sci Rep ; 11(1): 11479, 2021 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-34075109

RESUMEN

Widespread use of herbicides results in the global increase in weed resistance. The rotational use of herbicides according to their modes of action (MoAs) and discovery of novel phytotoxic molecules are the two strategies used against the weed resistance. Herein, Random Forest modeling was used to build predictive models and establish comprehensive characterization of structure-activity relationships underlying herbicide classifications according to their MoAs and weed selectivity. By combining the predictive models with herbicide-likeness rules defined by selected molecular features (numbers of H-bond acceptors and donors, logP, topological and relative polar surface area, and net charge), the virtual stepwise screening platform is proposed for characterization of small weight molecules for their phytotoxic properties. The screening cascade was applied on the data set of phytotoxic natural products. The obtained results may be valuable for refinement of herbicide rotational program as well as for discovery of novel herbicides primarily among natural products as a source for molecules of novel structures and novel modes of action and translocation profiles as compared with the synthetic compounds.

6.
Entropy (Basel) ; 22(8)2020 Aug 18.
Artículo en Inglés | MEDLINE | ID: mdl-33286675

RESUMEN

Machines usually employ a guess-and-check strategy to analyze data: they take the data, make a guess, check the answer, adjust it with regard to the correct one if necessary, and try again on a new data set. An active learning environment guarantees better performance while training on less, but carefully chosen, data which reduces the costs of both annotating and analyzing large data sets. This issue becomes even more critical for deep learning applications. Human-like active learning integrates a variety of strategies and instructional models chosen by a teacher to contribute to learners' knowledge, while machine active learning strategies lack versatile tools for shifting the focus of instruction away from knowledge transmission to learners' knowledge construction. We approach this gap by considering an active learning environment in an educational setting. We propose a new strategy that measures the information capacity of data using the information function from the four-parameter logistic item response theory (4PL IRT). We compared the proposed strategy with the most common active learning strategies-Least Confidence and Entropy Sampling. The results of computational experiments showed that the Information Capacity strategy shares similar behavior but provides a more flexible framework for building transparent knowledge models in deep learning.

7.
Molecules ; 25(9)2020 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-32397151

RESUMEN

Novel machine learning and molecular modelling filtering procedures for drug repurposing have been carried out for the recognition of the novel fungicide targets of Cyp51 and Erg2. Classification and regression approaches on molecular descriptors have been performed using stepwise multilinear regression (FS-MLR), uninformative-variable elimination partial-least square regression, and a non-linear method called Forward Stepwise Limited Correlation Random Forest (FS-LM-RF). Altogether, 112 prediction models from two different approaches have been built for the descriptor recognition of fungicide hit compounds. Aiming at the fungal targets of sterol biosynthesis in membranes, antifungal hit compounds have been selected for docking experiments from the Drugbank database using the Autodock4 molecular docking program. The results were verified by Gold Protein-Ligand Docking Software. The best-docked conformation, for each high-scored ligand considered, was submitted to quantum mechanics/molecular mechanics (QM/MM) gradient optimization with final single point calculations taking into account both the basis set superposition error and thermal corrections (with frequency calculations). Finally, seven Drugbank lead compounds were selected based on their high QM/MM scores for the Cyp51 target, and three were selected for the Erg2 target. These lead compounds could be recommended for further in vitro studies.


Asunto(s)
Fungicidas Industriales/química , Aprendizaje Automático , Simulación del Acoplamiento Molecular , Simulación de Dinámica Molecular , Programas Informáticos , Flujo de Trabajo
8.
Sci Rep ; 9(1): 19537, 2019 12 20.
Artículo en Inglés | MEDLINE | ID: mdl-31863070

RESUMEN

Genes with similar roles in the cell cluster on chromosomes, thus benefiting from coordinated regulation. This allows gene function to be inferred by transferring annotations from genomic neighbors, following the guilt-by-association principle. We performed a systematic search for co-occurrence of >1000 gene functions in genomic neighborhoods across 1669 prokaryotic, 49 fungal and 80 metazoan genomes, revealing prevalent patterns that cannot be explained by clustering of functionally similar genes. It is a very common occurrence that pairs of dissimilar gene functions - corresponding to semantically distant Gene Ontology terms - are significantly co-located on chromosomes. These neighborhood associations are often as conserved across genomes as the known associations between similar functions, suggesting selective benefits from clustering of certain diverse functions, which may conceivably play complementary roles in the cell. We propose a simple encoding of chromosomal gene order, the neighborhood function profiles (NFP), which draws on diverse gene clustering patterns to predict gene function and phenotype. NFPs yield a 26-46% increase in predictive power over state-of-the-art approaches that propagate function across neighborhoods, thus providing hundreds of novel, high-confidence gene function inferences per genome. Furthermore, we demonstrate that copy number-neutral structural variation that shapes gene function distribution across chromosomes can predict phenotype of individuals from their genome sequence.


Asunto(s)
Familia de Multigenes/genética , Mapeo Cromosómico , Ontología de Genes , Orden Génico/genética , Fenotipo
9.
PLoS One ; 14(7): e0219004, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31276469

RESUMEN

Recent research in machine learning pointed to the core problem of state-of-the-art models which impedes their widespread adoption in different domains. The models' inability to differentiate between noise and subtle, yet significant variation in data leads to their vulnerability to adversarial perturbations that cause wrong predictions with high confidence. The study is aimed at identifying whether the algorithms inspired by biological evolution may achieve better results in cases where brittle robustness properties are highly sensitive to the slight noise. To answer this question, we introduce the new robust gradient descent inspired by the stability and adaptability of biological systems to unknown and changing environments. The proposed optimization technique involves an open-ended adaptation process with regard to two hyperparameters inherited from the generalized Verhulst population growth equation. The hyperparameters increase robustness to adversarial noise by penalizing the degree to which hardly visible changes in gradients impact prediction. The empirical evidence on synthetic and experimental datasets confirmed the viability of the bio-inspired gradient descent and suggested promising directions for future research. The code used for computational experiments is provided in a repository at https://github.com/yukinoi/bio_gradient_descent.


Asunto(s)
Aprendizaje Automático , Modelos Teóricos
10.
Microbiome ; 6(1): 129, 2018 07 10.
Artículo en Inglés | MEDLINE | ID: mdl-29991352

RESUMEN

BACKGROUND: The function of many genes is still not known even in model organisms. An increasing availability of microbiome DNA sequencing data provides an opportunity to infer gene function in a systematic manner. RESULTS: We evaluated if the evolutionary signal contained in metagenome phyletic profiles (MPP) is predictive of a broad array of gene functions. The MPPs are an encoding of environmental DNA sequencing data that consists of relative abundances of gene families across metagenomes. We find that such MPPs can accurately predict 826 Gene Ontology functional categories, while drawing on human gut microbiomes, ocean metagenomes, and DNA sequences from various other engineered and natural environments. Overall, in this task, the MPPs are highly accurate, and moreover they provide coverage for a set of Gene Ontology terms largely complementary to standard phylogenetic profiles, derived from fully sequenced genomes. We also find that metagenomes approximated from taxon relative abundance obtained via 16S rRNA gene sequencing may provide surprisingly useful predictive models. Crucially, the MPPs derived from different types of environments can infer distinct, non-overlapping sets of gene functions and therefore complement each other. Consistently, simulations on > 5000 metagenomes indicate that the amount of data is not in itself critical for maximizing predictive accuracy, while the diversity of sampled environments appears to be the critical factor for obtaining robust models. CONCLUSIONS: In past work, metagenomics has provided invaluable insight into ecology of various habitats, into diversity of microbial life and also into human health and disease mechanisms. We propose that environmental DNA sequencing additionally constitutes a useful tool to predict biological roles of genes, yielding inferences out of reach for existing comparative genomics approaches.


Asunto(s)
Metagenómica/métodos , Familia de Multigenes , Análisis de Secuencia de ADN/métodos , Evolución Molecular , Ontología de Genes , Genómica , Humanos , Filogenia , ARN Ribosómico 16S/genética , Microbiología del Agua
11.
PLoS One ; 12(10): e0187364, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29088293

RESUMEN

Based on a set of subjects and a collection of attributes obtained from the Alzheimer's Disease Neuroimaging Initiative database, we used redescription mining to find interpretable rules revealing associations between those determinants that provide insights about the Alzheimer's disease (AD). We extended the CLUS-RM redescription mining algorithm to a constraint-based redescription mining (CBRM) setting, which enables several modes of targeted exploration of specific, user-constrained associations. Redescription mining enabled finding specific constructs of clinical and biological attributes that describe many groups of subjects of different size, homogeneity and levels of cognitive impairment. We confirmed some previously known findings. However, in some instances, as with the attributes: testosterone, ciliary neurotrophic factor, brain natriuretic peptide, Fas ligand, the imaging attribute Spatial Pattern of Abnormalities for Recognition of Early AD, as well as the levels of leptin and angiopoietin-2 in plasma, we corroborated previously debatable findings or provided additional information about these variables and their association with AD pathogenesis. Moreover, applying redescription mining on ADNI data resulted with the discovery of one largely unknown attribute: the Pregnancy-Associated Protein-A (PAPP-A), which we found highly associated with cognitive impairment in AD. Statistically significant correlations (p ≤ 0.01) were found between PAPP-A and clinical tests: Alzheimer's Disease Assessment Scale, Clinical Dementia Rating Sum of Boxes, Mini Mental State Examination, etc. The high importance of this finding lies in the fact that PAPP-A is a metalloproteinase, known to cleave insulin-like growth factor binding proteins. Since it also shares similar substrates with A Disintegrin and the Metalloproteinase family of enzymes that act as α-secretase to physiologically cleave amyloid precursor protein (APP) in the non-amyloidogenic pathway, it could be directly involved in the metabolism of APP very early during the disease course. Therefore, further studies should investigate the role of PAPP-A in the development of AD more thoroughly.


Asunto(s)
Enfermedad de Alzheimer/patología , Trastornos del Conocimiento/patología , Algoritmos , Humanos
12.
Nucleic Acids Res ; 44(21): 10074-10090, 2016 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-27915291

RESUMEN

Bacteria and Archaea display a variety of phenotypic traits and can adapt to diverse ecological niches. However, systematic annotation of prokaryotic phenotypes is lacking. We have therefore developed ProTraits, a resource containing ∼545 000 novel phenotype inferences, spanning 424 traits assigned to 3046 bacterial and archaeal species. These annotations were assigned by a computational pipeline that associates microbes with phenotypes by text-mining the scientific literature and the broader World Wide Web, while also being able to define novel concepts from unstructured text. Moreover, the ProTraits pipeline assigns phenotypes by drawing extensively on comparative genomics, capturing patterns in gene repertoires, codon usage biases, proteome composition and co-occurrence in metagenomes. Notably, we find that gene synteny is highly predictive of many phenotypes, and highlight examples of gene neighborhoods associated with spore-forming ability. A global analysis of trait interrelatedness outlined clusters in the microbial phenotype network, suggesting common genetic underpinnings. Our extended set of phenotype annotations allows detection of 57 088 high confidence gene-trait links, which recover many known associations involving sporulation, flagella, catalase activity, aerobicity, photosynthesis and other traits. Over 99% of the commonly occurring gene families are involved in genetic interactions conditional on at least one phenotype, suggesting that epistasis has a major role in shaping microbial gene content.


Asunto(s)
Archaea/genética , Bacterias/genética , Bases de Datos Genéticas , Fenotipo , Codón , Biología Computacional/métodos , Minería de Datos , Genes Arqueales , Genes Bacterianos , Genoma Arqueal , Genoma Bacteriano , Metagenoma , Anotación de Secuencia Molecular , Herencia Multifactorial , Reproducibilidad de los Resultados
13.
Bioinformatics ; 32(23): 3645-3653, 2016 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-27522084

RESUMEN

MOTIVATION: The number of sequenced genomes rises steadily but we still lack the knowledge about the biological roles of many genes. Automated function prediction (AFP) is thus a necessity. We hypothesized that AFP approaches that draw on distinct genome features may be useful for predicting different types of gene functions, motivating a systematic analysis of the benefits gained by obtaining and integrating such predictions. RESULTS: Our pipeline amalgamates 5 133 543 genes from 2071 genomes in a single massive analysis that evaluates five established genomic AFP methodologies. While 1227 Gene Ontology (GO) terms yielded reliable predictions, the majority of these functions were accessible to only one or two of the methods. Moreover, different methods tend to assign a GO term to non-overlapping sets of genes. Thus, inferences made by diverse genomic AFP methods display a striking complementary, both gene-wise and function-wise. Because of this, a viable integration strategy is to rely on a single most-confident prediction per gene/function, rather than enforcing agreement across multiple AFP methods. Using an information-theoretic approach, we estimate that current databases contain 29.2 bits/gene of known Escherichia coli gene functions. This can be increased by up to 5.5 bits/gene using individual AFP methods or by 11 additional bits/gene upon integration, thereby providing a highly-ranking predictor on the Critical Assessment of Function Annotation 2 community benchmark. Availability of more sequenced genomes boosts the predictive accuracy of AFP approaches and also the benefit from integrating them. AVAILABILITY AND IMPLEMENTATION: The individual and integrated GO predictions for the complete set of genes are available from http://gorbi.irb.hr/ CONTACT: fran.supek@irb.hrSupplementary information: Supplementary materials are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Ontología de Genes , Genómica/métodos , Algoritmos , Genoma , Aprendizaje Automático , Modelos Teóricos
14.
Phys Rev Lett ; 114(24): 248701, 2015 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-26197016

RESUMEN

Detection of patient zero can give new insights to epidemiologists about the nature of first transmissions into a population. In this Letter, we study the statistical inference problem of detecting the source of epidemics from a snapshot of spreading on an arbitrary network structure. By using exact analytic calculations and Monte Carlo estimators, we demonstrate the detectability limits for the susceptible-infected-recovered model, which primarily depend on the spreading process characteristics. Finally, we demonstrate the applicability of the approach in a case of a simulated sexually transmitted infection spreading over an empirical temporal network of sexual interactions.


Asunto(s)
Trazado de Contacto/métodos , Modelos Estadísticos , Enfermedades de Transmisión Sexual/epidemiología , Simulación por Computador , Métodos Epidemiológicos , Humanos , Método de Montecarlo , Enfermedades de Transmisión Sexual/transmisión
15.
Sci Rep ; 4: 5038, 2014 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-24849598

RESUMEN

Motivated by recent financial crises, significant research efforts have been put into studying contagion effects and herding behaviour in financial markets. Much less has been said regarding the influence of financial news on financial markets. We propose a novel measure of collective behaviour based on financial news on the Web, the News Cohesiveness Index (NCI), and we demonstrate that the index can be used as a financial market volatility indicator. We evaluate the NCI using financial documents from large Web news sources on a daily basis from October 2011 to July 2013 and analyse the interplay between financial markets and finance-related news. We hypothesise that strong cohesion in financial news reflects movements in the financial markets. Our results indicate that cohesiveness in financial news is highly correlated with and driven by volatility in financial markets.

16.
J Med Chem ; 56(14): 5691-708, 2013 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-23772653

RESUMEN

P-glycoprotein (P-gp, MDR1) is a promiscuous drug efflux pump of substantial pharmacological importance. Taking advantage of large-scale cytotoxicity screening data involving 60 cancer cell lines, we correlated the differential biological activities of ∼13,000 compounds against cellular P-gp levels. We created a large set of 934 high-confidence P-gp substrates or nonsubstrates by enforcing agreement with an orthogonal criterion involving P-gp overexpressing ADR-RES cells. A support vector machine (SVM) was 86.7% accurate in discriminating P-gp substrates on independent test data, exceeding previous models. Two molecular features had an overarching influence: nearly all P-gp substrates were large (>35 atoms including H) and dense (specific volume of <7.3 Å(3)/atom) molecules. Seven other descriptors and 24 molecular fragments ("effluxophores") were found enriched in the (non)substrates and incorporated into interpretable rule-based models. Biological experiments on an independent P-gp overexpressing cell line, the vincristine-resistant VK2, allowed us to reclassify six compounds previously annotated as substrates, validating our method's predictive ability. Models are freely available at http://pgp.biozyne.com .


Asunto(s)
Miembro 1 de la Subfamilia B de Casetes de Unión a ATP/metabolismo , Ensayos de Selección de Medicamentos Antitumorales , Miembro 1 de la Subfamilia B de Casetes de Unión a ATP/química , Línea Celular Tumoral , Humanos , Relación Estructura-Actividad Cuantitativa , Vincristina/farmacología
17.
PLoS Comput Biol ; 9(1): e1002852, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23308060

RESUMEN

New microbial genomes are sequenced at a high pace, allowing insight into the genetics of not only cultured microbes, but a wide range of metagenomic collections such as the human microbiome. To understand the deluge of genomic data we face, computational approaches for gene functional annotation are invaluable. We introduce a novel model for computational annotation that refines two established concepts: annotation based on homology and annotation based on phyletic profiling. The phyletic profiling-based model that includes both inferred orthologs and paralogs-homologs separated by a speciation and a duplication event, respectively-provides more annotations at the same average Precision than the model that includes only inferred orthologs. For experimental validation, we selected 38 poorly annotated Escherichia coli genes for which the model assigned one of three GO terms with high confidence: involvement in DNA repair, protein translation, or cell wall synthesis. Results of antibiotic stress survival assays on E. coli knockout mutants showed high agreement with our model's estimates of accuracy: out of 38 predictions obtained at the reported Precision of 60%, we confirmed 25 predictions, indicating that our confidence estimates can be used to make informed decisions on experimental validation. Our work will contribute to making experimental validation of computational predictions more approachable, both in cost and time. Our predictions for 998 prokaryotic genomes include ~400000 specific annotations with the estimated Precision of 90%, ~19000 of which are highly specific-e.g. "penicillin binding," "tRNA aminoacylation for protein translation," or "pathogenesis"-and are freely available at http://gorbi.irb.hr/.


Asunto(s)
Perfilación de la Expresión Génica , Filogenia , Escherichia coli/genética , Genes Bacterianos , Modelos Teóricos
18.
Invest New Drugs ; 30(2): 450-67, 2012 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-21046426

RESUMEN

Six recently synthesized cyano-substituted heteroaryles, which do not bind to DNA but are highly cytotoxic against the human tumor cell line HeLa, were analyzed for their antitumor mechanisms of action (MOA). They did not interfere with the expression of human papillomavirus oncogenes integrated in the HeLa cell genome, but they did induce strong G1 arrest and result in the activation of caspase-3 and apoptosis. A computational analysis was performed that compared the antiproliferative activities of our compounds in 13 different tumor cell lines with those of compounds listed in the National Cancer Institute database. The results indicate that interference with cytoskeletal function and inhibition of mitosis are the likely antitumor MOA. Furthermore, a second in silico investigation revealed that the tumor cells that are sensitive to the cyano-substituted compounds show differences in their expression of locomotion genes compared with that of insensitive cell lines, thus corroborating the involvement of the cytoskeleton. This MOA was also confirmed experimentally: the cyano-substituted heteroaryles disrupted the actin and the tubulin networks in HeLa cells and inhibited cellular migration. However, further analysis indicated that multiple MOA may exist that depend on the position of the cyano-group; while cyano-substituted naphthiophene reduced the expression of cytoskeletal proteins, cyano-substituted thieno-thiophene-carboxanilide inhibited the formation of cellular reactive oxygen species.


Asunto(s)
Proliferación Celular/efectos de los fármacos , Compuestos Heterocíclicos/farmacología , Neoplasias del Cuello Uterino/patología , Apoptosis/efectos de los fármacos , Caspasa 3/metabolismo , Ciclo Celular/efectos de los fármacos , Movimiento Celular/efectos de los fármacos , Simulación por Computador , Inhibidor p21 de las Quinasas Dependientes de la Ciclina/metabolismo , Inhibidor p27 de las Quinasas Dependientes de la Ciclina/metabolismo , Proteínas del Citoesqueleto/metabolismo , Citoesqueleto/efectos de los fármacos , Citoesqueleto/metabolismo , Relación Dosis-Respuesta a Droga , Femenino , Regulación Viral de la Expresión Génica/efectos de los fármacos , Células HCT116 , Células HL-60 , Células HT29 , Células HeLa , Compuestos Heterocíclicos/síntesis química , Humanos , Mitosis/efectos de los fármacos , Papillomaviridae/genética , Poli(ADP-Ribosa) Polimerasa-1 , Poli(ADP-Ribosa) Polimerasas/metabolismo , Especies Reactivas de Oxígeno/metabolismo , Factores de Tiempo , Transcripción Genética/efectos de los fármacos , Proteína p53 Supresora de Tumor/metabolismo , Neoplasias del Cuello Uterino/genética , Neoplasias del Cuello Uterino/metabolismo , Neoplasias del Cuello Uterino/virología
19.
PLoS One ; 6(7): e21800, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21789182

RESUMEN

Outcomes of high-throughput biological experiments are typically interpreted by statistical testing for enriched gene functional categories defined by the Gene Ontology (GO). The resulting lists of GO terms may be large and highly redundant, and thus difficult to interpret.REVIGO is a Web server that summarizes long, unintelligible lists of GO terms by finding a representative subset of the terms using a simple clustering algorithm that relies on semantic similarity measures. Furthermore, REVIGO visualizes this non-redundant GO term set in multiple ways to assist in interpretation: multidimensional scaling and graph-based visualizations accurately render the subdivisions and the semantic relationships in the data, while treemaps and tag clouds are also offered as alternative views. REVIGO is freely available at http://revigo.irb.hr/.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Anotación de Secuencia Molecular , Regulación de la Expresión Génica , Humanos , Internet , Programas Informáticos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Interfaz Usuario-Computador
20.
Eur J Med Chem ; 46(8): 3444-54, 2011 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-21628081

RESUMEN

18-crown-6 ethers are known to exert their biological activity by transporting K(+) ions across cell membranes. Using non-linear Support Vector Machines regression, we searched for structural features that influence antiproliferative activity in a diverse set of 19 known oxa-, monoaza- and diaza-18-crown-6 ethers. Here, we show that the logP of the molecule is the most important molecular descriptor, among ∼1300 tested descriptors, in determining biological potency (R(2)(cv) = 0.704). The optimal logP was at 5.5 (Ghose-Crippen ALOGP estimate) while both higher and lower values were detrimental to biological potency. After controlling for logP, we found that the antiproliferative activity of the molecule was generally not affected by side chain length, molecular symmetry, or presence of side chain amide links. To validate this QSAR model, we synthesized six novel, highly lipophilic diaza-18-crown-6 derivatives with adamantane moieties attached to the side arms. These compounds have near-optimal logP values and consequently exhibit strong growth inhibition in various human cancer cell lines and a bacterial system. The bioactivities of different diaza-18-crown-6 analogs in Bacillus subtilis and cancer cells were correlated, suggesting conserved molecular features may be mediating the cytotoxic response. We conclude that relying primarily on the logP is a sensible strategy in preparing future 18-crown-6 analogs with optimized biological activity.


Asunto(s)
Adamantano/química , Antineoplásicos/síntesis química , Bacillus subtilis/efectos de los fármacos , Ciclo Celular/efectos de los fármacos , Supervivencia Celular/efectos de los fármacos , Éteres Corona/síntesis química , Interacciones Hidrofóbicas e Hidrofílicas , Algoritmos , Antineoplásicos/farmacología , Bacillus subtilis/crecimiento & desarrollo , Línea Celular Tumoral , Éteres Corona/farmacología , Diseño de Fármacos , Ensayos de Selección de Medicamentos Antitumorales , Escherichia coli/efectos de los fármacos , Escherichia coli/crecimiento & desarrollo , Éteres/química , Femenino , Humanos , Concentración 50 Inhibidora , Modelos Moleculares , Neoplasias/tratamiento farmacológico , Neoplasias/patología , Relación Estructura-Actividad Cuantitativa , Programas Informáticos , Especificidad de la Especie , Relación Estructura-Actividad
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...