Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 28
Filtrar
1.
J Clin Endocrinol Metab ; 109(2): 402-412, 2024 Jan 18.
Artículo en Inglés | MEDLINE | ID: mdl-37683082

RESUMEN

CONTEXT: Thyroid nodule ultrasound-based risk stratification schemas rely on the presence of high-risk sonographic features. However, some malignant thyroid nodules have benign appearance on thyroid ultrasound. New methods for thyroid nodule risk assessment are needed. OBJECTIVE: We investigated polygenic risk score (PRS) accounting for inherited thyroid cancer risk combined with ultrasound-based analysis for improved thyroid nodule risk assessment. METHODS: The convolutional neural network classifier was trained on thyroid ultrasound still images and cine clips from 621 thyroid nodules. Phenome-wide association study (PheWAS) and PRS PheWAS were used to optimize PRS for distinguishing benign and malignant nodules. PRS was evaluated in 73 346 participants in the Colorado Center for Personalized Medicine Biobank. RESULTS: When the deep learning model output was combined with thyroid cancer PRS and genetic ancestry estimates, the area under the receiver operating characteristic curve (AUROC) of the benign vs malignant thyroid nodule classifier increased from 0.83 to 0.89 (DeLong, P value = .007). The combined deep learning and genetic classifier achieved a clinically relevant sensitivity of 0.95, 95% CI [0.88-0.99], specificity of 0.63 [0.55-0.70], and positive and negative predictive values of 0.47 [0.41-0.58] and 0.97 [0.92-0.99], respectively. AUROC improvement was consistent in European ancestry-stratified analysis (0.83 and 0.87 for deep learning and deep learning combined with PRS classifiers, respectively). Elevated PRS was associated with a greater risk of thyroid cancer structural disease recurrence (ordinal logistic regression, P value = .002). CONCLUSION: Augmenting ultrasound-based risk assessment with PRS improves diagnostic accuracy.


Asunto(s)
Neoplasias de la Tiroides , Nódulo Tiroideo , Humanos , Nódulo Tiroideo/diagnóstico por imagen , Nódulo Tiroideo/genética , Puntuación de Riesgo Genético , Sensibilidad y Especificidad , Recurrencia Local de Neoplasia , Neoplasias de la Tiroides/diagnóstico por imagen , Neoplasias de la Tiroides/genética , Ultrasonografía/métodos
2.
J Clin Med ; 12(17)2023 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-37685502

RESUMEN

While pediatric COVID-19 is rarely severe, a small fraction of children infected with SARS-CoV-2 go on to develop multisystem inflammatory syndrome (MIS-C), with substantial morbidity. An objective method with high specificity and high sensitivity to identify current or imminent MIS-C in children infected with SARS-CoV-2 is highly desirable. The aim was to learn about an interpretable novel cytokine/chemokine assay panel providing such an objective classification. This retrospective study was conducted on four groups of pediatric patients seen at multiple sites of Texas Children's Hospital, Houston, TX who consented to provide blood samples to our COVID-19 Biorepository. Standard laboratory markers of inflammation and a novel cytokine/chemokine array were measured in blood samples of all patients. Group 1 consisted of 72 COVID-19, 70 MIS-C and 63 uninfected control patients seen between May 2020 and January 2021 and predominantly infected with pre-alpha variants. Group 2 consisted of 29 COVID-19 and 43 MIS-C patients seen between January and May 2021 infected predominantly with the alpha variant. Group 3 consisted of 30 COVID-19 and 32 MIS-C patients seen between August and October 2021 infected with alpha and/or delta variants. Group 4 consisted of 20 COVID-19 and 46 MIS-C patients seen between October 2021 andJanuary 2022 infected with delta and/or omicron variants. Group 1 was used to train an L1-regularized logistic regression model which was tested using five-fold cross validation, and then separately validated against the remaining naïve groups. The area under receiver operating curve (AUROC) and F1-score were used to quantify the performance of the cytokine/chemokine assay-based classifier. Standard laboratory markers predict MIS-C with a five-fold cross-validated AUROC of 0.86 ± 0.05 and an F1 score of 0.78 ± 0.07, while the cytokine/chemokine panel predicted MIS-C with a five-fold cross-validated AUROC of 0.95 ± 0.02 and an F1 score of 0.91 ± 0.04, with only sixteen of the forty-five cytokines/chemokines sufficient to achieve this performance. Tested on Group 2 the cytokine/chemokine panel yielded AUROC = 0.98 and F1 = 0.93, on Group 3 it yielded AUROC = 0.89 and F1 = 0.89, and on Group 4 AUROC = 0.99 and F1 = 0.97. Adding standard laboratory markers to the cytokine/chemokine panel did not improve performance. A top-10 subset of these 16 cytokines achieves equivalent performance on the validation data sets. Our findings demonstrate that a sixteen-cytokine/chemokine panel as well as the top ten subset provides a highly sensitive, and specific method to identify MIS-C in patients infected with SARS-CoV-2 of all the major variants identified to date.

3.
medRxiv ; 2023 Apr 04.
Artículo en Inglés | MEDLINE | ID: mdl-37066407

RESUMEN

An objective method to identify imminent or current Multi-Inflammatory Syndrome in Children (MIS-C) infected with SARS-CoV-2 is highly desirable. The aims was to define an algorithmically interpreted novel cytokine/chemokine assay panel providing such an objective classification. This study was conducted on 4 groups of patients seen at multiple sites of Texas Children's Hospital, Houston, TX who consented to provide blood samples to our COVID-19 Biorepository. Standard laboratory markers of inflammation and a novel cytokine/chemokine array were measured in blood samples of all patients. Group 1 consisted of 72 COVID-19, 66 MIS-C and 63 uninfected control patients seen between May 2020 and January 2021 and predominantly infected with pre-alpha variants. Group 2 consisted of 29 COVID-19 and 43 MIS-C patients seen between January-May 2021 infected predominantly with the alpha variant. Group 3 consisted of 30 COVID-19 and 32 MIS-C patients seen between August-October 2021 infected with alpha and/or delta variants. Group 4 consisted of 20 COVID-19 and 46 MIS-C patients seen between October 2021-January 2022 infected with delta and/or omicron variants. Group 1 was used to train a L1-regularized logistic regression model which was validated using 5-fold cross validation, and then separately validated against the remaining naïve groups. The area under receiver operating curve (AUROC) and F1-score were used to quantify the performance of the algorithmically interpreted cytokine/chemokine assay panel. Standard laboratory markers predict MIS-C with a 5-fold cross-validated AUROC of 0.86 ± 0.05 and an F1 score of 0.78 ± 0.07, while the cytokine/chemokine panel predicted MIS-C with a 5-fold cross-validated AUROC of 0.95 ± 0.02 and an F1 score of 0.91 ± 0.04, with only sixteen of the forty-five cytokines/chemokines sufficient to achieve this performance. Tested on Group 2 the cytokine/chemokine panel yielded AUROC =0.98, F1=0.93, on Group 3 it yielded AUROC=0.89, F1 = 0.89, and on Group 4 AUROC= 0.99, F1= 0.97). Adding standard laboratory markers to the cytokine/chemokine panel did not improve performance. A top-10 subset of these 16 cytokines achieves equivalent performance on the validation data sets. Our findings demonstrate that a sixteen-cytokine/chemokine panel as well as the top ten subset provides a sensitive, specific method to identify MIS-C in patients infected with SARS-CoV-2 of all the major variants identified to date.

4.
Crit Rev Microbiol ; 49(3): 391-413, 2023 May.
Artículo en Inglés | MEDLINE | ID: mdl-35468027

RESUMEN

Staphylococcus aureus is a notorious pathogen posing challenges in the medical industry due to drug resistance and biofilm formation. The horizon of knowledge on S. aureus pathogenesis has expanded with the advancement of data-driven bioinformatics techniques. Mining information from sequenced genomes and their expression data is an economic approach that alleviates wastage of resources and redundancy in experiments. The current review covers how big data bioinformatics has been used in the analysis of S. aureus from publicly available -omics data to uncover mechanisms of infection and inhibition. Particularly, advances in the past two decades in biomarker discovery, host responses, phenotype identification, consolidation of information, and drug development are discussed highlighting the challenges and shortcomings. Overall, the review summarizes the diverse aspects of scrupulous re-analysis of S. aureus proteomic and transcriptomic expression datasets retrieved from public repositories in terms of the efforts taken, benefits offered, and follow-up actions. The detailed review thus serves as a reference and aid for (i) Computational biologists by briefing the approaches utilized for bacterial omics re-analysis concerning S. aureus and (ii) Experimental biologists by elucidating the potential of bioinformatics in biological research to generate reliable postulates in a prompt and economical manner.


Asunto(s)
Infecciones Estafilocócicas , Staphylococcus aureus , Humanos , Proteómica , Macrodatos , Infecciones Estafilocócicas/tratamiento farmacológico , Infecciones Estafilocócicas/microbiología , Biología Computacional
5.
AMIA Jt Summits Transl Sci Proc ; 2022: 349-358, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35854716

RESUMEN

Although pharmaceutical products undergo clinical trials to profile efficacy and safety, some adverse drug reactions (ADRs) are only discovered after release to market. Post-market drug safety surveillance - pharmacovigilance - leverages information from various sources to proactively identify such ADRs. Clinical notes are one source of observational data that could assist this process, but their inherent complexity can obfuscate possible ADR signals. In previous research, embeddings trained on observational reports have improved detection of such signals over commonly used statistical measures. Moreover, neural embedding methods which further encode juxtapositional information have shown promise on analogical retrieval tasks, suggesting proximity-based alternatives to document-level modeling for signal detection. This work uses natural language processing and locality sensitive neural embeddings to increase ADR signal recovery from clinical notes, with AUCs of ~0.63-0.71. Constituting a ~50% increase over baselines, our method sets the state-of-the-art for these reference standards when solely leveraging clinical notes.

6.
AMIA Annu Symp Proc ; 2022: 1163-1172, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-37128462

RESUMEN

Adverse event reports (AER) are widely used for post-market drug safety surveillance and drug repurposing, with the assumption that drugs with similar side-effects may have similar therapeutic effects also. In this study, we used distributed representations of drugs derived from the Food and Drug Administration (FDA) AER system using aer2vec, a method of representing AER, with drug embeddings emerging from a neural network trained to predict the probability of adverse drug effects given observed drugs. We combined these representations with molecular features to predict permeability of the blood-brain barrier to drugs, a prerequisite to their application to treat conditions of the central nervous system. Across multiple machine learning classifiers, the addition of distributed representations improved performance over prior methods using drug-drug similarity estimates derived from discrete representations of AER system data. Embedding-based approaches outperformed those using discrete statistics, with improvements in absolute AUC of 5% and 9%, corresponding to improvements of 9% and 13% over performance with molecular features only. Performance was retained when reducing embedding dimensions from 500 to 6, indicating that they are neither attributable to overfitting, nor to a difference in the number of trainable parameters. These results indicate that aer2vec distributed representations carry information that is valuable for drug repurposing.


Asunto(s)
Barrera Hematoencefálica , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Humanos , Preparaciones Farmacéuticas , Redes Neurales de la Computación , Aprendizaje Automático
7.
J Biomed Inform ; 119: 103833, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34111555

RESUMEN

Adverse Drug Events (ADEs) are prevalent, costly, and sometimes preventable. Post-marketing drug surveillance aims to monitor ADEs that occur after a drug is released to market. Reports of such ADEs are aggregated by reporting systems, such as the Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS). In this paper, we consider the topic of how best to represent data derived from reports in FAERS for the purpose of detecting post-marketing surveillance signals, in order to inform regulatory decision making. In our previous work, we developed aer2vec, a method for deriving distributed representations (concept embeddings) of drugs and side effects from ADE reports, establishing the utility of distributional information for pharmacovigilance signal detection. In this paper, we advance this line of research further by evaluating the utility of encoding orthographic and lexical information. We do so by adapting two Natural Language Processing methods, subword embedding and vector retrofitting, which were developed to encode such information into word embeddings. Models were compared for their ability to distinguish between positive and negative examples in a set of manually curated drug/ADE relationships, with both aer2vec enhancements offering advantages in performances over baseline models, and best performance obtained when retrofitting and subword embeddings were applied in concert. In addition, this work demonstrates that models leveraging distributed representations do not require extensive manual preprocessing to perform well on this pharmacovigilance signal detection task, and may even benefit from information that would otherwise be lost during the normalization and standardization process.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Farmacovigilancia , Sistemas de Registro de Reacción Adversa a Medicamentos , Humanos , Procesamiento de Lenguaje Natural , Estados Unidos , United States Food and Drug Administration
8.
J Biomed Inform ; 119: 103818, 2021 07.
Artículo en Inglés | MEDLINE | ID: mdl-34022420

RESUMEN

OBJECTIVE: Study the impact of local policies on near-future hospitalization and mortality rates. MATERIALS AND METHODS: We introduce a novel risk-stratified SIR-HCD model that introduces new variables to model the dynamics of low-contact (e.g., work from home) and high-contact (e.g., work on-site) subpopulations while sharing parameters to control their respective R0(t) over time. We test our model on data of daily reported hospitalizations and cumulative mortality of COVID-19 in Harris County, Texas, from May 1, 2020, until October 4, 2020, collected from multiple sources (USA FACTS, U.S. Bureau of Labor Statistics, Southeast Texas Regional Advisory Council COVID-19 report, TMC daily news, and Johns Hopkins University county-level mortality reporting). RESULTS: We evaluated our model's forecasting accuracy in Harris County, TX (the most populated county in the Greater Houston area) during Phase-I and Phase-II reopening. Not only does our model outperform other competing models, but it also supports counterfactual analysis to simulate the impact of future policies in a local setting, which is unique among existing approaches. DISCUSSION: Mortality and hospitalization rates are significantly impacted by local quarantine and reopening policies. Existing models do not directly account for the effect of these policies on infection, hospitalization, and death rates in an explicit and explainable manner. Our work is an attempt to improve prediction of these trends by incorporating this information into the model, thus supporting decision-making. CONCLUSION: Our work is a timely effort to attempt to model the dynamics of pandemics under the influence of local policies.


Asunto(s)
COVID-19 , Hospitalización , Humanos , Pandemias , Políticas , SARS-CoV-2 , Estados Unidos
9.
Philos Trans A Math Phys Eng Sci ; 379(2194): 20200246, 2021 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-33583272

RESUMEN

Recent advances in computing algorithms and hardware have rekindled interest in developing high-accuracy, low-cost surrogate models for simulating physical systems. The idea is to replace expensive numerical integration of complex coupled partial differential equations at fine time scales performed on supercomputers, with machine-learned surrogates that efficiently and accurately forecast future system states using data sampled from the underlying system. One particularly popular technique being explored within the weather and climate modelling community is the echo state network (ESN), an attractive alternative to other well-known deep learning architectures. Using the classical Lorenz 63 system, and the three tier multi-scale Lorenz 96 system (Thornes T, Duben P, Palmer T. 2017 Q. J. R. Meteorol. Soc. 143, 897-908. (doi:10.1002/qj.2974)) as benchmarks, we realize that previously studied state-of-the-art ESNs operate in two distinct regimes, corresponding to low and high spectral radius (LSR/HSR) for the sparse, randomly generated, reservoir recurrence matrix. Using knowledge of the mathematical structure of the Lorenz systems along with systematic ablation and hyperparameter sensitivity analyses, we show that state-of-the-art LSR-ESNs reduce to a polynomial regression model which we call Domain-Driven Regularized Regression (D2R2). Interestingly, D2R2 is a generalization of the well-known SINDy algorithm (Brunton SL, Proctor JL, Kutz JN. 2016 Proc. Natl Acad. Sci. USA 113, 3932-3937. (doi:10.1073/pnas.1517384113)). We also show experimentally that LSR-ESNs (Chattopadhyay A, Hassanzadeh P, Subramanian D. 2019 (http://arxiv.org/abs/1906.08829)) outperform HSR ESNs (Pathak J, Hunt B, Girvan M, Lu Z, Ott E. 2018 Phys. Rev. Lett. 120, 024102. (doi:10.1103/PhysRevLett.120.024102)) while D2R2 dominates both approaches. A significant goal in constructing surrogates is to cope with barriers to scaling in weather prediction and simulation of dynamical systems that are imposed by time and energy consumption in supercomputers. Inexact computing has emerged as a novel approach to helping with scaling. In this paper, we evaluate the performance of three models (LSR-ESN, HSR-ESN and D2R2) by varying the precision or word size of the computation as our inexactness-controlling parameter. For precisions of 64, 32 and 16 bits, we show that, surprisingly, the least expensive D2R2 method yields the most robust results and the greatest savings compared to ESNs. Specifically, D2R2 achieves 68 × in computational savings, with an additional 2 × if precision reductions are also employed, outperforming ESN variants by a large margin. This article is part of the theme issue 'Machine learning for weather and climate modelling'.

10.
Infect Genet Evol ; 88: 104702, 2021 03.
Artículo en Inglés | MEDLINE | ID: mdl-33388440

RESUMEN

Biofilm forming Staphylococcus aureus is a major threat to the health-care industry. It is important to understand the differences between planktonic and biofilm growth forms in the pathogen since conventional treatments targeting the planktonic forms are not effective against biofilms. The current study conducts a meta-analysis of three public transcriptomic profiles to examine the differences in gene expression between the planktonic and biofilm states of S. aureus using random-effects modeling. Mean effect sizes were calculated for 2847 genes among which 726 differentially expressed genes were taken for further analysis. Major genes that are discriminatory between the two conditions were mined using supervised learning techniques and validated by high-accuracy classifiers. Ten different feature selection algorithms were applied and used to rank the most important genes in S. aureus biofilms. Finally, an optimal set of 36 genes are presented as candidate genes in biofilm formation or development while throwing light on the novel roles of an acyl-CoA thioesterase enzyme and 10 hypothetical proteins in biofilms. The relevance of the identified gene set was further validated by building five different classification models using SVM, RF, kNN, NB and DT algorithms that were compared with models built from other relevant gene sets and by reviewing the functional role of 25 previously known genes in biofilm development. The study combines meta-analysis of differential expression with supervised machine learning strategies and feature selection for the first time to identify and validate a discriminatory set of genes important in biofilms of S. aureus. The functional roles of the identified genes predicted to be important in biofilms are further scrutinized and can be considered as a signature target list to develop anti-biofilm therapeutics in S. aureus.


Asunto(s)
Biopelículas , Infecciones Estafilocócicas/microbiología , Staphylococcus aureus/crecimiento & desarrollo , Staphylococcus aureus/genética , Aprendizaje Automático Supervisado , Transcriptoma , Algoritmos , Conjuntos de Datos como Asunto , Regulación Bacteriana de la Expresión Génica , Humanos , Análisis por Micromatrices , RNA-Seq
11.
Drug Saf ; 43(1): 67-77, 2020 01.
Artículo en Inglés | MEDLINE | ID: mdl-31646442

RESUMEN

INTRODUCTION: As a result of the well documented limitations of data collected by spontaneous reporting systems (SRS), such as bias and under-reporting, a number of authors have evaluated the utility of other data sources for the purpose of pharmacovigilance, including the biomedical literature. Previous work has demonstrated the utility of literature-derived distributed representations (concept embeddings) with machine learning for the purpose of drug side-effect prediction. In terms of data sources, these methods are complementary, observing drug safety from two different perspectives (knowledge extracted from the literature and statistics from SRS data). However, the combined utility of these pharmacovigilance methods has yet to be evaluated. OBJECTIVE: This research investigates the utility of directly or indirectly combining an observational signal from SRS with literature-derived distributed representations into a single feature vector or in an ensemble approach for downstream machine learning (logistic regression). METHODS: Leveraging a recently developed representation scheme, concept embeddings were generated from relational connections extracted from the literature and composed to represent drug and associated adverse reactions, as defined by two reference standards of positive (likely causal) and negative (no causal evidence) pairs. Embeddings were presented with and without common measures of observational signal from SRS sources to logistic regressors, and performance was evaluated with the receiver operating characteristic (ROC) area under the curve (AUC) metric. RESULTS: ROC AUC performance with these composite models improves up to ≈ 20% over SRS-based disproportionality metrics alone and exceeds the best prior results reported in the literature when models leverage both sources of information. CONCLUSIONS: Results from this study support the hypothesis that knowledge extracted from the literature can enhance the performance of SRS-based methods (and vice versa). Across reference sets, using literature and SRS information together performed better than using either source alone, providing strong support for the complementary nature of these approaches to post-marketing drug surveillance.


Asunto(s)
Sistemas de Registro de Reacción Adversa a Medicamentos , Vigilancia de Productos Comercializados/métodos , Humanos , Modelos Logísticos , Aprendizaje Automático , Farmacovigilancia
12.
Indian J Med Microbiol ; 37(2): 173-185, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31745016

RESUMEN

Context: Vancomycin-intermediate Staphylococcus aureus remains one of the most prevalent multidrug-resistant pathogens causing healthcare infections that are difficult to treat. Aims: This study uses a comprehensive computational analysis to systematically investigate various gene expression profiles of resistant and sensitive S. aureus strains on exposure to antibiotics. Settings and Design: The transcriptional changes leading to the development of multiple antibiotic resistance were examined by an integrative analysis of nine differential expression experiments under selected conditions of vancomycin-intermediate and -sensitive strains for four different antibiotics using publicly available RNA-Seq datasets. Materials and Methods: For each antibiotic, three experimental conditions for expression analysis were selected to identify those genes that are particularly involved in the development of resistance. The results were further scrutinised to generate a resistome that can be analysed for their role in the development or adaptation to antibiotic resistance. Results: The 99 genes in the resistome are then compiled to create a multiple drug resistome of 25 known and novel genes identified to play a part in antibiotic resistance. The inclusion of agr genes and associated virulence factors in the identified resistome supports the role of agr quorum sensing system in multiple drug resistance. In addition, enrichment analysis also identified the kyoto encyclopedia of genes and genomes (KEGG) pathways - quorum sensing and two-component system pathways - in the resistome gene set. Conclusion: Further studies on understanding the role of the identified molecular targets such as SAA6008_00181, SAA6008_01127, agrA, agrC and coa in adapting to the pressure of antibiotics at sub-inhibitory concentrations can help in learning the molecular mechanisms causing resistance to the pathogens as well as finding other potential therapeutics.


Asunto(s)
Farmacorresistencia Bacteriana , Genes Bacterianos , Transducción de Señal , Infecciones Estafilocócicas/microbiología , Staphylococcus aureus/efectos de los fármacos , Staphylococcus aureus/fisiología , Vancomicina/farmacología , Antibacterianos/farmacología , Regulación Bacteriana de la Expresión Génica/efectos de los fármacos , Humanos , Pruebas de Sensibilidad Microbiana , RNA-Seq , Factores de Virulencia
13.
AMIA Annu Symp Proc ; 2019: 717-726, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-32308867

RESUMEN

Adverse event report (AER) data are a key source of signal for post marketing drug surveillance. The standard methodology to analyze AER data applies disproportionality metrics, which estimate the strength of drug/side-effect associations from discrete counts of their occurrence at report level. However, in other domains, improvements in predictive modeling accuracy have been obtained through representation learning, where discrete features are replaced by distributed representations learned from unlabeled data. This paper describes aer2vec, a novel representational approach for AER data in which concept embeddings emerge from neural networks trained to predict drug/side-effect co-occurrence. Trained models are evaluated for their utility in identifying drug/side-effect relationships, with improvements over disproportionality metrics in most cases. In addition, we evaluate the utility of an otherwise-untapped resource in the Food and Drug Administration (FDA) AER system - reporter designations of suspected causality - and find that incorporating this information enhances performance of all models evaluated.


Asunto(s)
Sistemas de Registro de Reacción Adversa a Medicamentos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/diagnóstico , Modelos Teóricos , Vigilancia de Productos Comercializados , Bases de Datos Factuales , Humanos , Redes Neurales de la Computación , Estados Unidos , United States Food and Drug Administration
14.
AMIA Annu Symp Proc ; 2019: 992-1001, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-32308896

RESUMEN

The identification of drug-drug interactions (DDIs) is important for patient safety; yet, compared to other pharmacovigilance work, a limited amount of research has been conducted in this space. Recent work has successfully applied a method of deriving distributed vector representations from structured biomedical knowledge, known as Embedding of Semantic Predications (ESP), to the problem of predicting individual drug side effects. In the current paper we extend this work by applying ESP to the problem of predicting polypharmacy side-effects for particular drug combinations, building on a recent reconceptualization of this problem as a network of drug nodes connected by side effect edges. We evaluate ESP embeddings derived from the resulting graph on a side-effect prediction task against a previously reported graph convolutional neural network approach, using the same data and evaluation methods. We demonstrate that ESP models perform better, while being faster to train, more re-usable, and significantly simpler.


Asunto(s)
Interacciones Farmacológicas , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Modelos Biológicos , Redes Neurales de la Computación , Farmacovigilancia , Polifarmacia , Algoritmos , Biología Computacional , Visualización de Datos , Humanos , Semántica
15.
Genomics ; 111(6): 1431-1446, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-30304708

RESUMEN

sRNAs are important post-transcriptional regulators in bacteria. The current study exploits potential of next-generation technology with computational analyses to develop a whole-genome sRNA-gene network for drug-resistant S. aureus by subjecting public expression-profiles to a novel analysis pipeline. Clustering and examination of the resultant global-interactome indicated a coordinated-regulation of numerous processes by various sRNAs with 9 sRNAs and 10 genes as potential hubs. 10 major sRNA-modules were annotated with various functions, among which a major module including of Rsa sRNAs was predicted to be a central regulatory unit. In addition, sRNA95, a hub molecule associated with this unit was predicted to be a vulnerable target. Finally, novel associations between transcriptional-regulators and sRNAs have been mined resulting in some insights into the association between RNAIII and RsaA. To our knowledge, this is the first study in S. aureus throwing insights into global sRNA-gene interactions and identify potential sRNAs to explore sRNA-based applications for therapeutics.


Asunto(s)
Proteínas Bacterianas/genética , Regulación Bacteriana de la Expresión Génica , Genoma Bacteriano , ARN Pequeño no Traducido/genética , RNA-Seq/métodos , Staphylococcus aureus/genética , Proteínas Bacterianas/metabolismo , Biología Computacional , Redes Reguladoras de Genes , ARN Pequeño no Traducido/metabolismo , Infecciones Estafilocócicas/genética , Infecciones Estafilocócicas/microbiología , Staphylococcus aureus/crecimiento & desarrollo , Staphylococcus aureus/metabolismo , Transcriptoma
16.
Sci Am ; 319(4): 74-79, 2018 Sep 18.
Artículo en Inglés | MEDLINE | ID: mdl-30273319
17.
J Am Med Inform Assoc ; 25(10): 1339-1350, 2018 10 01.
Artículo en Inglés | MEDLINE | ID: mdl-30010902

RESUMEN

Objective: The aim of this work is to leverage relational information extracted from biomedical literature using a novel synthesis of unsupervised pretraining, representational composition, and supervised machine learning for drug safety monitoring. Methods: Using ≈80 million concept-relationship-concept triples extracted from the literature using the SemRep Natural Language Processing system, distributed vector representations (embeddings) were generated for concepts as functions of their relationships utilizing two unsupervised representational approaches. Embeddings for drugs and side effects of interest from two widely used reference standards were then composed to generate embeddings of drug/side-effect pairs, which were used as input for supervised machine learning. This methodology was developed and evaluated using cross-validation strategies and compared to contemporary approaches. To qualitatively assess generalization, models trained on the Observational Medical Outcomes Partnership (OMOP) drug/side-effect reference set were evaluated against a list of ≈1100 drugs from an online database. Results: The employed method improved performance over previous approaches. Cross-validation results advance the state of the art (AUC 0.96; F1 0.90 and AUC 0.95; F1 0.84 across the two sets), outperforming methods utilizing literature and/or spontaneous reporting system data. Examination of predictions for unseen drug/side-effect pairs indicates the ability of these methods to generalize, with over tenfold label support enrichment in the top 100 predictions versus the bottom 100 predictions. Discussion and Conclusion: Our methods can assist the pharmacovigilance process using information from the biomedical literature. Unsupervised pretraining generates a rich relationship-based representational foundation for machine learning techniques to classify drugs in the context of a putative side effect, given known examples.


Asunto(s)
Bases de Datos Bibliográficas , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Procesamiento de Lenguaje Natural , Farmacovigilancia , Aprendizaje Automático Supervisado , Minería de Datos/métodos , Humanos , Vigilancia de Productos Comercializados/métodos , Semántica
18.
Comput Biol Chem ; 75: 101-110, 2018 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-29763853

RESUMEN

BACKGROUND AND OBJECTIVES: Travel to elevations above 2500 m is associated with the risk of developing one or more forms of acute altitude illness such as acute mountain sickness (AMS), high altitude cerebral edema (HACE) or high altitude pulmonary edema (HAPE). Our work aims to identify the functional association of genes involved in high altitude diseases. METHOD: In this work we identified the gene networks responsible for high altitude diseases by using the principle of gene co-occurrence statistics from literature and network analysis. First, we mined the literature data from PubMed on high-altitude diseases, and extracted the co-occurring gene pairs. Next, based on their co-occurrence frequency, gene pairs were ranked. Finally, a gene association network was created using statistical measures to explore potential relationships. RESULTS: Network analysis results revealed that EPO, ACE, IL6 and TNF are the top five genes that were found to co-occur with 20 or more genes, while the association between EPAS1 and EGLN1 genes is strongly substantiated. CONCLUSION: The network constructed from this study proposes a large number of genes that work in-toto in high altitude conditions. Overall, the result provides a good reference for further study of the genetic relationships in high altitude diseases.


Asunto(s)
Mal de Altura/genética , Edema Encefálico/genética , Minería de Datos , Redes Reguladoras de Genes/genética , Hipertensión Pulmonar/genética , Humanos
19.
Genes Dis ; 3(3): 228-237, 2016 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-30258892

RESUMEN

Protein kinases play an important role in the incidence of neurodegenerative diseases. However their incidence in non-human primates is found to be very low. Small differences among the genomes might influence the disease susceptibilities. The present study deals with finding the genetic differences of protein kinases in humans and their three closest evolutionary partners chimpanzee, gorilla and orangutan for three neurodegenerative diseases namely, Alzheimer's, Parkinson's and Huntington's diseases. In total 47 human protein kinases associated with three neurodegenerative diseases and their orthologs from other three non-human primates were identified and analyzed for any possible susceptibility factors in humans. Multiple sequence alignment and pairwise sequence alignment revealed that, 18 human protein kinases including DYRK1A, RPS6KB1, and GRK6 contained significant indels and substitutions. Further phosphorylation site analysis revealed that eight kinases including MARK2 and LTK contained sites of phosphorylation exclusive to human genomes which could be particular candidates in determining disease susceptibility between human and non-human primates. Final pathway analysis of these eight kinases and their targets revealed that these kinases could have long range consequences in important signaling pathways which are associated with neurodegenerative diseases.

20.
AMIA Annu Symp Proc ; 2016: 1940-1949, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-28269953

RESUMEN

An important aspect of post-marketing drug surveillance involves identifying potential side-effects utilizing adverse drug event (ADE) reporting systems and/or Electronic Health Records. These data are noisy, necessitating identified drug/ADE associations be manually reviewed - a human-intensive process that scales poorly with large numbers of possibly dangerous associations and rapid growth of biomedical literature. Recent work has employed Literature Based Discovery methods that exploit implicit relationships between biomedical entities within the literature to estimate the plausibility of drug/ADE connections. We extend this work by evaluating machine learning classifiers applied to high-dimensional vector representations of relationships extracted from the literature as a means to identify substantiated drug/ADE connections. Using a curated reference standard, we show applying classifiers to such representations improves performance (+≈37%AUC) over previous approaches. These trained systems reproduce outcomes of the manual literature review process used to create the reference standard, but further research is required to establish their generalizability.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/clasificación , Vigilancia de Productos Comercializados/métodos , Máquina de Vectores de Soporte , Sistemas de Registro de Reacción Adversa a Medicamentos , Registros Electrónicos de Salud , Humanos , Aprendizaje Automático , Modelos Teóricos , Curva ROC
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...