Pesquisa | BVS Bolivia

An open source knowledge graph ecosystem for the life sciences.

Callahan, Tiffany J; Tripodi, Ignacio J; Stefanski, Adrianne L; Cappelletti, Luca; Taneja, Sanya B; Wyrwa, Jordan M; Casiraghi, Elena; Matentzoglu, Nicolas A; Reese, Justin; Silverstein, Jonathan C; Hoyt, Charles Tapley; Boyce, Richard D; Malec, Scott A; Unni, Deepak R; Joachimiak, Marcin P; Robinson, Peter N; Mungall, Christopher J; Cavalleri, Emanuele; Fontana, Tommaso; Valentini, Giorgio; Mesiti, Marco; Gillenwater, Lucas A; Santangelo, Brook; Vasilevsky, Nicole A; Hoehndorf, Robert; Bennett, Tellen D; Ryan, Patrick B; Hripcsak, George; Kahn, Michael G; Bada, Michael; Baumgartner, William A; Hunter, Lawrence E.

Sci Data ; 11(1): 363, 2024 Apr 11.

Artigo em Inglês | MEDLINE | ID: mdl-38605048

RESUMO

Translational research requires data at multiple scales of biological organization. Advancements in sequencing and multi-omics technologies have increased the availability of these data, but researchers face significant integration challenges. Knowledge graphs (KGs) are used to model complex phenomena, and methods exist to construct them automatically. However, tackling complex biomedical integration problems requires flexibility in the way knowledge is modeled. Moreover, existing KG construction methods provide robust tooling at the cost of fixed or limited choices among knowledge representation models. PheKnowLator (Phenotype Knowledge Translator) is a semantic ecosystem for automating the FAIR (Findable, Accessible, Interoperable, and Reusable) construction of ontologically grounded KGs with fully customizable knowledge representation. The ecosystem includes KG construction resources (e.g., data preparation APIs), analysis tools (e.g., SPARQL endpoint resources and abstraction algorithms), and benchmarks (e.g., prebuilt KGs). We evaluated the ecosystem by systematically comparing it to existing open-source KG construction methods and by analyzing its computational performance when used to construct 12 different large-scale KGs. With flexible knowledge representation, PheKnowLator enables fully customizable KGs without compromising performance or usability.

Assuntos

Disciplinas das Ciências Biológicas , Bases de Conhecimento , Reconhecimento Automatizado de Padrão , Algoritmos , Pesquisa Translacional Biomédica

Broadening the capture of natural products mentioned in FAERS using fuzzy string-matching and a Siamese neural network.

Dilán-Pantojas, Israel O; Boonchalermvichien, Tanupat; Taneja, Sanya B; Li, Xiaotong; Chapin, Maryann R; Karcher, Sandra; Boyce, Richard D.

Sci Rep ; 14(1): 1272, 2024 01 13.

Artigo em Inglês | MEDLINE | ID: mdl-38218987

RESUMO

Increased sales of natural products (NPs) in the US and growing safety concerns highlight the need for NP pharmacovigilance. A challenge for NP pharmacovigilance is ambiguity when referring to NPs in spontaneous reporting systems. We used a combination of fuzzy string-matching and a neural network to reduce this ambiguity. Our aim is to increase the capture of reports involving NPs in the US Food and Drug Administration Adverse Event Reporting System (FAERS). For this, we utilized Gestalt pattern-matching (GPM) and Siamese neural network (SM) to identify potential mentions of NPs of interest in 389,386 FAERS reports with unmapped drug names. A team of health professionals refined the candidates identified in the previous step through manual review and annotation. After candidate adjudication, GPM identified 595 unique NP names and SM 504. There was little overlap between candidates identified by each (Non-overlapping: GPM 347, SM 248). We identified a total of 686 novel NP names from FAERS reports. Including these names in the FAERS collection yielded 3,486 additional reports mentioning NPs.

Assuntos

Produtos Biológicos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Estados Unidos , Humanos , Sistemas de Notificação de Reações Adversas a Medicamentos , United States Food and Drug Administration , Redes Neurais de Computação , Farmacovigilância

Ontologizing health systems data at scale: making translational discovery a reality.

Callahan, Tiffany J; Stefanski, Adrianne L; Wyrwa, Jordan M; Zeng, Chenjie; Ostropolets, Anna; Banda, Juan M; Baumgartner, William A; Boyce, Richard D; Casiraghi, Elena; Coleman, Ben D; Collins, Janine H; Deakyne Davies, Sara J; Feinstein, James A; Lin, Asiyah Y; Martin, Blake; Matentzoglu, Nicolas A; Meeker, Daniella; Reese, Justin; Sinclair, Jessica; Taneja, Sanya B; Trinkley, Katy E; Vasilevsky, Nicole A; Williams, Andrew E; Zhang, Xingmin A; Denny, Joshua C; Ryan, Patrick B; Hripcsak, George; Bennett, Tellen D; Haendel, Melissa A; Robinson, Peter N; Hunter, Lawrence E; Kahn, Michael G.

NPJ Digit Med ; 6(1): 89, 2023 May 19.

Artigo em Inglês | MEDLINE | ID: mdl-37208468

RESUMO

Common data models solve many challenges of standardizing electronic health record (EHR) data but are unable to semantically integrate all of the resources needed for deep phenotyping. Open Biological and Biomedical Ontology (OBO) Foundry ontologies provide computable representations of biological knowledge and enable the integration of heterogeneous data. However, mapping EHR data to OBO ontologies requires significant manual curation and domain expertise. We introduce OMOP2OBO, an algorithm for mapping Observational Medical Outcomes Partnership (OMOP) vocabularies to OBO ontologies. Using OMOP2OBO, we produced mappings for 92,367 conditions, 8611 drug ingredients, and 10,673 measurement results, which covered 68-99% of concepts used in clinical practice when examined across 24 hospitals. When used to phenotype rare disease patients, the mappings helped systematically identify undiagnosed patients who might benefit from genetic testing. By aligning OMOP vocabularies to OBO ontologies our algorithm presents new opportunities to advance EHR-based deep phenotyping.

Causal feature selection using a knowledge graph combining structured knowledge from the biomedical literature and ontologies: A use case studying depression as a risk factor for Alzheimer's disease.

Malec, Scott A; Taneja, Sanya B; Albert, Steven M; Elizabeth Shaaban, C; Karim, Helmet T; Levine, Arthur S; Munro, Paul; Callahan, Tiffany J; Boyce, Richard D.

J Biomed Inform ; 142: 104368, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-37086959

RESUMO

BACKGROUND: Causal feature selection is essential for estimating effects from observational data. Identifying confounders is a crucial step in this process. Traditionally, researchers employ content-matter expertise and literature review to identify confounders. Uncontrolled confounding from unidentified confounders threatens validity, conditioning on intermediate variables (mediators) weakens estimates, and conditioning on common effects (colliders) induces bias. Additionally, without special treatment, erroneous conditioning on variables combining roles introduces bias. However, the vast literature is growing exponentially, making it infeasible to assimilate this knowledge. To address these challenges, we introduce a novel knowledge graph (KG) application enabling causal feature selection by combining computable literature-derived knowledge with biomedical ontologies. We present a use case of our approach specifying a causal model for estimating the total causal effect of depression on the risk of developing Alzheimer's disease (AD) from observational data. METHODS: We extracted computable knowledge from a literature corpus using three machine reading systems and inferred missing knowledge using logical closure operations. Using a KG framework, we mapped the output to target terminologies and combined it with ontology-grounded resources. We translated epidemiological definitions of confounder, collider, and mediator into queries for searching the KG and summarized the roles played by the identified variables. We compared the results with output from a complementary method and published observational studies and examined a selection of confounding and combined role variables in-depth. RESULTS: Our search identified 128 confounders, including 58 phenotypes, 47 drugs, 35 genes, 23 collider, and 16 mediator phenotypes. However, only 31 of the 58 confounder phenotypes were found to behave exclusively as confounders, while the remaining 27 phenotypes played other roles. Obstructive sleep apnea emerged as a potential novel confounder for depression and AD. Anemia exemplified a variable playing combined roles. CONCLUSION: Our findings suggest combining machine reading and KG could augment human expertise for causal feature selection. However, the complexity of causal feature selection for depression with AD highlights the need for standardized field-specific databases of causal variables. Further work is needed to optimize KG search and transform the output for human consumption.

Assuntos

Doença de Alzheimer , Humanos , Depressão , Reconhecimento Automatizado de Padrão , Causalidade , Fatores de Risco

An evaluation of adverse drug reactions and outcomes attributed to kratom in the US Food and Drug Administration Adverse Event Reporting System from January 2004 through September 2021.

Li, Xiaotong; Ndungu, Patrick; Taneja, Sanya B; Chapin, Maryann R; Egbert, Susan B; Akenapalli, Krishi; Paine, Mary F; Kane-Gill, Sandra L; Boyce, Richard D.

Clin Transl Sci ; 16(6): 1002-1011, 2023 06.

Artigo em Inglês | MEDLINE | ID: mdl-36861661

RESUMO

Kratom is a widely used Asian botanical that has gained popularity in the United States due to a perception that it can treat pain, anxiety, and opioid withdrawal symptoms. The American Kratom Association estimates 10-16 million people use kratom. Kratom-associated adverse drug reactions (ADRs) continue to be reported and raise concerns about the safety profile of kratom. However, studies are lacking that describe the overall pattern of kratom-associated adverse events and quantify the association between kratom and adverse events. ADRs reported to the US Food and Drug Administration Adverse Event Reporting System from January 2004 through September 2021 were used to address these knowledge gaps. Descriptive analysis was conducted to analyze kratom-related adverse reactions. Conservative pharmacovigilance signals based on observed-to-expected ratios with shrinkage were estimated by comparing kratom to all other natural products and drugs. Based on 489 deduplicated kratom-related ADR reports, users were young (mean age 35.5 years), and more often male (67.5%) than female patients (23.5%). Cases were predominantly reported since 2018 (94.2%). Fifty-two disproportionate reporting signals in 17 system-organ-class categories were generated. The observed/reported number of kratom-related accidental death reports was 63-fold greater than expected. There were eight strong signals related to addiction or drug withdrawal. An excess proportion of ADR reports were about kratom-related drug complaints, toxicity to various agents, and seizures. Although further research is needed to assess the safety of kratom, clinicians and consumers should be aware that real-world evidence points to potential safety threats.

Assuntos

Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Mitragyna , Estados Unidos/epidemiologia , Humanos , Masculino , Feminino , Adulto , Mitragyna/efeitos adversos , United States Food and Drug Administration , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Analgésicos Opioides , Dor

Developing a Knowledge Graph for Pharmacokinetic Natural Product-Drug Interactions.

Taneja, Sanya B; Callahan, Tiffany J; Paine, Mary F; Kane-Gill, Sandra L; Kilicoglu, Halil; Joachimiak, Marcin P; Boyce, Richard D.

J Biomed Inform ; 140: 104341, 2023 04.

Artigo em Inglês | MEDLINE | ID: mdl-36933632

RESUMO

BACKGROUND: Pharmacokinetic natural product-drug interactions (NPDIs) occur when botanical or other natural products are co-consumed with pharmaceutical drugs. With the growing use of natural products, the risk for potential NPDIs and consequent adverse events has increased. Understanding mechanisms of NPDIs is key to preventing or minimizing adverse events. Although biomedical knowledge graphs (KGs) have been widely used for drug-drug interaction applications, computational investigation of NPDIs is novel. We constructed NP-KG as a first step toward computational discovery of plausible mechanistic explanations for pharmacokinetic NPDIs that can be used to guide scientific research. METHODS: We developed a large-scale, heterogeneous KG with biomedical ontologies, linked data, and full texts of the scientific literature. To construct the KG, biomedical ontologies and drug databases were integrated with the Phenotype Knowledge Translator framework. The semantic relation extraction systems, SemRep and Integrated Network and Dynamic Reasoning Assembler, were used to extract semantic predications (subject-relation-object triples) from full texts of the scientific literature related to the exemplar natural products green tea and kratom. A literature-based graph constructed from the predications was integrated into the ontology-grounded KG to create NP-KG. NP-KG was evaluated with case studies of pharmacokinetic green tea- and kratom-drug interactions through KG path searches and meta-path discovery to determine congruent and contradictory information in NP-KG compared to ground truth data. We also conducted an error analysis to identify knowledge gaps and incorrect predications in the KG. RESULTS: The fully integrated NP-KG consisted of 745,512 nodes and 7,249,576 edges. Evaluation of NP-KG resulted in congruent (38.98% for green tea, 50% for kratom), contradictory (15.25% for green tea, 21.43% for kratom), and both congruent and contradictory (15.25% for green tea, 21.43% for kratom) information compared to ground truth data. Potential pharmacokinetic mechanisms for several purported NPDIs, including the green tea-raloxifene, green tea-nadolol, kratom-midazolam, kratom-quetiapine, and kratom-venlafaxine interactions were congruent with the published literature. CONCLUSION: NP-KG is the first KG to integrate biomedical ontologies with full texts of the scientific literature focused on natural products. We demonstrate the application of NP-KG to identify known pharmacokinetic interactions between natural products and pharmaceutical drugs mediated by drug metabolizing enzymes and transporters. Future work will incorporate context, contradiction analysis, and embedding-based methods to enrich NP-KG. NP-KG is publicly available at https://doi.org/10.5281/zenodo.6814507. The code for relation extraction, KG construction, and hypothesis generation is available at https://github.com/sanyabt/np-kg.

Assuntos

Ontologias Biológicas , Produtos Biológicos , Reconhecimento Automatizado de Padrão , Interações Medicamentosas , Semântica , Preparações Farmacêuticas

Bayesian network models with decision tree analysis for management of childhood malaria in Malawi.

Taneja, Sanya B; Douglas, Gerald P; Cooper, Gregory F; Michaels, Marian G; Druzdzel, Marek J; Visweswaran, Shyam.

BMC Med Inform Decis Mak ; 21(1): 158, 2021 05 17.

Artigo em Inglês | MEDLINE | ID: mdl-34001100

RESUMO

BACKGROUND: Malaria is a major cause of death in children under five years old in low- and middle-income countries such as Malawi. Accurate diagnosis and management of malaria can help reduce the global burden of childhood morbidity and mortality. Trained healthcare workers in rural health centers manage malaria with limited supplies of malarial diagnostic tests and drugs for treatment. A clinical decision support system that integrates predictive models to provide an accurate prediction of malaria based on clinical features could aid healthcare workers in the judicious use of testing and treatment. We developed Bayesian network (BN) models to predict the probability of malaria from clinical features and an illustrative decision tree to model the decision to use or not use a malaria rapid diagnostic test (mRDT). METHODS: We developed two BN models to predict malaria from a dataset of outpatient encounters of children in Malawi. The first BN model was created manually with expert knowledge, and the second model was derived using an automated method. The performance of the BN models was compared to other statistical models on a range of performance metrics at multiple thresholds. We developed a decision tree that integrates predictions with the costs of mRDT and a course of recommended treatment. RESULTS: The manually created BN model achieved an area under the ROC curve (AUC) equal to 0.60 which was statistically significantly higher than the other models. At the optimal threshold for classification, the manual BN model had sensitivity and specificity of 0.74 and 0.42 respectively, and the automated BN model had sensitivity and specificity of 0.45 and 0.68 respectively. The balanced accuracy values were similar across all the models. Sensitivity analysis of the decision tree showed that for values of probability of malaria below 0.04 and above 0.40, the preferred decision that minimizes expected costs is not to perform mRDT. CONCLUSION: In resource-constrained settings, judicious use of mRDT is important. Predictive models in combination with decision analysis can provide personalized guidance on when to use mRDT in the management of childhood malaria. BN models can be efficiently derived from data to support clinical decision making.

Assuntos

Malária , Teorema de Bayes , Criança , Pré-Escolar , Árvores de Decisões , Testes Diagnósticos de Rotina , Humanos , Malária/diagnóstico , Malária/tratamento farmacológico , Malaui/epidemiologia

Machine Learning Classifiers for Twitter Surveillance of Vaping: Comparative Machine Learning Study.

Visweswaran, Shyam; Colditz, Jason B; O'Halloran, Patrick; Han, Na-Rae; Taneja, Sanya B; Welling, Joel; Chu, Kar-Hai; Sidani, Jaime E; Primack, Brian A.

J Med Internet Res ; 22(8): e17478, 2020 08 12.

Artigo em Inglês | MEDLINE | ID: mdl-32784184

RESUMO

BACKGROUND: Twitter presents a valuable and relevant social media platform to study the prevalence of information and sentiment on vaping that may be useful for public health surveillance. Machine learning classifiers that identify vaping-relevant tweets and characterize sentiments in them can underpin a Twitter-based vaping surveillance system. Compared with traditional machine learning classifiers that are reliant on annotations that are expensive to obtain, deep learning classifiers offer the advantage of requiring fewer annotated tweets by leveraging the large numbers of readily available unannotated tweets. OBJECTIVE: This study aims to derive and evaluate traditional and deep learning classifiers that can identify tweets relevant to vaping, tweets of a commercial nature, and tweets with provape sentiments. METHODS: We continuously collected tweets that matched vaping-related keywords over 2 months from August 2018 to October 2018. From this data set of tweets, a set of 4000 tweets was selected, and each tweet was manually annotated for relevance (vape relevant or not), commercial nature (commercial or not), and sentiment (provape or not). Using the annotated data, we derived traditional classifiers that included logistic regression, random forest, linear support vector machine, and multinomial naive Bayes. In addition, using the annotated data set and a larger unannotated data set of tweets, we derived deep learning classifiers that included a convolutional neural network (CNN), long short-term memory (LSTM) network, LSTM-CNN network, and bidirectional LSTM (BiLSTM) network. The unannotated tweet data were used to derive word vectors that deep learning classifiers can leverage to improve performance. RESULTS: LSTM-CNN performed the best with the highest area under the receiver operating characteristic curve (AUC) of 0.96 (95% CI 0.93-0.98) for relevance, all deep learning classifiers including LSTM-CNN performed better than the traditional classifiers with an AUC of 0.99 (95% CI 0.98-0.99) for distinguishing commercial from noncommercial tweets, and BiLSTM performed the best with an AUC of 0.83 (95% CI 0.78-0.89) for provape sentiment. Overall, LSTM-CNN performed the best across all 3 classification tasks. CONCLUSIONS: We derived and evaluated traditional machine learning and deep learning classifiers to identify vaping-related relevant, commercial, and provape tweets. Overall, deep learning classifiers such as LSTM-CNN had superior performance and had the added advantage of requiring no preprocessing. The performance of these classifiers supports the development of a vaping surveillance system.

Assuntos

Aprendizado Profundo , Aprendizado de Máquina/normas , Vigilância em Saúde Pública/métodos , Mídias Sociais/normas , Vaping/tendências , Humanos , Estudos Longitudinais

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA