Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Pharmacol Rev ; 75(4): 714-738, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-36931724

RESUMO

Natural language processing (NLP) is an area of artificial intelligence that applies information technologies to process the human language, understand it to a certain degree, and use it in various applications. This area has rapidly developed in the past few years and now employs modern variants of deep neural networks to extract relevant patterns from large text corpora. The main objective of this work is to survey the recent use of NLP in the field of pharmacology. As our work shows, NLP is a highly relevant information extraction and processing approach for pharmacology. It has been used extensively, from intelligent searches through thousands of medical documents to finding traces of adversarial drug interactions in social media. We split our coverage into five categories to survey modern NLP: methodology, commonly addressed tasks, relevant textual data, knowledge bases, and useful programming libraries. We split each of the five categories into appropriate subcategories, describe their main properties and ideas, and summarize them in a tabular form. The resulting survey presents a comprehensive overview of the area, useful to practitioners and interested observers. SIGNIFICANCE STATEMENT: The main objective of this work is to survey the recent use of NLP in the field of pharmacology in order to provide a comprehensive overview of the current state in the area after the rapid developments that occurred in the past few years. The resulting survey will be useful to practitioners and interested observers in the domain.


Assuntos
Inteligência Artificial , Processamento de Linguagem Natural , Humanos , Armazenamento e Recuperação da Informação , Registros Eletrônicos de Saúde , Registros
2.
Front Artif Intell ; 6: 932519, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37056912

RESUMO

Introduction: Large pretrained language models have recently conquered the area of natural language processing. As an alternative to predominant masked language modeling introduced in BERT, the T5 model has introduced a more general training objective, namely sequence to sequence transformation, which more naturally fits text generation tasks. The monolingual variants of T5 models have been limited to well-resourced languages, while the massively multilingual T5 model supports 101 languages. Methods: We trained two different-sized T5-type sequence-to-sequence models for morphologically rich Slovene language with much fewer resources. We analyzed the behavior of new models on 11 tasks, eight classification ones (named entity recognition, sentiment classification, lemmatization, two question answering tasks, two natural language inference tasks, and a coreference resolution task), and three text generation tasks (text simplification and two summarization tasks on different datasets). We compared the new SloT5 models with the multilingual mT5 model, multilingual mBART-50 model, and with four encoder BERT-like models: multilingual BERT, multilingual XLM-RoBERTa, trilingual Croatian-Slovene-English BERT, and monolingual Slovene RoBERTa model. Results: Concerning the classification tasks, the SloT5 models mostly lag behind the monolingual Slovene SloBERTa model. However, these models are helpful for generative tasks and provide several useful results. In general, the size of models matters, and currently, there is not enough training data for Slovene for successful pretraining of large models. Discussion: While the results are obtained on Slovene, we believe that they may generalize to other less-resourced languages, where such models will be built. We make the training and evaluation code, as well as the trained models, publicly available.

3.
JMIR Med Inform ; 10(2): e30483, 2022 Feb 02.
Artigo em Inglês | MEDLINE | ID: mdl-35107432

RESUMO

BACKGROUND: Cardiovascular disorders in general are responsible for 30% of deaths worldwide. Among them, hypertrophic cardiomyopathy (HCM) is a genetic cardiac disease that is present in about 1 of 500 young adults and can cause sudden cardiac death (SCD). OBJECTIVE: Although the current state-of-the-art methods model the risk of SCD for patients, to the best of our knowledge, no methods are available for modeling the patient's clinical status up to 10 years ahead. In this paper, we propose a novel machine learning (ML)-based tool for predicting disease progression for patients diagnosed with HCM in terms of adverse remodeling of the heart during a 10-year period. METHODS: The method consisted of 6 predictive regression models that independently predict future values of 6 clinical characteristics: left atrial size, left atrial volume, left ventricular ejection fraction, New York Heart Association functional classification, left ventricular internal diastolic diameter, and left ventricular internal systolic diameter. We supplemented each prediction with the explanation that is generated using the Shapley additive explanation method. RESULTS: The final experiments showed that predictive error is lower on 5 of the 6 constructed models in comparison to experts (on average, by 0.34) or a consortium of experts (on average, by 0.22). The experiments revealed that semisupervised learning and the artificial data from virtual patients help improve predictive accuracies. The best-performing random forest model improved R2 from 0.3 to 0.6. CONCLUSIONS: By engaging medical experts to provide interpretation and validation of the results, we determined the models' favorable performance compared to the performance of experts for 5 of 6 targets.

4.
Comput Biol Med ; 135: 104648, 2021 08.
Artigo em Inglês | MEDLINE | ID: mdl-34280775

RESUMO

BACKGROUND: Machine learning (ML) and artificial intelligence are emerging as important components of precision medicine that enhance diagnosis and risk stratification. Risk stratification tools for hypertrophic cardiomyopathy (HCM) exist, but they are based on traditional statistical methods. The aim was to develop a novel machine learning risk stratification tool for the prediction of 5-year risk in HCM. The goal was to determine if its predictive accuracy is higher than the accuracy of the state-of-the-art tools. METHOD: Data from a total of 2302 patients were used. The data were comprised of demographic characteristics, genetic data, clinical investigations, medications, and disease-related events. Four classification models were applied to model the risk level, and their decisions were explained using the SHAP (SHapley Additive exPlanations) method. Unwanted cardiac events were defined as sustained ventricular tachycardia occurrence (VT), heart failure (HF), ICD activation, sudden cardiac death (SCD), cardiac death, and all-cause death. RESULTS: The proposed machine learning approach outperformed the similar existing risk-stratification models for SCD, cardiac death, and all-cause death risk-stratification: it achieved higher AUC by 17%, 9%, and 1%, respectively. The boosted trees achieved the best performing AUC of 0.82. The resulting model most accurately predicts VT, HF, and ICD with AUCs of 0.90, 0.88, and 0.87, respectively. CONCLUSIONS: The proposed risk-stratification model demonstrates high accuracy in predicting events in patients with hypertrophic cardiomyopathy. The use of a machine-learning risk stratification model may improve patient management, clinical practice, and outcomes in general.


Assuntos
Cardiomiopatia Hipertrófica , Insuficiência Cardíaca , Taquicardia Ventricular , Inteligência Artificial , Cardiomiopatia Hipertrófica/epidemiologia , Cardiomiopatia Hipertrófica/genética , Insuficiência Cardíaca/epidemiologia , Humanos , Aprendizado de Máquina , Medição de Risco , Fatores de Risco , Taquicardia Ventricular/epidemiologia , Taquicardia Ventricular/genética
5.
Comput Biol Med ; 134: 104520, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34118751

RESUMO

Virtual population generation is an emerging field in data science with numerous applications in healthcare towards the augmentation of clinical research databases with significant lack of population size. However, the impact of data augmentation on the development of AI (artificial intelligence) models to address clinical unmet needs has not yet been investigated. In this work, we assess whether the aggregation of real with virtual patient data can improve the performance of the existing risk stratification and disease classification models in two rare clinical domains, namely the primary Sjögren's Syndrome (pSS) and the hypertrophic cardiomyopathy (HCM), for the first time in the literature. To do so, multivariate approaches, such as, the multivariate normal distribution (MVND), and straightforward ones, such as, the Bayesian networks, the artificial neural networks (ANNs), and the tree ensembles are compared against their performance towards the generation of high-quality virtual data. Both boosting and bagging algorithms, such as, the Gradient boosting trees (XGBoost), the AdaBoost and the Random Forests (RFs) were trained on the augmented data to evaluate the performance improvement for lymphoma classification and HCM risk stratification. Our results revealed the favorable performance of the tree ensemble generators, in both domains, yielding virtual data with goodness-of-fit 0.021 and KL-divergence 0.029 in pSS and 0.029, 0.027 in HCM, respectively. The application of the XGBoost on the augmented data revealed an increase by 10.9% in accuracy, 10.7% in sensitivity, 11.5% in specificity for lymphoma classification and 16.1% in accuracy, 16.9% in sensitivity, 13.7% in specificity in HCM risk stratification.


Assuntos
Algoritmos , Inteligência Artificial , Teorema de Bayes , Humanos , Redes Neurais de Computação , Medição de Risco
6.
Mach Learn ; 109(7): 1465-1507, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32704202

RESUMO

Data preprocessing is an important component of machine learning pipelines, which requires ample time and resources. An integral part of preprocessing is data transformation into the format required by a given learning algorithm. This paper outlines some of the modern data processing techniques used in relational learning that enable data fusion from different input data types and formats into a single table data representation, focusing on the propositionalization and embedding data transformation approaches. While both approaches aim at transforming data into tabular data format, they use different terminology and task definitions, are perceived to address different goals, and are used in different contexts. This paper contributes a unifying framework that allows for improved understanding of these two data transformation techniques by presenting their unified definitions, and by explaining the similarities and differences between the two approaches as variants of a unified complex data transformation task. In addition to the unifying framework, the novelty of this paper is a unifying methodology combining propositionalization and embeddings, which benefits from the advantages of both in solving complex data transformation and learning tasks. We present two efficient implementations of the unifying methodology: an instance-based PropDRM approach, and a feature-based PropStar approach to data transformation and learning, together with their empirical evaluation on several relational problems. The results show that the new algorithms can outperform existing relational learners and can solve much larger problems.

7.
Artif Intell Med ; 91: 82-95, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29803610

RESUMO

Quality of life of patients with Parkinson's disease degrades significantly with disease progression. This paper presents a step towards personalized management of Parkinson's disease patients, based on discovering groups of similar patients. Similarity is based on patients' medical conditions and changes in the prescribed therapy when the medical conditions change. We present two novel approaches. The first algorithm discovers symptoms' impact on Parkinson's disease progression. Experiments on the Parkinson Progression Markers Initiative (PPMI) data reveal a subset of symptoms influencing disease progression which are already established in Parkinson's disease literature, as well as symptoms that are considered only recently as possible indicators of disease progression by clinicians. The second novelty is a methodology for detecting patterns of medications dosage changes based on the patient status. The methodology combines multitask learning using predictive clustering trees and short time series analysis to better understand when a change in medications is required. The experiments on PPMI data demonstrate that, using the proposed methodology, we can identify some clinically confirmed patients' symptoms suggesting medications change. In terms of predictive performance, our multitask predictive clustering tree approach is mostly comparable to the random forest multitask model, but has the advantage of model interpretability.


Assuntos
Algoritmos , Antiparkinsonianos/uso terapêutico , Progressão da Doença , Doença de Parkinson/tratamento farmacológico , Doença de Parkinson/fisiopatologia , Antiparkinsonianos/administração & dosagem , Biomarcadores , Mineração de Dados/métodos , Relação Dose-Resposta a Droga , Humanos , Qualidade de Vida , Índice de Gravidade de Doença
8.
IEEE Trans Neural Netw Learn Syst ; 27(5): 926-38, 2016 May.
Artigo em Inglês | MEDLINE | ID: mdl-26011896

RESUMO

There are plenty of problems where the data available is scarce and expensive. We propose a generator of semiartificial data with similar properties to the original data, which enables the development and testing of different data mining algorithms and the optimization of their parameters. The generated data allow large-scale experimentation and simulations without danger of overfitting. The proposed generator is based on radial basis function networks, which learn sets of Gaussian kernels. These Gaussian kernels can be used in a generative mode to generate new data from the same distributions. To assess the quality of the generated data, we evaluated the statistical properties of the generated data, structural similarity, and predictive similarity using supervised and unsupervised learning techniques. To determine usability of the proposed generator we conducted a large scale evaluation using 51 data sets. The results show a considerable similarity between the original and generated data and indicate that the method can be useful in several development and simulation scenarios. We analyze possible improvements in the classification performance by adding different amounts of the generated data to the training set, performance on high-dimensional data sets, and conditions when the proposed approach is successful.

9.
Artif Intell Med ; 64(2): 147-58, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25940855

RESUMO

OBJECTIVE: Survey data sets are important sources of data, and their successful exploitation is of key importance for informed policy decision-making. We present how a survey analysis approach initially developed for customer satisfaction research in marketing can be adapted for an introduction of clinical pharmacy services into a hospital. METHODS AND MATERIAL: We use a data mining analytical approach to extract relevant managerial consequences. We evaluate the importance of competences for users of a clinical pharmacy with the OrdEval algorithm and determine their nature according to the users' expectations. For this, we need substantially fewer questions than are required by the Kano approach. RESULTS: From 52 clinical pharmacy activities we were able to identify seven activities with a substantial negative impact (i.e., negative reinforcement) on the overall satisfaction of clinical pharmacy services, and two activities with a strong positive impact (upward reinforcement). Using analysis of individual feature values, we identified six performance, 10 excitement, and one basic clinical pharmacists' activity. CONCLUSIONS: We show how the OrdEval algorithm can exploit the information hidden in the ordering of class and attribute values, and their inherent correlation using a small sample of highly relevant respondents. The visualization of the outputs turns out highly useful in our clinical pharmacy research case study.


Assuntos
Algoritmos , Mineração de Dados/métodos , Aprendizado de Máquina , Farmacêuticos/organização & administração , Serviço de Farmácia Hospitalar/organização & administração , Atitude do Pessoal de Saúde , Comportamento do Consumidor , Pesquisas sobre Atenção à Saúde , Conhecimentos, Atitudes e Prática em Saúde , Pesquisa sobre Serviços de Saúde , Humanos , Recursos Humanos de Enfermagem Hospitalar/psicologia , Médicos/psicologia , Papel Profissional , Inquéritos e Questionários
10.
Artif Intell Med ; 29(1-2): 25-38, 2003.
Artigo em Inglês | MEDLINE | ID: mdl-12957779

RESUMO

We analyzed the data of a controlled clinical study of the chronic wound healing acceleration as a result of electrical stimulation. The study involved a conventional conservative treatment, sham treatment, biphasic pulsed current, and direct current electrical stimulation. Data was collected over 10 years and suffices for an analysis with machine learning methods. So far, only a limited number of studies have investigated the wound and patient attributes which affect the chronic wound healing. There is none to our knowledge to include treatment attributes. The aims of our study are to determine effects of the wound, patient and treatment attributes on the wound healing process and to propose a system for prediction of the wound healing rate. First we analyzed which wound and patient attributes play a predominant role in the wound healing process and investigated a possibility to predict the wound healing rate at the beginning of the treatment based on the initial wound, patient and treatment attributes. Later we tried to enhance the wound healing rate prediction accuracy by predicting it after a few weeks of the wound healing follow-up. Using the attribute estimation algorithms ReliefF and RReliefF we obtained a ranking of the prognostic factors which was comprehensible to experts. We used regression and classification trees to build models for prediction of the wound healing rate. The obtained results are encouraging and may form a basis for an expert system for the chronic wound healing rate prediction. If the wound healing rate is known, then the provided information can help to formulate the appropriate treatment decisions and orient resources towards individuals with poor prognosis.


Assuntos
Algoritmos , Inteligência Artificial , Terapia por Estimulação Elétrica , Modelos Teóricos , Cicatrização , Árvores de Decisões , Humanos , Prognóstico , Análise de Regressão
11.
J Hum Kinet ; 38: 183-9, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24235993

RESUMO

The International Basketball Federation (FIBA) recently introduced major rule changes that came into effect with the 2010/11 season. Most notably, moving the three-point arc and changing the shot-clock. The purpose of this study was to investigate and quantify how these changes affect the game performance of top-level European basketball players. In order to better understand these changes, we also investigated past seasons and showed the presence of several trends, even in the absence of significant rule changes. A large set of game statistics for 10 seasons and 2198 Euroleague basketball games in which top European clubs competed was analyzed. Results show that the effects of the rule changes are contrary to trends in recent years.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA