Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 20
Filtrar
1.
Stud Health Technol Inform ; 180: 235-9, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22874187

RESUMO

Pharmacovigilance is the activity related to the collection, analysis and prevention of adverse drug reactions (ADRs) induced by drugs. It leads to the safety survey of pharmaceutical products. The pharmacovigilance process benefits from the traditional statistical approaches and also from the qualitative information on semantic relations between close ADR terms, such as SMQs or hierarchical levels of MedDRA. In this work, our objective is to detect the semantic relatedness between the ADR MedDRA terms. To achieve this, we combine two approaches: semantic similarity algorithms computed within structured resources and terminology structuring methods applied to a raw list of the MedDRA terms. We compare these methods between them and study their differences and complementarity. The results are evaluated against the gold standard manually compiled within the pharmacovigilance area and also with an expert. The combination of the methods leads to an improved recall.


Assuntos
Sistemas de Notificação de Reações Adversas a Medicamentos , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Factuais , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Processamento de Linguagem Natural , Farmacovigilância , Vocabulário Controlado , Inteligência Artificial , França/epidemiologia , Humanos
2.
Stud Health Technol Inform ; 294: 868-869, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612229

RESUMO

We address the problem of semantic labeling of terms in two French medical corpora with the subset of the UMLS. We perform two experiments relying on the structure of words and terms, and on their context: 1) the semantic label of already identified terms is predicted; 2) the terms are detected in raw texts and their semantic label is predicted. Our results show over 0.90 F-measure.


Assuntos
Semântica , Unified Medical Language System , Processamento de Linguagem Natural
3.
Stud Health Technol Inform ; 281: 253-257, 2021 May 27.
Artigo em Inglês | MEDLINE | ID: mdl-34042744

RESUMO

This paper presents a prototype for the visualization of food-drug interactions implemented in the MIAM project, whose objective is to develop methods for the extraction and representation of these interactions and to make them available in the Thériaque database. The prototype provides users with a graphical visualization showing the hierarchies of drugs and foods in front of each other and the links between them representing the existing interactions as well as additional details about them, including the number of articles reporting the interaction. The prototype is interactive in the following ways: hierarchies can be easily folded and unfolded, a filter can be applied to view only certain types of interactions, and details about a given interaction are displayed when the mouse is moved over the corresponding link. Future work includes proposing a version more suitable for non-health professional users and the representation of the food hierarchy based on a reference classification.


Assuntos
Interações Alimento-Droga , Animais , Bases de Dados Factuais , Camundongos
4.
Stud Health Technol Inform ; 160(Pt 2): 1015-9, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20841837

RESUMO

Acquisition and enrichment of lexical resources is an important research area for the computational linguistics. We propose a method for inducing a lexicon of synonyms and for its weighting in order to establish its reliability. The method is based on the analysis of syntactic structure of complex terms. We apply and evaluate the approach on three biomedical terminologies (MeSH, Snomed Int, Snomed CT). Between 7.7 and 33.6% of the induced synonyms are ambiguous and cooccur with other semantic relations. A virtual reference allows to validate 9 to 14% of the induced synonyms.


Assuntos
Semântica , Linguística , Medical Subject Headings , Processamento de Linguagem Natural , Systematized Nomenclature of Medicine
5.
Stud Health Technol Inform ; 160(Pt 2): 964-8, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20841827

RESUMO

Risk factors discovery and prevention is an active research field within the biomedical domain. Despite abundant existing information on risk factors, as found in bibliographical databases or on several websites, accessing this information may be difficult. Methods from Natural Language Processing and Information Extraction can be helpful to access it more easily. Specifically, we show a procedure for analyzing massive amounts of scientific literature and for detecting linguistically marked associations between pathologies and risk factors. This approach allowed us to extract over 22,000 risk factors and associated pathologies. The performed evaluations pointed out that (1) over 88% of risk factors for coronary heart disease are correct, (2) associated pathologies, when they could be compared to MeSH indexing, are correct in about 70%, and (3) in existing terminologies links between risk factors and their pathologies are seldom recorded.


Assuntos
Mineração de Dados/normas , Indexação e Redação de Resumos/métodos , Bases de Dados Bibliográficas , Doença , Medical Subject Headings , Processamento de Linguagem Natural , Fatores de Risco , Semântica , Estados Unidos
6.
Methods Inf Med ; 48(2): 149-54, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19283312

RESUMO

OBJECTIVE: Currently, the use of natural language processing (NLP) approaches in order to improve search and exploration of electronic health records (EHRs) within healthcare information systems is not a common practice. One reason for this is the lack of suitable lexical resources. Indeed, in order to support such tasks, various types of such resources need to be collected or acquired (i.e., morphological, orthographic, synonymous). METHODS: We propose a novel method for the acquisition of synonymy resources. This method is language-independent and relies on existence of structured terminologies. It enables to decipher hidden synonymy relations between simple words and terms on the basis of their syntactic analysis and exploitation of their compositionality. RESULTS: Applied to series of synonym terms from the French subset of the UMLS , the method shows 99% precision. The overlap between thus inferred terms and the existing sparse resources of synonyms is very low. In order to better integrate these resources in an EHR search system, we analyzed a sample of clinical queries submitted by healthcare professionals. CONCLUSIONS: Observation of clinical queries shows that they make a very little use of the query expansion function, and, whenever they do, synonymy relations are rarely involved.


Assuntos
Sistemas de Informação Hospitalar/organização & administração , Sistemas Computadorizados de Registros Médicos , Processamento de Linguagem Natural , Terminologia como Assunto , França , Humanos
7.
Stud Health Technol Inform ; 264: 1327-1331, 2019 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-31438141

RESUMO

Detection of difficult for understanding words is a crucial task for ensuring the proper understanding of medical texts such as diagnoses and drug instructions. We propose to combine supervised machine learning algorithms using various features with word embeddings which contain context information of words. Data in French are manually cross-annotated by seven annotators. On the basis of these data, we propose cross-validation scenarios in order to test the generalization ability of models to detect the difficulty of medical words. On data provided by seven annotators, we show that the models are generalizable from one annotator to another.


Assuntos
Algoritmos , Compreensão , Idioma , Processamento de Linguagem Natural , Aprendizado de Máquina Supervisionado
8.
Stud Health Technol Inform ; 136: 809-14, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18487831

RESUMO

Currently, the use of Natural Language Processing (NLP) approaches in order to improve search and exploration of electronic health records (EHRs) within healthcare information systems is not a common practice. One reason for this is the lack of suitable lexical resources: various types of such resources need to be collected or acquired. In this work, we propose a novel method for the acquisition of synonymous resources. This method is language-independent and relies on existence of structured terminologies. It enables to decipher hidden synonymous relations between simple words and terms on the basis of their syntactic analysis and exploitation of their compositionality. Applied to series of synonym terms from the French subset of the UMLS, the method shows 99% precision. The overlap between thus inferred terms and the existing sparse resources of synonyms is very low.


Assuntos
Armazenamento e Recuperação da Informação , Sistemas Computadorizados de Registros Médicos , Multilinguismo , Processamento de Linguagem Natural , Vocabulário Controlado , Algoritmos , Coleta de Dados , Dicionários como Assunto , França , Bases de Conhecimento , Unified Medical Language System
9.
Stud Health Technol Inform ; 247: 730-734, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29678057

RESUMO

Exchanges between diabetic patients on discussion fora permit to study their understanding of their disorder, their behavior and needs when facing health problems. When analyzing these exchanges and behavior, it is necessary to collect information on user profile. We present an approach combining lexicon and super-vised classifiers for the identification of age and gender of contributors, their disorders and relation between contributor and patient. According to parameters of the method, precision is between 100% for gender and 53.48% for disorders.


Assuntos
Mineração de Dados , Diabetes Mellitus , Pacientes , Humanos , Internet , Mídias Sociais
10.
CEUR Workshop Proc ; 1609: 28-42, 2016 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-29308065

RESUMO

This paper reports on Task 2 of the 2016 CLEF eHealth evaluation lab which extended the previous information extraction tasks of ShARe/CLEF eHealth evaluation labs. The task continued with named entity recognition and normalization in French narratives, as offered in CLEF eHealth 2015. Named entity recognition involved ten types of entities including disorders that were defined according to Semantic Groups in the Unified Medical Language System® (UMLS®), which was also used for normalizing the entities. In addition, we introduced a large-scale classification task in French death certificates, which consisted of extracting causes of death as coded in the International Classification of Diseases, tenth revision (ICD10). Participant systems were evaluated against a blind reference standard of 832 titles of scientific articles indexed in MEDLINE, 4 drug monographs published by the European Medicines Agency (EMEA) and 27,850 death certificates using Precision, Recall and F-measure. In total, seven teams participated, including five in the entity recognition and normalization task, and five in the death certificate coding task. Three teams submitted their systems to our newly offered reproducibility track. For entity recognition, the highest performance was achieved on the EMEA corpus, with an overall F-measure of 0.702 for plain entities recognition and 0.529 for normalized entity recognition. For entity normalization, the highest performance was achieved on the MEDLINE corpus, with an overall F-measure of 0.552. For death certificate coding, the highest performance was 0.848 F-measure.

11.
Stud Health Technol Inform ; 216: 815-20, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26262165

RESUMO

With the recent and intensive research in the biomedical area, the knowledge accumulated is disseminated through various knowledge bases. Links between these knowledge bases are needed in order to use them jointly. Linked Data, SPARQL language, and interfaces in Natural Language question-answering provide interesting solutions for querying such knowledge bases. We propose a method for translating natural language questions in SPARQL queries. We use Natural Language Processing tools, semantic resources, and the RDF triples description. The method is designed on 50 questions over 3 biomedical knowledge bases, and evaluated on 27 questions. It achieves 0.78 F-measure on the test set. The method for translating natural language questions into SPARQL queries is implemented as Perl module available at http://search.cpan.org/ thhamon/RDF-NLP-SPARQLQuery.


Assuntos
Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Bases de Dados Factuais , Humanos , Armazenamento e Recuperação da Informação/métodos , Semântica
12.
Stud Health Technol Inform ; 210: 80-4, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25991106

RESUMO

While patients can freely access their Electronic Health Records or online health information, they may not be able to correctly understand the content of these documents. One of the challenges is related to the difference between expert and non-expert languages. We propose to investigate this issue within the Information Retrieval field. The patient queries have to be associated with the corresponding expert documents, that provide trustworthy information. Our approach relies on a state-of-the-art IR system called Indri and on semantic resources. Different query expansion strategies are explored. Our system shows up to 0.6740 P@10, up to 0.7610 R@10, and up to 0.6793 NDCG@10.


Assuntos
Informação de Saúde ao Consumidor/organização & administração , Mineração de Dados/métodos , Registros Eletrônicos de Saúde/organização & administração , Sistemas de Informação em Saúde/organização & administração , Processamento de Linguagem Natural , Interface Usuário-Computador , Aprendizado de Máquina , Acesso dos Pacientes aos Registros
13.
J Biomed Semantics ; 5: 18, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24739596

RESUMO

Pharmacovigilance is the activity related to the collection, analysis and prevention of adverse drug reactions (ADRs) induced by drugs. This activity is usually performed within dedicated databases (national, European, international...), in which the ADRs declared for patients are usually coded with a specific controlled terminology MedDRA (Medical Dictionary for Drug Regulatory Activities). Traditionally, the detection of adverse drug reactions is performed with data mining algorithms, while more recently the groupings of close ADR terms are also being exploited. The Standardized MedDRA Queries (SMQs) have become a standard in pharmacovigilance. They are created manually by international boards of experts with the objective to group together the MedDRA terms related to a given safety topic. Within the MedDRA version 13, 84 SMQs exist, although several important safety topics are not yet covered. The objective of our work is to propose an automatic method for assisting the creation of SMQs using the clustering of semantically close MedDRA terms. The experimented method relies on semantic approaches: semantic distance and similarity algorithms, terminology structuring methods and term clustering. The obtained results indicate that the proposed unsupervised methods appear to be complementary for this task, they can generate subsets of the existing SMQs and make this process systematic and less time consuming.

14.
Patient Educ Couns ; 92(2): 197-204, 2013 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-23769423

RESUMO

OBJECTIVE: Automatically analyze the online discussions related to diabetes and extract information on patient skills for managing this disease. METHODS: Two collections of about 7000 and 23,000 messages from online discussion fora and 174 skills from an available taxonomy are processed with Natural Language Processing methods and semantically enriched. Skills are projected on the messages to detect those skills which are mentioned by patients. Quantitative and qualitative evaluation is performed. RESULTS: The method recognizes almost all the aimed skills in fora. The quality of the skills' recognition varies with the method's parameters. Most of the selected messages are relevant to at least one of the associated skills. Manual analysis shows a substantial number of messages is dedicated to daily self-care and psychosocial skills. CONCLUSION: Study of real exchanges between patients leads to a better understanding of their skills in daily self-management of diabetes. PRACTICE IMPLICATIONS: Our experiments can be useful for a better understanding and better knowledge of self-management of diseases by patients. They can also refine existing patient education programs.


Assuntos
Diabetes Mellitus/terapia , Correio Eletrônico , Internet , Processamento de Linguagem Natural , Gerenciamento Clínico , Feminino , Conhecimentos, Atitudes e Prática em Saúde , Humanos , Masculino , Autocuidado
15.
Stud Health Technol Inform ; 192: 1189, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23920963

RESUMO

Extraction of information related to the medication is an important task within the biomedical area. Our method is applied to different types of documents in three languages. The results indicate that our approach can efficiently update and enrich the existing drug vocabularies.


Assuntos
Inteligência Artificial , Bases de Dados de Produtos Farmacêuticos/classificação , Rotulagem de Medicamentos/classificação , Processamento de Linguagem Natural , Preparações Farmacêuticas/classificação , Terminologia como Assunto , Vocabulário Controlado , Algoritmos , Mineração de Dados/métodos , Inglaterra , França , Reconhecimento Automatizado de Padrão/métodos , Semântica , Suécia , Tradução
16.
J Am Med Inform Assoc ; 20(5): 820-7, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23571851

RESUMO

OBJECTIVE: To identify the temporal relations between clinical events and temporal expressions in clinical reports, as defined in the i2b2/VA 2012 challenge. DESIGN: To detect clinical events, we used rules and Conditional Random Fields. We built Random Forest models to identify event modality and polarity. To identify temporal expressions we built on the HeidelTime system. To detect temporal relations, we systematically studied their breakdown into distinct situations; we designed an oracle method to determine the most prominent situations and the most suitable associated classifiers, and combined their results. RESULTS: We achieved F-measures of 0.8307 for event identification, based on rules, and 0.8385 for temporal expression identification. In the temporal relation task, we identified nine main situations in three groups, experimentally confirming shared intuitions: within-sentence relations, section-related time, and across-sentence relations. Logistic regression and Naïve Bayes performed best on the first and third groups, and decision trees on the second. We reached a 0.6231 global F-measure, improving by 7.5 points our official submission. CONCLUSIONS: Carefully hand-crafted rules obtained good results for the detection of events and temporal expressions, while a combination of classifiers improved temporal link prediction. The characterization of the oracle recall of situations allowed us to point at directions where further work would be most useful for temporal relation detection: within-sentence relations and linking History of Present Illness events to the admission date. We suggest that the systematic situation breakdown proposed in this paper could also help improve other systems addressing this task.


Assuntos
Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Inteligência Artificial , Humanos , Tempo
17.
Biomed Inform Insights ; 6(Suppl 1): 51-62, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-24052691

RESUMO

Medical entity recognition is currently generally performed by data-driven methods based on supervised machine learning. Expert-based systems, where linguistic and domain expertise are directly provided to the system are often combined with data-driven systems. We present here a case study where an existing expert-based medical entity recognition system, Ogmios, is combined with a data-driven system, Caramba, based on a linear-chain Conditional Random Field (CRF) classifier. Our case study specifically highlights the risk of overfitting incurred by an expert-based system. We observe that it prevents the combination of the 2 systems from obtaining improvements in precision, recall, or F-measure, and analyze the underlying mechanisms through a post-hoc feature-level analysis. Wrapping the expert-based system alone as attributes input to a CRF classifier does boost its F-measure from 0.603 to 0.710, bringing it on par with the data-driven system. The generalization of this method remains to be further investigated.

18.
J Am Med Inform Assoc ; 17(5): 549-54, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20819862

RESUMO

BACKGROUND: Pharmacotherapy is an integral part of any medical care process and plays an important role in the medical history of most patients. Information on medication is crucial for several tasks such as pharmacovigilance, medical decision or biomedical research. OBJECTIVES: Within a narrative text, medication-related information can be buried within other non-relevant data. Specific methods, such as those provided by text mining, must be designed for accessing them, and this is the objective of this study. METHODS: The authors designed a system for analyzing narrative clinical documents to extract from them medication occurrences and medication-related information. The system also attempts to deduce medications not covered by the dictionaries used. RESULTS: Results provided by the system were evaluated within the framework of the I2B2 NLP challenge held in 2009. The system achieved an F-measure of 0.78 and ranked 7th out of 20 participating teams (the highest F-measure was 0.86). The system provided good results for the annotation and extraction of medication names, their frequency, dosage and mode of administration (F-measure over 0.81), while information on duration and reasons is poorly annotated and extracted (F-measure 0.36 and 0.29, respectively). The performance of the system was stable between the training and test sets.


Assuntos
Registros Eletrônicos de Saúde , Armazenamento e Recuperação da Informação/métodos , Processamento de Linguagem Natural , Preparações Farmacêuticas , Tratamento Farmacológico , Humanos , Linguística , Design de Software
19.
AMIA Annu Symp Proc ; 2009: 203-7, 2009 Nov 14.
Artigo em Inglês | MEDLINE | ID: mdl-20351850

RESUMO

The motivation of this work is to study the use of speculation markers within scientific writing: this may be useful for discovering whether these markers are regularly spread across biomedical articles and then for establishing the logical structure of articles. To achieve these objectives, we compute associations between article sections and speculation markers. We use machine learning algorithms to show that there are strong and interesting associations between speculation markers and article structure. For instance, strong markers, which strongly influence the presentation of knowledge, are specific to Results, Discussion and Abstract; while non strong markers appear with higher regularity within Material and Methods. Our results indicate that speculation is governed by observable usage rules within scientific articles and can help their structuring.


Assuntos
Pesquisa Biomédica , Redação , Algoritmos , Ciência
20.
AMIA Annu Symp Proc ; : 252-6, 2008 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-18999042

RESUMO

Acquisition and enrichment of lexical resources is acknowledged as an important research in the area of computational linguistics. While such resources are often missing, specialized domains, ie biomedicine, propose several structured terminologies. In this paper, we propose a high-quality method for exploiting a structured terminology and inferring elementary synonym lexicon. The method is based on the analysis of syntactic structure of complex terms. The inferred synonym pairs are then profiled according to different clues endogenously computed within the same terminology. We apply and evaluate the approach on the Gene Ontology biomedical terminology.


Assuntos
Armazenamento e Recuperação da Informação/métodos , Semântica , Vocabulário Controlado , Biologia Computacional/métodos , Bases de Dados como Assunto , Perfilação da Expressão Gênica/métodos , Genômica , Gestão da Informação , Análise em Microsséries/métodos , Biologia Molecular/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA