Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 3 de 3
Filtrar
1.
BMC Med Inform Decis Mak ; 19(Suppl 3): 79, 2019 04 04.
Artigo em Inglês | MEDLINE | ID: mdl-30943954

RESUMO

BACKGROUND: Twitter messages (tweets) contain various types of topics in our daily life, which include health-related topics. Analysis of health-related tweets would help us understand health conditions and concerns encountered in our daily lives. In this paper we evaluate an approach to extracting causalities from tweets using natural language processing (NLP) techniques. METHODS: Lexico-syntactic patterns based on dependency parser outputs are used for causality extraction. We focused on three health-related topics: "stress", "insomnia", and "headache." A large dataset consisting of 24 million tweets are used. RESULTS: The results show the proposed approach achieved an average precision between 74.59 to 92.27% in comparisons with human annotations. CONCLUSIONS: Manual analysis on extracted causalities in tweets reveals interesting findings about expressions on health-related topic posted by Twitter users.


Assuntos
Causalidade , Armazenamento e Recuperação da Informação , Processamento de Linguagem Natural , Envio de Mensagens de Texto , Conjuntos de Dados como Assunto , Cefaleia , Humanos , Distúrbios do Início e da Manutenção do Sono , Mídias Sociais , Estresse Psicológico
2.
AMIA Annu Symp Proc ; 2018: 1028-1035, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30815146

RESUMO

Concept detection is an integral step in natural language processing (NLP) applications in the clinical domain. Clinical concepts are detailed (e.g., "pain in left/right upper/lower arm/leg") and expressed in diverse phrase types (e.g., noun, verb, adjective, or prepositional phrase). There are rich terminological resources in the clinical domain that include many concept synonyms. Even with these resources, concept detection remains challenging due to discontinuous and/or permuted phrase occurrences. To overcome this challenge, we investigated an approach to exploiting syntactic information. Syntactic patterns of concept phrases were mined from continuous, non-permuted forms of synonyms, and these patterns were used to detect discontinuous and/or permuted concept phrases. Experiments on 790 de-identified clinical notes showed that the proposed approach can potentially boost a recall of concept detection. Meanwhile, challenges and limitations were noticed. In this paper, we report and discuss our preliminary analysis and finding.


Assuntos
Processamento de Linguagem Natural , Reconhecimento Automatizado de Padrão , Semântica , Unified Medical Language System , Algoritmos , Registros Eletrônicos de Saúde , Humanos
3.
J Am Med Inform Assoc ; 20(6): 1168-77, 2013.
Artigo em Inglês | MEDLINE | ID: mdl-23907286

RESUMO

OBJECTIVE: To develop, evaluate, and share: (1) syntactic parsing guidelines for clinical text, with a new approach to handling ill-formed sentences; and (2) a clinical Treebank annotated according to the guidelines. To document the process and findings for readers with similar interest. METHODS: Using random samples from a shared natural language processing challenge dataset, we developed a handbook of domain-customized syntactic parsing guidelines based on iterative annotation and adjudication between two institutions. Special considerations were incorporated into the guidelines for handling ill-formed sentences, which are common in clinical text. Intra- and inter-annotator agreement rates were used to evaluate consistency in following the guidelines. Quantitative and qualitative properties of the annotated Treebank, as well as its use to retrain a statistical parser, were reported. RESULTS: A supplement to the Penn Treebank II guidelines was developed for annotating clinical sentences. After three iterations of annotation and adjudication on 450 sentences, the annotators reached an F-measure agreement rate of 0.930 (while intra-annotator rate was 0.948) on a final independent set. A total of 1100 sentences from progress notes were annotated that demonstrated domain-specific linguistic features. A statistical parser retrained with combined general English (mainly news text) annotations and our annotations achieved an accuracy of 0.811 (higher than models trained purely with either general or clinical sentences alone). Both the guidelines and syntactic annotations are made available at https://sourceforge.net/projects/medicaltreebank. CONCLUSIONS: We developed guidelines for parsing clinical text and annotated a corpus accordingly. The high intra- and inter-annotator agreement rates showed decent consistency in following the guidelines. The corpus was shown to be useful in retraining a statistical parser that achieved moderate accuracy.


Assuntos
Registros Eletrônicos de Saúde , Guias como Assunto , Linguística , Processamento de Linguagem Natural
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA