Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 47
Filtrar
1.
Stud Health Technol Inform ; 290: 632-636, 2022 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-35673093

RESUMO

Tools to automate the summarization of nursing entries in electronic health records (EHR) have the potential to support healthcare professionals to obtain a rapid overview of a patient's situation when time is limited. This study explores a keyword-based text summarization method for the nursing text that is based on machine learning model explainability for text classification models. This study aims to extract keywords and phrases that provide an intuitive overview of the content in multiple nursing entries in EHRs written during individual patients' care episodes. The proposed keyword extraction method is used to generate keyword summaries from 40 patients' care episodes and its performance is compared to a baseline method based on word embeddings combined with the PageRank method. The two methods were assessed with manual evaluation by three domain experts. The results indicate that it is possible to generate representative keyword summaries from nursing entries in EHRs and our method outperformed the baseline method.


Assuntos
Registros Eletrônicos de Saúde , Cuidado Periódico , Humanos , Aprendizado de Máquina , Processamento de Linguagem Natural , Projetos de Pesquisa , Redação
2.
Stud Health Technol Inform ; 294: 854-858, 2022 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-35612225

RESUMO

In health sciences, high-quality text embeddings may augment qualitative data analysis of large amounts of text by enabling, e.g., searching and clustering of health information. This study aimed to evaluate three different sentence-level embedding methods in clustering sentences in nursing narratives from individual patients' hospital care episodes. Two of these embeddings are generated from language models based on the BERT framework, and the third on the Sent2Vec method. These embedding methods were used to cluster sentences from 20 patient care episodes and the results were manually evaluated. Findings suggest that the best clusters were produced by the embeddings from a BERT model fine-tuned for the proxy task of predicting subject headings for nursing text.


Assuntos
Idioma , Processamento de Linguagem Natural , Análise por Conglomerados , Humanos , Unified Medical Language System
3.
IEEE/ACM Trans Comput Biol Bioinform ; 19(3): 1772-1781, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-33306472

RESUMO

Over the past decade, the demand for automated protein function prediction has increased due to the volume of newly sequenced proteins. In this paper, we address the function prediction task by developing an ensemble system automatically assigning Gene Ontology (GO) terms to the given input protein sequence. We develop an ensemble system which combines the GO predictions made by random forest (RF) and neural network (NN) classifiers. Both RF and NN models rely on features derived from BLAST sequence alignments, taxonomy and protein signature analysis tools. In addition, we report on experiments with a NN model that directly analyzes the amino acid sequence as its sole input, using a convolutional layer. The Swiss-Prot database is used as the training and evaluation data. In the CAFA3 evaluation, which relies on experimental verification of the functional predictions, our submitted ensemble model demonstrates competitive performance ranking among top-10 best-performing systems out of over 100 submitted systems. In this paper, we evaluate and further improve the CAFA3-submitted system. Our machine learning models together with the data pre-processing and feature generation tools are publicly available as an open source software at https://github.com/TurkuNLP/CAFA3.


Assuntos
Redes Neurais de Computação , Proteínas , Bases de Dados de Proteínas , Proteínas/química , Alinhamento de Sequência , Software
4.
J Adv Nurs ; 77(9): 3707-3717, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-34003504

RESUMO

AIM: To develop a consensus paper on the central points of an international invitational think-tank on nursing and artificial intelligence (AI). METHODS: We established the Nursing and Artificial Intelligence Leadership (NAIL) Collaborative, comprising interdisciplinary experts in AI development, biomedical ethics, AI in primary care, AI legal aspects, philosophy of AI in health, nursing practice, implementation science, leaders in health informatics practice and international health informatics groups, a representative of patients and the public, and the Chair of the ITU/WHO Focus Group on Artificial Intelligence for Health. The NAIL Collaborative convened at a 3-day invitational think tank in autumn 2019. Activities included a pre-event survey, expert presentations and working sessions to identify priority areas for action, opportunities and recommendations to address these. In this paper, we summarize the key discussion points and notes from the aforementioned activities. IMPLICATIONS FOR NURSING: Nursing's limited current engagement with discourses on AI and health posts a risk that the profession is not part of the conversations that have potentially significant impacts on nursing practice. CONCLUSION: There are numerous gaps and a timely need for the nursing profession to be among the leaders and drivers of conversations around AI in health systems. IMPACT: We outline crucial gaps where focused effort is required for nursing to take a leadership role in shaping AI use in health systems. Three priorities were identified that need to be addressed in the near future: (a) Nurses must understand the relationship between the data they collect and AI technologies they use; (b) Nurses need to be meaningfully involved in all stages of AI: from development to implementation; and (c) There is a substantial untapped and an unexplored potential for nursing to contribute to the development of AI technologies for global health and humanitarian efforts.


Assuntos
Inteligência Artificial , Liderança , Humanos , Tecnologia
5.
J Biomed Semantics ; 11(1): 10, 2020 09 01.
Artigo em Inglês | MEDLINE | ID: mdl-32873340

RESUMO

BACKGROUND: Up to 35% of nurses' working time is spent on care documentation. We describe the evaluation of a system aimed at assisting nurses in documenting patient care and potentially reducing the documentation workload. Our goal is to enable nurses to write or dictate nursing notes in a narrative manner without having to manually structure their text under subject headings. In the current care classification standard used in the targeted hospital, there are more than 500 subject headings to choose from, making it challenging and time consuming for nurses to use. METHODS: The task of the presented system is to automatically group sentences into paragraphs and assign subject headings. For classification the system relies on a neural network-based text classification model. The nursing notes are initially classified on sentence level. Subsequently coherent paragraphs are constructed from related sentences. RESULTS: Based on a manual evaluation conducted by a group of three domain experts, we find that in about 69% of the paragraphs formed by the system the topics of the sentences are coherent and the assigned paragraph headings correctly describe the topics. We also show that the use of a paragraph merging step reduces the number of paragraphs produced by 23% without affecting the performance of the system. CONCLUSIONS: The study shows that the presented system produces a coherent and logical structure for freely written nursing narratives and has the potential to reduce the time and effort nurses are currently spending on documenting care in hospitals.


Assuntos
Documentação , Enfermeiras e Enfermeiros , Automação , Hospitais , Idioma , Descritores
6.
J Am Med Inform Assoc ; 27(1): 81-88, 2020 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-31605490

RESUMO

OBJECTIVE: This study focuses on the task of automatically assigning standardized (topical) subject headings to free-text sentences in clinical nursing notes. The underlying motivation is to support nurses when they document patient care by developing a computer system that can assist in incorporating suitable subject headings that reflect the documented topics. Central in this study is performance evaluation of several text classification methods to assess the feasibility of developing such a system. MATERIALS AND METHODS: Seven text classification methods are evaluated using a corpus of approximately 0.5 million nursing notes (5.5 million sentences) with 676 unique headings extracted from a Finnish university hospital. Several of these methods are based on artificial neural networks. Evaluation is first done in an automatic manner for all methods, then a manual error analysis is done on a sample. RESULTS: We find that a method based on a bidirectional long short-term memory network performs best with an average recall of 0.5435 when allowed to suggest 1 subject heading per sentence and 0.8954 when allowed to suggest 10 subject headings per sentence. However, other methods achieve comparable results. The manual analysis indicates that the predictions are better than what the automatic evaluation suggests. CONCLUSIONS: The results indicate that several of the tested methods perform well in suggesting the most appropriate subject headings on sentence level. Thus, we find it feasible to develop a text classification system that can support the use of standardized terminologies and save nurses time and effort on care documentation.


Assuntos
Indexação e Redação de Resumos/métodos , Processamento de Linguagem Natural , Registros de Enfermagem , Terminologia Padronizada em Enfermagem , Descritores , Registros Eletrônicos de Saúde , Finlândia
7.
Stud Health Technol Inform ; 264: 1550-1551, 2019 Aug 21.
Artigo em Inglês | MEDLINE | ID: mdl-31438226

RESUMO

We report on the pilot evaluation of an experimental query-based search functionality that enables phrase-level query rewriting in an unsupervised way. It is intended for supporting search in clinical text. Qualitative evaluation is done by three clinicans using a prototype search tool. They report that they find the tested search functionality to be beneficial for making query-based searching in clinical text more efficient.


Assuntos
Processamento de Linguagem Natural , Ferramenta de Busca , Redação
8.
Health Inf Manag ; 48(3): 144-151, 2019 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-30554532

RESUMO

BACKGROUND: The potential for the secondary use of electronic health records (EHRs) is underused due to restrictions in national legislation. For privacy purposes, legislative restrictions limit the availability and content of EHR data provided to secondary users. These limitations do not encourage healthcare organisations to develop procedures to promote the secondary use of EHRs. OBJECTIVE: The objective of this study is to identify factors that restrict the secondary use of unstructured EHRs in academic research in Finland and Sweden. METHOD: A study was conducted to identify these availability-restricting issues that pertain to the academic secondary use of unstructured EHRs. Using semi-structured interviews, 14 domain experts in science, hospital management and business were interviewed to evaluate the efficiency of procedures and technologies that are implemented in secondary use processes. RESULTS: The results demonstrate three aspects that restrict the availability of unstructured EHRs for secondary purposes: (i) the management and (ii) privacy preservation of such data as well as (iii) potential secondary users. CONCLUSION: Based on these categories, two approaches for the secondary use of unstructured EHRs are identified: the protected processing environment and altered data. IMPLICATIONS: The protected processing environment ensures patient privacy by providing unstructured EHRs for exclusive user groups that have preferred use intentions. Compared to the use of such processing environments, data alteration enables the secondary use of unstructured EHRs for a larger user group with various use intentions but that yield less valuable content.


Assuntos
Registros Eletrônicos de Saúde , Disseminação de Informação , Finlândia , Humanos , Disseminação de Informação/legislação & jurisprudência , Entrevistas como Assunto , Pesquisa Qualitativa , Suécia
9.
J Clin Nurs ; 28(9-10): 1555-1567, 2019 May.
Artigo em Inglês | MEDLINE | ID: mdl-30589139

RESUMO

AIMS AND OBJECTIVES: To describe and compare the pain process of the patients' with cardiac surgery through nurses' and physicians' documentations in the electronic patient records. BACKGROUND: Postoperative pain assessment and management should be documented regularly, to ensure optimal pain care process for patients. Despite availability of evidence-based guidelines, pain assessment and documentation remain inadequate. DESIGN: A retrospective patients' record review. METHODS: The original data consisted of the electronic patient records of 26,922 patients with a diagnosed heart disease. A total of 1,818 care episodes of patients with cardiac surgery were selected from the data. We used random sampling to obtain 280 care episodes for annotation. These 280 care episodes contained 2,156 physician reports and 1,327 days of nursing notes. We developed an annotation manual and schema, and then, we manually conducted semantic annotation on care episodes, using the Brat annotation tool. We analysed the annotation units using thematic analysis. Consolidated criteria for reporting qualitative research guideline was followed in reporting where appropriate in this study design. RESULTS: We discovered expressions of six different aspects of pain process: (a) cause, (b) situation, (c) features, (d) consequences, (e) actions and (f) outcomes. We determined that five of the aspects existed chronologically. However, the features of pain were simultaneously existing. They indicated the location, quality, intensity, and temporality of the pain and they were present in every phase of the patient's pain process. Cardiac and postoperative pain documentations differed from each other in used expressions and in the quantity and quality of descriptions. CONCLUSION: We could construct a comprehensive pain process of the patients with cardiac surgery from several electronic patient records. The challenge remains how to support systematic documentation in each patient. RELEVANCE TO CLINICAL PRACTICE: The study provides knowledge and guidance of pain process aspects that can be used to achieve an effective pain assessment and more comprehensive documentation.


Assuntos
Procedimentos Cirúrgicos Cardíacos/normas , Documentação/normas , Registros Eletrônicos de Saúde/normas , Registros de Enfermagem/normas , Medição da Dor/normas , Dor Pós-Operatória/diagnóstico , Médicos/normas , Adulto , Confiabilidade dos Dados , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Pesquisa Qualitativa , Estudos Retrospectivos , Semântica
10.
Database (Oxford) ; 20182018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30576487

RESUMO

Biomedical researchers regularly discover new interactions between chemical compounds/drugs and genes/proteins, and report them in research literature. Having knowledge about these interactions is crucially important in many research areas such as precision medicine and drug discovery. The BioCreative VI Task 5 (CHEMPROT) challenge promotes the development and evaluation of computer systems that can automatically recognize and extract statements of such interactions from biomedical literature. We participated in this challenge with a Support Vector Machine (SVM) system and a deep learning-based system (ST-ANN), and achieved an F-score of 60.99 for the task. After the shared task, we have significantly improved the performance of the ST-ANN system. Additionally, we have developed a new deep learning-based system (I-ANN) that considerably outperforms the ST-ANN system. Both ST-ANN and I-ANN systems are centered around training an ensemble of artificial neural networks and utilizing different bidirectional Long Short-Term Memory (LSTM) chains for representing the shortest dependency path and/or the full sentence. By combining the predictions of the SVM and the I-ANN systems, we achieved an F-score of 63.10 for the task, improving our previous F-score by 2.11 percentage points. Our systems are fully open-source and publicly available. We highlight that the systems we present in this study are not applicable only to the BioCreative VI Task 5, but can be effortlessly re-trained to extract any types of relations of interest, with no modifications of the source code required, if a manually annotated corpus is provided as training data in a specific file format.


Assuntos
Descoberta de Drogas/métodos , Redes Neurais de Computação , Preparações Farmacêuticas , Proteínas , Máquina de Vetores de Suporte , Mineração de Dados , Bases de Dados de Compostos Químicos , Bases de Dados de Proteínas , Aprendizado Profundo , Preparações Farmacêuticas/química , Preparações Farmacêuticas/metabolismo , Ligação Proteica , Proteínas/química , Proteínas/metabolismo
11.
Database (Oxford) ; 2018: 1-10, 2018 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-30239666

RESUMO

We present a system for automatically identifying a multitude of biomedical entities from the literature. This work is based on our previous efforts in the BioCreative VI: Interactive Bio-ID Assignment shared task in which our system demonstrated state-of-the-art performance with the highest achieved results in named entity recognition. In this paper we describe the original conditional random field-based system used in the shared task as well as experiments conducted since, including better hyperparameter tuning and character level modeling, which led to further performance improvements. For normalizing the mentions into unique identifiers we use fuzzy character n-gram matching. The normalization approach has also been improved with a better abbreviation resolution method and stricter guideline compliance resulting in vastly improved results for various entity types. All tools and models used for both named entity recognition and normalization are publicly available under open license.Database URL: https://github.com/TurkuNLP/BioCreativeVI_BioID_assignment.


Assuntos
Algoritmos , Lógica Fuzzy , Anotação de Sequência Molecular
12.
Stud Health Technol Inform ; 247: 725-729, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-29678056

RESUMO

We report on the development and evaluation of a prototype tool aimed to assist laymen/patients in understanding the content of clinical narratives. The tool relies largely on unsupervised machine learning applied to two large corpora of unlabeled text - a clinical corpus and a general domain corpus. A joint semantic word-space model is created for the purpose of extracting easier to understand alternatives for words considered difficult to understand by laymen. Two domain experts evaluate the tool and inter-rater agreement is calculated. When having the tool suggest ten alternatives to each difficult word, it suggests acceptable lay words for 55.51% of them. This and future manual evaluation will serve to further improve performance, where also supervised machine learning will be used.


Assuntos
Compreensão , Narração , Processamento de Linguagem Natural , Semântica , Humanos , Aprendizado de Máquina Supervisionado , Aprendizado de Máquina não Supervisionado
13.
Comput Inform Nurs ; 36(9): 448-457, 2018 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-29652677

RESUMO

Written patient education materials are essential to motivate and help patients to participate in their own care, but the production and management of a large collection of high-quality and easily accessible patient education documents can be challenging. Ontologies can aid in these tasks, but the existing resources are not directly applicable to patient education. An ontology that models patient education documents and their readers was constructed. The Delphi method was used to identify a compact but sufficient set of entities with which the topics of documents may be described. The preferred terms of the entities were also considered to ensure their understandability. In the ontology, readers may be characterized by gender, age group, language, and role (patient or professional), whereas documents may be characterized by audience, topic(s), and content, as well as the time and place of use. The Delphi method yielded 265 unique document topics that are organized into seven hierarchies. Advantages and disadvantages of the ontology design, as well as possibilities for improvements, were identified. The patient education material ontology can enhance many applications, but further development is needed to reach its full potential.


Assuntos
Técnica Delphi , Relações Enfermeiro-Paciente , Educação de Pacientes como Assunto/métodos , Adulto , Feminino , Humanos , Masculino , Pessoa de Meia-Idade , Adulto Jovem
14.
Chemosphere ; 185: 1063-1071, 2017 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-28764102

RESUMO

We propose a cost-effective system for the determination of metal ion concentration in water, addressing a central issue in water resources management. The system combines novel luminometric label array technology with a machine learning algorithm that selects a minimal number of array reagents (modulators) and liquid sample dilutions, such that enable accurate quantification. The algorithm is able to identify the optimal modulators and sample dilutions leading to cost reductions since less manual labour and resources are needed. Inferring the ion detector involves a unique type of a structured feature selection problem, which we formalize in this paper. We propose a novel Cartesian greedy forward feature selection algorithm for solving the problem. The novel algorithm was evaluated in the concentration assessment of five metal ions and the performance was compared to two known feature selection approaches. The results demonstrate that the proposed system can assist in lowering the costs with minimal loss in accuracy.


Assuntos
Metais/análise , Modelos Químicos , Poluentes Químicos da Água/análise , Algoritmos , Monitoramento Ambiental , Água
15.
Genome Biol ; 17(1): 184, 2016 09 07.
Artigo em Inglês | MEDLINE | ID: mdl-27604469

RESUMO

BACKGROUND: A major bottleneck in our understanding of the molecular underpinnings of life is the assignment of function to proteins. While molecular experiments provide the most reliable annotation of proteins, their relatively low throughput and restricted purview have led to an increasing role for computational function prediction. However, assessing methods for protein function prediction and tracking progress in the field remain challenging. RESULTS: We conducted the second critical assessment of functional annotation (CAFA), a timed challenge to assess computational methods that automatically assign protein function. We evaluated 126 methods from 56 research groups for their ability to predict biological functions using Gene Ontology and gene-disease associations using Human Phenotype Ontology on a set of 3681 proteins from 18 species. CAFA2 featured expanded analysis compared with CAFA1, with regards to data set size, variety, and assessment metrics. To review progress in the field, the analysis compared the best methods from CAFA1 to those of CAFA2. CONCLUSIONS: The top-performing methods in CAFA2 outperformed those from CAFA1. This increased accuracy can be attributed to a combination of the growing number of experimental annotations and improved methods for function prediction. The assessment also revealed that the definition of top-performing algorithms is ontology specific, that different performance metrics can be used to probe the nature of accurate predictions, and the relative diversity of predictions in the biological process and human phenotype ontologies. While there was methodological improvement between CAFA1 and CAFA2, the interpretation of results and usefulness of individual methods remain context-dependent.


Assuntos
Biologia Computacional , Proteínas/química , Software , Relação Estrutura-Atividade , Algoritmos , Bases de Dados de Proteínas , Ontologia Genética , Humanos , Anotação de Sequência Molecular , Proteínas/genética
16.
Artif Intell Med ; 67: 25-37, 2016 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-26900011

RESUMO

OBJECTIVE: A major source of information available in electronic health record (EHR) systems are the clinical free text notes documenting patient care. Managing this information is time-consuming for clinicians. Automatic text summarisation could assist clinicians in obtaining an overview of the free text information in ongoing care episodes, as well as in writing final discharge summaries. We present a study of automated text summarisation of clinical notes. It looks to identify which methods are best suited for this task and whether it is possible to automatically evaluate the quality differences of summaries produced by different methods in an efficient and reliable way. METHODS AND MATERIALS: The study is based on material consisting of 66,884 care episodes from EHRs of heart patients admitted to a university hospital in Finland between 2005 and 2009. We present novel extractive text summarisation methods for summarising the free text content of care episodes. Most of these methods rely on word space models constructed using distributional semantic modelling. The summarisation effectiveness is evaluated using an experimental automatic evaluation approach incorporating well-known ROUGE measures. We also developed a manual evaluation scheme to perform a meta-evaluation on the ROUGE measures to see if they reflect the opinions of health care professionals. RESULTS: The agreement between the human evaluators is good (ICC=0.74, p<0.001), demonstrating the stability of the proposed manual evaluation method. Furthermore, the correlation between the manual and automated evaluations are high (> 0.90 Spearman's rho). Three of the presented summarisation methods ('Composite', 'Case-Based' and 'Translate') significantly outperform the other methods for all ROUGE measures (p<0.05, Wilcoxon signed-rank test and Bonferroni correction). CONCLUSION: The results indicate the feasibility of the automated summarisation of care episodes. Moreover, the high correlation between manual and automated evaluations suggests that the less labour-intensive automated evaluations can be used as a proxy for human evaluations when developing summarisation methods. This is of significant practical value for summarisation method development, because manual evaluation cannot be afforded for every variation of the summarisation methods. Instead, one can resort to automatic evaluation during the method development process.


Assuntos
Automação , Registros Eletrônicos de Saúde , Finlândia , Cardiopatias/fisiopatologia , Cardiopatias/terapia , Hospitais Universitários , Humanos
17.
BMC Bioinformatics ; 16 Suppl 16: S3, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26551766

RESUMO

BACKGROUND: Modern methods for mining biomolecular interactions from literature typically make predictions based solely on the immediate textual context, in effect a single sentence. No prior work has been published on extending this context to the information automatically gathered from the whole biomedical literature. Thus, our motivation for this study is to explore whether mutually supporting evidence, aggregated across several documents can be utilized to improve the performance of the state-of-the-art event extraction systems. RESULTS: In the GE task, our re-ranking approach led to a modest performance increase and resulted in the first rank of the official Shared Task results with 50.97% F-score. Additionally, in this paper we explore and evaluate the usage of distributed vector representations for this challenge. CONCLUSIONS: For the GRN task, we were able to produce a gene regulatory network from the EVEX data, warranting the use of such generic large-scale text mining data in network biology settings. A detailed performance and error analysis provides more insight into the relatively low recall rates.


Assuntos
Mineração de Dados , Redes Reguladoras de Genes , Anotação de Sequência Molecular , Processamento de Linguagem Natural
18.
BMC Bioinformatics ; 16 Suppl 16: S4, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26551925

RESUMO

BACKGROUND: The Turku Event Extraction System (TEES) is a text mining program developed for the extraction of events, complex biomedical relationships, from scientific literature. Based on a graph-generation approach, the system detects events with the use of a rich feature set built via dependency parsing. The TEES system has achieved record performance in several of the shared tasks of its domain, and continues to be used in a variety of biomedical text mining tasks. RESULTS: The TEES system was quickly adapted to the BioNLP'13 Shared Task in order to provide a public baseline for derived systems. An automated approach was developed for learning the underlying annotation rules of event type, allowing immediate adaptation to the various subtasks, and leading to a first place in four out of eight tasks. The system for the automated learning of annotation rules is further enhanced in this paper to the point of requiring no manual adaptation to any of the BioNLP'13 tasks. Further, the scikit-learn machine learning library is integrated into the system, bringing a wide variety of machine learning methods usable with TEES in addition to the default SVM. A scikit-learn ensemble method is also used to analyze the importances of the features in the TEES feature sets. CONCLUSIONS: The TEES system was introduced for the BioNLP'09 Shared Task and has since then demonstrated good performance in several other shared tasks. By applying the current TEES 2.2 system to multiple corpora from these past shared tasks an overarching analysis of the most promising methods and possible pitfalls in the evolving field of biomedical event extraction are presented.


Assuntos
Mineração de Dados , Software , Algoritmos
19.
BMC Med Inform Decis Mak ; 15 Suppl 2: S2, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26099735

RESUMO

Patients' health related information is stored in electronic health records (EHRs) by health service providers. These records include sequential documentation of care episodes in the form of clinical notes. EHRs are used throughout the health care sector by professionals, administrators and patients, primarily for clinical purposes, but also for secondary purposes such as decision support and research. The vast amounts of information in EHR systems complicate information management and increase the risk of information overload. Therefore, clinicians and researchers need new tools to manage the information stored in the EHRs. A common use case is, given a--possibly unfinished--care episode, to retrieve the most similar care episodes among the records. This paper presents several methods for information retrieval, focusing on care episode retrieval, based on textual similarity, where similarity is measured through domain-specific modelling of the distributional semantics of words. Models include variants of random indexing and the semantic neural network model word2vec. Two novel methods are introduced that utilize the ICD-10 codes attached to care episodes to better induce domain-specificity in the semantic model. We report on experimental evaluation of care episode retrieval that circumvents the lack of human judgements regarding episode relevance. Results suggest that several of the methods proposed outperform a state-of-the art search engine (Lucene) on the retrieval task.


Assuntos
Codificação Clínica/normas , Sistemas de Apoio a Decisões Clínicas/organização & administração , Registros Eletrônicos de Saúde/organização & administração , Cuidado Periódico , Gestão da Informação em Saúde/organização & administração , Armazenamento e Recuperação da Informação/métodos , Algoritmos , Codificação Clínica/métodos , Gestão da Informação em Saúde/métodos , Humanos , Classificação Internacional de Doenças , Modelos Teóricos , Semântica
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...