Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
J Biomed Inform ; 75S: S120-S128, 2017 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-28694118

RESUMEN

OBJECTIVE: Our objective was to develop a machine learning-based system to determine the severity of Positive Valance symptoms for a patient, based on information included in their initial psychiatric evaluation. Severity was rated on an ordinal scale of 0-3 as follows: 0 (absent=no symptoms), 1 (mild=modest significance), 2 (moderate=requires treatment), 3 (severe=causes substantial impairment) by experts. MATERIALS AND METHODS: We treated the task of assigning Positive Valence severity as a text classification problem. During development, we experimented with regularized multinomial logistic regression classifiers, gradient boosted trees, and feedforward, fully-connected neural networks. We found both regularization and feature selection via mutual information to be very important in preventing models from overfitting the data. Our best configuration was a neural network with three fully connected hidden layers with rectified linear unit activations. RESULTS: Our best performing system achieved a score of 77.86%. The evaluation metric is an inverse normalization of the Mean Absolute Error presented as a percentage number between 0 and 100, where 100 means the highest performance. Error analysis showed that 90% of the system errors involved neighboring severity categories. CONCLUSION: Machine learning text classification techniques with feature selection can be trained to recognize broad differences in Positive Valence symptom severity with a modest amount of training data (in this case 600 documents, 167 of which were unannotated). An increase in the amount of annotated data can increase accuracy of symptom severity classification by several percentage points. Additional features and/or a larger training corpus may further improve accuracy.


Asunto(s)
Automatización , Redes Neurales de la Computación , Humanos , Aprendizaje Automático
2.
J Biomed Inform ; 61: 97-109, 2016 06.
Artículo en Inglés | MEDLINE | ID: mdl-27020263

RESUMEN

OBJECTIVE: Electronic medical records (EMRs) are increasingly repurposed for activities beyond clinical care, such as to support translational research and public policy analysis. To mitigate privacy risks, healthcare organizations (HCOs) aim to remove potentially identifying patient information. A substantial quantity of EMR data is in natural language form and there are concerns that automated tools for detecting identifiers are imperfect and leak information that can be exploited by ill-intentioned data recipients. Thus, HCOs have been encouraged to invest as much effort as possible to find and detect potential identifiers, but such a strategy assumes the recipients are sufficiently incentivized and capable of exploiting leaked identifiers. In practice, such an assumption may not hold true and HCOs may overinvest in de-identification technology. The goal of this study is to design a natural language de-identification framework, rooted in game theory, which enables an HCO to optimize their investments given the expected capabilities of an adversarial recipient. METHODS: We introduce a Stackelberg game to balance risk and utility in natural language de-identification. This game represents a cost-benefit model that enables an HCO with a fixed budget to minimize their investment in the de-identification process. We evaluate this model by assessing the overall payoff to the HCO and the adversary using 2100 clinical notes from Vanderbilt University Medical Center. We simulate several policy alternatives using a range of parameters, including the cost of training a de-identification model and the loss in data utility due to the removal of terms that are not identifiers. In addition, we compare policy options where, when an attacker is fined for misuse, a monetary penalty is paid to the publishing HCO as opposed to a third party (e.g., a federal regulator). RESULTS: Our results show that when an HCO is forced to exhaust a limited budget (set to $2000 in the study), the precision and recall of the de-identification of the HCO are 0.86 and 0.8, respectively. A game-based approach enables a more refined cost-benefit tradeoff, improving both privacy and utility for the HCO. For example, our investigation shows that it is possible for an HCO to release the data without spending all their budget on de-identification and still deter the attacker, with a precision of 0.77 and a recall of 0.61 for the de-identification. There also exist scenarios in which the model indicates an HCO should not release any data because the risk is too great. In addition, we find that the practice of paying fines back to a HCO (an artifact of suing for breach of contract), as opposed to a third party such as a federal regulator, can induce an elevated level of data sharing risk, where the HCO is incentivized to bait the attacker to elicit compensation. CONCLUSIONS: A game theoretic framework can be applied in leading HCO's to optimized decision making in natural language de-identification investments before sharing EMR data.


Asunto(s)
Confidencialidad , Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , Humanos , Lenguaje , Riesgo
3.
J Law Med ; 23(3): 595-608, 2016 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-27323637

RESUMEN

For three decades, Australian coroners have been moving steadily away from an historical partnership with the criminal law, and have emerged as independent judicial investigators with a dedicated court, and forensic and administrative support structures. Occasionally, however, a situation may arise where the ghosts of coronial law's quasi-criminal past threaten to reappear, to the detriment of the coronial function. One of these situations might develop following an acquittal on a criminal charge which involved the causing of a death. Issues pertaining to a coroner's duty may remain unresolved following criminal proceedings; and the question has to be posed as to whether the result of the previous criminal prosecution restricts or confines, in any way, the scope of permissible findings by a coroner in a later inquest. This article attempts to address this question in the particular context of Queensland law and its historical antecedents--many of which are common to other jurisdictions--in the hope that it might provoke consideration of the underlying practical and theoretical issues to the future benefit of coronial legal theory.


Asunto(s)
Médicos Forenses/legislación & jurisprudencia , Australia , Derecho Penal , Humanos
4.
J Biomed Inform ; 58 Suppl: S189-S196, 2015 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-26210361

RESUMEN

OBJECTIVE: In recognition of potential barriers that may inhibit the widespread adoption of biomedical software, the 2014 i2b2 Challenge introduced a special track, Track 3 - Software Usability Assessment, in order to develop a better understanding of the adoption issues that might be associated with the state-of-the-art clinical NLP systems. This paper reports the ease of adoption assessment methods we developed for this track, and the results of evaluating five clinical NLP system submissions. MATERIALS AND METHODS: A team of human evaluators performed a series of scripted adoptability test tasks with each of the participating systems. The evaluation team consisted of four "expert evaluators" with training in computer science, and eight "end user evaluators" with mixed backgrounds in medicine, nursing, pharmacy, and health informatics. We assessed how easy it is to adopt the submitted systems along the following three dimensions: communication effectiveness (i.e., how effective a system is in communicating its designed objectives to intended audience), effort required to install, and effort required to use. We used a formal software usability testing tool, TURF, to record the evaluators' interactions with the systems and 'think-aloud' data revealing their thought processes when installing and using the systems and when resolving unexpected issues. RESULTS: Overall, the ease of adoption ratings that the five systems received are unsatisfactory. Installation of some of the systems proved to be rather difficult, and some systems failed to adequately communicate their designed objectives to intended adopters. Further, the average ratings provided by the end user evaluators on ease of use and ease of interpreting output are -0.35 and -0.53, respectively, indicating that this group of users generally deemed the systems extremely difficult to work with. While the ratings provided by the expert evaluators are higher, 0.6 and 0.45, respectively, these ratings are still low indicating that they also experienced considerable struggles. DISCUSSION: The results of the Track 3 evaluation show that the adoptability of the five participating clinical NLP systems has a great margin for improvement. Remedy strategies suggested by the evaluators included (1) more detailed and operation system specific use instructions; (2) provision of more pertinent onscreen feedback for easier diagnosis of problems; (3) including screen walk-throughs in use instructions so users know what to expect and what might have gone wrong; (4) avoiding jargon and acronyms in materials intended for end users; and (5) packaging prerequisites required within software distributions so that prospective adopters of the software do not have to obtain each of the third-party components on their own.


Asunto(s)
Actitud hacia los Computadores , Minería de Datos/estadística & datos numéricos , Registros Electrónicos de Salud/estadística & datos numéricos , Procesamiento de Lenguaje Natural , Reconocimiento de Normas Patrones Automatizadas/métodos , Programas Informáticos , Minería de Datos/métodos , Humanos , Persona de Mediana Edad , Interfaz Usuario-Computador
5.
Front Artif Intell ; 4: 670009, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34738081

RESUMEN

There is a growing desire to create computer systems that can collaborate with humans on complex, open-ended activities. These activities typically have no set completion criteria and frequently involve multimodal communication, extensive world knowledge, creativity, and building structures or compositions through multiple steps. Because these systems differ from question and answer (Q&A) systems, chatbots, and simple task-oriented assistants, new methods for evaluating such collaborative computer systems are needed. Here, we present a set of criteria for evaluating these systems, called Hallmarks of Human-Machine Collaboration. The Hallmarks build on the success of heuristic evaluation used by the user interface community and past evaluation techniques used in the spoken language and chatbot communities. They consist of observable characteristics indicative of successful collaborative communication, grouped into eight high-level properties: robustness; habitability; mutual contribution of meaningful content; context-awareness; consistent human engagement; provision of rationale; use of elementary concepts to teach and learn new concepts; and successful collaboration. We present examples of how we used these Hallmarks in the DARPA Communicating with Computers (CwC) program to evaluate diverse activities, including story and music generation, interactive building with blocks, and exploration of molecular mechanisms in cancer. We used the Hallmarks as guides for developers and as diagnostics, assessing systems with the Hallmarks to identify strengths and opportunities for improvement using logs from user studies, surveying the human partner, third-party review of creative products, and direct tests. Informal feedback from CwC technology developers indicates that the use of the Hallmarks for program evaluation helped guide development. The Hallmarks also made it possible to identify areas of progress and major gaps in developing systems where the machine is an equal, creative partner.

6.
Drug Saf ; 44(1): 83-94, 2021 01.
Artículo en Inglés | MEDLINE | ID: mdl-33006728

RESUMEN

INTRODUCTION: The US FDA is interested in a tool that would enable pharmacovigilance safety evaluators to automate the identification of adverse drug events (ADEs) mentioned in FDA prescribing information. The MITRE Corporation (MITRE) and the FDA organized a shared task-Adverse Drug Event Evaluation (ADE Eval)-to determine whether the performance of algorithms currently used for natural language processing (NLP) might be good enough for real-world use. OBJECTIVE: ADE Eval was conducted to evaluate a range of NLP techniques for identifying ADEs mentioned in publicly available FDA-approved drug labels (package inserts). It was designed specifically to reflect pharmacovigilance practices within the FDA and model possible pharmacovigilance use cases. METHODS: Pharmacovigilance-specific annotation guidelines and annotated corpora were created. Two metrics modeled the experiences of FDA safety evaluators: one measured the ability of an algorithm to identify correct Medical Dictionary for Regulatory Activities (MedDRA®) terms for the text from the annotated corpora, and the other assessed the quality of evidence extracted from the corpora to support the selected MedDRA® term by measuring the portion of annotated text an algorithm correctly identified. A third metric assessed the cost of correcting system output for subsequent training (averaged, weighted F1-measure for mention finding). RESULTS: In total, 13 teams submitted 23 runs: the top MedDRA® coding F1-measure was 0.79, the top quality score was 0.96, and the top mention-finding F1-measure was 0.89. CONCLUSION: While NLP techniques do not perform at levels that would allow them to be used without intervention, it is now worthwhile exploring making NLP outputs available in human pharmacovigilance workflows.


Asunto(s)
Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Preparaciones Farmacéuticas , Sistemas de Registro de Reacción Adversa a Medicamentos , Algoritmos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/diagnóstico , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/epidemiología , Humanos , Procesamiento de Lenguaje Natural , Farmacovigilancia
7.
J Am Med Inform Assoc ; 27(9): 1374-1382, 2020 07 01.
Artículo en Inglés | MEDLINE | ID: mdl-32930712

RESUMEN

OBJECTIVE: Effective, scalable de-identification of personally identifying information (PII) for information-rich clinical text is critical to support secondary use, but no method is 100% effective. The hiding-in-plain-sight (HIPS) approach attempts to solve this "residual PII problem." HIPS replaces PII tagged by a de-identification system with realistic but fictitious (resynthesized) content, making it harder to detect remaining unredacted PII. MATERIALS AND METHODS: Using 2000 representative clinical documents from 2 healthcare settings (4000 total), we used a novel method to generate 2 de-identified 100-document corpora (200 documents total) in which PII tagged by a typical automated machine-learned tagger was replaced by HIPS-resynthesized content. Four readers conducted aggressive reidentification attacks to isolate leaked PII: 2 readers from within the originating institution and 2 external readers. RESULTS: Overall, mean recall of leaked PII was 26.8% and mean precision was 37.2%. Mean recall was 9% (mean precision = 37%) for patient ages, 32% (mean precision = 26%) for dates, 25% (mean precision = 37%) for doctor names, 45% (mean precision = 55%) for organization names, and 23% (mean precision = 57%) for patient names. Recall was 32% (precision = 40%) for internal and 22% (precision =33%) for external readers. DISCUSSION AND CONCLUSIONS: Approximately 70% of leaked PII "hiding" in a corpus de-identified with HIPS resynthesis is resilient to detection by human readers in a realistic, aggressive reidentification attack scenario-more than double the rate reported in previous studies but less than the rate reported for an attack assisted by machine learning methods.


Asunto(s)
Confidencialidad , Anonimización de la Información , Registros Electrónicos de Salud , Seguridad Computacional , Humanos , Procesamiento de Lenguaje Natural
8.
J Biomed Semantics ; 10(1): 10, 2019 05 31.
Artículo en Inglés | MEDLINE | ID: mdl-31151407

RESUMEN

BACKGROUND: We introduce TranScriptML, a semantic representation schema for prescription regimens allowing various properties of prescriptions (e.g. dose, frequency, route) to be specified separately and applied (manually or automatically) as annotations to patient instructions. In this paper, we describe the annotation schema, the curation of a corpus of prescription instructions through a manual annotation effort, and initial experiments in modeling and automated generation of TranScriptML representations. RESULTS: TranScriptML was developed in the process of curating a corpus of 2914 ambulatory prescriptions written within the Partners Healthcare network, and its schema is informed by the content of that corpus. We developed the representation schema as a novel set of semantic tags for prescription concept categories (e.g. frequency); each tag label is defined with an accompanying attribute framework in which the meaning of tagged concepts can be specified in a normalized fashion. We annotated a subset (1746) of this dataset using cross-validation and reconciliation between multiple annotators, and used Conditional Random Field machine learning and various other methods to train automated annotation models based on the manual annotations. The TranScriptML schema implementation, manual annotation, and machine learning were all performed using the MITRE Annotation Toolkit (MAT). We report that our annotation schema can be applied with varying levels of pairwise agreement, ranging from low agreement levels (0.125 F for the relatively rare REFILL tag) to high agreement levels approaching 0.9 F for some of the more frequent tags. We report similarly variable scores for modeling tag labels and spans, averaging 0.748 F-measure with balanced precision and recall. The best of our various attribute modeling methods captured most attributes with accuracy above 0.9. CONCLUSIONS: We have described an annotation schema for prescription regimens, and shown that it is possible to annotate prescription regimens at high accuracy for many tag types. We have further shown that many of these tags and attributes can be modeled at high accuracy with various techniques. By structuring the textual representation through annotation enriched with normalized values, the text can be compared against the pharmacist-entered structured data, offering an opportunity to detect and correct discrepancies.


Asunto(s)
Curaduría de Datos/métodos , Prescripciones de Medicamentos/estadística & datos numéricos , Modelos Teóricos , Humanos , Factores de Tiempo
9.
J Am Med Inform Assoc ; 26(12): 1536-1544, 2019 12 01.
Artículo en Inglés | MEDLINE | ID: mdl-31390016

RESUMEN

OBJECTIVE: Clinical corpora can be deidentified using a combination of machine-learned automated taggers and hiding in plain sight (HIPS) resynthesis. The latter replaces detected personally identifiable information (PII) with random surrogates, allowing leaked PII to blend in or "hide in plain sight." We evaluated the extent to which a malicious attacker could expose leaked PII in such a corpus. MATERIALS AND METHODS: We modeled a scenario where an institution (the defender) externally shared an 800-note corpus of actual outpatient clinical encounter notes from a large, integrated health care delivery system in Washington State. These notes were deidentified by a machine-learned PII tagger and HIPS resynthesis. A malicious attacker obtained and performed a parrot attack intending to expose leaked PII in this corpus. Specifically, the attacker mimicked the defender's process by manually annotating all PII-like content in half of the released corpus, training a PII tagger on these data, and using the trained model to tag the remaining encounter notes. The attacker hypothesized that untagged identifiers would be leaked PII, discoverable by manual review. We evaluated the attacker's success using measures of leak-detection rate and accuracy. RESULTS: The attacker correctly hypothesized that 211 (68%) of 310 actual PII leaks in the corpus were leaks, and wrongly hypothesized that 191 resynthesized PII instances were also leaks. One-third of actual leaks remained undetected. DISCUSSION AND CONCLUSION: A malicious parrot attack to reveal leaked PII in clinical text deidentified by machine-learned HIPS resynthesis can attenuate but not eliminate the protective effect of HIPS deidentification.


Asunto(s)
Seguridad Computacional , Confidencialidad , Anonimización de la Información , Registros Electrónicos de Salud , Aprendizaje Automático , Información Personal , Instituciones de Atención Ambulatoria , Atención a la Salud , Humanos , Washingtón
10.
J Am Med Inform Assoc ; 14(5): 564-73, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17600096

RESUMEN

OBJECTIVE: This paper describes a successful approach to de-identification that was developed to participate in a recent AMIA-sponsored challenge evaluation. METHOD: Our approach focused on rapid adaptation of existing toolkits for named entity recognition using two existing toolkits, Carafe and LingPipe. RESULTS: The "out of the box" Carafe system achieved a very good score (phrase F-measure of 0.9664) with only four hours of work to adapt it to the de-identification task. With further tuning, we were able to reduce the token-level error term by over 36% through task-specific feature engineering and the introduction of a lexicon, achieving a phrase F-measure of 0.9736. CONCLUSIONS: We were able to achieve good performance on the de-identification task by the rapid retargeting of existing toolkits. For the Carafe system, we developed a method for tuning the balance of recall vs. precision, as well as a confidence score that correlated well with the measured F-score.


Asunto(s)
Confidencialidad , Sistemas de Registros Médicos Computarizados , Procesamiento de Lenguaje Natural , Estudios de Evaluación como Asunto , Humanos
11.
Methods Inf Med ; 55(4): 356-64, 2016 Aug 05.
Artículo en Inglés | MEDLINE | ID: mdl-27405787

RESUMEN

BACKGROUND: Clinical text contains valuable information but must be de-identified before it can be used for secondary purposes. Accurate annotation of personally identifiable information (PII) is essential to the development of automated de-identification systems and to manual redaction of PII. Yet the accuracy of annotations may vary considerably across individual annotators and annotation is costly. As such, the marginal benefit of incorporating additional annotators has not been well characterized. OBJECTIVES: This study models the costs and benefits of incorporating increasing numbers of independent human annotators to identify the instances of PII in a corpus. We used a corpus with gold standard annotations to evaluate the performance of teams of annotators of increasing size. METHODS: Four annotators independently identified PII in a 100-document corpus consisting of randomly selected clinical notes from Family Practice clinics in a large integrated health care system. These annotations were pooled and validated to generate a gold standard corpus for evaluation. RESULTS: Recall rates for all PII types ranged from 0.90 to 0.98 for individual annotators to 0.998 to 1.0 for teams of three, when meas-ured against the gold standard. Median cost per PII instance discovered during corpus annotation ranged from $ 0.71 for an individual annotator to $ 377 for annotations discovered only by a fourth annotator. CONCLUSIONS: Incorporating a second annotator into a PII annotation process reduces unredacted PII and improves the quality of annotations to 0.99 recall, yielding clear benefit at reasonable cost; the cost advantages of annotation teams larger than two diminish rapidly.


Asunto(s)
Análisis Costo-Beneficio/economía , Minería de Datos/economía , Sistemas de Identificación de Pacientes/economía , Registros Electrónicos de Salud , Humanos
12.
Artículo en Inglés | MEDLINE | ID: mdl-25797061

RESUMEN

Motivated by the high cost of human curation of biological databases, there is an increasing interest in using computational approaches to assist human curators and accelerate the manual curation process. Towards the goal of cataloging drug indications from FDA drug labels, we recently developed LabeledIn, a human-curated drug indication resource for 250 clinical drugs. Its development required over 40 h of human effort across 20 weeks, despite using well-defined annotation guidelines. In this study, we aim to investigate the feasibility of scaling drug indication annotation through a crowdsourcing technique where an unknown network of workers can be recruited through the technical environment of Amazon Mechanical Turk (MTurk). To translate the expert-curation task of cataloging indications into human intelligence tasks (HITs) suitable for the average workers on MTurk, we first simplify the complex task such that each HIT only involves a worker making a binary judgment of whether a highlighted disease, in context of a given drug label, is an indication. In addition, this study is novel in the crowdsourcing interface design where the annotation guidelines are encoded into user options. For evaluation, we assess the ability of our proposed method to achieve high-quality annotations in a time-efficient and cost-effective manner. We posted over 3000 HITs drawn from 706 drug labels on MTurk. Within 8 h of posting, we collected 18 775 judgments from 74 workers, and achieved an aggregated accuracy of 96% on 450 control HITs (where gold-standard answers are known), at a cost of $1.75 per drug label. On the basis of these results, we conclude that our crowdsourcing approach not only results in significant cost and time saving, but also leads to accuracy comparable to that of domain experts.


Asunto(s)
Colaboración de las Masas , Curaduría de Datos/métodos , Bases de Datos Farmacéuticas , Preparaciones Farmacéuticas , Etiquetado de Medicamentos , Humanos , Preparaciones Farmacéuticas/química , Preparaciones Farmacéuticas/clasificación
13.
Int J Med Inform ; 83(10): 750-67, 2014 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-25106934

RESUMEN

PURPOSE: Electronic health records contain a substantial quantity of clinical narrative, which is increasingly reused for research purposes. To share data on a large scale and respect privacy, it is critical to remove patient identifiers. De-identification tools based on machine learning have been proposed; however, model training is usually based on either a random group of documents or a pre-existing document type designation (e.g., discharge summary). This work investigates if inherent features, such as the writing complexity, can identify document subsets to enhance de-identification performance. METHODS: We applied an unsupervised clustering method to group two corpora based on writing complexity measures: a collection of over 4500 documents of varying document types (e.g., discharge summaries, history and physical reports, and radiology reports) from Vanderbilt University Medical Center (VUMC) and the publicly available i2b2 corpus of 889 discharge summaries. We compare the performance (via recall, precision, and F-measure) of de-identification models trained on such clusters with models trained on documents grouped randomly or VUMC document type. RESULTS: For the Vanderbilt dataset, it was observed that training and testing de-identification models on the same stylometric cluster (with the average F-measure of 0.917) tended to outperform models based on clusters of random documents (with an average F-measure of 0.881). It was further observed that increasing the size of a training subset sampled from a specific cluster could yield improved results (e.g., for subsets from a certain stylometric cluster, the F-measure raised from 0.743 to 0.841 when training size increased from 10 to 50 documents, and the F-measure reached 0.901 when the size of the training subset reached 200 documents). For the i2b2 dataset, training and testing on the same clusters based on complexity measures (average F-score 0.966) did not significantly surpass randomly selected clusters (average F-score 0.965). CONCLUSIONS: Our findings illustrate that, in environments consisting of a variety of clinical documentation, de-identification models trained on writing complexity measures are better than models trained on random groups and, in many instances, document types.


Asunto(s)
Registros Electrónicos de Salud , Narración , Escritura , Análisis por Conglomerados
14.
Artículo en Inglés | MEDLINE | ID: mdl-25246425

RESUMEN

BACKGROUND: This article describes capture of biological information using a hybrid approach that combines natural language processing to extract biological entities and crowdsourcing with annotators recruited via Amazon Mechanical Turk to judge correctness of candidate biological relations. These techniques were applied to extract gene- mutation relations from biomedical abstracts with the goal of supporting production scale capture of gene-mutation-disease findings as an open source resource for personalized medicine. RESULTS: The hybrid system could be configured to provide good performance for gene-mutation extraction (precision ∼82%; recall ∼70% against an expert-generated gold standard) at a cost of $0.76 per abstract. This demonstrates that crowd labor platforms such as Amazon Mechanical Turk can be used to recruit quality annotators, even in an application requiring subject matter expertise; aggregated Turker judgments for gene-mutation relations exceeded 90% accuracy. Over half of the precision errors were due to mismatches against the gold standard hidden from annotator view (e.g., incorrect EntrezGene identifier or incorrect mutation position extracted), or incomplete task instructions (e.g., the need to exclude nonhuman mutations). CONCLUSIONS: The hybrid curation model provides a readily scalable cost-effective approach to curation, particularly if coupled with expert human review to filter precision errors. We plan to generalize the framework and make it available as open source software. DATABASE URL: http://www.mitre.org/publications/technical-papers/hybrid-curation-of-gene-mutation-relations-combining-automated.


Asunto(s)
Colaboración de las Masas/métodos , Curaduría de Datos/métodos , Predisposición Genética a la Enfermedad , Almacenamiento y Recuperación de la Información/métodos , Mutación/genética , Procesamiento de Lenguaje Natural , Biología Computacional/métodos , Colaboración de las Masas/economía , Curaduría de Datos/economía , Bases de Datos Genéticas , Genómica , Humanos
15.
J Am Med Inform Assoc ; 20(2): 342-8, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-22771529

RESUMEN

OBJECTIVE: Secondary use of clinical text is impeded by a lack of highly effective, low-cost de-identification methods. Both, manual and automated methods for removing protected health information, are known to leave behind residual identifiers. The authors propose a novel approach for addressing the residual identifier problem based on the theory of Hiding In Plain Sight (HIPS). MATERIALS AND METHODS: HIPS relies on obfuscation to conceal residual identifiers. According to this theory, replacing the detected identifiers with realistic but synthetic surrogates should collectively render the few 'leaked' identifiers difficult to distinguish from the synthetic surrogates. The authors conducted a pilot study to test this theory on clinical narrative, de-identified by an automated system. Test corpora included 31 oncology and 50 family practice progress notes read by two trained chart abstractors and an informaticist. RESULTS: Experimental results suggest approximately 90% of residual identifiers can be effectively concealed by the HIPS approach in text containing average and high densities of personal identifying information. DISCUSSION: This pilot test suggests HIPS is feasible, but requires further evaluation. The results need to be replicated on larger corpora of diverse origin under a range of detection scenarios. Error analyses also suggest areas where surrogate generation techniques can be refined to improve efficacy. CONCLUSIONS: If these results generalize to existing high-performing de-identification systems with recall rates of 94-98%, HIPS could increase the effective de-identification rates of these systems to levels above 99% without further advancements in system recall. Additional and more rigorous assessment of the HIPS approach is warranted.


Asunto(s)
Seguridad Computacional , Confidencialidad , Registros Electrónicos de Salud , Difusión de la Información , Procesamiento de Lenguaje Natural , Investigación Biomédica/estadística & datos numéricos , Recolección de Datos , Humanos , Proyectos Piloto , Estados Unidos
16.
Int J Med Inform ; 82(9): 821-31, 2013 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-23643147

RESUMEN

PURPOSE: We describe an experiment to build a de-identification system for clinical records using the open source MITRE Identification Scrubber Toolkit (MIST). We quantify the human annotation effort needed to produce a system that de-identifies at high accuracy. METHODS: Using two types of clinical records (history and physical notes, and social work notes), we iteratively built statistical de-identification models by annotating 10 notes, training a model, applying the model to another 10 notes, correcting the model's output, and training from the resulting larger set of annotated notes. This was repeated for 20 rounds of 10 notes each, and then an additional 6 rounds of 20 notes each, and a final round of 40 notes. At each stage, we measured precision, recall, and F-score, and compared these to the amount of annotation time needed to complete the round. RESULTS: After the initial 10-note round (33min of annotation time) we achieved an F-score of 0.89. After just over 8h of annotation time (round 21) we achieved an F-score of 0.95. Number of annotation actions needed, as well as time needed, decreased in later rounds as model performance improved. Accuracy on history and physical notes exceeded that of social work notes, suggesting that the wider variety and contexts for protected health information (PHI) in social work notes is more difficult to model. CONCLUSIONS: It is possible, with modest effort, to build a functioning de-identification system de novo using the MIST framework. The resulting system achieved performance comparable to other high-performing de-identification systems.


Asunto(s)
Seguridad Computacional , Confidencialidad , Registros Electrónicos de Salud/economía , Registros Electrónicos de Salud/normas , Humanos , Difusión de la Información , Programas Informáticos
17.
J Am Med Inform Assoc ; 18(5): 563-7, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-21515542

RESUMEN

OBJECTIVE: To describe a system for determining the assertion status of medical problems mentioned in clinical reports, which was entered in the 2010 i2b2/VA community evaluation 'Challenges in natural language processing for clinical data' for the task of classifying assertions associated with problem concepts extracted from patient records. MATERIALS AND METHODS: A combination of machine learning (conditional random field and maximum entropy) and rule-based (pattern matching) techniques was used to detect negation, speculation, and hypothetical and conditional information, as well as information associated with persons other than the patient. RESULTS: The best submission obtained an overall micro-averaged F-score of 0.9343. CONCLUSIONS: Using semantic attributes of concepts and information about document structure as features for statistical classification of assertions is a good way to leverage rule-based and statistical techniques. In this task, the choice of features may be more important than the choice of classifier algorithm.


Asunto(s)
Minería de Datos , Sistemas de Apoyo a Decisiones Clínicas , Registros Electrónicos de Salud , Procesamiento de Lenguaje Natural , Reconocimiento de Normas Patrones Automatizadas , Señales (Psicología) , Minería de Datos/clasificación , Sistemas de Apoyo a Decisiones Clínicas/clasificación , Registros Electrónicos de Salud/clasificación , Humanos , Reconocimiento de Normas Patrones Automatizadas/clasificación , Semántica , Incertidumbre
18.
J Am Med Inform Assoc ; 17(2): 159-68, 2010.
Artículo en Inglés | MEDLINE | ID: mdl-20190058

RESUMEN

OBJECTIVE: De-identified medical records are critical to biomedical research. Text de-identification software exists, including "resynthesis" components that replace real identifiers with synthetic identifiers. The goal of this research is to evaluate the effectiveness and examine possible bias introduced by resynthesis on de-identification software. DESIGN: We evaluated the open-source MITRE Identification Scrubber Toolkit, which includes a resynthesis capability, with clinical text from Vanderbilt University Medical Center patient records. We investigated four record classes from over 500 patients' files, including laboratory reports, medication orders, discharge summaries and clinical notes. We trained and tested the de-identification tool on real and resynthesized records. MEASUREMENTS: We measured performance in terms of precision, recall, F-measure and accuracy for the detection of protected health identifiers as designated by the HIPAA Safe Harbor Rule. RESULTS: The de-identification tool was trained and tested on a collection of real and resynthesized Vanderbilt records. Results for training and testing on the real records were 0.990 accuracy and 0.960 F-measure. The results improved when trained and tested on resynthesized records with 0.998 accuracy and 0.980 F-measure but deteriorated moderately when trained on real records and tested on resynthesized records with 0.989 accuracy 0.862 F-measure. Moreover, the results declined significantly when trained on resynthesized records and tested on real records with 0.942 accuracy and 0.728 F-measure. CONCLUSION: The de-identification tool achieves high accuracy when training and test sets are homogeneous (ie, both real or resynthesized records). The resynthesis component regularizes the data to make them less "realistic," resulting in loss of performance particularly when training on resynthesized data and testing on real data.


Asunto(s)
Inteligencia Artificial , Seguridad Computacional , Confidencialidad , Registros Electrónicos de Salud , Programas Informáticos , Humanos , Almacenamiento y Recuperación de la Información , Estados Unidos
19.
Int J Med Inform ; 79(12): 849-59, 2010 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-20951082

RESUMEN

PURPOSE: Medical records must often be stripped of patient identifiers, or de-identified, before being shared. De-identification by humans is time-consuming, and existing software is limited in its generality. The open source MITRE Identification Scrubber Toolkit (MIST) provides an environment to support rapid tailoring of automated de-identification to different document types, using automatically learned classifiers to de-identify and protect sensitive information. METHODS: MIST was evaluated with four classes of patient records from the Vanderbilt University Medical Center: discharge summaries, laboratory reports, letters, and order summaries. We trained and tested MIST on each class of record separately, as well as on pooled sets of records. We measured precision, recall, F-measure and accuracy at the word level for the detection of patient identifiers as designated by the HIPAA Safe Harbor Rule. RESULTS: MIST was applied to medical records that differed in the amounts and types of protected health information (PHI): lab reports contained only two types of PHI (dates, names) compared to discharge summaries, which were much richer. Performance of the de-identification tool depended on record class; F-measure results were 0.996 for order summaries, 0.996 for discharge summaries, 0.943 for letters and 0.934 for laboratory reports. Experiments suggest the tool requires several hundred training exemplars to reach an F-measure of at least 0.9. CONCLUSIONS: The MIST toolkit makes possible the rapid tailoring of automated de-identification to particular document types and supports the transition of the de-identification software to medical end users, avoiding the need for developers to have access to original medical records. We are making the MIST toolkit available under an open source license to encourage its application to diverse data sets at multiple institutions.


Asunto(s)
Registros Electrónicos de Salud , Registro Médico Coordinado/normas , Sistemas de Identificación de Pacientes , Programas Informáticos , Algoritmos , Confidencialidad , Recolección de Datos , Humanos , Registro Médico Coordinado/métodos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA