Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 9 de 9
Filtrar
1.
J Biomed Inform ; 145: 104478, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37625508

RESUMEN

Obtaining text datasets with semantic annotations is an effortful process, yet crucial for supervised training in natural language processing (NLP). In general, developing and applying new NLP pipelines in domain-specific contexts for tasks often requires custom-designed datasets to address NLP tasks in a supervised machine learning fashion. When operating in non-English languages for medical data processing, this exposes several minor and major, interconnected problems such as the lack of task-matching datasets as well as task-specific pre-trained models. In our work, we suggest to leverage pre-trained large language models for training data acquisition in order to retrieve sufficiently large datasets for training smaller and more efficient models for use-case-specific tasks. To demonstrate the effectiveness of your approach, we create a custom dataset that we use to train a medical NER model for German texts, GPTNERMED, yet our method remains language-independent in principle. Our obtained dataset as well as our pre-trained models are publicly available at https://github.com/frankkramer-lab/GPTNERMED.


Asunto(s)
Lenguaje , Procesamiento de Lenguaje Natural , Semántica , Registros , Aprendizaje Automático Supervisado
2.
J Biomed Inform ; 147: 104513, 2023 11.
Artículo en Inglés | MEDLINE | ID: mdl-37838290

RESUMEN

We present a statistical model, GERNERMED++, for German medical natural language processing trained for named entity recognition (NER) as an open, publicly available model. We demonstrate the effectiveness of combining multiple techniques in order to achieve strong results in entity recognition performance by the means of transfer-learning on pre-trained deep language models (LM), word-alignment and neural machine translation, outperforming a pre-existing baseline model on several datasets. Due to the sparse situation of open, public medical entity recognition models for German texts, this work offers benefits to the German research community on medical NLP as a baseline model. The work serves as a refined successor to our first GERNERMED model. Similar to our previous work, our trained model is publicly available to other researchers. The sample code and the statistical model is available at: https://github.com/frankkramer-lab/GERNERMED-pp.


Asunto(s)
Lenguaje , Semántica , Aprendizaje Automático , Procesamiento de Lenguaje Natural , Aprendizaje
3.
JMIR Form Res ; 7: e39077, 2023 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-36853741

RESUMEN

BACKGROUND: Data mining in the field of medical data analysis often needs to rely solely on the processing of unstructured data to retrieve relevant data. For German natural language processing, few open medical neural named entity recognition (NER) models have been published before this work. A major issue can be attributed to the lack of German training data. OBJECTIVE: We developed a synthetic data set and a novel German medical NER model for public access to demonstrate the feasibility of our approach. In order to bypass legal restrictions due to potential data leaks through model analysis, we did not make use of internal, proprietary data sets, which is a frequent veto factor for data set publication. METHODS: The underlying German data set was retrieved by translation and word alignment of a public English data set. The data set served as a foundation for model training and evaluation. For demonstration purposes, our NER model follows a simple network architecture that is designed for low computational requirements. RESULTS: The obtained data set consisted of 8599 sentences including 30,233 annotations. The model achieved a class frequency-averaged F1 score of 0.82 on the test set after training across 7 different NER types. Artifacts in the synthesized data set with regard to translation and alignment induced by the proposed method were exposed. The annotation performance was evaluated on an external data set and measured in comparison with an existing baseline model that has been trained on a dedicated German data set in a traditional fashion. We discussed the drop in annotation performance on an external data set for our simple NER model. Our model is publicly available. CONCLUSIONS: We demonstrated the feasibility of obtaining a data set and training a German medical NER model by the exclusive use of public training data through our suggested method. The discussion on the limitations of our approach includes ways to further mitigate remaining problems in future work.

4.
AMIA Annu Symp Proc ; 2023: 351-358, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-38222405

RESUMEN

The evaluation of clinical questionnaires is an important part of gaining knowledge in empirical research. The electronically captured responses are encoded in a standard format such as HL7 FHIR® that facilitates data exchange and systems interoperability. However, this also complicates access of the information to explore and interpret the results without appropriate tools. In this work, we present the design of a web-based graphical exploration tool for categorical questionnaire response data that can interact with FHIR-conformant HTTP endpoints. The web app enables non-technical users with simplified, direct visual access to highly structured FHIR questionnaire response data and preserves the applicability in arbitrary data exploration tasks. We describe the abstract feature design with the derived technical implementation to allow a universal, user-configurable data subselection mechanism to generate conditional one- and two-data-dimensional charts. The applicability of our developed prototype is demonstrated on synthetic FHIR data with the source code available at https://github.com/frankkramer-lab/FHIR-QR-Explorer.


Asunto(s)
Registros Electrónicos de Salud , Estándar HL7 , Humanos , Encuestas y Cuestionarios , Internet
5.
Stud Health Technol Inform ; 290: 912-916, 2022 Jun 06.
Artículo en Inglés | MEDLINE | ID: mdl-35673151

RESUMEN

We present a perspective on platforms for code submission and automated evaluation in the context of university teaching. Due to the COVID-19 pandemic, such platforms have become an essential asset for remote courses and a reasonable standard for structured code submission concerning increasing numbers of students in computer sciences. Utilizing automated code evaluation techniques exhibits notable positive impacts for both students and teachers in terms of quality and scalability. We identified relevant technical and non-technical requirements for such platforms in terms of practical applicability and secure code submission environments. Furthermore, a survey among students was conducted to obtain empirical data on general perception. We conclude that submission and automated evaluation involves continuous maintenance yet lowers the required workload for teachers and provides better evaluation transparency for students.


Asunto(s)
COVID-19 , Humanos , Pandemias , Encuestas y Cuestionarios , Enseñanza , Universidades
6.
PLOS Digit Health ; 1(8): e0000086, 2022 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-36812581

RESUMEN

In the context of clinical trials and medical research medical text mining can provide broader insights for various research scenarios by tapping additional text data sources and extracting relevant information that is often exclusively present in unstructured fashion. Although various works for data like electronic health reports are available for English texts, only limited work on tools for non-English text resources has been published that offers immediate practicality in terms of flexibility and initial setup. We introduce DrNote, an open source text annotation service for medical text processing. Our work provides an entire annotation pipeline with its focus on a fast yet effective and easy to use software implementation. Further, the software allows its users to define a custom annotation scope by filtering only for relevant entities that should be included in its knowledge base. The approach is based on OpenTapioca and combines the publicly available datasets from WikiData and Wikipedia, and thus, performs entity linking tasks. In contrast to other related work our service can easily be built upon any language-specific Wikipedia dataset in order to be trained on a specific target language. We provide a public demo instance of our DrNote annotation service at https://drnote.misit-augsburg.de/.

7.
Front Neurol ; 13: 663200, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35645963

RESUMEN

Background: In-vivo MR-based high-resolution volumetric quantification methods of the endolymphatic hydrops (ELH) are highly dependent on a reliable segmentation of the inner ear's total fluid space (TFS). This study aimed to develop a novel open-source inner ear TFS segmentation approach using a dedicated deep learning (DL) model. Methods: The model was based on a V-Net architecture (IE-Vnet) and a multivariate (MR scans: T1, T2, FLAIR, SPACE) training dataset (D1, 179 consecutive patients with peripheral vestibulocochlear syndromes). Ground-truth TFS masks were generated in a semi-manual, atlas-assisted approach. IE-Vnet model segmentation performance, generalizability, and robustness to domain shift were evaluated on four heterogenous test datasets (D2-D5, n = 4 × 20 ears). Results: The IE-Vnet model predicted TFS masks with consistently high congruence to the ground-truth in all test datasets (Dice overlap coefficient: 0.9 ± 0.02, Hausdorff maximum surface distance: 0.93 ± 0.71 mm, mean surface distance: 0.022 ± 0.005 mm) without significant difference concerning side (two-sided Wilcoxon signed-rank test, p>0.05), or dataset (Kruskal-Wallis test, p>0.05; post-hoc Mann-Whitney U, FDR-corrected, all p>0.2). Prediction took 0.2 s, and was 2,000 times faster than a state-of-the-art atlas-based segmentation method. Conclusion: IE-Vnet TFS segmentation demonstrated high accuracy, robustness toward domain shift, and rapid prediction times. Its output works seamlessly with a previously published open-source pipeline for automatic ELS segmentation. IE-Vnet could serve as a core tool for high-volume trans-institutional studies of the inner ear. Code and pre-trained models are available free and open-source under https://github.com/pydsgz/IEVNet.

8.
IEEE J Biomed Health Inform ; 23(3): 969-977, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30530377

RESUMEN

BACKGROUND: Deep learning has been recently applied to a multitude of computer vision and medical image analysis problems. Although recent research efforts have improved the state of the art, most of the methods cannot be easily accessed, compared or used by other researchers or clinicians. Even if developers publish their code and pre-trained models on the internet, integration in stand-alone applications and existing workflows is often not straightforward, especially for clinical research partners. In this paper, we propose an open-source framework to provide AI-enabled medical image analysis through the network. METHODS: TOMAAT provides a cloud environment for general medical image analysis, composed of three basic components: (i) an announcement service, maintaining a public registry of (ii) multiple distributed server nodes offering various medical image analysis solutions, and (iii) client software offering simple interfaces for users. Deployment is realized through HTTP-based communication, along with an API and wrappers for common image manipulations during pre- and post-processing. RESULTS: We demonstrate the utility and versatility of TOMAAT on several hallmark medical image analysis tasks: segmentation, diffeomorphic deformable atlas registration, landmark localization, and workflow integration. Through TOMAAT, the high hardware demands, setup and model complexity of demonstrated approaches are transparent to users, who are provided with simple client interfaces. We present example clients in three-dimensional Slicer, in the web browser, on iOS devices and in a commercially available, certified medical image analysis suite. CONCLUSION: TOMAAT enables deployment of state-of-the-art image segmentation in the cloud, fostering interaction among deep learning researchers and medical collaborators in the clinic. Currently, a public announcement service is hosted by the authors, and several ready-to-use services are registered and enlisted at http://tomaat.cloud.


Asunto(s)
Nube Computacional , Aprendizaje Profundo , Diagnóstico por Imagen , Algoritmos , Humanos , Interpretación de Imagen Asistida por Computador
9.
J Neurol ; 266(Suppl 1): 108-117, 2019 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-31286203

RESUMEN

We perform classification, ranking and mapping of body sway parameters from static posturography data of patients using recent machine-learning and data-mining techniques. Body sway is measured in 293 individuals with the clinical diagnoses of acute unilateral vestibulopathy (AVS, n = 49), distal sensory polyneuropathy (PNP, n = 12), anterior lobe cerebellar atrophy (CA, n = 48), downbeat nystagmus syndrome (DN, n = 16), primary orthostatic tremor (OT, n = 25), Parkinson's disease (PD, n = 27), phobic postural vertigo (PPV n = 59) and healthy controls (HC, n = 57). We classify disorders and rank sway features using supervised machine learning. We compute a continuous, human-interpretable 2D map of stance disorders using t-stochastic neighborhood embedding (t-SNE). Classification of eight diagnoses yielded 82.7% accuracy [95% CI (80.9%, 84.5%)]. Five (CA, PPV, AVS, HC, OT) were classified with a mean sensitivity and specificity of 88.4% and 97.1%, while three (PD, PNP, and DN) achieved a mean sensitivity of 53.7%. The most discriminative stance condition was ranked as "standing on foam-rubber, eyes closed". Mapping of sway path features into 2D space revealed clear clusters among CA, PPV, AVS, HC and OT subjects. We confirm previous claims that machine learning can aid in classification of clinical sway patterns measured with static posturography. Given a standardized, long-term acquisition of quantitative patient databases, modern machine learning and data analysis techniques help in visualizing, understanding and utilizing high-dimensional sensor data from clinical routine.


Asunto(s)
Minería de Datos/métodos , Diagnóstico por Computador/métodos , Aprendizaje Automático , Enfermedades del Sistema Nervioso/diagnóstico , Equilibrio Postural/fisiología , Adulto , Estudios de Cohortes , Femenino , Humanos , Masculino , Enfermedades del Sistema Nervioso/fisiopatología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA