Pesquisa | BVS - MINISTÉRIO DA SAÚDE

A general text mining method to extract echocardiography measurement results from echocardiography documents.

Szekér, Szabolcs; Fogarassy, György; Vathy-Fogarassy, Ágnes.

Artif Intell Med ; 143: 102584, 2023 09.

Artigo em Inglês | MEDLINE | ID: mdl-37673570

RESUMO

BACKGROUND: In everyday medical practice, the results of cardiac ultrasound examinations are generally recorded in unstructured text, from which extracting relevant information is an important and challenging task. This paper presents a generally applicable language and corpus-independent text mining method for extracting and structuring numerical measurement results and their descriptions from echocardiography reports. METHOD: The developed method is based on generally applicable text mining preprocessing activities, it automatically identifies and standardizes the descriptions of the cardiac ultrasound measures, and it stores the extracted and standardized measurement descriptions with their measurement results in a structured form for later usage. The method does not contain any regular expression-based search and does not rely on information about the structure of the document. RESULTS: The method has been tested on a document set containing more than 20,000 echocardiographic reports by examining the efficiency of extracting 12 echocardiography parameters considered important by experts. The method extracted and structured the echocardiography parameters under the study with good sensitivity (lowest value: 0.775, highest value: 1.0, average: 0.904) and excellent specificity (for all cases 1.0). The F1 score ranged between 0.873 and 1.0, and its average value was 0.948. CONCLUSION: The presented case study has shown that the proposed method can extract measurement results from echocardiography documents with high confidence without performing a direct search or having detailed information about the data recording habits. Furthermore, it effectively handles spelling errors, abbreviations and the highly varied terminology used in descriptions. As it does not rely on any information related to the structure or the language of the documents or data recording habits, it can be applied for processing any free-text written medical texts.

Assuntos

Mineração de Dados , Ecocardiografia

Weighted nearest neighbours-based control group selection method for observational studies.

Szekér, Szabolcs; Vathy-Fogarassy, Ágnes.

PLoS One ; 15(7): e0236531, 2020.

Artigo em Inglês | MEDLINE | ID: mdl-32701991

RESUMO

Although in observational studies, propensity score matching is the most widely used balancing method, it has received much criticism. The main drawback of this method is that the individuals of the case and control groups are paired in the compressed one-dimensional space of propensity scores. In this paper, such a novel multivariate weighted k-nearest neighbours-based control group selection method is proposed which can eliminate this disadvantage of propensity score matching. The proposed method pairs the elements of the case and control groups in the original vector space of the covariates and the dissimilarities of the individuals are calculated as the weighted distances of the subjects. The weight factors are calculated from a logistic regression model fitted on the status of treatment assignment. The efficiency of the proposed method was evaluated by Monte Carlo simulations on different datasets. Experimental results show that the proposed Weighted Nearest Neighbours Control Group Selection with Error Minimization method is able to select a more balanced control group than the most widely applied greedy form of the propensity score matching method, especially for individuals characterized with few descriptive features.

Assuntos

Algoritmos , Humanos , Método de Monte Carlo , Estudos Observacionais como Assunto , Pontuação de Propensão

The Efficiency of Different Distance Metrics for Keyword-Based Search in Medical Documents: A Short Case Study.

Vathy-Fogarassy, Ágnes; Szekér, Szabolcs; Szolár, Balázs; Fogarassy, György.

Stud Health Technol Inform ; 271: 232-239, 2020 Jun 23.

Artigo em Inglês | MEDLINE | ID: mdl-32578568

RESUMO

BACKGROUND: Processing of free text written medical texts involves many difficulties arising from typographical errors, synonyms, and abbreviations occurring in the texts. METHODS: In this study, the applicability of the most common string similarity measures were analyzed and compared for the keyword-based medical text search. RESULTS: The usefulness of the similarity measures was studied in a set of medical documents containing more than 20,000 echocardiography reports. Experimental results showed that the Jaro-Winkler dissimilarity measure is the most capable measure to explore the content of the medical texts.

Assuntos

Benchmarking , Documentação

Application of Named Entity Recognition Methods to Extract Information from Echocardiography Reports.

Szekér, Szabolcs; Fogarassy, György; Machalik, Károly; Vathy-Fogarassy, Ágnes.

Stud Health Technol Inform ; 260: 41-48, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31118317

RESUMO

As there is no consensus about how to store the results of echocardiography examinations, information extraction from them is a non-trivial task. Successful named entity recognition (NER) is key to getting access to the stored information and the process of identification has been recognized as a bottleneck in text mining. Our goal was to develop and compare such NER methods that are capable of achieving this task. Our practical results show that the text mining-based NER method is able to perform at a similar level in finding and identifying terms as the regular expression-based NER method. The paper highlights the advantages and disadvantages of both methods.

Assuntos

Mineração de Dados , Ecocardiografia

The Effect of Latent Binary Variables on the Uncertainty of the Prediction of a Dichotomous Outcome Using Logistic Regression Based Propensity Score Matching.

Szekér, Szabolcs; Vathy-Fogarassy, Ágnes.

Stud Health Technol Inform ; 248: 1-8, 2018.

Artigo em Inglês | MEDLINE | ID: mdl-29726412

RESUMO

Logistic regression based propensity score matching is a widely used method in case-control studies to select the individuals of the control group. This method creates a suitable control group if all factors affecting the output variable are known. However, if relevant latent variables exist as well, which are not taken into account during the calculations, the quality of the control group is uncertain. In this paper, we present a statistics-based research in which we try to determine the relationship between the accuracy of the logistic regression model and the uncertainty of the dependent variable of the control group defined by propensity score matching. Our analyses show that there is a linear correlation between the fit of the logistic regression model and the uncertainty of the output variable. In certain cases, a latent binary explanatory variable can result in a relative error of up to 70% in the prediction of the outcome variable. The observed phenomenon calls the attention of analysts to an important point, which must be taken into account when deducting conclusions.

Assuntos

Pontuação de Propensão , Incerteza , Humanos , Modelos Logísticos , Método de Monte Carlo

Comparison of Control Group Generating Methods.

Szekér, Szabolcs; Fogarassy, György; Vathy-Fogarassy, Ágnes.

Stud Health Technol Inform ; 236: 311-318, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28508812

RESUMO

Retrospective studies suffer from drawbacks such as selection bias. As the selection of the control group has a significant impact on the evaluation of the results, it is very important to find the proper method to generate the most appropriate control group. In this paper we suggest two nearest neighbors based control group selection methods that aim to achieve good matching between the individuals of case and control groups. The effectiveness of the proposed methods is evaluated by runtime and accuracy tests and the results are compared to the classical stratified sampling method.

Assuntos

Grupos Controle , Humanos , Estudos Retrospectivos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA