Búsqueda | Portal Regional de la BVS

Procedure code overutilization detection from healthcare claims using unsupervised deep learning methods.

Suesserman, Michael; Gorny, Samantha; Lasaga, Daniel; Helms, John; Olson, Dan; Bowen, Edward; Bhattacharya, Sanmitra.

BMC Med Inform Decis Mak ; 23(1): 196, 2023 09 28.

Artículo en Inglés | MEDLINE | ID: mdl-37770866

RESUMEN

BACKGROUND: Fraud, Waste, and Abuse (FWA) in medical claims have a negative impact on the quality and cost of healthcare. A major component of FWA in claims is procedure code overutilization, where one or more prescribed procedures may not be relevant to a given diagnosis and patient profile, resulting in unnecessary and unwarranted treatments and medical payments. This study aims to identify such unwarranted procedures from millions of healthcare claims. In the absence of labeled examples of unwarranted procedures, the study focused on the application of unsupervised machine learning techniques. METHODS: Experiments were conducted with deep autoencoders to find claims containing anomalous procedure codes indicative of FWA, and were compared against a baseline density-based clustering model. Diagnoses, procedures, and demographic data associated with healthcare claims were used as features for the models. A dataset of one hundred thousand claims sampled from a larger claims database is used to initially train and tune the models, followed by experimentations on a dataset with thirty-three million claims. Experimental results show that the autoencoder model, when trained with a novel feature-weighted loss function, outperforms the density-based clustering approach in finding potential outlier procedure codes. RESULTS: Given the unsupervised nature of our experiments, model performance was evaluated using a synthetic outlier test dataset, and a manually annotated outlier test dataset. Precision, recall and F1-scores on the synthetic outlier test dataset for the autoencoder model trained on one hundred thousand claims were 0.87, 1.0 and 0.93, respectively, while the results for these metrics on the manually annotated outlier test dataset were 0.36, 0.86 and 0.51, respectively. The model performance on the manually annotated outlier test dataset improved further when trained on the larger thirty-three million claims dataset with precision, recall and F1-scores of 0.48, 0.90 and 0.63, respectively. CONCLUSIONS: This study demonstrates the feasibility of leveraging unsupervised, deep-learning methods to identify potential procedure overutilization from healthcare claims.

Asunto(s)

Aprendizaje Profundo , Humanos , Aprendizaje Automático no Supervisado , Atención a la Salud , Bases de Datos Factuales , Fraude

Leveraging deep survival models to predict quality of care risk in diverse hospital readmissions.

Tran, Nhat Quang; Goel, Gautam; Pudota, Nirmala; Suesserman, Michael; Helms, John; Lasaga, Daniel; Olson, Dan; Bowen, Edward; Bhattacharya, Sanmitra.

Sci Rep ; 13(1): 10479, 2023 06 28.

Artículo en Inglés | MEDLINE | ID: mdl-37380704

RESUMEN

Hospital readmissions rate is reportedly high and has caused huge financial burden on health care systems in many countries. It is viewed as an important indicator of health care providers' quality of care. We examine the use of machine learning-based survival analysis to assess quality of care risk in hospital readmissions. This study applies various survival models to explore the risk of hospital readmissions given patient demographics and their respective hospital discharges extracted from a health care claims dataset. We explore advanced feature representation techniques such as BioBERT and Node2Vec to encode high-dimensional diagnosis code features. To our knowledge, this study is the first to apply deep-learning based survival-analysis models for predicting hospital readmission risk agnostic of specific medical conditions and a fixed window for readmission. We found that modeling the time from discharge date to readmission date as a Weibull distribution as in the SparseDeepWeiSurv model yields the best discriminative power and calibration. In addition, embedding representations of the diagnosis codes do not contribute to improvement in model performance. We find dependency of each model's performance on the time point at which it is evaluated. This time dependency of the models' performance on the health care claims data may necessitate a different choice of model in quality of care issue detection at different points in time. We show the effectiveness of deep-learning based survival-analysis models in estimating the quality of care risk in hospital readmissions.

Asunto(s)

Instituciones de Salud , Readmisión del Paciente , Humanos , Calibración , Personal de Salud , Calidad de la Atención de Salud

Social media engagement analysis of U.S. Federal health agencies on Facebook.

Bhattacharya, Sanmitra; Srinivasan, Padmini; Polgreen, Philip.

BMC Med Inform Decis Mak ; 17(1): 49, 2017 Apr 21.

Artículo en Inglés | MEDLINE | ID: mdl-28431582

RESUMEN

BACKGROUND: It is becoming increasingly common for individuals and organizations to use social media platforms such as Facebook. These are being used for a wide variety of purposes including disseminating, discussing and seeking health related information. U.S. Federal health agencies are leveraging these platforms to 'engage' social media users to read, spread, promote and encourage health related discussions. However, different agencies and their communications get varying levels of engagement. In this study we use statistical models to identify factors that associate with engagement. METHODS: We analyze over 45,000 Facebook posts from 72 Facebook accounts belonging to 24 health agencies. Account usage, user activity, sentiment and content of these posts are studied. We use the hurdle regression model to identify factors associated with the level of engagement and Cox proportional hazards model to identify factors associated with duration of engagement. RESULTS: In our analysis we find that agencies and accounts vary widely in their usage of social media and activity they generate. Statistical analysis shows, for instance, that Facebook posts with more visual cues such as photos or videos or those which express positive sentiment generate more engagement. We further find that posts on certain topics such as occupation or organizations negatively affect the duration of engagement. CONCLUSIONS: We present the first comprehensive analyses of engagement with U.S. Federal health agencies on Facebook. In addition, we briefly compare and contrast findings from this study to our earlier study with similar focus but on Twitter to show the robustness of our methods.

Asunto(s)

Difusión de la Información , Conducta en la Búsqueda de Información , Medios de Comunicación Sociales , Red Social , United States Dept. of Health and Human Services , Comunicación , Humanos , Modelos Estadísticos , Aceptación de la Atención de Salud , Estados Unidos

Engagement with health agencies on twitter.

Bhattacharya, Sanmitra; Srinivasan, Padmini; Polgreen, Phil.

PLoS One ; 9(11): e112235, 2014.

Artículo en Inglés | MEDLINE | ID: mdl-25379727

RESUMEN

OBJECTIVE: To investigate factors associated with engagement of U.S. Federal Health Agencies via Twitter. Our specific goals are to study factors related to a) numbers of retweets, b) time between the agency tweet and first retweet and c) time between the agency tweet and last retweet. METHODS: We collect 164,104 tweets from 25 Federal Health Agencies and their 130 accounts. We use negative binomial hurdle regression models and Cox proportional hazards models to explore the influence of 26 factors on agency engagement. Account features include network centrality, tweet count, numbers of friends, followers, and favorites. Tweet features include age, the use of hashtags, user-mentions, URLs, sentiment measured using Sentistrength, and tweet content represented by fifteen semantic groups. RESULTS: A third of the tweets (53,556) had zero retweets. Less than 1% (613) had more than 100 retweets (mean â=â284). The hurdle analysis shows that hashtags, URLs and user-mentions are positively associated with retweets; sentiment has no association with retweets; and tweet count has a negative association with retweets. Almost all semantic groups, except for geographic areas, occupations and organizations, are positively associated with retweeting. The survival analyses indicate that engagement is positively associated with tweet age and the follower count. CONCLUSIONS: Some of the factors associated with higher levels of Twitter engagement cannot be changed by the agencies, but others can be modified (e.g., use of hashtags, URLs). Our findings provide the background for future controlled experiments to increase public health engagement via Twitter.

Asunto(s)

Medios de Comunicación Sociales , United States Dept. of Health and Human Services , Humanos , Difusión de la Información , Modelos de Riesgos Proporcionales , Análisis de Regresión , Medios de Comunicación Sociales/estadística & datos numéricos , Estados Unidos , United States Dept. of Health and Human Services/estadística & datos numéricos

Analysis of eligibility criteria representation in industry-standard clinical trial protocols.

Bhattacharya, Sanmitra; Cantor, Michael N.

J Biomed Inform ; 46(5): 805-13, 2013 Oct.

Artículo en Inglés | MEDLINE | ID: mdl-23770150

RESUMEN

Previous research on standardization of eligibility criteria and its feasibility has traditionally been conducted on clinical trial protocols from ClinicalTrials.gov (CT). The portability and use of such standardization for full-text industry-standard protocols has not been studied in-depth. Towards this end, in this study we first compare the representation characteristics and textual complexity of a set of Pfizer's internal full-text protocols to their corresponding entries in CT. Next, we identify clusters of similar criteria sentences from both full-text and CT protocols and outline methods for standardized representation of eligibility criteria. We also study the distribution of eligibility criteria in full-text and CT protocols with respect to pre-defined semantic classes used for eligibility criteria classification. We find that in comparison to full-text protocols, CT protocols are not only more condensed but also convey less information. We also find no correlation between the variations in word-counts of the ClinicalTrials.gov and full-text protocols. While we identify 65 and 103 clusters of inclusion and exclusion criteria from full text protocols, our methods found only 36 and 63 corresponding clusters from CT protocols. For both the full-text and CT protocols we are able to identify 'templates' for standardized representations with full-text standardization being more challenging of the two. In our exploration of the semantic class distributions we find that the majority of the inclusion criteria from both full-text and CT protocols belong to the semantic class "Diagnostic and Lab Results" while "Disease, Sign or Symptom" forms the majority for exclusion criteria. Overall, we show that developing a template set of eligibility criteria for clinical trials, specifically in their full-text form, is feasible and could lead to more efficient clinical trial protocol design.

Asunto(s)

Protocolos Clínicos , Ensayos Clínicos como Asunto , Atención a la Salud/organización & administración , Determinación de la Elegibilidad , Femenino , Humanos , Masculino

BioCreative III interactive task: an overview.

Arighi, Cecilia N; Roberts, Phoebe M; Agarwal, Shashank; Bhattacharya, Sanmitra; Cesareni, Gianni; Chatr-Aryamontri, Andrew; Clematide, Simon; Gaudet, Pascale; Giglio, Michelle Gwinn; Harrow, Ian; Huala, Eva; Krallinger, Martin; Leser, Ulf; Li, Donghui; Liu, Feifan; Lu, Zhiyong; Maltais, Lois J; Okazaki, Naoaki; Perfetto, Livia; Rinaldi, Fabio; Sætre, Rune; Salgado, David; Srinivasan, Padmini; Thomas, Philippe E; Toldo, Luca; Hirschman, Lynette; Wu, Cathy H.

BMC Bioinformatics ; 12 Suppl 8: S4, 2011 Oct 03.

Artículo en Inglés | MEDLINE | ID: mdl-22151968

RESUMEN

BACKGROUND: The BioCreative challenge evaluation is a community-wide effort for evaluating text mining and information extraction systems applied to the biological domain. The biocurator community, as an active user of biomedical literature, provides a diverse and engaged end user group for text mining tools. Earlier BioCreative challenges involved many text mining teams in developing basic capabilities relevant to biological curation, but they did not address the issues of system usage, insertion into the workflow and adoption by curators. Thus in BioCreative III (BC-III), the InterActive Task (IAT) was introduced to address the utility and usability of text mining tools for real-life biocuration tasks. To support the aims of the IAT in BC-III, involvement of both developers and end users was solicited, and the development of a user interface to address the tasks interactively was requested. RESULTS: A User Advisory Group (UAG) actively participated in the IAT design and assessment. The task focused on gene normalization (identifying gene mentions in the article and linking these genes to standard database identifiers), gene ranking based on the overall importance of each gene mentioned in the article, and gene-oriented document retrieval (identifying full text papers relevant to a selected gene). Six systems participated and all processed and displayed the same set of articles. The articles were selected based on content known to be problematic for curation, such as ambiguity of gene names, coverage of multiple genes and species, or introduction of a new gene name. Members of the UAG curated three articles for training and assessment purposes, and each member was assigned a system to review. A questionnaire related to the interface usability and task performance (as measured by precision and recall) was answered after systems were used to curate articles. Although the limited number of articles analyzed and users involved in the IAT experiment precluded rigorous quantitative analysis of the results, a qualitative analysis provided valuable insight into some of the problems encountered by users when using the systems. The overall assessment indicates that the system usability features appealed to most users, but the system performance was suboptimal (mainly due to low accuracy in gene normalization). Some of the issues included failure of species identification and gene name ambiguity in the gene normalization task leading to an extensive list of gene identifiers to review, which, in some cases, did not contain the relevant genes. The document retrieval suffered from the same shortfalls. The UAG favored achieving high performance (measured by precision and recall), but strongly recommended the addition of features that facilitate the identification of correct gene and its identifier, such as contextual information to assist in disambiguation. DISCUSSION: The IAT was an informative exercise that advanced the dialog between curators and developers and increased the appreciation of challenges faced by each group. A major conclusion was that the intended users should be actively involved in every phase of software development, and this will be strongly encouraged in future tasks. The IAT Task provides the first steps toward the definition of metrics and functional requirements that are necessary for designing a formal evaluation of interactive curation systems in the BioCreative IV challenge.

Asunto(s)

Minería de Datos/métodos , Genes , Animales , Biología Computacional/métodos , Publicaciones Periódicas como Asunto , Plantas/genética , Plantas/metabolismo

The gene normalization task in BioCreative III.

Lu, Zhiyong; Kao, Hung-Yu; Wei, Chih-Hsuan; Huang, Minlie; Liu, Jingchen; Kuo, Cheng-Ju; Hsu, Chun-Nan; Tsai, Richard Tzong-Han; Dai, Hong-Jie; Okazaki, Naoaki; Cho, Han-Cheol; Gerner, Martin; Solt, Illes; Agarwal, Shashank; Liu, Feifan; Vishnyakova, Dina; Ruch, Patrick; Romacker, Martin; Rinaldi, Fabio; Bhattacharya, Sanmitra; Srinivasan, Padmini; Liu, Hongfang; Torii, Manabu; Matos, Sergio; Campos, David; Verspoor, Karin; Livingston, Kevin M; Wilbur, W John.

BMC Bioinformatics ; 12 Suppl 8: S2, 2011 Oct 03.

Artículo en Inglés | MEDLINE | ID: mdl-22151901

RESUMEN

BACKGROUND: We report the Gene Normalization (GN) challenge in BioCreative III where participating teams were asked to return a ranked list of identifiers of the genes detected in full-text articles. For training, 32 fully and 500 partially annotated articles were prepared. A total of 507 articles were selected as the test set. Due to the high annotation cost, it was not feasible to obtain gold-standard human annotations for all test articles. Instead, we developed an Expectation Maximization (EM) algorithm approach for choosing a small number of test articles for manual annotation that were most capable of differentiating team performance. Moreover, the same algorithm was subsequently used for inferring ground truth based solely on team submissions. We report team performance on both gold standard and inferred ground truth using a newly proposed metric called Threshold Average Precision (TAP-k). RESULTS: We received a total of 37 runs from 14 different teams for the task. When evaluated using the gold-standard annotations of the 50 articles, the highest TAP-k scores were 0.3297 (k=5), 0.3538 (k=10), and 0.3535 (k=20), respectively. Higher TAP-k scores of 0.4916 (k=5, 10, 20) were observed when evaluated using the inferred ground truth over the full test set. When combining team results using machine learning, the best composite system achieved TAP-k scores of 0.3707 (k=5), 0.4311 (k=10), and 0.4477 (k=20) on the gold standard, representing improvements of 12.4%, 21.8%, and 26.6% over the best team results, respectively. CONCLUSIONS: By using full text and being species non-specific, the GN task in BioCreative III has moved closer to a real literature curation task than similar tasks in the past and presents additional challenges for the text mining community, as revealed in the overall team results. By evaluating teams using the gold standard, we show that the EM algorithm allows team submissions to be differentiated while keeping the manual annotation effort feasible. Using the inferred ground truth we show measures of comparative performance between teams. Finally, by comparing team rankings on gold standard vs. inferred ground truth, we further demonstrate that the inferred ground truth is as effective as the gold standard for detecting good team performance.

Asunto(s)

Algoritmos , Minería de Datos/métodos , Genes , Animales , Minería de Datos/normas , Humanos , National Library of Medicine (U.S.) , Publicaciones Periódicas como Asunto , Estados Unidos

MeSH: a window into full text for document summarization.

Bhattacharya, Sanmitra; Ha-Thuc, Viet; Srinivasan, Padmini.

Bioinformatics ; 27(13): i120-8, 2011 Jul 01.

Artículo en Inglés | MEDLINE | ID: mdl-21685060

RESUMEN

MOTIVATION: Previous research in the biomedical text-mining domain has historically been limited to titles, abstracts and metadata available in MEDLINE records. Recent research initiatives such as TREC Genomics and BioCreAtIvE strongly point to the merits of moving beyond abstracts and into the realm of full texts. Full texts are, however, more expensive to process not only in terms of resources needed but also in terms of accuracy. Since full texts contain embellishments that elaborate, contextualize, contrast, supplement, etc., there is greater risk for false positives. Motivated by this, we explore an approach that offers a compromise between the extremes of abstracts and full texts. Specifically, we create reduced versions of full text documents that contain only important portions. In the long-term, our goal is to explore the use of such summaries for functions such as document retrieval and information extraction. Here, we focus on designing summarization strategies. In particular, we explore the use of MeSH terms, manually assigned to documents by trained annotators, as clues to select important text segments from the full text documents. RESULTS: Our experiments confirm the ability of our approach to pick the important text portions. Using the ROUGE measures for evaluation, we were able to achieve maximum ROUGE-1, ROUGE-2 and ROUGE-SU4 F-scores of 0.4150, 0.1435 and 0.1782, respectively, for our MeSH term-based method versus the maximum baseline scores of 0.3815, 0.1353 and 0.1428, respectively. Using a MeSH profile-based strategy, we were able to achieve maximum ROUGE F-scores of 0.4320, 0.1497 and 0.1887, respectively. Human evaluation of the baselines and our proposed strategies further corroborates the ability of our method to select important sentences from the full texts. CONTACT: sanmitra-bhattacharya@uiowa.edu; padmini-srinivasan@uiowa.edu.

Asunto(s)

Almacenamiento y Recuperación de la Información , Medical Subject Headings , MEDLINE , Estados Unidos

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA