Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
J Neurooncol ; 166(3): 569-574, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38286976

RESUMO

PURPOSE: Cancer is an independent risk factor for the development of venous thromboembolism (VTE). However, patients with high-grade glioma (HGG) including glioblastoma (GBM) are at a particularly high risk of VTE with an incidence up to 20-30% per year. Patients are often placed on anticoagulation if they are found to have VTE. However, patients with primary brain tumors such as HGG are at increased risk for intracerebral hemorrhage (ICH) even without the administration of anticoagulation. The combination of risk factors for ICH with anticoagulation and HGG complicates decision-making. Currently it is not known which of the direct oral anticoagulants (DOACs) are safest for patients with HGG in terms of adverse bleeding-related outcomes such as ICH. Furthermore, a deeper understanding of the clinical and molecular determinants of bleeding-related adverse outcomes in HGG is not fully characterized. METHODS: In this retrospective study, we identified and gathered data on 75 consecutive patients with pathology-confirmed HGG with hospital encounters at two academic medical center hospitals in Austin between July 1, 2017 and June 30, 2022. We compared clinical and treatment-related factors among cohorts who had received various forms of anticoagulation or no anticoagulation. RESULTS: Patients who were on rivaroxaban (3/7 (43%)) had a statistically significant association with more bleeding-related adverse events compared to those on apixaban (0/12 (0%)) or enoxaparin (0/5 (0%), p = 0.022) even though the groups were similar in characteristics including total time on the respective anticoagulation. Patients on anticoagulation vs those never on anticoagulation did not differ in terms of their studied demographic and clinical characteristics. Intriguingly, logistic regression analysis revealed that patients Astrocytoma, isocitrate dehydrogenase (IDH) mutant, grade 4 had a significant association with more adverse bleeding-related events even when controlling for other relevant factors (Odds Ratio compared to reference GBM: 49.4, 95% CI: 2.8, 2084.7; p = 0.013). CONCLUSION: In this study we found that the use of rivaroxaban was associated with more bleeding-related events compared to apixaban and enoxaparin in patients with high-grade glioma. In this study we also found that the diagnosis of astrocytoma, IDH mutant, grade 4 was associated with more bleeding events. However, this is based on a small study and there is a need for larger studies to further evaluate these results.


Assuntos
Astrocitoma , Glioma , Tromboembolia Venosa , Humanos , Anticoagulantes/efeitos adversos , Rivaroxabana/efeitos adversos , Enoxaparina/efeitos adversos , Estudos Retrospectivos , Tromboembolia Venosa/tratamento farmacológico , Tromboembolia Venosa/epidemiologia , Hemorragia/induzido quimicamente , Hemorragia/epidemiologia , Glioma/complicações , Glioma/tratamento farmacológico , Astrocitoma/complicações
2.
J Neurooncol ; 167(1): 181-188, 2024 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-38372903

RESUMO

PURPOSE: Bevacizumab has evolved as an integral treatment option for patients with high-grade gliomas. Little is known about clinical risk factors that predispose patients with high-grade gliomas receiving bevacizumab to VTE or ICH. We sought to characterize the clinical risk factors associated with risk of either event. METHODS: In this multi-institutional retrospective study, we first evaluated patients with high-grade gliomas who were treated with bevacizumab at University of Texas MD Anderson Cancer Center from 2015-2021. We compared clinical and treatment-related factors among three cohorts: those who developed VTE, ICH, or neither. We further compared survival outcomes of these patients from the time of bevacizumab initiation. Then to further confirm our results in a non-cancer center hospital setting we evaluated patients from two Ascension Seton Hospitals in Austin, Texas which are affiliated with Dell Medical School at the University of Texas at Austin from 2017-2022. RESULTS: We found that the presence of cerebral macrobleeding, defined as a magnetic susceptibility of > 1 cm3 on magnetic resonance imaging, was highly associated with risk of developing ICH after initiation of bevacizumab. Development of ICH was significantly associated with poorer survival outcomes. We did not find a statistically significant effect of VTE on survival after bevacizumab initiation. CONCLUSION: In order to stratify the risk for developing ICH before the initiation of bevacizumab, we recommend to assess for the presence of cerebral macrobleeding as it is associated with ICH development.


Assuntos
Neoplasias Encefálicas , Glioma , Tromboembolia Venosa , Humanos , Bevacizumab/efeitos adversos , Tromboembolia Venosa/induzido quimicamente , Estudos Retrospectivos , Glioma/complicações , Glioma/tratamento farmacológico , Fatores de Risco , Neoplasias Encefálicas/patologia
3.
J Trauma Stress ; 37(4): 606-616, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38565718

RESUMO

Divergent conceptualization of posttraumatic stress disorder (PTSD) within the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5) and International Statistical Classification of Diseases and Related Health Problems (11th ed..; ICD-11) significantly confounds both research and practice. Using a diverse sample of trauma-exposed youth (N = 1,542, age range: 8-20 years), we compared these two diagnostic approaches along with an expanded version of the ICD-11 PTSD criteria that included three additional reexperiencing symptoms (ICD-11+). Within the sample, PTSD was more prevalent using the DSM-5 criteria (25.7%) compared to the ICD-11 criteria (16.0%), with moderate agreement between these diagnostic systems, κ = .57. The inclusion of additional reexperiencing symptoms (i.e., ICD-11+) reduced this discrepancy in prevalence (24.7%) and increased concordance with DSM-5 criteria, κ = .73. All three PTSD classification systems exhibited similar comorbidity rates with major depressive episode (MDE) or generalized anxiety disorder (GAD; 78.0%-83.6%). Most youths who met the DSM-5 PTSD criteria also met the criteria for ICD-11 PTSD, MDE, or GAD (88.4%), and this proportion increased when applying the ICD-11+ criteria (95.5%). Symptom-level analyses identified reexperiencing/intrusions and negative alterations in cognition and mood symptoms as primary sources of discrepancy between the DSM-5 and ICD-11 PTSD diagnostic systems. Overall, these results challenge assertions that nonspecific distress and diagnostically overlapping symptoms within DSM-5 PTSD inflate comorbidity with depressive and anxiety disorders. Further, they support the argument that the DSM-5 PTSD criteria can be refined and simplified without reducing the overall prevalence of psychiatric diagnoses in youth.


Assuntos
Manual Diagnóstico e Estatístico de Transtornos Mentais , Classificação Internacional de Doenças , Transtornos de Estresse Pós-Traumáticos , Humanos , Transtornos de Estresse Pós-Traumáticos/diagnóstico , Transtornos de Estresse Pós-Traumáticos/epidemiologia , Adolescente , Feminino , Masculino , Criança , Adulto Jovem , Prevalência , Escalas de Graduação Psiquiátrica/normas
4.
J Interprof Care ; 37(2): 254-261, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36739557

RESUMO

The need for blueprints to design specialty care interprofessional collaboration (IPC) models is urgent, given the expanding aging population and current challenges in dementia diagnosis and treatment. We describe key steps creating an interprofessional outpatient dementia specialty clinic, efforts to sustain the model, and evaluation of interprofessional effectiveness and clinician satisfaction. The conception for the Comprehensive Memory Center was informed by qualitative research methodologies including focus groups, interviews, and literature reviews. Quantitative evaluation included satisfaction surveys and team effectiveness measures. The IPC model diverges from typical dementia practices through its interprofessional team, visit structure, approach to decision-making, in-house services, and community collaborations. Team retreats and workshops helped build clinician knowledge of interprofessional values and practices to sustain the IPC model. In the first 3.5 years, we served nearly 750 patients and their caregivers. Team evaluation results revealed that increased access to consultation and sharing the workload and emotional burden were beneficial. The majority of team members preferred the IPC model to traditional models of clinical care.


Assuntos
Demência , Relações Interprofissionais , Humanos , Idoso , Formação de Conceito , Grupos Focais , Demência/diagnóstico , Demência/terapia , Assistência Centrada no Paciente , Comportamento Cooperativo , Equipe de Assistência ao Paciente
5.
Headache ; 62(2): 141-158, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-35156215

RESUMO

OBJECTIVE: To quantify and compare healthcare utilization and costs for patients with chronic migraine (CM), episodic migraine (EM), and tension-type headache (TTH) enrolled in US commercial health plans. METHODS: This retrospective cohort study used the Optum Clinformatics® Data Mart database from January 2015 to December 2018. Adult patients with a diagnosis of EM, CM or TTH and at least 12 months of continuous enrollment before and after diagnosis were included. Inverse probability of treatment weighting was used to adjust for baseline differences among the three groups. Patient demographic and clinical characteristics at baseline, and healthcare utilization and costs during follow-up, were described and compared between the three groups. RESULTS: A total of 45,849 patients were included: 8955 with CM, 31,961 with EM, and 4933 with TTH. The total all-cause annual direct medical costs of patients with CM ($17,878) were 1.38 times higher (95% CI: 1.31-1.44) than those with EM ($12,986), and 2.26 times higher (95% CI: 2.08-2.47) than those with TTH ($7902). The annual migraine/TTH-related costs of patients with CM ($1869) were 4.19 times higher (95% CI: 3.92-4.48) than those with EM ($446), and 11.90 times (95% CI: 10.59-13.52) higher than those with TTH ($157). In the adjusted analyses, for all service categories (emergency department, inpatient, outpatient, and prescriptions), the expected costs in the migraine groups were higher than in the TTH group (all p < 0.001), while controlling for covariates. Main findings were consistent in both weighted and unweighted samples, and with both unadjusted and adjusted analyses. CONCLUSION: This study provides an updated assessment of healthcare utilization and expenditures for adult patients with primary headache disorders. Compared to TTH, migraine is associated with higher resource use and direct medical costs, especially for those with a chronic condition. Future studies are needed to understand the indirect medical costs (productivity loss) and humanistic burden (quality of life) between migraine and TTH.


Assuntos
Demandas Administrativas em Assistência à Saúde/estatística & dados numéricos , Gastos em Saúde/estatística & dados numéricos , Seguro Saúde/estatística & dados numéricos , Transtornos de Enxaqueca/terapia , Aceitação pelo Paciente de Cuidados de Saúde/estatística & dados numéricos , Cefaleia do Tipo Tensional/terapia , Adulto , Doença Crônica , Serviço Hospitalar de Emergência , Feminino , Humanos , Masculino , Estudos Retrospectivos
6.
J Biomed Inform ; 136: 104241, 2022 12.
Artigo em Inglês | MEDLINE | ID: mdl-36375772

RESUMO

OBJECTIVE: To describe methods to approach application of data standards to integrate social determinants of health (SDoH) into EHRs through evaluation of a case of clinical decision support for pediatric asthma. MATERIALS AND METHODS: We identified a list of environmental factors important for managing pediatric asthma. We identified and integrated data from local outdoor air quality monitors with elements available from the clinic's EHR and self-reported indoor air quality questionnaire data. We assessed existing SDoH frameworks, assessment tools, and terminologies to identify representative data standards for these environmental SDoH measures. RESULTS: We found many-to-many relationships between the multiple framework domains, the environmental exposure measures collected, and existing standards. The majority of concepts did not accurately align with environmental exposure measurements. We propose an ontology-driven information framework methodology to apply standards for SDoH measurements to support measuring, managing, and computing SDoH data. DISCUSSION: To support methods of integrating SDoH data in the EHR via an ontology-driven information framework, a common SDoH ecosystem should be developed descriptively and prescriptively integrating framework domains, assessment tools, and standard ontologies to support future data sharing, aggregation, and interoperability. A hierarchical object-oriented information model should be adopted to manage SDoH to extend beyond patient-centered orientation of EHRs to orient to households and communities. CONCLUSION: SDoH data pose unique challenges and opportunities in collecting, measuring, and managing health information. Future work is needed to define data standards for implementing SDoH in a hierarchical, object-oriented information model representing multiple units of orientation including individuals, households, and communities.


Assuntos
Asma , Sistemas de Apoio a Decisões Clínicas , Humanos , Criança , Determinantes Sociais da Saúde , Ecossistema , Inquéritos e Questionários , Asma/diagnóstico , Asma/terapia
7.
J Biomed Inform ; 132: 104139, 2022 08.
Artigo em Inglês | MEDLINE | ID: mdl-35811026

RESUMO

Accurate identification of the presence, absence or possibility of relevant entities in clinical notes is important for healthcare professionals to quickly understand crucial clinical information. This introduces the task of assertion classification - to correctly identify the assertion status of an entity in the unstructured clinical notes. Recent rule-based and machine-learning approaches suffer from labor-intensive pattern engineering and severe class bias toward majority classes. To solve this problem, in this study, we propose a prompt-based learning approach, which treats the assertion classification task as a masked language auto-completion problem. We evaluated the model on six datasets. Our prompt-based method achieved a micro-averaged F-1 of 0.954 on the i2b2 2010 assertion dataset, with ∼1.8% improvements over previous works. In particular, our model showed excellence in detecting classes with few instances (few-shot). Evaluations on five external datasets showcase the outstanding generalizability of the prompt-based method to unseen data. To examine the rationality of our model, we further introduced two rationale faithfulness metrics: comprehensiveness and sufficiency. The results reveal that compared to the "pre-train, fine-tune" procedure, our prompt-based model has a stronger capability of identifying the comprehensive (∼63.93%) and sufficient (∼11.75%) linguistic features from free text. We further evaluated the model-agnostic explanations using LIME. The results imply a better rationale agreement between our model and human beings (∼71.93% in average F-1), which demonstrates the superior trustworthiness of our model.


Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Humanos , Linguística , Aprendizado de Máquina
8.
PLoS One ; 19(3): e0294892, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38512832

RESUMO

BACKGROUND: Dexamethasone was approved for use in hospitalized COVID-19 patients early in the pandemic based on the RECOVERY trial, but evidence is still needed to support its real-world effectiveness in heterogeneous populations of patients with a wide range of comorbidities. METHODS: COVID-19 inpatients represented within the National COVID Cohort Collaborative (N3C) Data Enclave, prior to vaccine availability, were studied. Primary outcome was in-hospital death; secondary outcome was combined in-hospital death and severe outcome defined by use of ECMO or mechanical ventilation. Missing data were imputed with single imputation. Dexamethasone-treated patients were propensity score (PS) matched to non-dexamethasone-treated controls, stratified by remdesivir treatment and based on demographics, baseline laboratory values, comorbidities, and amount of missing data before imputation. Treatment benefit was quantified using logistic regression. Further sensitivity analyses were performed using clinical adjusters in matched groups and in strata defined by quartiles of PS. RESULTS: Dexamethasone treatment was associated with reduced risk of in-hospital mortality for n = 1,263 treated, matched 1:3 to untreated, patients not receiving remdesivir (OR = 0.77, 95% CI: 0.62 to 0.95, p = 0.017), and for n = 804 treated, matched 1:1 to untreated, patients receiving remdesivir (OR = 0.74, 95% CI: 0.53 to 1.02, p = 0.054). Treatment showed secondary outcome benefit. In sensitivity analyses, treatment effect generally remained similar with some heterogeneity of benefit across quartiles of PS, possibly reflecting concentration of benefit among the more severely affected. CONCLUSIONS: We add evidence that dexamethasone provides benefit with respect to mortality and severe outcomes in a diverse, national hospitalized sample, prior to vaccine availability.


Assuntos
COVID-19 , Vacinas , Humanos , Estados Unidos/epidemiologia , Pandemias , Mortalidade Hospitalar , COVID-19/epidemiologia , Tratamento Farmacológico da COVID-19 , Pacientes Internados , Dexametasona/uso terapêutico
9.
J Endocr Soc ; 8(8): bvae096, 2024 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-38988672

RESUMO

Context: Primary hyperparathyroidism (PHPT) increases the risk of bone loss, debilitating fractures, kidney stones, impaired renal function, and neurocognitive symptoms. Studies describing the natural history of PHPT have been limited to small samples, single institutions, or specific populations. Objective: We assessed the natural history of PHPT through a large, diverse national cohort from an electronic health record dataset representing more than 100 million patients. Methods: The TriNetX database was queried for adult patients with PHPT. We extracted demographics, comorbidities, and longitudinal biochemistries. Primary outcomes included major osteoporotic fracture (MOF) and chronic kidney disease (CKD). Outcomes were stratified by treatment strategy (surgical parathyroidectomy [PTX] vs nonsurgical) and age. Results: Among 50 958 patients with PHPT, 26.5% were treated surgically at a median of 0.3 years postdiagnosis. At diagnosis, median age was 65 years, 74.0% were female, and median calcium level was 10.9 mg/dL. Black and older patients underwent PTX less frequently than White and younger patients. MOF 10-year incidence was 5.20% (PTX) and 7.91% (nonsurgical), with median 1.7-year delay with PTX compared to nonsurgical. PTX-associated MOF absolute risk reduction was 0.83% (age < 65 years) and 3.33% (age ≥ 65 years). CKD 10-year incidence was 21.2% (PTX) and 33.6% (nonsurgical), with median 1.9-year delay with PTX. PTX-associated CKD absolute risk reduction was 12.2% (age < 65 years) and 9.5% (age ≥ 65 years). Conclusion: We report 1 of the largest, representative, population-based natural histories of PHPT with different management strategies. A minority of patients underwent PTX, especially in older age. Patients managed surgically had lower incidence of fracture and CKD, and older patients experienced differential benefit.

10.
JMIR AI ; 3: e52095, 2024 May 16.
Artigo em Inglês | MEDLINE | ID: mdl-38875593

RESUMO

BACKGROUND: Large language models (LLMs) have the potential to support promising new applications in health informatics. However, practical data on sample size considerations for fine-tuning LLMs to perform specific tasks in biomedical and health policy contexts are lacking. OBJECTIVE: This study aims to evaluate sample size and sample selection techniques for fine-tuning LLMs to support improved named entity recognition (NER) for a custom data set of conflicts of interest disclosure statements. METHODS: A random sample of 200 disclosure statements was prepared for annotation. All "PERSON" and "ORG" entities were identified by each of the 2 raters, and once appropriate agreement was established, the annotators independently annotated an additional 290 disclosure statements. From the 490 annotated documents, 2500 stratified random samples in different size ranges were drawn. The 2500 training set subsamples were used to fine-tune a selection of language models across 2 model architectures (Bidirectional Encoder Representations from Transformers [BERT] and Generative Pre-trained Transformer [GPT]) for improved NER, and multiple regression was used to assess the relationship between sample size (sentences), entity density (entities per sentence [EPS]), and trained model performance (F1-score). Additionally, single-predictor threshold regression models were used to evaluate the possibility of diminishing marginal returns from increased sample size or entity density. RESULTS: Fine-tuned models ranged in topline NER performance from F1-score=0.79 to F1-score=0.96 across architectures. Two-predictor multiple linear regression models were statistically significant with multiple R2 ranging from 0.6057 to 0.7896 (all P<.001). EPS and the number of sentences were significant predictors of F1-scores in all cases ( P<.001), except for the GPT-2_large model, where EPS was not a significant predictor (P=.184). Model thresholds indicate points of diminishing marginal return from increased training data set sample size measured by the number of sentences, with point estimates ranging from 439 sentences for RoBERTa_large to 527 sentences for GPT-2_large. Likewise, the threshold regression models indicate a diminishing marginal return for EPS with point estimates between 1.36 and 1.38. CONCLUSIONS: Relatively modest sample sizes can be used to fine-tune LLMs for NER tasks applied to biomedical text, and training data entity density should representatively approximate entity density in production data. Training data quality and a model architecture's intended use (text generation vs text processing or classification) may be as, or more, important as training data volume and model parameter size.

11.
World Med Health Policy ; 16(3): 489-505, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39430118

RESUMO

Considerable efforts have been devoted to addressing the problem of conflicts of interest (COI) in health research, policy, education, and practice. An overwhelming body of evidence demonstrates that conflicts associate with deleterious outcomes for the biomedical research enterprise. Nevertheless, little has changed for research, specifically, since the Institute of Medicine's landmark Conflicts of Interest in Medical Research, Practice, and Education was published over a decade ago. In this article, we draw on interdisciplinary research on manufactured controversies in science-policy deliberation to argue that the development of meaningful COI policy has been stymied through argumentative "wedges" designed to delay consensus and policy formation. Argumentative wedges disrupt policy formation by mischaracterizing the evidence base, continuously redefining the terms of the debate and/or recommending overly narrow criteria for who should be allowed to participate in policy deliberation. In this article, we argue researchers and policymakers interested in better addressing the harmful effects of COI can improve their efforts through strategic efforts designed to disrupt the wedges of manufactured controversy. Additionally, we argue that efforts to address COI can be further enhanced through embracing a broader framework for COI inquiry. Specifically, we argue that aggregate approaches to COI can help to disrupt these wedges and provide a strong foundation for future policy.

12.
Psychiatry Res ; 334: 115772, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38442477

RESUMO

This investigation, conducted within the Texas Childhood Trauma Research Network, investigated the prospective relationships between resiliency and emergent internalizing symptoms among trauma-exposed youth. The cohort encompassed 1262 youth, aged 8-20, from twelve health-related institutions across Texas, who completed assessments at baseline and one- and six-month follow-ups for resiliency, symptoms of depression, generalized anxiety, posttraumatic stress disorder (PTSD), and other demographic and clinical characteristics. At baseline, greater resilience was positively associated with older age, male (vs female) sex assigned at birth, and history of mental health treatment. Unadjusted for covariates, higher baseline resilience was associated with greater prospective depression and PTSD symptoms but not anxiety symptoms. Upon adjusting for demographic and clinical factors, higher baseline resilience was no longer associated with depression, PTSD, or anxiety symptoms. Our analyses demonstrate that the predictive value of resilience on psychopathology is relatively small compared to more readily observable clinical and demographic factors. These data suggest a relatively minor prospective role of resilience in protecting against internalizing symptoms among trauma-exposed youth and highlight the importance of controlling for relevant youth characteristics when investigating a protective effect of resilience on internalizing symptoms.


Assuntos
Resiliência Psicológica , Transtornos de Estresse Pós-Traumáticos , Recém-Nascido , Criança , Adolescente , Feminino , Masculino , Humanos , Depressão/etiologia , Transtornos de Ansiedade , Ansiedade/etiologia
13.
NPJ Digit Med ; 7(1): 190, 2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-39043988

RESUMO

Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4V's rationales of image comprehension, recall of medical knowledge, and step-by-step multimodal reasoning when solving New England Journal of Medicine (NEJM) Image Challenges-an imaging quiz designed to test the knowledge and diagnostic capabilities of medical professionals. Evaluation results confirmed that GPT-4V performs comparatively to human physicians regarding multi-choice accuracy (81.6% vs. 77.8%). GPT-4V also performs well in cases where physicians incorrectly answer, with over 78% accuracy. However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (35.5%), most prominent in image comprehension (27.2%). Regardless of GPT-4V's high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such multimodal AI models into clinical workflows.

14.
ArXiv ; 2024 Aug 31.
Artigo em Inglês | MEDLINE | ID: mdl-38410646

RESUMO

Recent studies indicate that Generative Pre-trained Transformer 4 with Vision (GPT-4V) outperforms human physicians in medical challenge tasks. However, these evaluations primarily focused on the accuracy of multi-choice questions alone. Our study extends the current scope by conducting a comprehensive analysis of GPT-4V's rationales of image comprehension, recall of medical knowledge, and step-by-step multimodal reasoning when solving New England Journal of Medicine (NEJM) Image Challenges - an imaging quiz designed to test the knowledge and diagnostic capabilities of medical professionals. Evaluation results confirmed that GPT-4V performs comparatively to human physicians regarding multi-choice accuracy (81.6% vs. 77.8%). GPT-4V also performs well in cases where physicians incorrectly answer, with over 78% accuracy. However, we discovered that GPT-4V frequently presents flawed rationales in cases where it makes the correct final choices (35.5%), most prominent in image comprehension (27.2%). Regardless of GPT-4V's high accuracy in multi-choice questions, our findings emphasize the necessity for further in-depth evaluations of its rationales before integrating such multimodal AI models into clinical workflows.

15.
Proc Conf Assoc Comput Linguist Meet ; 2023: 12532-12555, 2023 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37701928

RESUMO

A human decision-maker benefits the most from an AI assistant that corrects for their biases. For problems such as generating interpretation of a radiology report given findings, a system predicting only highly likely outcomes may be less useful, where such outcomes are already obvious to the user. To alleviate biases in human decision-making, it is worth considering a broad differential diagnosis, going beyond the most likely options. We introduce a new task, "less likely brainstorming," that asks a model to generate outputs that humans think are relevant but less likely to happen. We explore the task in two settings: a brain MRI interpretation generation setting and an everyday commonsense reasoning setting. We found that a baseline approach of training with less likely hypotheses as targets generates outputs that humans evaluate as either likely or irrelevant nearly half of the time; standard MLE training is not effective. To tackle this problem, we propose a controlled text generation method that uses a novel contrastive learning strategy to encourage models to differentiate between generating likely and less likely outputs according to humans. We compare our method with several state-of-the-art controlled text generation models via automatic and human evaluations and show that our models' capability of generating less likely outputs is improved.

16.
IEEE Winter Conf Appl Comput Vis ; 2023: 4976-4985, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-37051561

RESUMO

Deep neural networks (DNNs) have rapidly become a de facto choice for medical image understanding tasks. However, DNNs are notoriously fragile to the class imbalance in image classification. We further point out that such imbalance fragility can be amplified when it comes to more sophisticated tasks such as pathology localization, as imbalances in such problems can have highly complex and often implicit forms of presence. For example, different pathology can have different sizes or colors (w.r.t.the background), different underlying demographic distributions, and in general different difficulty levels to recognize, even in a meticulously curated balanced distribution of training data. In this paper, we propose to use pruning to automatically and adaptively identify hard-to-learn (HTL) training samples, and improve pathology localization by attending them explicitly, during training in supervised, semi-supervised, and weakly-supervised settings. Our main inspiration is drawn from the recent finding that deep classification models have difficult-to-memorize samples and those may be effectively exposed through network pruning [15] - and we extend such observation beyond classification for the first time. We also present an interesting demographic analysis which illustrates HTLs ability to capture complex demographic imbalances. Our extensive experiments on the Skin Lesion Localization task in multiple training settings by paying additional attention to HTLs show significant improvement of localization performance by ~2-3%.

17.
AMIA Jt Summits Transl Sci Proc ; 2023: 477-486, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37350891

RESUMO

This paper applies eXplainable Artificial Intelligence (XAI) methods to investigate the socioeconomic disparities in COVID-19 patient mortality. An Extreme Gradient Boosting (XGBoost) prediction model is built based on a de-identified Austin area hospital dataset to predict the mortality of COVID-19 patients. We apply two XAI methods, Shapley Additive exPlanations (SHAP) and Locally Interpretable Model Agnostic Explanations (LIME), to compare the global and local interpretation of feature importance. This paper demonstrates the advantages of using XAI which shows the feature importance and decisive capability. Furthermore, we use the XAI methods to cross-validate their interpretations for individual patients. The XAI models reveal that Medicare financial class, older age, and gender have high impact on the mortality prediction. We find that LIME's local interpretation does not show significant differences in feature importance comparing to SHAP, which suggests pattern confirmation. This paper demonstrates the importance of XAI methods in cross-validation of feature attributions.

18.
AJOB Empir Bioeth ; 14(2): 91-98, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36576202

RESUMO

INTRODUCTION: Financial conflicts of interest (fCOI) present well documented risks to the integrity of biomedical research. However, few studies differentiate among fCOI types in their analyses, and those that do tend to use preexisting taxonomies for fCOI identification. Research on fCOI would benefit from an empirically-derived taxonomy of self-reported fCOI and data on fCOI type and payor prevalence. METHODS: We conducted a content analysis of 6,165 individual self-reported relationships from COI statements distributed across 378 articles indexed with PubMed. Two coders used an iterative coding process to identify and classify individual fCOI types and payors. Inter-rater reliability was κ = 0.935 for fCOI type and κ = 0.884 for payor identification. RESULTS: Our analysis identified 21 fCOI types, 9 of which occurred at prevalences greater than 1%. These included research funding (24.8%), speaking fees (20.8%), consulting fees (18.8%), advisory relationships (11%), industry employment (7.6%), unspecified fees (4.8%), travel fees (3.2%), stock holdings (3.1%), and patent ownership (1%). Reported fCOI were held with 1,077 unique payors, 22 of which were present in more than 1% of financial relationships. The ten most common payors included Pfizer (4%), Novartis (3.9%), MSD (3.8%), Bristol Myers Squibb (3.2%), AstraZeneca (3.1%), GSK (3%), Boehringer Ingelheim (2.9%), Roche (2.8%), Eli LIlly (2.5%), and AbbVie (2.4%). CONCLUSIONS: These results provide novel multi-domain prevalence data on self-reported fCOI and payors in biomedical research. As such, they have the potential to catalyze future research that can assess the differential effects of various types of fCOI. Specifically, the data suggest that comparative analyses of the effects of different fCOI types are needed and that special attention should be paid to the diversity of payor types for research relationships.


Assuntos
Pesquisa Biomédica , Humanos , Autorrelato , Reprodutibilidade dos Testes , Conflito de Interesses , Indústrias
19.
J Psychiatr Res ; 167: 1-9, 2023 Sep 25.
Artigo em Inglês | MEDLINE | ID: mdl-37778242

RESUMO

OBJECTIVE: Previous work investigating the impact of childhood trauma on substance use and co-occurring psychiatric disorders has primarily been conducted in adults or on specific trauma types. This limits understanding of traumas impact in childhood and how different types of traumas play a role. We sought to characterize substance use in a sample of trauma-exposed youth in the context of psychiatric comorbidities. METHOD: 1152 youth from the Texas Childhood Trauma Research Network (TX-CTRN) that were exposed to at least one trauma meeting DSM-5 Criterion A were assessed for current substance use and psychiatric diagnoses. Latent class analysis was used to identify patterns of substance use. To characterize these patterns, we examined if demographics, number of trauma types experienced, or childhood psychiatric disorders predicted class membership. RESULTS: We identified four primary patterns of substance use: Non-use (66.1%), predominantly alcohol use (19.7%), predominantly cannabis use (4.5%), and polysubstance use (9.7%). Compared to the non-users, polysubstance users tended to be older, Non-Hispanic White, have experienced more types of trauma. They were also more likely to have fulfilled diagnostic criteria for suicidality and ADHD. Comparisons among the substance using classes were more nuanced. CONCLUSION: The findings highlight the need for universal assessments of trauma, substance misuse, and mental health symptoms in youth as the presence or absence of their co-occurrence has implications for treatment.

20.
NPJ Digit Med ; 6(1): 158, 2023 Aug 24.
Artigo em Inglês | MEDLINE | ID: mdl-37620423

RESUMO

Recent advances in large language models (LLMs) have demonstrated remarkable successes in zero- and few-shot performance on various downstream tasks, paving the way for applications in high-stakes domains. In this study, we systematically examine the capabilities and limitations of LLMs, specifically GPT-3.5 and ChatGPT, in performing zero-shot medical evidence summarization across six clinical domains. We conduct both automatic and human evaluations, covering several dimensions of summary quality. Our study demonstrates that automatic metrics often do not strongly correlate with the quality of summaries. Furthermore, informed by our human evaluations, we define a terminology of error types for medical evidence summarization. Our findings reveal that LLMs could be susceptible to generating factually inconsistent summaries and making overly convincing or uncertain statements, leading to potential harm due to misinformation. Moreover, we find that models struggle to identify the salient information and are more error-prone when summarizing over longer textual contexts.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA