Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Cureus ; 16(4): e57457, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38699087

RESUMO

BACKGROUND: The integrity of medical research reporting in online news publications is crucial for informed healthcare decisions and public health discourse. However, omissions, lack of transparency, and the rapid spread of misinformation on digital and social media platforms can lead to an incomplete or inaccurate understanding of research findings. This study aims to analyze the fidelity of online news in reporting medical research findings, focusing on conflicts of interest, study limitations, statistical data, and research conclusions. METHODS: Fifty randomized controlled trials published in major medical journals and their corresponding news reports were evaluated for the inclusion of conflicts of interest, study limitations, and inferential statistics in the news reports. The alignment of conclusions was evaluated. A binomial test with a Bonferroni correction was used to assess the inclusion rate of these variables against a 90% threshold. RESULTS: Conflicts of interest were reported in 10 (20%) of news reports, study limitations in 14 (28%), and inferential statistics in 19 (38%). These rates were significantly lower than the 90% threshold (p<0.001). Research conclusions aligned in 43 (86%) cases, which was not significantly different from 90% (p=0.230). Misaligned conclusions resulted from overstating claims. CONCLUSION: Significant gaps exist in the reporting of critical contextual information in medical news articles. Adopting a structured reporting format could enhance the quality and transparency of medical research communication. Collaboration among journalists, news organizations, and medical researchers is crucial for establishing and promoting best practices, fostering informed public discourse, and better health outcomes.

2.
PLoS One ; 19(4): e0301854, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38626142

RESUMO

BACKGROUND: ChatGPT-4 is a large language model with promising healthcare applications. However, its ability to analyze complex clinical data and provide consistent results is poorly known. Compared to validated tools, this study evaluated ChatGPT-4's risk stratification of simulated patients with acute nontraumatic chest pain. METHODS: Three datasets of simulated case studies were created: one based on the TIMI score variables, another on HEART score variables, and a third comprising 44 randomized variables related to non-traumatic chest pain presentations. ChatGPT-4 independently scored each dataset five times. Its risk scores were compared to calculated TIMI and HEART scores. A model trained on 44 clinical variables was evaluated for consistency. RESULTS: ChatGPT-4 showed a high correlation with TIMI and HEART scores (r = 0.898 and 0.928, respectively), but the distribution of individual risk assessments was broad. ChatGPT-4 gave a different risk 45-48% of the time for a fixed TIMI or HEART score. On the 44-variable model, a majority of the five ChatGPT-4 models agreed on a diagnosis category only 56% of the time, and risk scores were poorly correlated (r = 0.605). CONCLUSION: While ChatGPT-4 correlates closely with established risk stratification tools regarding mean scores, its inconsistency when presented with identical patient data on separate occasions raises concerns about its reliability. The findings suggest that while large language models like ChatGPT-4 hold promise for healthcare applications, further refinement and customization are necessary, particularly in the clinical risk assessment of atraumatic chest pain patients.


Assuntos
Dor no Peito , Humanos , Reprodutibilidade dos Testes , Estudos Prospectivos , Dor no Peito/diagnóstico , Medição de Risco/métodos , Fatores de Risco
3.
Acad Med ; 99(3): 240, 2024 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-38060379
4.
Cureus ; 15(12): e50729, 2023 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38111813

RESUMO

Background Generative artificial intelligence (AI) models, exemplified by systems such as ChatGPT, Bard, and Anthropic, are currently under intense investigation for their potential to address existing gaps in mental health support. One implementation of these large language models involves the development of mental health-focused conversational agents, which utilize pre-structured prompts to facilitate user interaction without requiring specialized knowledge in prompt engineering. However, uncertainties persist regarding the safety and efficacy of these agents in recognizing severe depression and suicidal tendencies. Given the well-established correlation between the severity of depression and the risk of suicide, improperly calibrated conversational agents may inadequately identify and respond to crises. Consequently, it is crucial to investigate whether publicly accessible repositories of mental health-focused conversational agents can consistently and safely address crisis scenarios before considering their adoption in clinical settings. This study assesses the safety of publicly available ChatGPT-3.5 conversational agents by evaluating their responses to a patient simulation indicating worsening depression and suicidality. Methodology This study evaluated ChatGPT-3.5 conversational agents on a publicly available repository specifically designed for mental health counseling. Each conversational agent was evaluated twice by a highly structured patient simulation. First, the simulation indicated escalating suicide risk based on the Patient Health Questionnaire (PHQ-9). For the second patient simulation, the escalating risk was presented in a more generalized manner not associated with an existing risk scale to assess the more generalized ability of the conversational agent to recognize suicidality. Each simulation recorded the exact point at which the conversational agent recommended human support. Then, the simulation continued until the conversational agent stopped entirely and shut down completely, insisting on human intervention. Results All 25 agents available on the public repository FlowGPT.com were evaluated. The point at which the conversational agents referred to a human occurred around the mid-point of the simulation, and definitive shutdown predominantly only happened at the highest risk levels. For the PHQ-9 simulation, the average initial referral and shutdown aligned with PHQ-9 scores of 12 (moderate depression) and 25 (severe depression). Few agents included crisis resources - only two referenced suicide hotlines. Despite the conversational agents insisting on human intervention, 22 out of 25 agents would eventually resume the dialogue if the simulation reverted to a lower risk level. Conclusions Current generative AI-based conversational agents are slow to escalate mental health risk scenarios, postponing referral to a human to potentially dangerous levels. More rigorous testing and oversight of conversational agents are needed before deployment in mental healthcare settings. Additionally, further investigation should explore if sustained engagement worsens outcomes and whether enhanced accessibility outweighs the risks of improper escalation. Advancing AI safety in mental health remains imperative as these technologies continue rapidly advancing.

5.
Cureus ; 15(8): e44397, 2023 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-37791215

RESUMO

Statistical significance is widely used to evaluate research findings but has limitations around reproducibility. Measures of statistical fragility aim to quantify robustness against violations of assumptions. However, dependence on sample size and single unit changes restricts indices like the unit fragility index and the fragility quotient. The Robustness Index (RI) is proposed to overcome these limitations and quantify fragility independently of the research study's sample size. The RI measures how altering sample size affects significance. For insignificant findings, the sample size is multiplied until significance is reached; the multiplicand is the RI. The sample size is divided for significant research findings until insignificance is reached; the divisor is the RI. Thus, higher RIs indicate greater robustness of insignificant and significant research findings. The RI provides a simple, interpretable metric of fragility. It facilitates comparisons across studies and can potentially increase trust in biomedical research.

6.
Cureus ; 15(10): e47741, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37899890

RESUMO

Background In biostatistics, assessing the fragility of research findings is crucial for understanding their clinical significance. This study focuses on the fragility index, unit fragility index, and relative risk index as measures to evaluate statistical fragility. The fragility indices assess the susceptibility of p-values to change significance with minor alterations in outcomes within a 2x2 contingency table. In contrast, the relative risk index quantifies the deviation of observed findings from therapeutic equivalence, the point at which the relative risk equals 1. While the fragility indices have intuitive appeal and have been widely applied, their behavior across a wide range of contingency tables has not been rigorously evaluated. Methods Using a Python software program, a simulation approach was employed to generate random 2x2 contingency tables. All tables under consideration exhibited p-values < 0.05 according to Fisher's exact test. Subsequently, the fragility indices and the relative risk index were calculated. To account for sample size variations, the indices were divided by the sample size to give fragility and risk quotients. A correlation matrix assessed the collinearity between each metric and the p-value. Results The analysis included 2,000 contingency tables with cell counts ranging from 20 to 480. Notably, the formulas for calculating the fragility indices encountered limitations when cell counts approached zero or duplicate cell counts hindered standardized application. The correlation coefficients with p-values were as follows: unit fragility index (-0.806), fragility index (-0.802), fragility quotient (-0.715), unit fragility quotient (-0.695), relative risk index (-0.403), and risk quotient (-0.261). Conclusion The fragility indices and fragility quotients demonstrated a strong correlation with p-values below 0.05, while the relative risk index and relative risk quotient exhibited a weak association with p-values below this threshold. This implies that the fragility indices offer limited additional information beyond the p-value alone. In contrast, the relative risk index and risk quotient exhibit independence from the p-value, indicating that they may provide important additional information about statistical fragility by evaluating the divergence of observed results from therapeutic equivalence, irrespective of the p-value-based statistical significance.

7.
Cureus ; 15(10): e46975, 2023 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37841988

RESUMO

BACKGROUND: Homelessness persists as a critical global issue despite myriad interventions. This study analyzed state-level differences in homelessness rates across the United States to identify influential societal factors to help guide resource prioritization. METHODS: Homelessness rates for 50 states and Washington, DC, were compared using the most recent data from 2020 to 2023. Twenty-five variables representing potential socioeconomic and health contributors were examined. The correlation between these variables and the homelessness rate was calculated. Decision trees and regression models were also utilized to identify the most significant factors contributing to homelessness. RESULTS: Homelessness rates were strongly correlated with the cost of living index (COLI), housing costs, transportation costs, grocery costs, and the cigarette excise tax rate (all: P < 0.001). An inverse relationship was observed between opioid prescription rates and homelessness, with increased opioid prescribing associated with decreased homelessness (P < 0.001). Due to collinearity, the combined cost of living index was used for modeling instead of its individual components. Decision tree and regression models identified the cost of living index as the strongest contributor to homelessness, with unemployment, taxes, binge drinking rates, and opioid prescription rates emerging as important factors. CONCLUSION: This state-level analysis revealed the cost of living index as the primary driver of homelessness rates. Unemployment, poverty, and binge drinking were also contributing factors. An unexpected negative correlation was found between opioid prescription rates and homelessness. These findings can help guide resource allocation to address homelessness through targeted interventions.

8.
PeerJ ; 11: e15090, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-36945358

RESUMO

Introduction: Patients with suspected thoracic pathology frequently get imaging with conventional radiography or chest x-rays (CXR) and computed tomography (CT). CXR include one or two planar views, compared to the three-dimensional images generated by chest CT. CXR imaging has the advantage of lower costs and lower radiation exposure at the expense of lower diagnostic accuracy, especially in patients with large body habitus. Objectives: To determine whether CXR imaging could achieve acceptable diagnostic accuracy in patients with a low body mass index (BMI). Methods: This retrospective study evaluated 50 patients with age of 63 ± 12 years old, 92% male, BMI 31.7 ± 7.9, presenting with acute, nontraumatic cardiopulmonary complaints who underwent CXR followed by CT within 1 day. Diagnostic accuracy was determined by comparing scan interpretation with the final clinical diagnosis of the referring clinician. Results: CT results were significantly correlated with CXR results (r = 0.284, p = 0.046). Correcting for BMI did not improve this correlation (r = 0.285, p = 0.047). Correcting for BMI and age also did not improve the correlation (r = 0.283, p = 0.052), nor did correcting for BMI, age, and sex (r = 0.270, p = 0.067). Correcting for height alone slightly improved the correlation (r = 0.290, p = 0.043), as did correcting for weight alone (r = 0.288, p = 0.045). CT accuracy was 92% (SE = 0.039) vs. 60% for CXR (SE = 0.070, p < 0.01). Conclusion: Accounting for patient body habitus as determined by either BMI, height, or weight did not improve the correlation between CXR accuracy and chest CT accuracy. CXR is significantly less accurate than CT even in patients with a low BMI.


Assuntos
Radiografia Torácica , Tomografia Computadorizada por Raios X , Humanos , Masculino , Pessoa de Meia-Idade , Idoso , Feminino , Índice de Massa Corporal , Estudos Retrospectivos , Raios X , Radiografia Torácica/métodos , Tomografia Computadorizada por Raios X/métodos
9.
Acad Med ; 95(6): 819, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32452851
12.
F1000Res ; 8: 1193, 2019.
Artigo em Inglês | MEDLINE | ID: mdl-38435121

RESUMO

Healthcare providers experience moral injury when their internal ethics are violated. The routine and direct exposure to ethical violations makes clinicians vulnerable to harm. The fundamental ethics in health care typically fall into the four broad categories of patient autonomy, beneficence, nonmaleficence, and social justice. Patients have a moral right to determine their own goals of medical care, that is, they have autonomy. When this principle is violated, moral injury occurs. Beneficence is the desire to help people, so when the delivery of proper medical care is obstructed for any reason, moral injury is the result. Nonmaleficence, meaning do no harm, has been a primary principle of medical ethics throughout recorded history. Yet today, even the most advanced and safest medical treatments are associated with unavoidable, harmful side effects. When an inevitable side effect occurs, the patient is harmed, and the clinician is also at risk of moral injury. Social injustice results when patients experience suboptimal treatment due to their race, gender, religion, or other demographic variables. While minor ethical dilemmas and violations routinely occur in medical care and cannot be eliminated, clinicians can decrease the prevalence of a significant moral injury by advocating for the ethical treatment of patients, not only at the bedside but also by addressing the ethics of political influence, governmental mandates, and administrative burdens on the delivery of optimal medical care. Although clinicians can strengthen their resistance to moral injury by deepening their own spiritual foundation, that is not enough. Improvements in the ethics of the entire healthcare system are necessary to improve medical care and decrease moral injury.


Assuntos
Bioética , Transtornos de Estresse Pós-Traumáticos , Humanos , Princípios Morais , Governo , Instalações de Saúde
15.
World J Methodol ; 7(4): 112-116, 2017 Dec 26.
Artigo em Inglês | MEDLINE | ID: mdl-29354483

RESUMO

A statistically significant research finding should not be defined as a P-value of 0.05 or less, because this definition does not take into account study power. Statistical significance was originally defined by Fisher RA as a P-value of 0.05 or less. According to Fisher, any finding that is likely to occur by random variation no more than 1 in 20 times is considered significant. Neyman J and Pearson ES subsequently argued that Fisher's definition was incomplete. They proposed that statistical significance could only be determined by analyzing the chance of incorrectly considering a study finding was significant (a Type I error) or incorrectly considering a study finding was insignificant (a Type II error). Their definition of statistical significance is also incomplete because the error rates are considered separately, not together. A better definition of statistical significance is the positive predictive value of a P-value, which is equal to the power divided by the sum of power and the P-value. This definition is more complete and relevant than Fisher's or Neyman-Peason's definitions, because it takes into account both concepts of statistical significance. Using this definition, a statistically significant finding requires a P-value of 0.05 or less when the power is at least 95%, and a P-value of 0.032 or less when the power is 60%. To achieve statistical significance, P-values must be adjusted downward as the study power decreases.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA