Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
J Med Internet Res ; 26: e51640, 2024 Feb 06.
Artículo en Inglés | MEDLINE | ID: mdl-38319694

RESUMEN

BACKGROUND: The outbreak of SARS-CoV-2 in 2019 has necessitated the rapid and accurate detection of COVID-19 to manage patients effectively and implement public health measures. Artificial intelligence (AI) models analyzing cough sounds have emerged as promising tools for large-scale screening and early identification of potential cases. OBJECTIVE: This study aimed to investigate the efficacy of using cough sounds as a diagnostic tool for COVID-19, considering the unique acoustic features that differentiate positive and negative cases. We investigated whether an AI model trained on cough sound recordings from specific periods, especially the early stages of the COVID-19 pandemic, were applicable to the ongoing situation with persistent variants. METHODS: We used cough sound recordings from 3 data sets (Cambridge, Coswara, and Virufy) representing different stages of the pandemic and variants. Our AI model was trained using the Cambridge data set with subsequent evaluation against all data sets. The performance was analyzed based on the area under the receiver operating curve (AUC) across different data measurement periods and COVID-19 variants. RESULTS: The AI model demonstrated a high AUC when tested with the Cambridge data set, indicative of its initial effectiveness. However, the performance varied significantly with other data sets, particularly in detecting later variants such as Delta and Omicron, with a marked decline in AUC observed for the latter. These results highlight the challenges in maintaining the efficacy of AI models against the backdrop of an evolving virus. CONCLUSIONS: While AI models analyzing cough sounds offer a promising noninvasive and rapid screening method for COVID-19, their effectiveness is challenged by the emergence of new virus variants. Ongoing research and adaptations in AI methodologies are crucial to address these limitations. The adaptability of AI models to evolve with the virus underscores their potential as a foundational technology for not only the current pandemic but also future outbreaks, contributing to a more agile and resilient global health infrastructure.


Asunto(s)
COVID-19 , Humanos , COVID-19/diagnóstico , SARS-CoV-2 , Inteligencia Artificial , Prueba de COVID-19 , Pandemias , Tos/diagnóstico
2.
J Med Internet Res ; 26: e53396, 2024 Jul 05.
Artículo en Inglés | MEDLINE | ID: mdl-38967964

RESUMEN

BACKGROUND: In the realm of in vitro fertilization (IVF), artificial intelligence (AI) models serve as invaluable tools for clinicians, offering predictive insights into ovarian stimulation outcomes. Predicting and understanding a patient's response to ovarian stimulation can help in personalizing doses of drugs, preventing adverse outcomes (eg, hyperstimulation), and improving the likelihood of successful fertilization and pregnancy. Given the pivotal role of accurate predictions in IVF procedures, it becomes important to investigate the landscape of AI models that are being used to predict the outcomes of ovarian stimulation. OBJECTIVE: The objective of this review is to comprehensively examine the literature to explore the characteristics of AI models used for predicting ovarian stimulation outcomes in the context of IVF. METHODS: A total of 6 electronic databases were searched for peer-reviewed literature published before August 2023, using the concepts of IVF and AI, along with their related terms. Records were independently screened by 2 reviewers against the eligibility criteria. The extracted data were then consolidated and presented through narrative synthesis. RESULTS: Upon reviewing 1348 articles, 30 met the predetermined inclusion criteria. The literature primarily focused on the number of oocytes retrieved as the main predicted outcome. Microscopy images stood out as the primary ground truth reference. The reviewed studies also highlighted that the most frequently adopted stimulation protocol was the gonadotropin-releasing hormone (GnRH) antagonist. In terms of using trigger medication, human chorionic gonadotropin (hCG) was the most commonly selected option. Among the machine learning techniques, the favored choice was the support vector machine. As for the validation of AI algorithms, the hold-out cross-validation method was the most prevalent. The area under the curve was highlighted as the primary evaluation metric. The literature exhibited a wide variation in the number of features used for AI algorithm development, ranging from 2 to 28,054 features. Data were mostly sourced from patient demographics, followed by laboratory data, specifically hormonal levels. Notably, the vast majority of studies were restricted to a single infertility clinic and exclusively relied on nonpublic data sets. CONCLUSIONS: These insights highlight an urgent need to diversify data sources and explore varied AI techniques for improved prediction accuracy and generalizability of AI models for the prediction of ovarian stimulation outcomes. Future research should prioritize multiclinic collaborations and consider leveraging public data sets, aiming for more precise AI-driven predictions that ultimately boost patient care and IVF success rates.


Asunto(s)
Inteligencia Artificial , Fertilización In Vitro , Inducción de la Ovulación , Humanos , Inducción de la Ovulación/métodos , Fertilización In Vitro/métodos , Femenino , Embarazo
3.
Skin Res Technol ; 29(8): e13414, 2023 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-37632180

RESUMEN

BACKGROUND: Appropriate skin treatment and care warrants an accurate prediction of skin moisture. However, current diagnostic tools are costly and time-consuming. Stratum corneum moisture content has been measured with moisture content meters or from a near-infrared image. OBJECTIVE: Here, we establish an artificial intelligence (AI) alternative for conventional skin moisture content measurements. METHODS: Skin feature factors positively or negatively correlated with the skin moisture content were created and selected by using the PolynomialFeatures(3) of scikit-learn. Then, an integrated AI model using, as inputs, a visible-light skin image and the skin feature factors were trained with 914 skin images, the corresponding skin feature factors, and the corresponding skin moisture contents. RESULTS: A regression-type AI model using only a visible-light skin-containing image was insufficiently implemented. To improve the accuracy of the prediction of skin moisture content, we searched for new features through feature engineering ("creation of new factors") correlated with the moisture content from various combinations of the existing skin features, and have found that factors created by combining the brown spot count, the pore count, and/or the visually assessed skin roughness give significant correlation coefficients. Then, an integrated AI deep-learning model using a visible-light skin image and these factors resulted in significantly improved skin moisture content prediction. CONCLUSION: Skin moisture content interacts with the brown spot count, the pore count, and/or the visually assessed skin roughness so that better inference of stratum corneum moisture content can be provided using a common visible-light skin photo image and skin feature factors.


Asunto(s)
Inteligencia Artificial , Piel , Humanos , Piel/diagnóstico por imagen , Epidermis , Administración Cutánea , Luz
4.
J Med Internet Res ; 25: e42717, 2023 02 16.
Artículo en Inglés | MEDLINE | ID: mdl-36795468

RESUMEN

BACKGROUND: An artificial intelligence (AI) model using chest radiography (CXR) may provide good performance in making prognoses for COVID-19. OBJECTIVE: We aimed to develop and validate a prediction model using CXR based on an AI model and clinical variables to predict clinical outcomes in patients with COVID-19. METHODS: This retrospective longitudinal study included patients hospitalized for COVID-19 at multiple COVID-19 medical centers between February 2020 and October 2020. Patients at Boramae Medical Center were randomly classified into training, validation, and internal testing sets (at a ratio of 8:1:1, respectively). An AI model using initial CXR images as input, a logistic regression model using clinical information, and a combined model using the output of the AI model (as CXR score) and clinical information were developed and trained to predict hospital length of stay (LOS) ≤2 weeks, need for oxygen supplementation, and acute respiratory distress syndrome (ARDS). The models were externally validated in the Korean Imaging Cohort of COVID-19 data set for discrimination and calibration. RESULTS: The AI model using CXR and the logistic regression model using clinical variables were suboptimal to predict hospital LOS ≤2 weeks or the need for oxygen supplementation but performed acceptably in the prediction of ARDS (AI model area under the curve [AUC] 0.782, 95% CI 0.720-0.845; logistic regression model AUC 0.878, 95% CI 0.838-0.919). The combined model performed better in predicting the need for oxygen supplementation (AUC 0.704, 95% CI 0.646-0.762) and ARDS (AUC 0.890, 95% CI 0.853-0.928) compared to the CXR score alone. Both the AI and combined models showed good calibration for predicting ARDS (P=.079 and P=.859). CONCLUSIONS: The combined prediction model, comprising the CXR score and clinical information, was externally validated as having acceptable performance in predicting severe illness and excellent performance in predicting ARDS in patients with COVID-19.


Asunto(s)
COVID-19 , Aprendizaje Profundo , Síndrome de Dificultad Respiratoria , Humanos , Inteligencia Artificial , COVID-19/diagnóstico por imagen , Estudios Longitudinales , Estudios Retrospectivos , Radiografía , Oxígeno , Pronóstico
5.
J Med Internet Res ; 25: e47612, 2023 07 10.
Artículo en Inglés | MEDLINE | ID: mdl-37428525

RESUMEN

BACKGROUND: Respiratory distress syndrome (RDS) is a disease that commonly affects premature infants whose lungs are not fully developed. RDS results from a lack of surfactant in the lungs. The more premature the infant is, the greater is the likelihood of having RDS. However, even though not all premature infants have RDS, preemptive treatment with artificial pulmonary surfactant is administered in most cases. OBJECTIVE: We aimed to develop an artificial intelligence model to predict RDS in premature infants to avoid unnecessary treatment. METHODS: In this study, 13,087 very low birth weight infants who were newborns weighing less than 1500 grams were assessed in 76 hospitals of the Korean Neonatal Network. To predict RDS in very low birth weight infants, we used basic infant information, maternity history, pregnancy/birth process, family history, resuscitation procedure, and test results at birth such as blood gas analysis and Apgar score. The prediction performances of 7 different machine learning models were compared, and a 5-layer deep neural network was proposed in order to enhance the prediction performance from the selected features. An ensemble approach combining multiple models from the 5-fold cross-validation was subsequently developed. RESULTS: Our proposed ensemble 5-layer deep neural network consisting of the top 20 features provided high sensitivity (83.03%), specificity (87.50%), accuracy (84.07%), balanced accuracy (85.26%), and area under the curve (0.9187). Based on the model that we developed, a public web application that enables easy access for the prediction of RDS in premature infants was deployed. CONCLUSIONS: Our artificial intelligence model may be useful for preparations for neonatal resuscitation, particularly in cases involving the delivery of very low birth weight infants, as it can aid in predicting the likelihood of RDS and inform decisions regarding the administration of surfactant.


Asunto(s)
Surfactantes Pulmonares , Síndrome de Dificultad Respiratoria del Recién Nacido , Femenino , Humanos , Recién Nacido , Embarazo , Inteligencia Artificial , Recién Nacido de muy Bajo Peso , Estudios Prospectivos , Surfactantes Pulmonares/uso terapéutico , República de Corea , Síndrome de Dificultad Respiratoria del Recién Nacido/diagnóstico , Síndrome de Dificultad Respiratoria del Recién Nacido/tratamiento farmacológico , Resucitación , Tensoactivos , Aprendizaje Automático
6.
J Med Internet Res ; 25: e47621, 2023 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-37713254

RESUMEN

BACKGROUND: Artificial intelligence (AI) has gained tremendous popularity recently, especially the use of natural language processing (NLP). ChatGPT is a state-of-the-art chatbot capable of creating natural conversations using NLP. The use of AI in medicine can have a tremendous impact on health care delivery. Although some studies have evaluated ChatGPT's accuracy in self-diagnosis, there is no research regarding its precision and the degree to which it recommends medical consultations. OBJECTIVE: The aim of this study was to evaluate ChatGPT's ability to accurately and precisely self-diagnose common orthopedic diseases, as well as the degree of recommendation it provides for medical consultations. METHODS: Over a 5-day course, each of the study authors submitted the same questions to ChatGPT. The conditions evaluated were carpal tunnel syndrome (CTS), cervical myelopathy (CM), lumbar spinal stenosis (LSS), knee osteoarthritis (KOA), and hip osteoarthritis (HOA). Answers were categorized as either correct, partially correct, incorrect, or a differential diagnosis. The percentage of correct answers and reproducibility were calculated. The reproducibility between days and raters were calculated using the Fleiss κ coefficient. Answers that recommended that the patient seek medical attention were recategorized according to the strength of the recommendation as defined by the study. RESULTS: The ratios of correct answers were 25/25, 1/25, 24/25, 16/25, and 17/25 for CTS, CM, LSS, KOA, and HOA, respectively. The ratios of incorrect answers were 23/25 for CM and 0/25 for all other conditions. The reproducibility between days was 1.0, 0.15, 0.7, 0.6, and 0.6 for CTS, CM, LSS, KOA, and HOA, respectively. The reproducibility between raters was 1.0, 0.1, 0.64, -0.12, and 0.04 for CTS, CM, LSS, KOA, and HOA, respectively. Among the answers recommending medical attention, the phrases "essential," "recommended," "best," and "important" were used. Specifically, "essential" occurred in 4 out of 125, "recommended" in 12 out of 125, "best" in 6 out of 125, and "important" in 94 out of 125 answers. Additionally, 7 out of the 125 answers did not include a recommendation to seek medical attention. CONCLUSIONS: The accuracy and reproducibility of ChatGPT to self-diagnose five common orthopedic conditions were inconsistent. The accuracy could potentially be improved by adding symptoms that could easily identify a specific location. Only a few answers were accompanied by a strong recommendation to seek medical attention according to our study standards. Although ChatGPT could serve as a potential first step in accessing care, we found variability in accurate self-diagnosis. Given the risk of harm with self-diagnosis without medical follow-up, it would be prudent for an NLP to include clear language alerting patients to seek expert medical opinions. We hope to shed further light on the use of AI in a future clinical study.


Asunto(s)
Enfermedades Musculoesqueléticas , Osteoartritis de la Rodilla , Enfermedades de la Médula Espinal , Humanos , Inteligencia Artificial , Reproducibilidad de los Resultados , Procesamiento de Lenguaje Natural , Comunicación
7.
Sensors (Basel) ; 23(3)2023 Jan 22.
Artículo en Inglés | MEDLINE | ID: mdl-36772319

RESUMEN

Artificial Intelligence (Al) models are being produced and used to solve a variety of current and future business and technical problems. Therefore, AI model engineering processes, platforms, and products are acquiring special significance across industry verticals. For achieving deeper automation, the number of data features being used while generating highly promising and productive AI models is numerous, and hence the resulting AI models are bulky. Such heavyweight models consume a lot of computation, storage, networking, and energy resources. On the other side, increasingly, AI models are being deployed in IoT devices to ensure real-time knowledge discovery and dissemination. Real-time insights are of paramount importance in producing and releasing real-time and intelligent services and applications. Thus, edge intelligence through on-device data processing has laid down a stimulating foundation for real-time intelligent enterprises and environments. With these emerging requirements, the focus turned towards unearthing competent and cognitive techniques for maximally compressing huge AI models without sacrificing AI model performance. Therefore, AI researchers have come up with a number of powerful optimization techniques and tools to optimize AI models. This paper is to dig deep and describe all kinds of model optimization at different levels and layers. Having learned the optimization methods, this work has highlighted the importance of having an enabling AI model optimization framework.

8.
Entropy (Basel) ; 24(6)2022 Jun 08.
Artículo en Inglés | MEDLINE | ID: mdl-35741522

RESUMEN

Peer-to-peer lending (P2P lending) has proliferated in recent years thanks to Fintech and big data advancements. However, P2P lending platforms are not tightly governed by relevant laws yet, as their development speed has far exceeded that of regulations. Therefore, P2P lending operations are still subject to risks. This paper proposes prediction models to mitigate the risks of default and asymmetric information on P2P lending platforms. Specifically, we designed sophisticated procedures to pre-process mass data extracted from Lending Club in 2018 Q3-2019 Q2. After that, three statistical models, namely, Logistic Regression, Bayesian Classifier, and Linear Discriminant Analysis (LDA), and five AI models, namely, Decision Tree, Random Forest, LightGBM, Artificial Neural Network (ANN), and Convolutional Neural Network (CNN), were utilized for data analysis. The loan statuses of Lending Club's customers were rationally classified. To evaluate the models, we adopted the confusion matrix series of metrics, AUC-ROC curve, Kolmogorov-Smirnov chart (KS), and Student's t-test. Empirical studies show that LightGBM produces the best performance and is 2.91% more accurate than the other models, resulting in a revenue improvement of nearly USD 24 million for Lending Club. Student's t-test proves that the differences between models are statistically significant.

10.
Ophthalmol Glaucoma ; 7(1): 8-15, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-37437884

RESUMEN

PURPOSE: To assess the performance and generalizability of a convolutional neural network (CNN) model for objective and high-throughput identification of primary angle-closure disease (PACD) as well as PACD stage differentiation on anterior segment swept-source OCT (AS-OCT). DESIGN: Cross-sectional. PARTICIPANTS: Patients from 3 different eye centers across China and Singapore were recruited for this study. Eight hundred forty-one eyes from the 2 Chinese centers were divided into 170 control eyes, 488 PACS, and 183 PAC + PACG eyes. An additional 300 eyes were recruited from Singapore National Eye Center as a testing data set, divided into 100 control eyes, 100 PACS, and 100 PAC + PACG eyes. METHODS: Each participant underwent standardized ophthalmic examination and was classified by the presiding physician as either control, primary angle-closure suspect (PACS), primary angle closure (PAC), or primary angle-closure glaucoma (PACG). Deep Learning model was used to train 3 different CNN classifiers: classifier 1 aimed to separate control versus PACS versus PAC + PACG; classifier 2 aimed to separate control versus PACD; and classifier 3 aimed to separate PACS versus PAC + PACG. All classifiers were evaluated on independent validation sets from the same region, China and further tested using data from a different country, Singapore. MAIN OUTCOME MEASURES: Area under receiver operator characteristic curve (AUC), precision, and recall. RESULTS: Classifier 1 achieved an AUC of 0.96 on validation set from the same region, but dropped to an AUC of 0.84 on test set from a different country. Classifier 2 achieved the most generalizable performance with an AUC of 0.96 on validation set and AUC of 0.95 on test set. Classifier 3 showed the poorest performance, with an AUC of 0.83 and 0.64 on test and validation data sets, respectively. CONCLUSIONS: Convolutional neural network classifiers can effectively distinguish PACD from controls on AS-OCT with good generalizability across different patient cohorts. However, their performance is moderate when trying to distinguish PACS versus PAC + PACG. FINANCIAL DISCLOSURES: The authors have no proprietary or commercial interest in any materials discussed in this article.


Asunto(s)
Aprendizaje Profundo , Glaucoma de Ángulo Cerrado , Humanos , Presión Intraocular , Tomografía de Coherencia Óptica/métodos , Estudios Transversales , Glaucoma de Ángulo Cerrado/diagnóstico
11.
PeerJ Comput Sci ; 10: e2067, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38855196

RESUMEN

Accurate prediction of electricity generation from diverse renewable energy sources (RES) plays a pivotal role in optimizing power schedules within RES, contributing to the collective effort to combat climate change. While prior research often focused on individual energy sources in isolation, neglecting intricate interactions among multiple sources, this limitation frequently leads to inaccurate estimations of total power generation. In this study, we introduce a hybrid architecture designed to address these challenges, incorporating advanced artificial intelligence (AI) techniques. The hybrid model seamlessly integrates a gated recurrent unit (GRU) and a ResNext model, and it is tuned with the modified jaya algorithm (MJA) to capture localized correlations among different energy sources. Leveraging its nonlinear time-series properties, the model integrates meteorological conditions and specific energy source data. Additionally, principal component analysis (PCA) is employed to extract linear time-series data characteristics for each energy source. Application of the proposed AI-infused approach to a renewable energy system demonstrates its effectiveness and feasibility in the context of climate change mitigation. Results reveal the superior accuracy of the hybrid framework compared to more complex models such as decision trees and ResNet. Specifically, our proposed method achieved remarkable performance, boasting the lowest error rates with a normalized RMSE of 6.51 and a normalized MAPE of 4.34 for solar photovoltaic (PV), highlighting its exceptional precision in terms of mean absolute errors. A detailed sensitivity analysis is carried out to evaluate the influence of every element in the hybrid framework, emphasizing the importance of energy correlation patterns. Comparative assessments underscore the increased accuracy and stability of the suggested AI-infused framework when compared to other methods.

12.
Environ Pollut ; 349: 123974, 2024 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-38615837

RESUMEN

PM2.5 concentrations are higher during rush hours at background stations compared to the average concentration across these stations. Few studies have investigated PM2.5 concentration and its spatial distribution during rush hours using machine learning models. This study employs a geospatial-artificial intelligence (Geo-AI) prediction model to estimate the spatial and temporal variations of PM2.5 concentrations during morning and dusk rush hours in Taiwan. Mean hourly PM2.5 measurements were collected from 2006 to 2020, and aggregated into morning (7 a.m.-9 a.m.) and dusk (4 p.m.-6 p.m.) rush-hour mean concentrations. The Geo-AI prediction model was generated by integrating kriging interpolation, land-use regression, machine learning, and a stacking ensemble approach. A forward stepwise variable selection method based on the SHapley Additive exPlanations (SHAP) index was used to identify the most influential variables. The performance of the Geo-AI models for morning and dusk rush hours had accuracy scores of 0.95 and 0.93, respectively and these results were validated, indicating robust model performance. Spatially, PM2.5 concentrations were higher in southwestern Taiwan for morning rush hours, and suburban areas for dusk rush hours. Key predictors included kriged PM2.5 values, SO2 concentrations, forest density, and the distance to incinerators for both morning and dusk rush hours. These PM2.5 estimates for morning and dusk rush hours can support the development of alternative commuting routes with lower concentrations.


Asunto(s)
Contaminantes Atmosféricos , Contaminación del Aire , Inteligencia Artificial , Monitoreo del Ambiente , Material Particulado , Taiwán , Material Particulado/análisis , Contaminantes Atmosféricos/análisis , Monitoreo del Ambiente/métodos , Contaminación del Aire/estadística & datos numéricos , Transportes
13.
Curr Med Imaging ; 20: 1-18, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38389356

RESUMEN

BACKGROUND: Glaucoma is a significant cause of irreversible blindness worldwide, with symptoms often going undetected until the patient's visual field starts shrinking. OBJECTIVE: To develop an AI-based glaucoma detection method to reduce glaucoma-related blindness and offer more precise diagnosis. METHODS: Discusses various methods and technologies, including Heidelberg Retinal Tomography (HRT), Optical Coherence Tomography (OCT), and Fundus Photography, for obtaining relevant information about the presence of glaucoma in a patient. Additionally, it mentions the use of Support Vector Machines (SVMs) and Convolutional Neural Networks (CNNs) for glaucoma detection. There are many limitations for existing methods as; Asymptomatic Progression, reliance on subjective feedback, multiple tests required, late detection, limited availability of preventive tests, influence of external factors. RESULTS: Findings reveal promising outcomes in terms of glaucoma detection accuracy, particularly in the analysis of the RIM-ONE-r3 dataset. By scrutinizing 20 images from the Healthy, Glaucoma, and Suspects categories through fundus image recognition, our developed AI model consistently achieved high diagnostic accuracy rates. Conclusion Our study suggests that further enhancements in glaucoma detection accuracy are attainable by augmenting the dataset with additional labeled images. We emphasize the significance of considering various application parameters when discussing the integration of computer-aided decision/management systems into healthcare frameworks.


Asunto(s)
Aprendizaje Profundo , Glaucoma , Humanos , Glaucoma/diagnóstico por imagen , Fondo de Ojo , Redes Neurales de la Computación , Ceguera
14.
JMIR AI ; 3: e48295, 2024 Mar 12.
Artículo en Inglés | MEDLINE | ID: mdl-38875582

RESUMEN

BACKGROUND: Identification and referral of at-risk patients from primary care practitioners (PCPs) to eye care professionals remain a challenge. Approximately 1.9 million Americans suffer from vision loss as a result of undiagnosed or untreated ophthalmic conditions. In ophthalmology, artificial intelligence (AI) is used to predict glaucoma progression, recognize diabetic retinopathy (DR), and classify ocular tumors; however, AI has not yet been used to triage primary care patients for ophthalmology referral. OBJECTIVE: This study aimed to build and compare machine learning (ML) methods, applicable to electronic health records (EHRs) of PCPs, capable of triaging patients for referral to eye care specialists. METHODS: Accessing the Optum deidentified EHR data set, 743,039 patients with 5 leading vision conditions (age-related macular degeneration [AMD], visually significant cataract, DR, glaucoma, or ocular surface disease [OSD]) were exact-matched on age and gender to 743,039 controls without eye conditions. Between 142 and 182 non-ophthalmic parameters per patient were input into 5 ML methods: generalized linear model, L1-regularized logistic regression, random forest, Extreme Gradient Boosting (XGBoost), and J48 decision tree. Model performance was compared for each pathology to select the most predictive algorithm. The area under the curve (AUC) was assessed for all algorithms for each outcome. RESULTS: XGBoost demonstrated the best performance, showing, respectively, a prediction accuracy and an AUC of 78.6% (95% CI 78.3%-78.9%) and 0.878 for visually significant cataract, 77.4% (95% CI 76.7%-78.1%) and 0.858 for exudative AMD, 79.2% (95% CI 78.8%-79.6%) and 0.879 for nonexudative AMD, 72.2% (95% CI 69.9%-74.5%) and 0.803 for OSD requiring medication, 70.8% (95% CI 70.5%-71.1%) and 0.785 for glaucoma, 85.0% (95% CI 84.2%-85.8%) and 0.924 for type 1 nonproliferative diabetic retinopathy (NPDR), 82.2% (95% CI 80.4%-84.0%) and 0.911 for type 1 proliferative diabetic retinopathy (PDR), 81.3% (95% CI 81.0%-81.6%) and 0.891 for type 2 NPDR, and 82.1% (95% CI 81.3%-82.9%) and 0.900 for type 2 PDR. CONCLUSIONS: The 5 ML methods deployed were able to successfully identify patients with elevated odds ratios (ORs), thus capable of patient triage, for ocular pathology ranging from 2.4 (95% CI 2.4-2.5) for glaucoma to 5.7 (95% CI 5.0-6.4) for type 1 NPDR, with an average OR of 3.9. The application of these models could enable PCPs to better identify and triage patients at risk for treatable ophthalmic pathology. Early identification of patients with unrecognized sight-threatening conditions may lead to earlier treatment and a reduced economic burden. More importantly, such triage may improve patients' lives.

15.
J Clin Med ; 13(13)2024 Jun 30.
Artículo en Inglés | MEDLINE | ID: mdl-38999416

RESUMEN

Background: Chest radiography is the standard method for detecting rib fractures. Our study aims to develop an artificial intelligence (AI) model that, with only a relatively small amount of training data, can identify rib fractures on chest radiographs and accurately mark their precise locations, thereby achieving a diagnostic accuracy comparable to that of medical professionals. Methods: For this retrospective study, we developed an AI model using 540 chest radiographs (270 normal and 270 with rib fractures) labeled for use with Detectron2 which incorporates a faster region-based convolutional neural network (R-CNN) enhanced with a feature pyramid network (FPN). The model's ability to classify radiographs and detect rib fractures was assessed. Furthermore, we compared the model's performance to that of 12 physicians, including six board-certified anesthesiologists and six residents, through an observer performance test. Results: Regarding the radiographic classification performance of the AI model, the sensitivity, specificity, and area under the receiver operating characteristic curve (AUROC) were 0.87, 0.83, and 0.89, respectively. In terms of rib fracture detection performance, the sensitivity, false-positive rate, and free-response receiver operating characteristic (JAFROC) figure of merit (FOM) were 0.62, 0.3, and 0.76, respectively. The AI model showed no statistically significant difference in the observer performance test compared to 11 of 12 and 10 of 12 physicians, respectively. Conclusions: We developed an AI model trained on a limited dataset that demonstrated a rib fracture classification and detection performance comparable to that of an experienced physician.

16.
JMIR Med Educ ; 10: e51523, 2024 Feb 21.
Artículo en Inglés | MEDLINE | ID: mdl-38381486

RESUMEN

BACKGROUND: Large language models (LLMs) have revolutionized natural language processing with their ability to generate human-like text through extensive training on large data sets. These models, including Generative Pre-trained Transformers (GPT)-3.5 (OpenAI), GPT-4 (OpenAI), and Bard (Google LLC), find applications beyond natural language processing, attracting interest from academia and industry. Students are actively leveraging LLMs to enhance learning experiences and prepare for high-stakes exams, such as the National Eligibility cum Entrance Test (NEET) in India. OBJECTIVE: This comparative analysis aims to evaluate the performance of GPT-3.5, GPT-4, and Bard in answering NEET-2023 questions. METHODS: In this paper, we evaluated the performance of the 3 mainstream LLMs, namely GPT-3.5, GPT-4, and Google Bard, in answering questions related to the NEET-2023 exam. The questions of the NEET were provided to these artificial intelligence models, and the responses were recorded and compared against the correct answers from the official answer key. Consensus was used to evaluate the performance of all 3 models. RESULTS: It was evident that GPT-4 passed the entrance test with flying colors (300/700, 42.9%), showcasing exceptional performance. On the other hand, GPT-3.5 managed to meet the qualifying criteria, but with a substantially lower score (145/700, 20.7%). However, Bard (115/700, 16.4%) failed to meet the qualifying criteria and did not pass the test. GPT-4 demonstrated consistent superiority over Bard and GPT-3.5 in all 3 subjects. Specifically, GPT-4 achieved accuracy rates of 73% (29/40) in physics, 44% (16/36) in chemistry, and 51% (50/99) in biology. Conversely, GPT-3.5 attained an accuracy rate of 45% (18/40) in physics, 33% (13/26) in chemistry, and 34% (34/99) in biology. The accuracy consensus metric showed that the matching responses between GPT-4 and Bard, as well as GPT-4 and GPT-3.5, had higher incidences of being correct, at 0.56 and 0.57, respectively, compared to the matching responses between Bard and GPT-3.5, which stood at 0.42. When all 3 models were considered together, their matching responses reached the highest accuracy consensus of 0.59. CONCLUSIONS: The study's findings provide valuable insights into the performance of GPT-3.5, GPT-4, and Bard in answering NEET-2023 questions. GPT-4 emerged as the most accurate model, highlighting its potential for educational applications. Cross-checking responses across models may result in confusion as the compared models (as duos or a trio) tend to agree on only a little over half of the correct responses. Using GPT-4 as one of the compared models will result in higher accuracy consensus. The results underscore the suitability of LLMs for high-stakes exams and their positive impact on education. Additionally, the study establishes a benchmark for evaluating and enhancing LLMs' performance in educational tasks, promoting responsible and informed use of these models in diverse learning environments.


Asunto(s)
Inteligencia Artificial , Benchmarking , Humanos , Escolaridad , Confusión , India
17.
Pathology ; 55(3): 342-349, 2023 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-36641379

RESUMEN

We trained an artificial intelligence (AI) algorithm to identify basal cell carcinoma (BCC), and to distinguish BCC from histological mimics. A total of 1061 glass slides were collected: 616 containing BCC and 445 without BCC. BCC slides were collected prospectively, reflecting the range of specimen types and morphological variety encountered in routine pathology practice. Benign and malignant histological mimics of BCC were selected prospectively and retrospectively, including cases considered diagnostically challenging for pathologists. Glass slides were digitally scanned to create a whole slide image (WSI), which was divided into patches representing a tissue area of 65,535 µm2. Pathologists annotated the data, yielding 87,205 patches labelled BCC present and 1,688,697 patches labelled BCC absent. The COMPASS model (COntext-aware Multi-scale tool for Pathologists Assessing SlideS) based on Convolutional Neural Networks, was trained to provide a probability of BCC being present at the patch level and the slide level. The test set comprised 246 slides, 147 of which contained BCC. The COMPASS AI model demonstrated high accuracy, classifying WSIs as containing BCC with a sensitivity of 98.0% and a specificity of 97.0%, representing 240 WSIs classified correctly, three false positives, and three false negatives. Using BCC as a proof of concept, we demonstrate how AI can account for morphological variation within an entity, and accurately distinguish from histologically similar entities. Our study highlights the potential for AI in routine pathology practice.


Asunto(s)
Carcinoma Basocelular , Neoplasias Cutáneas , Humanos , Inteligencia Artificial , Estudios Retrospectivos , Carcinoma Basocelular/diagnóstico , Algoritmos , Neoplasias Cutáneas/diagnóstico , Neoplasias Cutáneas/patología
18.
Healthcare (Basel) ; 11(10)2023 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-37239653

RESUMEN

Convolutional neural networks (CNNs) have shown promise in accurately diagnosing coronavirus disease 2019 (COVID-19) and bacterial pneumonia using chest X-ray images. However, determining the optimal feature extraction approach is challenging. This study investigates the use of fusion-extracted features by deep networks to improve the accuracy of COVID-19 and bacterial pneumonia classification with chest X-ray radiography. A Fusion CNN method was developed using five different deep learning models after transferred learning to extract image features (Fusion CNN). The combined features were used to build a support vector machine (SVM) classifier with a RBF kernel. The performance of the model was evaluated using accuracy, Kappa values, recall rate, and precision scores. The Fusion CNN model achieved an accuracy and Kappa value of 0.994 and 0.991, with precision scores for normal, COVID-19, and bacterial groups of 0.991, 0.998, and 0.994, respectively. The results indicate that the Fusion CNN models with the SVM classifier provided reliable and accurate classification performance, with Kappa values no less than 0.990. Using a Fusion CNN approach could be a possible solution to enhance accuracy further. Therefore, the study demonstrates the potential of deep learning and fusion-extracted features for accurate COVID-19 and bacterial pneumonia classification with chest X-ray radiography.

19.
Comput Biol Med ; 159: 106901, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37068317

RESUMEN

BACKGROUND AND PURPOSE: A medical AI system's generalizability describes the continuity of its performance acquired from varying geographic, historical, and methodologic settings. Previous literature on this topic has mostly focused on "how" to achieve high generalizability (e.g., via larger datasets, transfer learning, data augmentation, model regularization schemes), with limited success. Instead, we aim to understand "when" the generalizability is achieved: Our study presents a medical AI system that could estimate its generalizability status for unseen data on-the-fly. MATERIALS AND METHODS: We introduce a latent space mapping (LSM) approach utilizing Fréchet distance loss to force the underlying training data distribution into a multivariate normal distribution. During the deployment, a given test data's LSM distribution is processed to detect its deviation from the forced distribution; hence, the AI system could predict its generalizability status for any previously unseen data set. If low model generalizability is detected, then the user is informed by a warning message integrated into a sample deployment workflow. While the approach is applicable for most classification deep neural networks (DNNs), we demonstrate its application to a brain metastases (BM) detector for T1-weighted contrast-enhanced (T1c) 3D MRI. The BM detection model was trained using 175 T1c studies acquired internally (from the authors' institution) and tested using (1) 42 internally acquired exams and (2) 72 externally acquired exams from the publicly distributed Brain Mets dataset provided by the Stanford University School of Medicine. Generalizability scores, false positive (FP) rates, and sensitivities of the BM detector were computed for the test datasets. RESULTS AND CONCLUSION: The model predicted its generalizability to be low for 31% of the testing data (i.e., two of the internally and 33 of the externally acquired exams), where it produced (1) ∼13.5 false positives (FPs) at 76.1% BM detection sensitivity for the low and (2) ∼10.5 FPs at 89.2% BM detection sensitivity for the high generalizability groups respectively. These results suggest that the proposed formulation enables a model to predict its generalizability for unseen data.


Asunto(s)
Neoplasias Encefálicas , Diagnóstico por Computador , Humanos , Diagnóstico por Computador/métodos , Imagen por Resonancia Magnética/métodos , Redes Neurales de la Computación , Neoplasias Encefálicas/diagnóstico por imagen , Neoplasias Encefálicas/secundario
20.
JMIR AI ; 2: e42313, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37457747

RESUMEN

Background: Despite immense progress in artificial intelligence (AI) models, there has been limited deployment in health care environments. The gap between potential and actual AI applications is likely due to the lack of translatability between controlled research environments (where these models are developed) and clinical environments for which the AI tools are ultimately intended. Objective: We previously developed the Translational Evaluation of Healthcare AI (TEHAI) framework to assess the translational value of AI models and to support successful transition to health care environments. In this study, we applied the TEHAI framework to the COVID-19 literature in order to assess how well translational topics are covered. Methods: A systematic literature search for COVID-19 AI studies published between December 2019 and December 2020 resulted in 3830 records. A subset of 102 (2.7%) papers that passed the inclusion criteria was sampled for full review. The papers were assessed for translational value and descriptive data collected by 9 reviewers (each study was assessed by 2 reviewers). Evaluation scores and extracted data were compared by a third reviewer for resolution of discrepancies. The review process was conducted on the Covidence software platform. Results: We observed a significant trend for studies to attain high scores for technical capability but low scores for the areas essential for clinical translatability. Specific questions regarding external model validation, safety, nonmaleficence, and service adoption received failed scores in most studies. Conclusions: Using TEHAI, we identified notable gaps in how well translational topics of AI models are covered in the COVID-19 clinical sphere. These gaps in areas crucial for clinical translatability could, and should, be considered already at the model development stage to increase translatability into real COVID-19 health care environments.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA