Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Am J Pathol ; 194(2): 253-263, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38029922

RESUMEN

Obese patients with breast cancer have worse outcomes than their normal weight counterparts, with a 50% to 80% increased rate of axillary nodal metastasis. Recent studies suggest a link between increased lymph node adipose tissue and breast cancer nodal metastasis. Further investigation into potential mechanisms underlying this link may reveal potential prognostic utility of fat-enlarged lymph nodes in patients with breast cancer. This study used a deep learning model to identify morphologic differences in nonmetastatic axillary nodes between obese, node-positive, and node-negative patients with breast cancer. The model was developed using nested cross-validation on 180 cases and achieved an area under the receiver operator characteristic curve of 0.67 in differentiating patients using hematoxylin and eosin-stained whole slide images. The morphologic analysis of the predictive regions showed an increased average adipocyte size (P = 0.004), increased white space between lymphocytes (P < 0.0001), and increased red blood cells (P < 0.001) in nonmetastatic lymph nodes of node-positive patients. Preliminary immunohistochemistry analysis on a subset of 30 patients showed a trend of decreased CD3 expression and increased leptin expression in fat-replaced axillary lymph nodes of obese, node-positive patients. These findings suggest a novel direction to further investigate the interaction between lymph node adiposity, lymphatic dysfunction, and breast cancer nodal metastases, highlighting a possible prognostic tool for obese patients with breast cancer.


Asunto(s)
Neoplasias de la Mama , Humanos , Femenino , Neoplasias de la Mama/complicaciones , Neoplasias de la Mama/patología , Metástasis Linfática/patología , Estadificación de Neoplasias , Ganglios Linfáticos/patología , Obesidad/complicaciones , Obesidad/patología
2.
Am J Pathol ; 2024 Jun 13.
Artículo en Inglés | MEDLINE | ID: mdl-38879079

RESUMEN

Endometrial cancer is the fourth most common cancer in women in the United States; the lifetime risk for developing this disease is approximately 2.8%. Precise histologic evaluation and molecular classification of endometrial cancer are important for effective patient management and determining the best treatment modalities. This study introduces EndoNet, which uses convolutional neural networks for extracting histologic features and a vision transformer for aggregating these features and classifying slides based on their visual characteristics into high- and low-grade cases. The model was trained on 929 digitized hematoxylin and eosin-stained whole-slide images of endometrial cancer from hysterectomy cases at Dartmouth-Health. It classifies these slides into low-grade (endometrioid grades 1 and 2) and high-grade (endometrioid carcinoma International Federation of Gynecology and Obstetrics grade 3, uterine serous carcinoma, or carcinosarcoma) categories. EndoNet was evaluated on an internal test set of 110 patients and an external test set of 100 patients from The Cancer Genome Atlas public database. The model achieved a weighted average F1 score of 0.91 (95% CI, 0.86 to 0.95) and an area under the curve of 0.95 (95% CI, 0.89 to 0.99) on the internal test, and 0.86 (95% CI, 0.80 to 0.94) for F1 score and 0.86 (95% CI, 0.75 to 0.93) for area under the curve on the external test. Pending further validation, EndoNet has the potential to support pathologists without the need of manual annotations in classifying the grades of gynecologic pathology tumors.

3.
Am J Pathol ; 193(3): 332-340, 2023 03.
Artículo en Inglés | MEDLINE | ID: mdl-36563748

RESUMEN

Colorectal cancer (CRC) is one of the most common types of cancer among men and women. The grading of dysplasia and the detection of adenocarcinoma are important clinical tasks in the diagnosis of CRC and shape the patients' follow-up plans. This study evaluated the feasibility of deep learning models for the classification of colorectal lesions into four classes: benign, low-grade dysplasia, high-grade dysplasia, and adenocarcinoma. To this end, a deep neural network was developed on a training set of 655 whole slide images of digitized colorectal resection slides from a tertiary medical institution; and the network was evaluated on an internal test set of 234 slides, as well as on an external test set of 606 adenocarcinoma slides from The Cancer Genome Atlas database. The model achieved an overall accuracy, sensitivity, and specificity of 95.5%, 91.0%, and 97.1%, respectively, on the internal test set, and an accuracy and sensitivity of 98.5% for adenocarcinoma detection task on the external test set. Results suggest that such deep learning models can potentially assist pathologists in grading colorectal dysplasia, detecting adenocarcinoma, prescreening, and prioritizing the reviewing of suspicious cases to improve the turnaround time for patients with a high risk of CRC. Furthermore, the high sensitivity on the external test set suggests the model's generalizability in detecting colorectal adenocarcinoma on whole slide images across different institutions.


Asunto(s)
Adenocarcinoma , Neoplasias Colorrectales , Aprendizaje Profundo , Masculino , Humanos , Femenino , Redes Neurales de la Computación , Adenocarcinoma/diagnóstico , Adenocarcinoma/patología , Patólogos , Hiperplasia , Neoplasias Colorrectales/diagnóstico
4.
J Med Internet Res ; 25: e45556, 2023 06 13.
Artículo en Inglés | MEDLINE | ID: mdl-37310787

RESUMEN

BACKGROUND: Multiple digital data sources can capture moment-to-moment information to advance a robust understanding of opioid use disorder (OUD) behavior, ultimately creating a digital phenotype for each patient. This information can lead to individualized interventions to improve treatment for OUD. OBJECTIVE: The aim is to examine patient engagement with multiple digital phenotyping methods among patients receiving buprenorphine medication for OUD. METHODS: The study enrolled 65 patients receiving buprenorphine for OUD between June 2020 and January 2021 from 4 addiction medicine programs in an integrated health care delivery system in Northern California. Ecological momentary assessment (EMA), sensor data, and social media data were collected by smartphone, smartwatch, and social media platforms over a 12-week period. Primary engagement outcomes were meeting measures of minimum phone carry (≥8 hours per day) and watch wear (≥18 hours per day) criteria, EMA response rates, social media consent rate, and data sparsity. Descriptive analyses, bivariate, and trend tests were performed. RESULTS: The participants' average age was 37 years, 47% of them were female, and 71% of them were White. On average, participants met phone carrying criteria on 94% of study days, met watch wearing criteria on 74% of days, and wore the watch to sleep on 77% of days. The mean EMA response rate was 70%, declining from 83% to 56% from week 1 to week 12. Among participants with social media accounts, 88% of them consented to providing data; of them, 55% of Facebook, 54% of Instagram, and 57% of Twitter participants provided data. The amount of social media data available varied widely across participants. No differences by age, sex, race, or ethnicity were observed for any outcomes. CONCLUSIONS: To our knowledge, this is the first study to capture these 3 digital data sources in this clinical population. Our findings demonstrate that patients receiving buprenorphine treatment for OUD had generally high engagement with multiple digital phenotyping data sources, but this was more limited for the social media data. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): RR2-10.3389/fpsyt.2022.871916.


Asunto(s)
Buprenorfina , Trastornos Relacionados con Opioides , Femenino , Humanos , Masculino , Participación del Paciente , Buprenorfina/uso terapéutico , Evaluación Ecológica Momentánea , Etnicidad , Trastornos Relacionados con Opioides/tratamiento farmacológico
5.
Breast Cancer Res Treat ; 189(1): 257-267, 2021 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-34081259

RESUMEN

PURPOSE: Obesity associated fat infiltration of organ systems is accompanied by organ dysfunction and poor cancer outcomes. Obese women demonstrate variable degrees of fat infiltration of axillary lymph nodes (LNs), and they are at increased risk for node-positive breast cancer. However, the relationship between enlarged axillary nodes and axillary metastases has not been investigated. The purpose of this study is to evaluate the association between axillary metastases and fat-enlarged axillary nodes visualized on mammograms and breast MRI in obese women with a diagnosis of invasive breast cancer. METHODS: This retrospective case-control study included 431 patients with histologically confirmed invasive breast cancer. The primary analysis of this study included 306 patients with pre-treatment and pre-operative breast MRI and body mass index (BMI) > 30 (201 node-positive cases and 105 randomly selected node-negative controls) diagnosed with invasive breast cancer between April 1, 2011, and March 1, 2020. The largest visible LN was measured in the axilla contralateral to the known breast cancer on breast MRI. Multivariate logistic regression models were used to assess the association between node-positive status and LN size adjusting for age, BMI, tumor size, tumor grade, tumor subtype, and lymphovascular invasion. RESULTS: A strong likelihood of node-positive breast cancer was observed among obese women with fat-expanded lymph nodes (adjusted OR for the 4th vs. 1st quartile for contralateral LN size on MRI: 9.70; 95% CI 4.26, 23.50; p < 0.001). The receiver operating characteristic curve for size of fat-enlarged nodes in the contralateral axilla identified on breast MRI had an area under the curve of 0.72 for predicting axillary metastasis, and this increased to 0.77 when combined with patient and tumor characteristics. CONCLUSION: Fat expansion of axillary lymph nodes was associated with a high likelihood of axillary metastases in obese women with invasive breast cancer independent of BMI and tumor characteristics.


Asunto(s)
Neoplasias de la Mama , Axila , Neoplasias de la Mama/complicaciones , Neoplasias de la Mama/epidemiología , Neoplasias de la Mama/cirugía , Estudios de Casos y Controles , Femenino , Humanos , Ganglios Linfáticos/diagnóstico por imagen , Obesidad/complicaciones , Estudios Retrospectivos , Biopsia del Ganglio Linfático Centinela
6.
J Med Internet Res ; 23(9): e27314, 2021 09 15.
Artículo en Inglés | MEDLINE | ID: mdl-34524095

RESUMEN

BACKGROUND: Many social media studies have explored the ability of thematic structures, such as hashtags and subreddits, to identify information related to a wide variety of mental health disorders. However, studies and models trained on specific themed communities are often difficult to apply to different social media platforms and related outcomes. A deep learning framework using thematic structures from Reddit and Twitter can have distinct advantages for studying alcohol abuse, particularly among the youth in the United States. OBJECTIVE: This study proposes a new deep learning pipeline that uses thematic structures to identify alcohol-related content across different platforms. We apply our method on Twitter to determine the association of the prevalence of alcohol-related tweets with alcohol-related outcomes reported from the National Institute of Alcoholism and Alcohol Abuse, Centers for Disease Control Behavioral Risk Factor Surveillance System, county health rankings, and the National Industry Classification System. METHODS: The Bidirectional Encoder Representations From Transformers neural network learned to classify 1,302,524 Reddit posts as either alcohol-related or control subreddits. The trained model identified 24 alcohol-related hashtags from an unlabeled data set of 843,769 random tweets. Querying alcohol-related hashtags identified 25,558,846 alcohol-related tweets, including 790,544 location-specific (geotagged) tweets. We calculated the correlation between the prevalence of alcohol-related tweets and alcohol-related outcomes, controlling for confounding effects of age, sex, income, education, and self-reported race, as recorded by the 2013-2018 American Community Survey. RESULTS: Significant associations were observed: between alcohol-hashtagged tweets and alcohol consumption (P=.01) and heavy drinking (P=.005) but not binge drinking (P=.37), self-reported at the metropolitan-micropolitan statistical area level; between alcohol-hashtagged tweets and self-reported excessive drinking behavior (P=.03) but not motor vehicle fatalities involving alcohol (P=.21); between alcohol-hashtagged tweets and the number of breweries (P<.001), wineries (P<.001), and beer, wine, and liquor stores (P<.001) but not drinking places (P=.23), per capita at the US county and county-equivalent level; and between alcohol-hashtagged tweets and all gallons of ethanol consumed (P<.001), as well as ethanol consumed from wine (P<.001) and liquor (P=.01) sources but not beer (P=.63), at the US state level. CONCLUSIONS: Here, we present a novel natural language processing pipeline developed using Reddit's alcohol-related subreddits that identify highly specific alcohol-related Twitter hashtags. The prevalence of identified hashtags contains interpretable information about alcohol consumption at both coarse (eg, US state) and fine-grained (eg, metropolitan-micropolitan statistical area level and county) geographical designations. This approach can expand research and deep learning interventions on alcohol abuse and other behavioral health outcomes.


Asunto(s)
Aprendizaje Profundo , Medios de Comunicación Sociales , Adolescente , Consumo de Bebidas Alcohólicas/epidemiología , Etanol , Humanos , Procesamiento de Lenguaje Natural , Estados Unidos/epidemiología
7.
J Med Internet Res ; 23(10): e25512, 2021 10 22.
Artículo en Inglés | MEDLINE | ID: mdl-34677131

RESUMEN

BACKGROUND: Providing digital recordings of clinic visits to patients has emerged as a strategy to promote patient and family engagement in care. With advances in natural language processing, an opportunity exists to maximize the value of visit recordings for patients by automatically tagging key visit information (eg, medications, tests, and imaging) and linkages to trustworthy web-based resources curated in an audio-based personal health library. OBJECTIVE: This study aims to report on the user-centered development of HealthPAL, an audio personal health library. METHODS: Our user-centered design and usability evaluation approach incorporated iterative rounds of video-recorded sessions from 2016 to 2019. We recruited participants from a range of community settings to represent older patient and caregiver perspectives. In the first round, we used paper prototypes and focused on feature envisionment. We moved to low-fidelity and high-fidelity versions of the HealthPAL in later rounds, which focused on functionality and use; all sessions included a debriefing interview. Participants listened to a deidentified, standardized primary care visit recording before completing a series of tasks (eg, finding where a medication was discussed in the recording). In the final round, we recorded the patients' primary care clinic visits for use in the session. Findings from each round informed the agile software development process. Task completion and critical incidents were recorded in each round, and the System Usability Scale was completed by participants using the digital prototype in later rounds. RESULTS: We completed 5 rounds of usability sessions with 40 participants, of whom 25 (63%) were women with a median age of 68 years (range 23-89). Feedback from sessions resulted in color-coding and highlighting of information tags, a more prominent play button, clearer structure to move between one's own recordings and others' recordings, the ability to filter recording content by the topic discussed and descriptions, 10-second forward and rewind controls, and a help link and search bar. Perceived usability increased over the rounds, with a median System Usability Scale of 78.2 (range 20-100) in the final round. Participants were overwhelmingly positive about the concept of accessing a curated audio recording of a clinic visit. Some participants reported concerns about privacy and the computer-based skills necessary to access recordings. CONCLUSIONS: To our knowledge, HealthPAL is the first patient-centered app designed to allow patients and their caregivers to access easy-to-navigate recordings of clinic visits, with key concepts tagged and hyperlinks to further information provided. The HealthPAL user interface has been rigorously co-designed with older adult patients and their caregivers and is now ready for further field testing. The successful development and use of HealthPAL may help improve the ability of patients to manage their own care, especially older adult patients who have to navigate complex treatment plans.


Asunto(s)
Cuidadores , Diseño Centrado en el Usuario , Adulto , Anciano , Anciano de 80 o más Años , Atención Ambulatoria , Femenino , Humanos , Persona de Mediana Edad , Atención Primaria de Salud , Adulto Joven
8.
J Biomed Inform ; 111: 103581, 2020 11.
Artículo en Inglés | MEDLINE | ID: mdl-33010425

RESUMEN

OBJECTIVE: Currently, a major limitation for natural language processing (NLP) analyses in clinical applications is that concepts are not effectively referenced in various forms across different texts. This paper introduces Multi-Ontology Refined Embeddings (MORE), a novel hybrid framework that incorporates domain knowledge from multiple ontologies into a distributional semantic model, learned from a corpus of clinical text. MATERIALS AND METHODS: We use the RadCore and MIMIC-III free-text datasets for the corpus-based component of MORE. For the ontology-based part, we use the Medical Subject Headings (MeSH) ontology and three state-of-the-art ontology-based similarity measures. In our approach, we propose a new learning objective, modified from the sigmoid cross-entropy objective function. RESULTS AND DISCUSSION: We used two established datasets of semantic similarities among biomedical concept pairs to evaluate the quality of the generated word embeddings. On the first dataset with 29 concept pairs, with similarity scores established by physicians and medical coders, MORE's similarity scores have the highest combined correlation (0.633), which is 5.0% higher than that of the baseline model, and 12.4% higher than that of the best ontology-based similarity measure. On the second dataset with 449 concept pairs, MORE's similarity scores have a correlation of 0.481, based on the average of four medical residents' similarity ratings, and that outperforms the skip-gram model by 8.1%, and the best ontology measure by 6.9%. Furthermore, MORE outperforms three pre-trained transformer-based word embedding models (i.e., BERT, ClinicalBERT, and BioBERT) on both datasets. CONCLUSION: MORE incorporates knowledge from several biomedical ontologies into an existing corpus-based distributional semantics model, improving both the accuracy of the learned word embeddings and the extensibility of the model to a broader range of biomedical concepts. MORE allows for more accurate clustering of concepts across a wide range of applications, such as analyzing patient health records to identify subjects with similar pathologies, or integrating heterogeneous clinical data to improve interoperability between hospitals.


Asunto(s)
Ontologías Biológicas , Procesamiento de Lenguaje Natural , Semántica , Análisis por Conglomerados , Humanos , Medical Subject Headings
9.
J Med Internet Res ; 22(9): e21916, 2020 09 16.
Artículo en Inglés | MEDLINE | ID: mdl-32936081

RESUMEN

BACKGROUND: Technology-based computational strategies that leverage social network site (SNS) data to detect substance use are promising screening tools but rely on the presence of sufficient data to detect risk if it is present. A better understanding of the association between substance use and SNS participation may inform the utility of these technology-based screening tools. OBJECTIVE: This paper aims to examine associations between substance use and Instagram posts and to test whether such associations differ as a function of age, gender, and race/ethnicity. METHODS: Participants with an Instagram account were recruited primarily via Clickworker (N=3117). With participant permission and Instagram's approval, participants' Instagram photo posts were downloaded with an application program interface. Participants' past-year substance use was measured with an adapted version of the National Institute on Drug Abuse Quick Screen. At-risk drinking was defined as at least one past-year instance having "had more than a few alcoholic drinks a day," drug use was defined as any use of nonprescription drugs, and prescription drug use was defined as any nonmedical use of prescription medications. We used logistic regression to examine the associations between substance use and any Instagram posts and negative binomial regression to examine the associations between substance use and number of Instagram posts. We examined whether age (18-25, 26-38, 39+ years), gender, and race/ethnicity moderated associations in both logistic and negative binomial models. All differences noted were significant at the .05 level. RESULTS: Compared with no at-risk drinking, any at-risk drinking was associated with both a higher likelihood of any Instagram posts and a higher number of posts, except among Hispanic/Latino individuals, in whom at-risk drinking was associated with a similar number of posts. Compared with no drug use, any drug use was associated with a higher likelihood of any posts but was associated with a similar number of posts. Compared with no prescription drug use, any prescription drug use was associated with a similar likelihood of any posts and was associated with a lower number of posts only among those aged 39 years and older. Of note, main effects showed that being female compared with being male and being Hispanic/Latino compared with being White were significantly associated with both a greater likelihood of any posts and a greater number of posts. CONCLUSIONS: Researchers developing computational substance use risk detection models using Instagram or other SNS data may wish to consider our findings showing that at-risk drinking and drug use were positively associated with Instagram participation, while prescription drug use was negatively associated with Instagram participation for middle- and older-aged adults. As more is learned about SNS behaviors among those who use substances, researchers may be better positioned to successfully design and interpret innovative risk detection approaches.


Asunto(s)
Conductas Relacionadas con la Salud/fisiología , Medios de Comunicación Sociales/estadística & datos numéricos , Red Social , Trastornos Relacionados con Sustancias/epidemiología , Adolescente , Adulto , Estudios Transversales , Femenino , Humanos , Masculino , Adulto Joven
10.
J Biomed Inform ; 93: 103169, 2019 05.
Artículo en Inglés | MEDLINE | ID: mdl-30959206

RESUMEN

Radiologists are expected to expediently communicate critical and unexpected findings to referring clinicians to prevent delayed diagnosis and treatment of patients. However, competing demands such as heavy workload along with lack of administrative support resulted in communication failures that accounted for 7% of the malpractice payments made from 2004 to 2008 in the United States. To address this problem, we have developed a novel machine learning method that can automatically and accurately identify cases that require prompt communication to referring physicians based on analyzing the associated radiology reports. This semi-supervised learning approach requires a minimal amount of manual annotations and was trained on a large multi-institutional radiology report repository from three major external healthcare organizations. To test our approach, we created a corpus of 480 radiology reports from our own institution and double-annotated cases that required prompt communication by two radiologists. Our evaluation on the test corpus achieved an F-score of 74.5% and recall of 90.0% in identifying cases for prompt communication. The implementation of the proposed approach as part of an online decision support system can assist radiologists in identifying radiological cases for prompt communication to referring physicians to avoid or minimize potential harm to patients.


Asunto(s)
Comunicación , Aprendizaje Automático , Radiólogos , Derivación y Consulta , Análisis por Conglomerados , Humanos
11.
BMC Med Imaging ; 19(1): 21, 2019 02 28.
Artículo en Inglés | MEDLINE | ID: mdl-30819133

RESUMEN

BACKGROUND: Computer-aided diagnosis of skin lesions is a growing area of research, but its application to nonmelanoma skin cancer (NMSC) is relatively under-studied. The purpose of this review is to synthesize the research that has been conducted on automated detection of NMSC using digital images and to assess the quality of evidence for the diagnostic accuracy of these technologies. METHODS: Eight databases (PubMed, Google Scholar, Embase, IEEE Xplore, Web of Science, SpringerLink, ScienceDirect, and the ACM Digital Library) were searched to identify diagnostic studies of NMSC using image-based machine learning models. Two reviewers independently screened eligible articles. The level of evidence of each study was evaluated using a five tier rating system, and the applicability and risk of bias of each study was assessed using the Quality Assessment of Diagnostic Accuracy Studies tool. RESULTS: Thirty-nine studies were reviewed. Twenty-four models were designed to detect basal cell carcinoma, two were designed to detect squamous cell carcinoma, and thirteen were designed to detect both. All studies were conducted in silico. The overall diagnostic accuracy of the classifiers, defined as concordance with histopathologic diagnosis, was high, with reported accuracies ranging from 72 to 100% and areas under the receiver operating characteristic curve ranging from 0.832 to 1. Most studies had substantial methodological limitations, but several were robustly designed and presented a high level of evidence. CONCLUSION: Most studies of image-based NMSC classifiers report performance greater than or equal to the reported diagnostic accuracy of the average dermatologist, but relatively few studies have presented a high level of evidence. Clinical studies are needed to assess whether these technologies can feasibly be implemented as a real-time aid for clinical diagnosis of NMSC.


Asunto(s)
Carcinoma Basocelular/diagnóstico , Carcinoma de Células Escamosas/diagnóstico , Diagnóstico por Computador/métodos , Neoplasias Cutáneas/diagnóstico , Área Bajo la Curva , Humanos , Interpretación de Imagen Asistida por Computador/métodos , Aprendizaje Automático , Sensibilidad y Especificidad
12.
BMC Med Inform Decis Mak ; 19(1): 141, 2019 07 25.
Artículo en Inglés | MEDLINE | ID: mdl-31340796

RESUMEN

BACKGROUND: Usage of structured fields in Electronic Health Records (EHRs) to ascertain smoking history is important but fails in capturing the nuances of smoking behaviors. Knowledge of smoking behaviors, such as pack year history and most recent cessation date, allows care providers to select the best care plan for patients at risk of smoking attributable diseases. METHODS: We developed and evaluated a health informatics pipeline for identifying complete smoking history from clinical notes in EHRs. We utilized 758 patient-visit notes (from visits between 03/28/2016 and 04/04/2016) from our local EHR in addition to a public dataset of 502 clinical notes from the 2006 i2b2 Challenge to assess the performance of this pipeline. We used a machine-learning classifier to extract smoking status and a comprehensive set of text processing regular expressions to extract pack years and cessation date information from these clinical notes. RESULTS: We identified smoking status with an F1 score of 0.90 on both the i2b2 and local data sets. Regular expression identification of pack year history in the local test set was 91.7% sensitive and 95.2% specific, but due to variable context the pack year extraction was incomplete in 25% of cases, extracting packs per day or years smoked only. Regular expression identification of cessation date was 63.2% sensitive and 94.6% specific. CONCLUSIONS: Our work indicates that the development of an EHR-based Smokers' Registry containing information relating to smoking behaviors, not just status, from free-text clinical notes using an informatics pipeline is feasible. This pipeline is capable of functioning in external EHRs, reducing the amount of time and money needed at the institute-level to create a Smokers' Registry for improved identification of patient risk and eligibility for preventative and early detection services.


Asunto(s)
Algoritmos , Fumar Cigarrillos/epidemiología , Registros Electrónicos de Salud , Sistema de Registros , Conjuntos de Datos como Asunto , Humanos , Aprendizaje Automático , Informática Médica , Procesamiento de Lenguaje Natural
13.
BMC Med Inform Decis Mak ; 19(1): 143, 2019 07 25.
Artículo en Inglés | MEDLINE | ID: mdl-31345210

RESUMEN

BACKGROUND: Approximately 20% of deaths in the US each year are attributable to smoking, yet current practices in the recording of this health risk in electronic health records (EHRs) have not led to discernable changes in health outcomes. Several groups have developed algorithms for extracting smoking behaviors from clinical notes, but none of these approaches were assessed with external data to report on anticipated clinical performance. METHODS: Previously, we developed an informatics pipeline that extracts smoking status, pack year history, and cessation date from clinical notes. Here we report on the clinical implementation performance of our pipeline using 1,504 clinical notes matched to an external questionnaire. RESULTS: We found that 73% of available notes contained no smoking behavior information. The weighted Cohen's kappa between the external questionnaire and EHR smoking status was 0.62 (95% CI 0.56-0.69) for the clinical notes we were able to extract information from. The correlation between pack years reported by our pipeline and the external questionnaire was 0.39 on the 81 notes for which this information was present in both. We also assessed for lung cancer screening eligibility using notes from individuals identified as never smokers or smokers with pack year history extracted by our pipeline (n = 196). We found a positive predictive value of 85.4%, a negative predictive value of 83.8%, sensitivity of 63.1%, and specificity of 94.7%. CONCLUSIONS: We have demonstrated that our pipeline can extract smoking behaviors from unannotated EHR notes when the information is present. This information is reliable enough to identify patients most likely to be eligible for smoking related services. Ensuring capture of smoking information during clinical encounters should continue to be a high priority.


Asunto(s)
Algoritmos , Fumar Cigarrillos , Registros Electrónicos de Salud , Almacenamiento y Recuperación de la Información/métodos , Procesamiento de Lenguaje Natural , Adulto , Detección Precoz del Cáncer , Humanos , Neoplasias Pulmonares/diagnóstico , Sistemas de Registros Médicos Computarizados , Sistema de Registros , Encuestas y Cuestionarios
14.
J Med Internet Res ; 20(12): e11817, 2018 12 06.
Artículo en Inglés | MEDLINE | ID: mdl-30522991

RESUMEN

BACKGROUND: The content produced by individuals on various social media platforms has been successfully used to identify mental illness, including depression. However, most of the previous work in this area has focused on user-generated content, that is, content created by the individual, such as an individual's posts and pictures. In this study, we explored the predictive capability of community-generated content, that is, the data generated by a community of friends or followers, rather than by a sole individual, to identify depression among social media users. OBJECTIVE: The objective of this research was to evaluate the utility of community-generated content on social media, such as comments on an individual's posts, to predict depression as defined by the clinically validated Patient Health Questionnaire-8 (PHQ-8) assessment questionnaire. We hypothesized that the results of this research may provide new insights into next generation of population-level mental illness risk assessment and intervention delivery. METHODS: We created a Web-based survey on a crowdsourcing platform through which participants granted access to their Instagram profiles as well as provided their responses to PHQ-8 as a reference standard for depression status. After data quality assurance and postprocessing, the study analyzed the data of 749 participants. To build our predictive model, linguistic features were extracted from Instagram post captions and comments, including multiple sentiment scores, emoji sentiment analysis results, and meta-variables such as the number of likes and average comment length. In this study, 10.4% (78/749) of the data were held out as a test set. The remaining 89.6% (671/749) of the data were used to train an elastic-net regularized linear regression model to predict PHQ-8 scores. We compared different versions of this model (ie, a model trained on only user-generated data, a model trained on only community-generated data, and a model trained on the combination of both types of data) on a test set to explore the utility of community-generated data in our predictive analysis. RESULTS: The 2 models, the first trained on only community-generated data (area under curve [AUC]=0.71) and the second trained on a combination of user-generated and community-generated data (AUC=0.72), had statistically significant performances for predicting depression based on the Mann-Whitney U test (P=.03 and P=.02, respectively). The model trained on only user-generated data (AUC=0.63; P=.11) did not achieve statistically significant results. The coefficients of the models revealed that our combined data classifier effectively amalgamated both user-generated and community-generated data and that the 2 feature sets were complementary and contained nonoverlapping information in our predictive analysis. CONCLUSIONS: The results presented in this study indicate that leveraging community-generated data from social media, in addition to user-generated data, can be informative for predicting depression among social media users.


Asunto(s)
Depresión/diagnóstico , Medios de Comunicación Sociales/normas , Depresión/psicología , Humanos , Aprendizaje Automático
15.
J Digit Imaging ; 31(1): 84-90, 2018 02.
Artículo en Inglés | MEDLINE | ID: mdl-28808792

RESUMEN

Electronic medical record (EMR) systems provide easy access to radiology reports and offer great potential to support quality improvement efforts and clinical research. Harnessing the full potential of the EMR requires scalable approaches such as natural language processing (NLP) to convert text into variables used for evaluation or analysis. Our goal was to determine the feasibility of using NLP to identify patients with Type 1 Modic endplate changes using clinical reports of magnetic resonance (MR) imaging examinations of the spine. Identifying patients with Type 1 Modic change who may be eligible for clinical trials is important as these findings may be important targets for intervention. Four annotators identified all reports that contained Type 1 Modic change, using N = 458 randomly selected lumbar spine MR reports. We then implemented a rule-based NLP algorithm in Java using regular expressions. The prevalence of Type 1 Modic change in the annotated dataset was 10%. Results were recall (sensitivity) 35/50 = 0.70 (95% confidence interval (C.I.) 0.52-0.82), specificity 404/408 = 0.99 (0.97-1.0), precision (positive predictive value) 35/39 = 0.90 (0.75-0.97), negative predictive value 404/419 = 0.96 (0.94-0.98), and F1-score 0.79 (0.43-1.0). Our evaluation shows the efficacy of rule-based NLP approach for identifying patients with Type 1 Modic change if the emphasis is on identifying only relevant cases with low concern regarding false negatives. As expected, our results show that specificity is higher than recall. This is due to the inherent difficulty of eliciting all possible keywords given the enormous variability of lumbar spine reporting, which decreases recall, while availability of good negation algorithms improves specificity.


Asunto(s)
Dolor de la Región Lumbar/patología , Vértebras Lumbares/diagnóstico por imagen , Vértebras Lumbares/patología , Imagen por Resonancia Magnética/métodos , Procesamiento de Lenguaje Natural , Informe de Investigación , Humanos , Estudios Prospectivos , Radiología , Reproducibilidad de los Resultados , Sensibilidad y Especificidad
16.
AJR Am J Roentgenol ; 208(4): 750-753, 2017 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-28140627

RESUMEN

OBJECTIVE: The purpose of this study is to evaluate the performance of a natural language processing (NLP) system in classifying a database of free-text knee MRI reports at two separate academic radiology practices. MATERIALS AND METHODS: An NLP system that uses terms and patterns in manually classified narrative knee MRI reports was constructed. The NLP system was trained and tested on expert-classified knee MRI reports from two major health care organizations. Radiology reports were modeled in the training set as vectors, and a support vector machine framework was used to train the classifier. A separate test set from each organization was used to evaluate the performance of the system. We evaluated the performance of the system both within and across organizations. Standard evaluation metrics, such as accuracy, precision, recall, and F1 score (i.e., the weighted average of the precision and recall), and their respective 95% CIs were used to measure the efficacy of our classification system. RESULTS: The accuracy for radiology reports that belonged to the model's clinically significant concept classes after training data from the same institution was good, yielding an F1 score greater than 90% (95% CI, 84.6-97.3%). Performance of the classifier on cross-institutional application without institution-specific training data yielded F1 scores of 77.6% (95% CI, 69.5-85.7%) and 90.2% (95% CI, 84.5-95.9%) at the two organizations studied. CONCLUSION: The results show excellent accuracy by the NLP machine learning classifier in classifying free-text knee MRI reports, supporting the institution-independent reproducibility of knee MRI report classification. Furthermore, the machine learning classifier performed well on free-text knee MRI reports from another institution. These data support the feasibility of multiinstitutional classification of radiologic imaging text reports with a single machine learning classifier without requiring institution-specific training data.


Asunto(s)
Centros Médicos Académicos/estadística & datos numéricos , Rodilla/diagnóstico por imagen , Aprendizaje Automático , Imagen por Resonancia Magnética/métodos , Imagen por Resonancia Magnética/estadística & datos numéricos , Sistemas de Información Radiológica/estadística & datos numéricos , California , Minería de Datos/métodos , Humanos , Aumento de la Imagen/métodos , Procesamiento de Lenguaje Natural , North Carolina , Reconocimiento de Normas Patrones Automatizadas/métodos , Servicio de Radiología en Hospital , Sistemas de Información Radiológica/organización & administración , Reproducibilidad de los Resultados , Sensibilidad y Especificidad , Máquina de Vectores de Soporte , Carga de Trabajo/estadística & datos numéricos
17.
J Digit Imaging ; 30(3): 314-322, 2017 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-28050714

RESUMEN

We built a natural language processing (NLP) method to automatically extract clinical findings in radiology reports and characterize their level of change and significance according to a radiology-specific information model. We utilized a combination of machine learning and rule-based approaches for this purpose. Our method is unique in capturing different features and levels of abstractions at surface, entity, and discourse levels in text analysis. This combination has enabled us to recognize the underlying semantics of radiology report narratives for this task. We evaluated our method on radiology reports from four major healthcare organizations. Our evaluation showed the efficacy of our method in highlighting important changes (accuracy 99.2%, precision 96.3%, recall 93.5%, and F1 score 94.7%) and identifying significant observations (accuracy 75.8%, precision 75.2%, recall 75.7%, and F1 score 75.3%) to characterize radiology reports. This method can help clinicians quickly understand the key observations in radiology reports and facilitate clinical decision support, review prioritization, and disease surveillance.


Asunto(s)
Registros Médicos , Procesamiento de Lenguaje Natural , Radiología , Toma de Decisiones Clínicas , Humanos , Aprendizaje Automático , Radiografía , Informe de Investigación , Semántica
18.
J Digit Imaging ; 29(1): 59-62, 2016 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-26353748

RESUMEN

Radiology report narrative contains a large amount of information about the patient's health and the radiologist's interpretation of medical findings. Most of this critical information is entered in free text format, even when structured radiology report templates are used. The radiology report narrative varies in use of terminology and language among different radiologists and organizations. The free text format and the subtlety and variations of natural language hinder the extraction of reusable information from radiology reports for decision support, quality improvement, and biomedical research. Therefore, as the first step to organize and extract the information content in a large multi-institutional free text radiology report repository, we have designed and developed an unsupervised machine learning approach to capture the main concepts in a radiology report repository and partition the reports based on their main foci. In this approach, radiology reports are modeled in a vector space and compared to each other through a cosine similarity measure. This similarity is used to cluster radiology reports and identify the repository's underlying topics. We applied our approach on a repository of 1,899,482 radiology reports from three major healthcare organizations. Our method identified 19 major radiology report topics in the repository and clustered the reports accordingly to these topics. Our results are verified by a domain expert radiologist and successfully explain the repository's primary topics and extract the corresponding reports. The results of our system provide a target-based corpus and framework for information extraction and retrieval systems for radiology reports.


Asunto(s)
Algoritmos , Aprendizaje Automático , Modelos Teóricos , Procesamiento de Lenguaje Natural , Sistemas de Información Radiológica , Análisis por Conglomerados , Humanos
19.
Med Educ Online ; 29(1): 2315684, 2024 Dec 31.
Artículo en Inglés | MEDLINE | ID: mdl-38351737

RESUMEN

Artificial intelligence (AI) is rapidly being introduced into the clinical workflow of many specialties. Despite the need to train physicians who understand the utility and implications of AI and mitigate a growing skills gap, no established consensus exists on how to best introduce AI concepts to medical students during preclinical training. This study examined the effectiveness of a pilot Digital Health Scholars (DHS) non-credit enrichment elective that paralleled the Dartmouth Geisel School of Medicine's first-year preclinical curriculum with a focus on introducing AI algorithms and their applications in the concurrently occurring systems-blocks. From September 2022 to March 2023, ten self-selected first-year students enrolled in the elective curriculum run in parallel with four existing curricular blocks (Immunology, Hematology, Cardiology, and Pulmonology). Each DHS block consisted of a journal club, a live-coding demonstration, and an integration session led by a researcher in that field. Students' confidence in explaining the content objectives (high-level knowledge, implications, and limitations of AI) was measured before and after each block and compared using Mann-Whitney U tests. Students reported significant increases in confidence in describing the content objectives after all four blocks (Immunology: U = 4.5, p = 0.030; Hematology: U = 1.0, p = 0.009; Cardiology: U = 4.0, p = 0.019; Pulmonology: U = 4.0, p = 0.030) as well as an average overall satisfaction level of 4.29/5 in rating the curriculum content. Our study demonstrates that a digital health enrichment elective that runs in parallel to an institution's preclinical curriculum and embeds AI concepts into relevant clinical topics can enhance students' confidence in describing the content objectives that pertain to high-level algorithmic understanding, implications, and limitations of the studied models. Building on this elective curricular design, further studies with a larger enrollment can help determine the most effective approach in preparing future physicians for the AI-enhanced clinical workflow.


Asunto(s)
Inteligencia Artificial , Estudiantes de Medicina , Humanos , Proyectos Piloto , Curriculum , Atención a la Salud
20.
J Pathol Inform ; 14: 100320, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37457594

RESUMEN

Deep learning has been effective for histology image analysis in digital pathology. However, many current deep learning approaches require large, strongly- or weakly labeled images and regions of interest, which can be time-consuming and resource-intensive to obtain. To address this challenge, we present HistoPerm, a view generation method for representation learning using joint embedding architectures that enhances representation learning for histology images. HistoPerm permutes augmented views of patches extracted from whole-slide histology images to improve classification performance. We evaluated the effectiveness of HistoPerm on 2 histology image datasets for Celiac disease and Renal Cell Carcinoma, using 3 widely used joint embedding architecture-based representation learning methods: BYOL, SimCLR, and VICReg. Our results show that HistoPerm consistently improves patch- and slide-level classification performance in terms of accuracy, F1-score, and AUC. Specifically, for patch-level classification accuracy on the Celiac disease dataset, HistoPerm boosts BYOL and VICReg by 8% and SimCLR by 3%. On the Renal Cell Carcinoma dataset, patch-level classification accuracy is increased by 2% for BYOL and VICReg, and by 1% for SimCLR. In addition, on the Celiac disease dataset, models with HistoPerm outperform the fully supervised baseline model by 6%, 5%, and 2% for BYOL, SimCLR, and VICReg, respectively. For the Renal Cell Carcinoma dataset, HistoPerm lowers the classification accuracy gap for the models up to 10% relative to the fully supervised baseline. These findings suggest that HistoPerm can be a valuable tool for improving representation learning of histopathology features when access to labeled data is limited and can lead to whole-slide classification results that are comparable to or superior to fully supervised methods.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA