Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 39
Filtrar
Más filtros

Base de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Front Nutr ; 11: 1231070, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38899323

RESUMEN

Although diets influence health and the environment, measuring and changing nutrition is challenging. Traditional measurement methods face challenges, and designing and conducting behavior-changing interventions is conceptually and logistically complicated. Situated local communities such as university campuses offer unique opportunities to shape the nutritional environment and promote health and sustainability. The present study investigates how passively sensed food purchase logs typically collected as part of regular business operations can be used to monitor and measure on-campus food consumption and understand food choice determinants. First, based on 38 million sales logs collected on a large university campus over eight years, we perform statistical analyses to quantify spatio-temporal determinants of food choice and characterize harmful patterns in dietary behaviors, in a case study of food purchasing at EPFL campus. We identify spatial proximity, food item pairing, and academic schedules (yearly and daily) as important determinants driving the on-campus food choice. The case studies demonstrate the potential of food sales logs for measuring nutrition and highlight the breadth and depth of future possibilities to study individual food-choice determinants. We describe how these insights provide an opportunity for stakeholders, such as campus offices responsible for managing food services, to shape the nutritional environment and improve health and sustainability by designing policies and behavioral interventions. Finally, based on the insights derived through the case study of food purchases at EPFL campus, we identify five future opportunities and offer a call to action for the nutrition research community to contribute to ensuring the health and sustainability of on-campus populations-the very communities to which many researchers belong.

2.
Res Sq ; 2024 Apr 26.
Artículo en Inglés | MEDLINE | ID: mdl-38746169

RESUMEN

The majority of proteins must form higher-order assemblies to perform their biological functions. Despite the importance of protein quaternary structure, there are few machine learning models that can accurately and rapidly predict the symmetry of assemblies involving multiple copies of the same protein chain. Here, we address this gap by training several classes of protein foundation models, including ESM-MSA, ESM2, and RoseTTAFold2, to predict homo-oligomer symmetry. Our best model named Seq2Symm, which utilizes ESM2, outperforms existing template-based and deep learning methods. It achieves an average PR-AUC of 0.48 and 0.44 across homo-oligomer symmetries on two different held-out test sets compared to 0.32 and 0.23 for the template-based method. Because Seq2Symm can rapidly predict homo-oligomer symmetries using a single sequence as input (~ 80,000 proteins/hour), we have applied it to 5 entire proteomes and ~ 3.5 million unlabeled protein sequences to identify patterns in protein assembly complexity across biological kingdoms and species.

5.
medRxiv ; 2024 Mar 14.
Artículo en Inglés | MEDLINE | ID: mdl-38559045

RESUMEN

Importance: Diagnostic errors are common and cause significant morbidity. Large language models (LLMs) have shown promise in their performance on both multiple-choice and open-ended medical reasoning examinations, but it remains unknown whether the use of such tools improves diagnostic reasoning. Objective: To assess the impact of the GPT-4 LLM on physicians' diagnostic reasoning compared to conventional resources. Design: Multi-center, randomized clinical vignette study. Setting: The study was conducted using remote video conferencing with physicians across the country and in-person participation across multiple academic medical institutions. Participants: Resident and attending physicians with training in family medicine, internal medicine, or emergency medicine. Interventions: Participants were randomized to access GPT-4 in addition to conventional diagnostic resources or to just conventional resources. They were allocated 60 minutes to review up to six clinical vignettes adapted from established diagnostic reasoning exams. Main Outcomes and Measures: The primary outcome was diagnostic performance based on differential diagnosis accuracy, appropriateness of supporting and opposing factors, and next diagnostic evaluation steps. Secondary outcomes included time spent per case and final diagnosis. Results: 50 physicians (26 attendings, 24 residents) participated, with an average of 5.2 cases completed per participant. The median diagnostic reasoning score per case was 76.3 percent (IQR 65.8 to 86.8) for the GPT-4 group and 73.7 percent (IQR 63.2 to 84.2) for the conventional resources group, with an adjusted difference of 1.6 percentage points (95% CI -4.4 to 7.6; p=0.60). The median time spent on cases for the GPT-4 group was 519 seconds (IQR 371 to 668 seconds), compared to 565 seconds (IQR 456 to 788 seconds) for the conventional resources group, with a time difference of -82 seconds (95% CI -195 to 31; p=0.20). GPT-4 alone scored 15.5 percentage points (95% CI 1.5 to 29, p=0.03) higher than the conventional resources group. Conclusions and Relevance: In a clinical vignette-based study, the availability of GPT-4 to physicians as a diagnostic aid did not significantly improve clinical reasoning compared to conventional resources, although it may improve components of clinical reasoning such as efficiency. GPT-4 alone demonstrated higher performance than both physician groups, suggesting opportunities for further improvement in physician-AI collaboration in clinical practice.

6.
Proc Natl Acad Sci U S A ; 119(45): e2211715119, 2022 11 08.
Artículo en Inglés | MEDLINE | ID: mdl-36322749

RESUMEN

Lifelong experiences and learned knowledge lead to shared expectations about how common situations tend to unfold. Such knowledge of narrative event flow enables people to weave together a story. However, comparable computational tools to evaluate the flow of events in narratives are limited. We quantify the differences between autobiographical and imagined stories by introducing sequentiality, a measure of narrative flow of events, drawing probabilistic inferences from a cutting-edge large language model (GPT-3). Sequentiality captures the flow of a narrative by comparing the probability of a sentence with and without its preceding story context. We applied our measure to study thousands of diary-like stories, collected from crowdworkers, about either a recent remembered experience or an imagined story on the same topic. The results show that imagined stories have higher sequentiality than autobiographical stories and that the sequentiality of autobiographical stories increases when the memories are retold several months later. In pursuit of deeper understandings of how sequentiality measures the flow of narratives, we explore proportions of major and minor events in story sentences, as annotated by crowdworkers. We find that lower sequentiality is associated with higher proportions of major events. The methods and results highlight opportunities to use cutting-edge computational analyses, such as sequentiality, on large corpora of matched imagined and autobiographical stories to investigate the influences of memory and reasoning on language generation processes.


Asunto(s)
Recuerdo Mental , Narración , Humanos , Comprensión , Lenguaje , Aprendizaje
7.
Nat Commun ; 13(1): 7094, 2022 11 19.
Artículo en Inglés | MEDLINE | ID: mdl-36402817

RESUMEN

The COVID-19 pandemic has stimulated important changes in online information access as digital engagement became necessary to meet the demand for health, economic, and educational resources. Our analysis of 55 billion everyday web search interactions during the pandemic across 25,150 US ZIP codes reveals that the extent to which different communities of internet users enlist digital resources varies based on socioeconomic and environmental factors. For example, we find that ZIP codes with lower income intensified their access to health information to a smaller extent than ZIP codes with higher income. We show that ZIP codes with higher proportions of Black or Hispanic residents intensified their access to unemployment resources to a greater extent, while revealing patterns of unemployment site visits unseen by the claims data. Such differences frame important questions on the relationship between differential information search behaviors and the downstream real-world implications on more and less advantaged populations.


Asunto(s)
COVID-19 , Humanos , COVID-19/epidemiología , Pandemias , Acceso a la Información , Renta
8.
J Am Med Inform Assoc ; 29(3): 415-423, 2022 01 29.
Artículo en Inglés | MEDLINE | ID: mdl-34918101

RESUMEN

OBJECTIVE: To analyze gender bias in clinical trials, to design an algorithm that mitigates the effects of biases of gender representation on natural-language (NLP) systems trained on text drawn from clinical trials, and to evaluate its performance. MATERIALS AND METHODS: We analyze gender bias in clinical trials described by 16 772 PubMed abstracts (2008-2018). We present a method to augment word embeddings, the core building block of NLP-centric representations, by weighting abstracts by the number of women participants in the trial. We evaluate the resulting gender-sensitive embeddings performance on several clinical prediction tasks: comorbidity classification, hospital length of stay prediction, and intensive care unit (ICU) readmission prediction. RESULTS: For female patients, the gender-sensitive model area under the receiver-operator characteristic (AUROC) is 0.86 versus the baseline of 0.81 for comorbidity classification, mean absolute error 4.59 versus the baseline of 4.66 for length of stay prediction, and AUROC 0.69 versus 0.67 for ICU readmission. All results are statistically significant. DISCUSSION: Women have been underrepresented in clinical trials. Thus, using the broad clinical trials literature as training data for statistical language models could result in biased models, with deficits in knowledge about women. The method presented enables gender-sensitive use of publications as training data for word embeddings. In experiments, the gender-sensitive embeddings show better performance than baseline embeddings for the clinical tasks studied. The results highlight opportunities for recognizing and addressing gender and other representational biases in the clinical trials literature. CONCLUSION: Addressing representational biases in data for training NLP embeddings can lead to better results on downstream tasks for underrepresented populations.


Asunto(s)
Atención a la Salud , Procesamiento de Lenguaje Natural , Sexismo , Ensayos Clínicos como Asunto , Femenino , Humanos , Masculino , PubMed
10.
Bioinformatics ; 36(22-23): 5269-5274, 2021 Apr 01.
Artículo en Inglés | MEDLINE | ID: mdl-33325496

RESUMEN

SUMMARY: How do nuances of scientists' attention influence what they discover? We pursue an understanding of the influences of patterns of attention on discovery with a case study about confirmations of protein-protein interactions over time. We find that modeling and accounting for attention can help us to recognize and interpret biases in large-scale and widely used databases of confirmed interactions and to better understand missing data and unknowns. Additionally, we present an analysis of how awareness of patterns of attention and use of debiasing techniques can foster earlier discoveries. AVAILABILITY AND IMPLEMENTATION: The data is freely available at https://github.com/urielsinger/PPI-unbias.

11.
J Biomed Inform ; 107: 103425, 2020 07.
Artículo en Inglés | MEDLINE | ID: mdl-32348850

RESUMEN

Medical error is a leading cause of patient death in the United States. Among the different types of medical errors, harm to patients caused by doctors missing early signs of deterioration is especially challenging to address due to the heterogeneity of patients' physiological patterns. In this study, we implemented risk prediction models using the gradient boosted tree method to derive risk estimates for acute onset diseases in the near future. The prediction model uses physiological variables as input signals and the time of the administration of outcome-related interventions and discharge diagnoses as labels. We examine four categories of acute onset illness: acute heart failure (AHF), acute lung injury (ALI), acute kidney injury (AKI), and acute liver failure (ALF). To develop and test the model, we consider data from two sources: 23,578 admissions to the Intensive Care Unit (ICU) from the MIMIC-3 dataset (Beth-Israel Hospital) and 16,612 ICU admissions on hospitals affiliated with our institution (University of Washington Medical Center and Harborview Medical Center, the UW-CDR dataset). We systematically identify outcome-related interventions for each acute organ failure, then use them, along with discharge diagnoses, to label proxy events to train gradient boosted trees. The trained models achieve the highest F1 score with a value of 0.6018 when predicting the need for life-saving interventions for ALI within the next 24 h in the MIMIC-3 dataset while showing a median F1 score of 0.3850 from all acute organ failures in both datasets. The approach also achieves the highest F1 score of 0.6301 when classifying a patient's ALI status at the time of discharge from the MIMIC-3 dataset, with a median F1 score of 0.4307 in both datasets. This study shows the potential for using the time of outcome-related intervention administrations and discharge diagnoses as labels to train supervised machine learning models that predict the risk of acute onset illnesses.


Asunto(s)
Lesión Renal Aguda , Aprendizaje Automático , Lesión Renal Aguda/diagnóstico , Hospitalización , Humanos , Unidades de Cuidados Intensivos
12.
NPJ Digit Med ; 2: 93, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-31583281

RESUMEN

Tremors are a common movement disorder with a spectrum of benign and pathological causes, including neurodegenerative disease, alcohol withdrawal, and physical overexertion. Studies of tremors in clinical practice are limited in size and scope and depend on explicit tracking of tremor characteristics by clinicians. Data drawn from small numbers of patients observed in short-duration sessions pose challenges for understanding the nature and distribution of tremors over a large population. Methods are presented to estimate hand tremors based on anonymized computer mouse cursor movement data collected from millions of users of a web search engine. To determine the feasibility of using this signal for the estimation of the prevalence of tremors over a large population, the characteristics of tremor-like movements are computed and compared against user data that can be interpreted as self-reports, the findings of published clinical studies, and a target selection study where participants self-report hand tremors and known causes. The results demonstrate significant alignment between estimated tremors and both self-reports and clinical findings. Those with cursor tremor events are more likely to report tremor-related search interests. Variations in cursor tremor quantity and cursor tremor frequency with demographics mirror those from clinical studies. Distributions of cursor tremor frequencies vary as expected for different medical conditions. Overall, the study finds evidence for the validity of harnessing anonymized mouse cursor motion as a population-scale tremor sensor for epidemiologic studies. Feasible future applications include opt-in services for screening and for monitoring the progression of illness.

13.
Proc Natl Acad Sci U S A ; 115(32): 8099-8103, 2018 08 07.
Artículo en Inglés | MEDLINE | ID: mdl-30038026

RESUMEN

The problem of maintaining a local cache of n constantly changing pages arises in multiple mechanisms such as web crawlers and proxy servers. In these, the resources for polling pages for possible updates are typically limited. The goal is to devise a polling and fetching policy that maximizes the utility of served pages that are up to date. Cho and Garcia-Molina [(2003) ACM Trans Database Syst 28:390-426] formulated this as an optimization problem, which can be solved numerically for small values of n, but appears intractable in general. Here, we show that the optimal randomized policy can be found exactly in [Formula: see text] operations. Moreover, using the optimal probabilities to define in linear time a deterministic schedule yields a tractable policy that in experiments attains 99% of the optimum.

14.
NPJ Digit Med ; 1: 20173, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-31304347

RESUMEN

Impaired psychomotor performance severely increases the risk of fatal and non-fatal car accidents. However, we currently lack methods to continuously and non-intrusively monitor psychomotor performance. We show we can estimate psychomotor function at population scale from 16 billion observations of typing speeds during the input of web search queries. We show that these estimates exhibit diurnal variation with a substantial increase during typical sleep times, matching published accident risk rates. Further, we show that psychomotor impairment, as measured by keystroke timing, predicts motor vehicle fatality risk on a population level (Spearman ρ = 0.61; p « 10-10). The methods and results highlight a promising direction of harnessing ambient streams of data, such as patterns of interactions with devices, as large-scale sensors to continuously and non-intrusively monitor human psychomotor performance at population scale.

15.
NPJ Digit Med ; 1: 8, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-31304293

RESUMEN

Neurodegenerative disorders, such as Parkinson's disease (PD) and Alzheimer's disease (AD), are important public health problems warranting early detection. We trained machine-learned classifiers on the longitudinal search logs of 31,321,773 search engine users to automatically detect neurodegenerative disorders. Several digital phenotypes with high discriminatory weights for detecting these disorders are identified. Classifier sensitivities for PD detection are 94.2/83.1/42.0/34.6% at false positive rates (FPRs) of 20/10/1/0.1%, respectively. Preliminary analysis shows similar performance for AD detection. Subject to further refinement of accuracy and reproducibility, these findings show the promise of web search digital phenotypes as adjunctive screening tools for neurodegenerative disorders.

16.
J Biomed Inform ; 76: 41-49, 2017 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-29081385

RESUMEN

OBJECTIVE: Improving mechanisms to detect adverse drug reactions (ADRs) is key to strengthening post-marketing drug safety surveillance. Signal detection is presently unimodal, relying on a single information source. Multimodal signal detection is based on jointly analyzing multiple information sources. Building on, and expanding the work done in prior studies, the aim of the article is to further research on multimodal signal detection, explore its potential benefits, and propose methods for its construction and evaluation. MATERIAL AND METHODS: Four data sources are investigated; FDA's adverse event reporting system, insurance claims, the MEDLINE citation database, and the logs of major Web search engines. Published methods are used to generate and combine signals from each data source. Two distinct reference benchmarks corresponding to well-established and recently labeled ADRs respectively are used to evaluate the performance of multimodal signal detection in terms of area under the ROC curve (AUC) and lead-time-to-detection, with the latter relative to labeling revision dates. RESULTS: Limited to our reference benchmarks, multimodal signal detection provides AUC improvements ranging from 0.04 to 0.09 based on a widely used evaluation benchmark, and a comparative added lead-time of 7-22 months relative to labeling revision dates from a time-indexed benchmark. CONCLUSIONS: The results support the notion that utilizing and jointly analyzing multiple data sources may lead to improved signal detection. Given certain data and benchmark limitations, the early stage of development, and the complexity of ADRs, it is currently not possible to make definitive statements about the ultimate utility of the concept. Continued development of multimodal signal detection requires a deeper understanding the data sources used, additional benchmarks, and further research on methods to generate and synthesize signals.


Asunto(s)
Sistemas de Registro de Reacción Adversa a Medicamentos , Bases de Datos Factuales , Humanos , Estados Unidos , United States Food and Drug Administration
17.
Science ; 357(6346): 7, 2017 Jul 07.
Artículo en Inglés | MEDLINE | ID: mdl-28684472
19.
JAMA Oncol ; 3(3): 398-401, 2017 Mar 01.
Artículo en Inglés | MEDLINE | ID: mdl-27832243

RESUMEN

IMPORTANCE: A statistical model that predicts the appearance of strong evidence of a lung carcinoma diagnosis via analysis of large-scale anonymized logs of web search queries from millions of people across the United States. OBJECTIVE: To evaluate the feasibility of screening patients at risk of lung carcinoma via analysis of signals from online search activity. DESIGN, SETTING, AND PARTICIPANTS: We identified people who issue special queries that provide strong evidence of a recent diagnosis of lung carcinoma. We then considered patterns of symptoms expressed as searches about concerning symptoms over several months prior to the appearance of the landmark web queries. We built statistical classifiers that predict the future appearance of landmark queries based on the search log signals. This was a retrospective log analysis of the online activity of millions of web searchers seeking health-related information online. Of web searchers who queried for symptoms related to lung carcinoma, some (n = 5443 of 4 813 985) later issued queries that provide strong evidence of recent clinical diagnosis of lung carcinoma and are regarded as positive cases in our analysis. Additional evidence on the reliability of these queries as representing clinical diagnoses is based on the significant increase in follow-on searches for treatments and medications for these searchers and on the correlation between lung carcinoma incidence rates and our log-based statistics. The remaining symptom searchers (n = 4 808 542) are regarded as negative cases. MAIN OUTCOMES AND MEASURES: Performance of the statistical model for early detection from online search behavior, for different lead times, different sets of signals, and different cohorts of searchers stratified by potential risk. RESULTS: The statistical classifier predicting the future appearance of landmark web queries based on search log signals identified searchers who later input queries consistent with a lung carcinoma diagnosis, with a true-positive rate ranging from 3% to 57% for false-positive rates ranging from 0.00001 to 0.001, respectively. The methods can be used to identify people at highest risk up to a year in advance of the inferred diagnosis time. The 5 factors associated with the highest relative risk (RR) were evidence of family history (RR = 7.548; 95% CI, 3.937-14.470), age (RR = 3.558; 95% CI, 3.357-3.772), radon (RR = 2.529; 95% CI, 1.137-5.624), primary location (RR = 2.463; 95% CI, 1.364-4.446), and occupation (RR = 1.969; 95% CI, 1.143-3.391). Evidence of smoking (RR = 1.646; 95% CI, 1.032-2.260) was important but not top-ranked, which was due to the difficulty of identifying smoking history from search terms. CONCLUSIONS AND RELEVANCE: Pattern recognition based on data drawn from large-scale web search queries holds opportunity for identifying risk factors and frames new directions with early detection of lung carcinoma.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas/diagnóstico , Detección Precoz del Cáncer/métodos , Internet/estadística & datos numéricos , Neoplasias Pulmonares/diagnóstico , Carcinoma Pulmonar de Células Pequeñas/diagnóstico , Estudios de Factibilidad , Humanos , Conducta en la Búsqueda de Información , Tamizaje Masivo/métodos , Modelos Estadísticos , Factores de Riesgo , Motor de Búsqueda
20.
J Med Internet Res ; 18(12): e315, 2016 12 06.
Artículo en Inglés | MEDLINE | ID: mdl-27923778

RESUMEN

BACKGROUND: Physical activity helps people maintain a healthy weight and reduces the risk for several chronic diseases. Although this knowledge is widely recognized, adults and children in many countries around the world do not get recommended amounts of physical activity. Although many interventions are found to be ineffective at increasing physical activity or reaching inactive populations, there have been anecdotal reports of increased physical activity due to novel mobile games that embed game play in the physical world. The most recent and salient example of such a game is Pokémon Go, which has reportedly reached tens of millions of users in the United States and worldwide. OBJECTIVE: The objective of this study was to quantify the impact of Pokémon Go on physical activity. METHODS: We study the effect of Pokémon Go on physical activity through a combination of signals from large-scale corpora of wearable sensor data and search engine logs for 32,000 Microsoft Band users over a period of 3 months. Pokémon Go players are identified through search engine queries and physical activity is measured through accelerometers. RESULTS: We find that Pokémon Go leads to significant increases in physical activity over a period of 30 days, with particularly engaged users (ie, those making multiple search queries for details about game usage) increasing their activity by 1473 steps a day on average, a more than 25% increase compared with their prior activity level (P<.001). In the short time span of the study, we estimate that Pokémon Go has added a total of 144 billion steps to US physical activity. Furthermore, Pokémon Go has been able to increase physical activity across men and women of all ages, weight status, and prior activity levels showing this form of game leads to increases in physical activity with significant implications for public health. In particular, we find that Pokémon Go is able to reach low activity populations, whereas all 4 leading mobile health apps studied in this work largely draw from an already very active population. CONCLUSIONS: Mobile apps combining game play with physical activity lead to substantial short-term activity increases and, in contrast to many existing interventions and mobile health apps, have the potential to reach activity-poor populations. Future studies are needed to investigate potential long-term effects of these applications.


Asunto(s)
Ejercicio Físico , Aplicaciones Móviles/estadística & datos numéricos , Telemedicina/estadística & datos numéricos , Juegos de Video/estadística & datos numéricos , Adolescente , Adulto , Niño , Femenino , Humanos , Masculino , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA