Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 142
Filtrar
1.
J Am Chem Soc ; 145(32): 17656-17664, 2023 08 16.
Artigo em Inglês | MEDLINE | ID: mdl-37530568

RESUMO

The study of non-natural biocatalytic transformations relies heavily on empirical methods, such as directed evolution, for identifying improved variants. Although exceptionally effective, this approach provides limited insight into the molecular mechanisms behind the transformations and necessitates multiple protein engineering campaigns for new reactants. To address this limitation, we disclose a strategy to explore the biocatalytic reaction space and garner insight into the molecular mechanisms driving enzymatic transformations. Specifically, we explored the selectivity of an "ene"-reductase, GluER-T36A, to create a data-driven toolset that explores reaction space and rationalizes the observed and predicted selectivities of substrate/mutant combinations. The resultant statistical models related structural features of the enzyme and substrate to selectivity and were used to effectively predict selectivity in reactions with out-of-sample substrates and mutants. Our approach provided a deeper understanding of enantioinduction by GluER-T36A and holds the potential to enhance the virtual screening of enzyme mutants.


Assuntos
Ciência de Dados , Ciência de Dados/métodos , Biocatálise , Estereoisomerismo , Especificidade por Substrato , Ligantes , Mutação , Modelos Moleculares
2.
BMC Res Notes ; 16(1): 98, 2023 Jun 06.
Artigo em Inglês | MEDLINE | ID: mdl-37280717

RESUMO

OBJECTIVE: Survival models are used extensively in biomedical sciences, where they allow the investigation of the effect of exposures on health outcomes. It is desirable to use diverse data sets in survival analyses, because this offers increased statistical power and generalisability of results. However, there are often challenges with bringing data together in one location or following an analysis plan and sharing results. DataSHIELD is an analysis platform that helps users to overcome these ethical, governance and process difficulties. It allows users to analyse data remotely, using functions that are built to restrict access to the detailed data items (federated analysis). Previous works have provided survival modelling functionality in DataSHIELD (dsSurvival package), but there is a requirement to provide functions that offer privacy enhancing survival curves that retain useful information. RESULTS: We introduce an enhanced version of the dsSurvival package which offers privacy enhancing survival curves for DataSHIELD. Different methods for enhancing privacy were evaluated for their effectiveness in enhancing privacy while maintaining utility. We demonstrated how our selected method could enhance privacy in different scenarios using real survival data. The details of how DataSHIELD can be used to generate survival curves can be found in the associated tutorial.


Assuntos
Ciência de Dados , Modelos Estatísticos , Privacidade , Análise de Sobrevida , Confidencialidade , Ciência de Dados/métodos , Anonimização de Dados , Análise de Dados , Ética em Pesquisa
5.
PLoS One ; 17(3): e0264713, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35298483

RESUMO

In most big cities, public transports are enclosed and crowded spaces. Therefore, they are considered as one of the most important triggers of COVID-19 spread. Most of the existing research related to the mobility of people and COVID-19 spread is focused on investigating highly frequented paths by analyzing data collected from mobile devices, which mainly refer to geo-positioning records. In contrast, this paper tackles the problem by studying mass mobility. The relations between daily mobility on public transport (subway or metro) in three big cities and mortality due to COVID-19 are investigated. Data collected for these purposes come from official sources, such as the web pages of the cities' local governments. To provide a systematic framework, we applied the IBM Foundational Methodology for Data Science to the epidemiological domain of this paper. Our analysis consists of moving averages with a moving window equal to seven days so as to avoid bias due to weekly tendencies. Among the main findings of this work are: a) New York City and Madrid show similar distribution on studied variables, which resemble a Gauss bell, in contrast to Mexico City, and b) Non-pharmaceutical interventions don't bring immediate results, and reductions to the number of deaths due to COVID are observed after a certain number of days. This paper yields partial evidence for assessing the effectiveness of public policies in mitigating the COVID-19 pandemic.


Assuntos
COVID-19/mortalidade , Meios de Transporte , Adulto , COVID-19/epidemiologia , Cidades/epidemiologia , Cidades/estatística & dados numéricos , Ciência de Dados/métodos , Modelos Epidemiológicos , Humanos , México/epidemiologia , Cidade de Nova Iorque/epidemiologia , Espanha/epidemiologia , Meios de Transporte/métodos , Meios de Transporte/estatística & dados numéricos
6.
J Biosci ; 472022.
Artigo em Inglês | MEDLINE | ID: mdl-35092414

RESUMO

Cooking forms the core of our cultural identity other than being the basis of nutrition and health. The increasing availability of culinary data and the advent of computational methods for their scrutiny are dramatically changing the artistic outlook towards gastronomy. Starting with a seemingly simple question, 'Why do we eat what we eat?', data-driven research conducted in our lab has led to interesting explorations of traditional recipes, their flavor composition, and health associations. Our investigations have revealed 'culinary fingerprints' of regional cuisines across the world. Application of data-driven strategies for investigating the gastronomic data has opened up exciting avenues, giving rise to an all-new field of 'computational gastronomy'. This emerging interdisciplinary science asks questions of culinary origin to seek their answers via the compilation of culinary data and their analysis using methods of complex systems, statistics, computer science, and artificial intelligence. Along with complementary experimental studies, these endeavors have the potential to transform the food landscape by effectively leveraging data-driven food innovations for better health and nutrition.


Assuntos
Culinária , Ciência de Dados/métodos , Alimentos , Fenômenos Fisiológicos da Nutrição , Culinária/métodos , Bases de Dados Factuais , Aromatizantes , Humanos , Paladar
7.
Cell ; 185(1): 1-3, 2022 01 06.
Artigo em Inglês | MEDLINE | ID: mdl-34995512

RESUMO

Psychiatric disease is one of the greatest health challenges of our time. The pipeline for conceptually novel therapeutics remains low, in part because uncovering the biological mechanisms of psychiatric disease has been difficult. We asked experts researching different aspects of psychiatric disease: what do you see as the major urgent questions that need to be addressed? Where are the next frontiers, and what are the current hurdles to understanding the biological basis of psychiatric disease?


Assuntos
Antidepressivos/uso terapêutico , Ciência de Dados/métodos , Depressão/tratamento farmacológico , Depressão/metabolismo , Transtorno Depressivo/tratamento farmacológico , Transtorno Depressivo/metabolismo , Genômica/métodos , Medicina de Precisão/métodos , Pesquisa Translacional Biomédica/métodos , Animais , Depressão/genética , Transtorno Depressivo/genética , Humanos , Neurônios/metabolismo , Córtex Pré-Frontal/metabolismo , Resultado do Tratamento
9.
Viruses ; 13(11)2021 10 21.
Artigo em Inglês | MEDLINE | ID: mdl-34834927

RESUMO

Bacteriophages are viruses that infect bacteria and are present in niches where bacteria thrive. In recent years, the suggested application areas of lytic bacteriophage have been expanded to include therapy, biocontrol, detection, sanitation, and remediation. However, phage application is constrained by the phage's host range-the range of bacterial hosts sensitive to the phage and the degree of infection. Even though phage isolation and enrichment techniques are straightforward protocols, the correlation between the enrichment technique and host range profile has not been evaluated. Agar-based methods such as spotting assay and efficiency of plaquing (EOP) are the most used methods to determine the phage host range. These methods, aside from being labor intensive, can lead to subjective and incomplete results as they rely on qualitative observations of the lysis/plaques, do not reflect the lytic activity in liquid culture, and can overestimate the host range. In this study, phages against three bacterial genera were isolated using three different enrichment methods. Host range profiles of the isolated phages were quantitatively determined using a high throughput turbidimetric protocol and the data were analyzed with an accessible analytic tool "PHIDA". Using this tool, the host ranges of 9 Listeria, 14 Salmonella, and 20 Pseudomonas phages isolated with different enrichment methods were quantitatively compared. A high variability in the host range index (HRi) ranging from 0.86-0.63, 0.07-0.24, and 0.00-0.67 for Listeria, Salmonella, and Pseudomonas phages, respectively, was observed. Overall, no direct correlation was found between the phage host range breadth and the enrichment method in any of the three target bacterial genera. The high throughput method and analytics tool developed in this study can be easily adapted to any phage study and can provide a consensus for phage host range determination.


Assuntos
Bacteriófagos/isolamento & purificação , Bacteriófagos/fisiologia , Ciência de Dados/métodos , Ensaios de Triagem em Larga Escala/métodos , Especificidade de Hospedeiro , Listeria/virologia , Pseudomonas/virologia , Salmonella/virologia , Software
11.
Nat Commun ; 12(1): 5757, 2021 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-34599181

RESUMO

The large amount of biomedical data derived from wearable sensors, electronic health records, and molecular profiling (e.g., genomics data) is rapidly transforming our healthcare systems. The increasing scale and scope of biomedical data not only is generating enormous opportunities for improving health outcomes but also raises new challenges ranging from data acquisition and storage to data analysis and utilization. To meet these challenges, we developed the Personal Health Dashboard (PHD), which utilizes state-of-the-art security and scalability technologies to provide an end-to-end solution for big biomedical data analytics. The PHD platform is an open-source software framework that can be easily configured and deployed to any big data health project to store, organize, and process complex biomedical data sets, support real-time data analysis at both the individual level and the cohort level, and ensure participant privacy at every step. In addition to presenting the system, we illustrate the use of the PHD framework for large-scale applications in emerging multi-omics disease studies, such as collecting and visualization of diverse data types (wearable, clinical, omics) at a personal level, investigation of insulin resistance, and an infrastructure for the detection of presymptomatic COVID-19.


Assuntos
Ciência de Dados/métodos , Sistemas Computadorizados de Registros Médicos , Big Data , Segurança Computacional , Análise de Dados , Interoperabilidade da Informação em Saúde , Humanos , Armazenamento e Recuperação da Informação , Software
12.
PLoS Biol ; 19(9): e3001398, 2021 09.
Artigo em Inglês | MEDLINE | ID: mdl-34555021

RESUMO

Hypothesis generation in observational, biomedical data science often starts with computing an association or identifying the statistical relationship between a dependent and an independent variable. However, the outcome of this process depends fundamentally on modeling strategy, with differing strategies generating what can be called "vibration of effects" (VoE). VoE is defined by variation in associations that often lead to contradictory results. Here, we present a computational tool capable of modeling VoE in biomedical data by fitting millions of different models and comparing their output. We execute a VoE analysis on a series of widely reported associations (e.g., carrot intake associated with eyesight) with an extended additional focus on lifestyle exposures (e.g., physical activity) and components of the Framingham Risk Score for cardiovascular health (e.g., blood pressure). We leveraged our tool for potential confounder identification, investigating what adjusting variables are responsible for conflicting models. We propose modeling VoE as a critical step in navigating discovery in observational data, discerning robust associations, and cataloging adjusting variables that impact model output.


Assuntos
Ciência de Dados/métodos , Modelos Estatísticos , Estudos Observacionais como Assunto/estatística & dados numéricos , Métodos Epidemiológicos , Humanos
13.
Crit Care Med ; 49(12): e1196-e1205, 2021 12 01.
Artigo em Inglês | MEDLINE | ID: mdl-34259450

RESUMO

OBJECTIVES: To train a model to predict vasopressor use in ICU patients with sepsis and optimize external performance across hospital systems using domain adaptation, a transfer learning approach. DESIGN: Observational cohort study. SETTING: Two academic medical centers from January 2014 to June 2017. PATIENTS: Data were analyzed from 14,512 patients (9,423 at the development site and 5,089 at the validation site) who were admitted to an ICU and met Center for Medicare and Medicaid Services definition of severe sepsis either before or during the ICU stay. Patients were excluded if they never developed sepsis, if the ICU length of stay was less than 8 hours or more than 20 days or if they developed shock up to the first 4 hours of ICU admission. MEASUREMENTS AND MAIN RESULTS: Forty retrospectively collected features from the electronic medical records of adult ICU patients at the development site (four hospitals) were used as inputs for a neural network Weibull-Cox survival model to derive a prediction tool for future need of vasopressors. Domain adaptation updated parameters to optimize model performance in the validation site (two hospitals), a different healthcare system over 2,000 miles away. The cohorts at both sites were randomly split into training and testing sets (80% and 20%, respectively). When applied to the test set in the development site, the model predicted vasopressor use 4-24 hours in advance with an area under the receiver operator characteristic curve, specificity, and positive predictive value ranging from 0.80 to 0.81, 56.2% to 61.8%, and 5.6% to 12.1%, respectively. Domain adaptation improved performance of the model to predict vasopressor use within 4 hours at the validation site (area under the receiver operator characteristic curve 0.81 [CI, 0.80-0.81] from 0.77 [CI, 0.76-0.77], p < 0.01; specificity 59.7% [CI, 58.9-62.5%] from 49.9% [CI, 49.5-50.7%], p < 0.01; positive predictive value 8.9% [CI, 8.5-9.4%] from 7.3 [7.1-7.4%], p < 0.01). CONCLUSIONS: Domain adaptation improved performance of a model predicting sepsis-associated vasopressor use during external validation.


Assuntos
Aceitação pelo Paciente de Cuidados de Saúde/estatística & dados numéricos , Sepse/tratamento farmacológico , Vasoconstritores/administração & dosagem , Estudos de Coortes , Ciência de Dados/métodos , Humanos , Unidades de Terapia Intensiva/organização & administração , Unidades de Terapia Intensiva/estatística & dados numéricos , Design de Software , Vasoconstritores/uso terapêutico
15.
Nature ; 595(7866): 181-188, 2021 07.
Artigo em Inglês | MEDLINE | ID: mdl-34194044

RESUMO

Computational social science is more than just large repositories of digital data and the computational methods needed to construct and analyse them. It also represents a convergence of different fields with different ways of thinking about and doing science. The goal of this Perspective is to provide some clarity around how these approaches differ from one another and to propose how they might be productively integrated. Towards this end we make two contributions. The first is a schema for thinking about research activities along two dimensions-the extent to which work is explanatory, focusing on identifying and estimating causal effects, and the degree of consideration given to testing predictions of outcomes-and how these two priorities can complement, rather than compete with, one another. Our second contribution is to advocate that computational social scientists devote more attention to combining prediction and explanation, which we call integrative modelling, and to outline some practical suggestions for realizing this goal.


Assuntos
Simulação por Computador , Ciência de Dados/métodos , Previsões/métodos , Modelos Teóricos , Ciências Sociais/métodos , Objetivos , Humanos
16.
J Acquir Immune Defic Syndr ; 87(Suppl 1): S28-S35, 2021 08 01.
Artigo em Inglês | MEDLINE | ID: mdl-34166310

RESUMO

BACKGROUND AND SETTING: Electronic data capture facilitates timely use of data. Population-based HIV impact assessments (PHIAs) were led by host governments, with funding from the President's Emergency Plan for AIDS Relief, technical assistance from the Centers for Disease Control, and implementation support from ICAP at Columbia University. We described data architectures, code-based processes, and resulting data volume and quality for 14 national PHIA surveys with concurrent timelines and varied country-level data governance (2015-2020). METHODS: PHIA project data were collected through tablets, point-of-care and laboratory testing instruments, and inventory management systems, using open-source software, vendor solutions, and custom-built software. Data were securely uploaded to the PHIA data warehouse daily or weekly and then used to populate survey-monitoring dashboards and return timely laboratory-based test results on an ongoing basis. Automated data processing allowed timely reporting of survey results. RESULTS: Fourteen data architectures were successfully established, and data from more than 450,000 participants in 30,000 files across 13 countries with completed PHIAs, and blood draws producing approximately 6000 aliquots each week per country, were securely collected, transmitted, and processed by 17 full-time equivalent staff. More than 25,600 viral load results were returned to clinics of participants' choice. Data cleaning was not needed for 98.5% of household and 99.2% of individual questionnaires. CONCLUSION: The PHIA data architecture permitted secure, simultaneous collection and transmission of high-quality interview and biomarker data across multiple countries, quick turnaround time of laboratory-based biomarker results, and rapid dissemination of survey outcomes to guide President's Emergency Plan for AIDS Relief epidemic control.


Assuntos
Ciência de Dados/métodos , Infecções por HIV/epidemiologia , HIV-1 , Inquéritos Epidemiológicos , Fármacos Anti-HIV/uso terapêutico , Países em Desenvolvimento , Monitoramento Epidemiológico , Humanos , Cooperação Internacional , Manejo de Espécimes , Carga Viral
17.
Fam Syst Health ; 39(1): 66-76, 2021 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-34014731

RESUMO

INTRODUCTION: Transforming administrative health care data into meaningful metrics has been critical to the implementation of the Department of Defense's Primary Care Behavioral Health (PCBH) program. METHODS: Data from clinical encounters with PCBH providers are used to develop metrics of program performance collaboratively. Metrics focus on describing the PCBH program and patients, provider fidelity to the model, and provider performance. These metrics form two key deliverables: a monitoring dashboard for program managers and a training dashboard for expert trainers conducting site visits. RESULTS: Behavioral health consultants (BHCs) conducted nearly 200,000 encounters with more than 100,000 unique patients in fiscal year 2019 at more than 170 locations in 6 countries and 37 states. Administrative data derived from these encounters were used to create a variety of metrics that describe practice and performance at both the provider and program levels. These metrics are delivered through a variety of analytic products to stakeholders who use that information to make data-driven decisions about program direction and provider training. DISCUSSION: We discuss examples of program management decisions and expert trainer actions based on these dashboards, highlighting the benefits of continued collaboration between analysts and program managers. Specifically, excerpts from several dashboards illustrate how penetration and productivity metrics yield specific, tailored action plans to improve care delivery and provider performance. (PsycInfo Database Record (c) 2021 APA, all rights reserved).


Assuntos
Ciência de Dados/métodos , Atenção à Saúde/métodos , Serviços de Saúde Mental/estatística & dados numéricos , Adolescente , Adulto , Idoso , Criança , Pré-Escolar , Ciência de Dados/estatística & dados numéricos , Atenção à Saúde/estatística & dados numéricos , Prestação Integrada de Cuidados de Saúde/métodos , Prestação Integrada de Cuidados de Saúde/estatística & dados numéricos , Feminino , Humanos , Lactente , Informática/instrumentação , Informática/métodos , Masculino , Pessoa de Meia-Idade , Atenção Primária à Saúde/métodos , Atenção Primária à Saúde/estatística & dados numéricos , Estados Unidos , United States Department of Defense
18.
Nurs Philos ; 22(3): e12347, 2021 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-33979474

RESUMO

In this paper we argue that 'informed' consent in Big Data genomic biobanking is frequently less than optimally informative. This is due to the particular features of genomic biobanking research which render it ethically problematic. We discuss these features together with details of consent models aimed to address them. Using insights from consent theory, we provide a detailed analysis of the essential components of informed consent which includes recommendations to improve consent performance. In addition, and using insights from philosophy of mind and language and psycholinguistics we support our analyses by identifying the nature and function of concepts (ideas) operational in human cognition and language together with an implicit coding/decoding model of human communication. We identify this model as the source of patients/participants poor understanding. We suggest an alternative, explicit model of human communication, namely, that of relevance-theoretic inference which obviates the limitations of the code model. We suggest practical strategies to assist health service professionals to ensure that the specific information they provide concerning the proposed treatment or research is used to inform participants' decision to consent. We do not prescribe a standard, formal approach to decision-making where boxes are ticked; rather, we aim to focus attention towards the sorts of considerations and questions that might usefully be borne in mind in any consent situation. We hope that our theorising will be of real practical benefit to nurses and midwives working on the clinical and research front-line of genomic science.


Assuntos
Ciência de Dados/métodos , Genômica/ética , Consentimento Livre e Esclarecido/ética , Ciência de Dados/normas , Genômica/tendências , Humanos , Consentimento Livre e Esclarecido/normas , Participação do Paciente/psicologia
19.
Sci Rep ; 11(1): 10209, 2021 05 13.
Artigo em Inglês | MEDLINE | ID: mdl-33986378

RESUMO

Competition for resources is a key question in the study of our early human evolution. From the first hominin groups, carnivores have played a fundamental role in the ecosystem. From this perspective, understanding the trophic pressure between hominins and carnivores can provide valuable insights into the context in which humans survived, interacted with their surroundings, and consequently evolved. While numerous techniques already exist for the detection of carnivore activity in archaeological and palaeontological sites, many of these techniques present important limitations. The present study builds on a number of advanced data science techniques to confront these issues, defining methods for the identification of the precise agents involved in carcass consumption and manipulation. For the purpose of this study, a large sample of 620 carnivore tooth pits is presented, including samples from bears, hyenas, jaguars, leopards, lions, wolves, foxes and African wild dogs. Using 3D modelling, geometric morphometrics, robust data modelling, and artificial intelligence algorithms, the present study obtains between 88 and 98% accuracy, with balanced overall evaluation metrics across all datasets. From this perspective, and when combined with other sources of taphonomic evidence, these results show that advanced data science techniques can be considered a valuable addition to the taphonomist's toolkit for the identification of precise carnivore agents via tooth pit morphology.


Assuntos
Ciência de Dados/métodos , Paleontologia/métodos , Dente/anatomia & histologia , Animais , Arqueologia/métodos , Inteligência Artificial , Osso e Ossos/anatomia & histologia , Carnívoros , Biologia Computacional/métodos , Fósseis , Hominidae , Humanos , Modelos Estatísticos
20.
PLoS One ; 16(5): e0252147, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34019581

RESUMO

BACKGROUND: The WHO announced the epidemic of SARS-CoV2 as a public health emergency of international concern on 30th January 2020. To date, it has spread to more than 200 countries and has been declared a global pandemic. For appropriate preparedness, containment, and mitigation response, the stakeholders and policymakers require prior guidance on the propagation of SARS-CoV2. METHODOLOGY: This study aims to provide such guidance by forecasting the cumulative COVID-19 cases up to 4 weeks ahead for 187 countries, using four data-driven methodologies; autoregressive integrated moving average (ARIMA), exponential smoothing model (ETS), and random walk forecasts (RWF) with and without drift. For these forecasts, we evaluate the accuracy and systematic errors using the Mean Absolute Percentage Error (MAPE) and Mean Absolute Error (MAE), respectively. FINDINGS: The results show that the ARIMA and ETS methods outperform the other two forecasting methods. Additionally, using these forecasts, we generate heat maps to provide a pictorial representation of the countries at risk of having an increase in the cases in the coming 4 weeks of February 2021. CONCLUSION: Due to limited data availability during the ongoing pandemic, less data-hungry short-term forecasting models, like ARIMA and ETS, can help in anticipating the future outbreaks of SARS-CoV2.


Assuntos
COVID-19/epidemiologia , Ciência de Dados/métodos , Modelos Estatísticos , Ciência de Dados/normas , Humanos , Guias de Prática Clínica como Assunto , Software/normas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA