Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.211
Filtrar
Mais filtros

Intervalo de ano de publicação
1.
Cell ; 184(6): 1415-1419, 2021 03 18.
Artigo em Inglês | MEDLINE | ID: mdl-33740447

RESUMO

Precision medicine promises improved health by accounting for individual variability in genes, environment, and lifestyle. Precision medicine will continue to transform healthcare in the coming decade as it expands in key areas: huge cohorts, artificial intelligence (AI), routine clinical genomics, phenomics and environment, and returning value across diverse populations.


Assuntos
Atenção à Saúde , Medicina de Precisão , Inteligência Artificial , Big Data , Pesquisa Biomédica , Diversidade Cultural , Registros Eletrônicos de Saúde , Humanos , Fenômica
2.
Cell ; 180(1): 9-14, 2020 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-31951522

RESUMO

This commentary introduces a new clinical trial construct, the Master Observational Trial (MOT), which hybridizes the power of molecularly based master interventional protocols with the breadth of real-world data. The MOT provides a clinical venue to allow molecular medicine to rapidly advance, answers questions that traditional interventional trials generally do not address, and seamlessly integrates with interventional trials in both diagnostic and therapeutic arenas. The result is a more comprehensive data collection ecosystem in precision medicine.


Assuntos
Estudos Observacionais como Assunto/métodos , Medicina de Precisão/métodos , Projetos de Pesquisa/normas , Big Data , Protocolos de Ensaio Clínico como Assunto , Humanos , Terapia de Alvo Molecular/métodos , Terapia de Alvo Molecular/tendências , Estudos Observacionais como Assunto/normas
3.
Cell ; 176(3): 649-662.e20, 2019 01 24.
Artigo em Inglês | MEDLINE | ID: mdl-30661755

RESUMO

The body-wide human microbiome plays a role in health, but its full diversity remains uncharacterized, particularly outside of the gut and in international populations. We leveraged 9,428 metagenomes to reconstruct 154,723 microbial genomes (45% of high quality) spanning body sites, ages, countries, and lifestyles. We recapitulated 4,930 species-level genome bins (SGBs), 77% without genomes in public repositories (unknown SGBs [uSGBs]). uSGBs are prevalent (in 93% of well-assembled samples), expand underrepresented phyla, and are enriched in non-Westernized populations (40% of the total SGBs). We annotated 2.85 M genes in SGBs, many associated with conditions including infant development (94,000) or Westernization (106,000). SGBs and uSGBs permit deeper microbiome analyses and increase the average mappability of metagenomic reads from 67.76% to 87.51% in the gut (median 94.26%) and 65.14% to 82.34% in the mouth. We thus identify thousands of microbial genomes from yet-to-be-named species, expand the pangenomes of human-associated microbes, and allow better exploitation of metagenomic technologies.


Assuntos
Metagenoma/genética , Metagenômica/métodos , Microbiota/genética , Big Data , Variação Genética/genética , Geografia , Humanos , Estilo de Vida , Filogenia , Análise de Sequência de DNA/métodos
4.
Annu Rev Neurosci ; 43: 441-464, 2020 07 08.
Artigo em Inglês | MEDLINE | ID: mdl-32283996

RESUMO

As acquiring bigger data becomes easier in experimental brain science, computational and statistical brain science must achieve similar advances to fully capitalize on these data. Tackling these problems will benefit from a more explicit and concerted effort to work together. Specifically, brain science can be further democratized by harnessing the power of community-driven tools, which both are built by and benefit from many different people with different backgrounds and expertise. This perspective can be applied across modalities and scales and enables collaborations across previously siloed communities.


Assuntos
Big Data , Encéfalo/fisiologia , Biologia Computacional , Rede Nervosa/fisiologia , Animais , Biologia Computacional/métodos , Bases de Dados Genéticas , Expressão Gênica/fisiologia , Humanos
5.
Nat Methods ; 21(9): 1597-1602, 2024 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-39174710

RESUMO

Over the last decade, biology has begun utilizing 'big data' approaches, resulting in large, comprehensive atlases in modalities ranging from transcriptomics to neural connectomics. However, these approaches must be complemented and integrated with 'small data' approaches to efficiently utilize data from individual labs. Integration of smaller datasets with major reference atlases is critical to provide context to individual experiments, and approaches toward integration of large and small data have been a major focus in many fields in recent years. Here we discuss progress in integration of small data with consortium-sized atlases across multiple modalities, and its potential applications. We then examine promising future directions for utilizing the power of small data to maximize the information garnered from small-scale experiments. We envision that, in the near future, international consortia comprising many laboratories will work together to collaboratively build reference atlases and foundation models using small data methods.


Assuntos
Genômica , Humanos , Genômica/métodos , Big Data , Animais , Conectoma/métodos , Biologia Computacional/métodos
6.
Nature ; 600(7890): 695-700, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34880504

RESUMO

Surveys are a crucial tool for understanding public opinion and behaviour, and their accuracy depends on maintaining statistical representativeness of their target populations by minimizing biases from all sources. Increasing data size shrinks confidence intervals but magnifies the effect of survey bias: an instance of the Big Data Paradox1. Here we demonstrate this paradox in estimates of first-dose COVID-19 vaccine uptake in US adults from 9 January to 19 May 2021 from two large surveys: Delphi-Facebook2,3 (about 250,000 responses per week) and Census Household Pulse4 (about 75,000 every two weeks). In May 2021, Delphi-Facebook overestimated uptake by 17 percentage points (14-20 percentage points with 5% benchmark imprecision) and Census Household Pulse by 14 (11-17 percentage points with 5% benchmark imprecision), compared to a retroactively updated benchmark the Centers for Disease Control and Prevention published on 26 May 2021. Moreover, their large sample sizes led to miniscule margins of error on the incorrect estimates. By contrast, an Axios-Ipsos online panel5 with about 1,000 responses per week following survey research best practices6 provided reliable estimates and uncertainty quantification. We decompose observed error using a recent analytic framework1 to explain the inaccuracy in the three surveys. We then analyse the implications for vaccine hesitancy and willingness. We show how a survey of 250,000 respondents can produce an estimate of the population mean that is no more accurate than an estimate from a simple random sample of size 10. Our central message is that data quality matters more than data quantity, and that compensating the former with the latter is a mathematically provable losing proposition.


Assuntos
Vacinas contra COVID-19/administração & dosagem , Pesquisas sobre Atenção à Saúde , Vacinação/estatística & dados numéricos , Benchmarking , Viés , Big Data , COVID-19/epidemiologia , COVID-19/prevenção & controle , Centers for Disease Control and Prevention, U.S. , Conjuntos de Dados como Assunto/normas , Feminino , Pesquisas sobre Atenção à Saúde/normas , Humanos , Masculino , Projetos de Pesquisa , Tamanho da Amostra , Mídias Sociais , Estados Unidos/epidemiologia , Hesitação Vacinal/estatística & dados numéricos
7.
Proc Natl Acad Sci U S A ; 121(39): e2402387121, 2024 Sep 24.
Artigo em Inglês | MEDLINE | ID: mdl-39288180

RESUMO

New data sources and AI methods for extracting information are increasingly abundant and relevant to decision-making across societal applications. A notable example is street view imagery, available in over 100 countries, and purported to inform built environment interventions (e.g., adding sidewalks) for community health outcomes. However, biases can arise when decision-making does not account for data robustness or relies on spurious correlations. To investigate this risk, we analyzed 2.02 million Google Street View (GSV) images alongside health, demographic, and socioeconomic data from New York City. Findings demonstrate robustness challenges; built environment characteristics inferred from GSV labels at the intracity level often do not align with ground truth. Moreover, as average individual-level behavior of physical inactivity significantly mediates the impact of built environment features by census tract, intervention on features measured by GSV would be misestimated without proper model specification and consideration of this mediation mechanism. Using a causal framework accounting for these mediators, we determined that intervening by improving 10% of samples in the two lowest tertiles of physical inactivity would lead to a 4.17 (95% CI 3.84-4.55) or 17.2 (95% CI 14.4-21.3) times greater decrease in the prevalence of obesity or diabetes, respectively, compared to the same proportional intervention on the number of crosswalks by census tract. This study highlights critical issues of robustness and model specification in using emergent data sources, showing the data may not measure what is intended, and ignoring mediators can result in biased intervention effect estimates.


Assuntos
Big Data , Tomada de Decisões , Saúde Pública , Humanos , Cidade de Nova Iorque , Ambiente Construído , Masculino , Feminino
8.
Brief Bioinform ; 25(Supplement_1)2024 Jul 23.
Artigo em Inglês | MEDLINE | ID: mdl-39376084

RESUMO

Biomedical data are growing exponentially in both volume and levels of complexity, due to the rapid advancement of technologies and research methodologies. Analyzing these large datasets, referred to collectively as "big data," has become an integral component of research that guides experimentation-driven discovery and a new engine of discovery itself as it uncovers previously unknown connections through mining of existing data. To fully realize the potential of big data, biomedical researchers need access to high-performance-computing (HPC) resources. However, supporting on-premises infrastructure that keeps up with these consistently expanding research needs presents persistent financial and staffing challenges, even for well-resourced institutions. For other institutions, including primarily undergraduate institutions and minority serving institutions, that educate a large portion of the future workforce in the USA, this challenge presents an insurmountable barrier. Therefore, new approaches are needed to provide broad and equitable access to HPC resources to biomedical researchers and students who will advance biomedical research in the future.


Assuntos
Pesquisa Biomédica , Computação em Nuvem , Humanos , Big Data , Biologia Computacional/métodos , Biologia Computacional/educação , Software , Estados Unidos
9.
Brief Bioinform ; 25(3)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38711370

RESUMO

Across many scientific disciplines, the development of computational models and algorithms for generating artificial or synthetic data is gaining momentum. In biology, there is a great opportunity to explore this further as more and more big data at multi-omics level are generated recently. In this opinion, we discuss the latest trends in biological applications based on process-driven and data-driven aspects. Moving ahead, we believe these methodologies can help shape novel multi-omics-scale cellular inferences.


Assuntos
Algoritmos , Biologia Computacional , Biologia Computacional/métodos , Genômica/métodos , Humanos , Big Data , Proteômica/métodos , Multiômica
10.
PLoS Biol ; 21(9): e3002306, 2023 09.
Artigo em Inglês | MEDLINE | ID: mdl-37751414

RESUMO

Over the past 20 years, neuroscience has been propelled forward by theory-driven experimentation. We consider the future outlook for the field in the age of big neural data and powerful artificial intelligence models.


Assuntos
Inteligência Artificial , Neurociências , Big Data , Pesquisa Empírica , Projetos de Pesquisa
11.
PLoS Biol ; 21(4): e3002116, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-37099620

RESUMO

Since its inception, synthetic biology has overcome many technical barriers but is at a crossroads for high-precision biological design. Devising ways to fully utilize big biological data may be the key to achieving greater heights in synthetic biology.


Assuntos
Big Data , Biologia Sintética
12.
Nucleic Acids Res ; 52(D1): D18-D32, 2024 Jan 05.
Artigo em Inglês | MEDLINE | ID: mdl-38018256

RESUMO

The National Genomics Data Center (NGDC), which is a part of the China National Center for Bioinformation (CNCB), provides a family of database resources to support the global academic and industrial communities. With the rapid accumulation of multi-omics data at an unprecedented pace, CNCB-NGDC continuously expands and updates core database resources through big data archiving, integrative analysis and value-added curation. Importantly, NGDC collaborates closely with major international databases and initiatives to ensure seamless data exchange and interoperability. Over the past year, significant efforts have been dedicated to integrating diverse omics data, synthesizing expanding knowledge, developing new resources, and upgrading major existing resources. Particularly, several database resources are newly developed for the biodiversity of protists (P10K), bacteria (NTM-DB, MPA) as well as plant (PPGR, SoyOmics, PlantPan) and disease/trait association (CROST, HervD Atlas, HALL, MACdb, BioKA, BioKA, RePoS, PGG.SV, NAFLDkb). All the resources and services are publicly accessible at https://ngdc.cncb.ac.cn.


Assuntos
Biologia Computacional , Bases de Dados Genéticas , Genômica , Big Data , China , Bases de Dados Genéticas/tendências , Eucariotos , Internet
13.
Bioinformatics ; 40(2)2024 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-38273677

RESUMO

MOTIVATION: Given the widespread use of the variant call format (VCF/BCF) coupled with continuous surge in big data, there remains a perpetual demand for fast and flexible methods to manipulate these comprehensive formats across various programming languages. RESULTS: This work presents vcfpp, a C++ API of HTSlib in a single file, providing an intuitive interface to manipulate VCF/BCF files rapidly and safely, in addition to being portable. Moreover, this work introduces the vcfppR package to demonstrate the development of a high-performance R package with vcfpp, allowing for rapid and straightforward variants analyses. AVAILABILITY AND IMPLEMENTATION: vcfpp is available from https://github.com/Zilong-Li/vcfpp under MIT license. vcfppR is available from https://cran.r-project.org/web/packages/vcfppR.


Assuntos
Linguagens de Programação , Software , Big Data
14.
Annu Rev Biomed Eng ; 26(1): 529-560, 2024 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-38594947

RESUMO

Despite the remarkable advances in cancer diagnosis, treatment, and management over the past decade, malignant tumors remain a major public health problem. Further progress in combating cancer may be enabled by personalizing the delivery of therapies according to the predicted response for each individual patient. The design of personalized therapies requires the integration of patient-specific information with an appropriate mathematical model of tumor response. A fundamental barrier to realizing this paradigm is the current lack of a rigorous yet practical mathematical theory of tumor initiation, development, invasion, and response to therapy. We begin this review with an overview of different approaches to modeling tumor growth and treatment, including mechanistic as well as data-driven models based on big data and artificial intelligence. We then present illustrative examples of mathematical models manifesting their utility and discuss the limitations of stand-alone mechanistic and data-driven models. We then discuss the potential of mechanistic models for not only predicting but also optimizing response to therapy on a patient-specific basis. We describe current efforts and future possibilities to integrate mechanistic and data-driven models. We conclude by proposing five fundamental challenges that must be addressed to fully realize personalized care for cancer patients driven by computational models.


Assuntos
Inteligência Artificial , Big Data , Neoplasias , Medicina de Precisão , Humanos , Neoplasias/terapia , Medicina de Precisão/métodos , Simulação por Computador , Modelos Biológicos , Modelagem Computacional Específica para o Paciente
18.
Nature ; 566(7743): 195-204, 2019 02.
Artigo em Inglês | MEDLINE | ID: mdl-30760912

RESUMO

Machine learning approaches are increasingly used to extract patterns and insights from the ever-increasing stream of geospatial data, but current approaches may not be optimal when system behaviour is dominated by spatial or temporal context. Here, rather than amending classical machine learning, we argue that these contextual cues should be used as part of deep learning (an approach that is able to extract spatio-temporal features automatically) to gain further process understanding of Earth system science problems, improving the predictive ability of seasonal forecasting and modelling of long-range spatial connections across multiple timescales, for example. The next step will be a hybrid modelling approach, coupling physical process models with the versatility of data-driven machine learning.


Assuntos
Big Data , Simulação por Computador , Aprendizado Profundo , Ciências da Terra/métodos , Previsões/métodos , Reconhecimento Automatizado de Padrão/métodos , Reconhecimento Facial , Feminino , Mapeamento Geográfico , Humanos , Conhecimento , Regressão Psicológica , Reprodutibilidade dos Testes , Estações do Ano , Análise Espaço-Temporal , Fatores de Tempo , Tradução , Incerteza , Tempo (Meteorologia)
19.
Am J Epidemiol ; 193(9): 1211-1214, 2024 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-38751306

RESUMO

Many examples of the use of real-world data in the area of pharmacoepidemiology include "big data," such as insurance claims, medical records, or hospital discharge databases. However, "big" is not always better, particularly when studying outcomes with narrow windows of etiologic relevance. Birth defects are such an outcome, for which specificity of exposure timing is critical. Studies with primary data collection can be designed to query details about the timing of medication use, as well as type, dose, frequency, duration, and indication, that can better characterize the "real world." Because birth defects are rare, etiologic studies are typically case­control in design, like the National Birth Defects Prevention Study, Birth Defects Study to Evaluate Pregnancy Exposures, and Slone Birth Defects Study. Recall bias can be a concern, but the ability to collect detailed information about both prescription and over-the-counter medication use and other exposures such as diet, family history, and sociodemographic factors is a distinct advantage over claims and medical record data sources. Case­control studies with primary data collection are essential to advancing the pharmacoepidemiology of birth defects. This article is part of a Special Collection on Pharmacoepidemiology.


Assuntos
Anormalidades Congênitas , Farmacoepidemiologia , Humanos , Gravidez , Feminino , Farmacoepidemiologia/métodos , Anormalidades Congênitas/epidemiologia , Big Data , Anormalidades Induzidas por Medicamentos/epidemiologia , Coleta de Dados/métodos , Estudos de Casos e Controles
20.
Oncologist ; 29(5): 415-421, 2024 May 03.
Artigo em Inglês | MEDLINE | ID: mdl-38330451

RESUMO

PURPOSE: Immune checkpoint inhibitors (ICIs) have significantly improved the survival of patients with cancer and provided long-term durable benefit. However, ICI-treated patients develop a range of toxicities known as immune-related adverse events (irAEs), which could compromise clinical benefits from these treatments. As the incidence and spectrum of irAEs differs across cancer types and ICI agents, it is imperative to characterize the incidence and spectrum of irAEs in a pan-cancer cohort to aid clinical management. DESIGN: We queried >400 000 trials registered at ClinicalTrials.gov and retrieved a comprehensive pan-cancer database of 71 087 ICI-treated participants from 19 cancer types and 7 ICI agents. We performed data harmonization and cleaning of these trial results into 293 harmonized adverse event categories using Medical Dictionary for Regulatory Activities. RESULTS: We developed irAExplorer (https://irae.tanlab.org), an interactive database that focuses on adverse events in patients administered with ICIs from big data mining. irAExplorer encompasses 71 087 distinct clinical trial participants from 343 clinical trials across 19 cancer types with well-annotated ICI treatment regimens and harmonized adverse event categories. We demonstrated a few of the irAE analyses through irAExplorer and highlighted some associations between treatment- or cancer-specific irAEs. CONCLUSION: The irAExplorer is a user-friendly resource that offers exploration, validation, and discovery of treatment- or cancer-specific irAEs across pan-cancer cohorts. We envision that irAExplorer can serve as a valuable resource to cross-validate users' internal datasets to increase the robustness of their findings.


Assuntos
Ensaios Clínicos como Assunto , Mineração de Dados , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Inibidores de Checkpoint Imunológico , Neoplasias , Humanos , Inibidores de Checkpoint Imunológico/efeitos adversos , Inibidores de Checkpoint Imunológico/uso terapêutico , Neoplasias/tratamento farmacológico , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos/epidemiologia , Big Data , Bases de Dados Factuais/estatística & dados numéricos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA