Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 3.187
Filtrar
Más filtros

Intervalo de año de publicación
1.
Cell ; 184(6): 1415-1419, 2021 03 18.
Artículo en Inglés | MEDLINE | ID: mdl-33740447

RESUMEN

Precision medicine promises improved health by accounting for individual variability in genes, environment, and lifestyle. Precision medicine will continue to transform healthcare in the coming decade as it expands in key areas: huge cohorts, artificial intelligence (AI), routine clinical genomics, phenomics and environment, and returning value across diverse populations.


Asunto(s)
Atención a la Salud , Medicina de Precisión , Inteligencia Artificial , Macrodatos , Investigación Biomédica , Diversidad Cultural , Registros Electrónicos de Salud , Humanos , Fenómica
2.
Cell ; 180(1): 9-14, 2020 01 09.
Artículo en Inglés | MEDLINE | ID: mdl-31951522

RESUMEN

This commentary introduces a new clinical trial construct, the Master Observational Trial (MOT), which hybridizes the power of molecularly based master interventional protocols with the breadth of real-world data. The MOT provides a clinical venue to allow molecular medicine to rapidly advance, answers questions that traditional interventional trials generally do not address, and seamlessly integrates with interventional trials in both diagnostic and therapeutic arenas. The result is a more comprehensive data collection ecosystem in precision medicine.


Asunto(s)
Estudios Observacionales como Asunto/métodos , Medicina de Precisión/métodos , Proyectos de Investigación/normas , Macrodatos , Protocolos de Ensayos Clínicos como Asunto , Humanos , Terapia Molecular Dirigida/métodos , Terapia Molecular Dirigida/tendencias , Estudios Observacionales como Asunto/normas
3.
Cell ; 176(3): 649-662.e20, 2019 01 24.
Artículo en Inglés | MEDLINE | ID: mdl-30661755

RESUMEN

The body-wide human microbiome plays a role in health, but its full diversity remains uncharacterized, particularly outside of the gut and in international populations. We leveraged 9,428 metagenomes to reconstruct 154,723 microbial genomes (45% of high quality) spanning body sites, ages, countries, and lifestyles. We recapitulated 4,930 species-level genome bins (SGBs), 77% without genomes in public repositories (unknown SGBs [uSGBs]). uSGBs are prevalent (in 93% of well-assembled samples), expand underrepresented phyla, and are enriched in non-Westernized populations (40% of the total SGBs). We annotated 2.85 M genes in SGBs, many associated with conditions including infant development (94,000) or Westernization (106,000). SGBs and uSGBs permit deeper microbiome analyses and increase the average mappability of metagenomic reads from 67.76% to 87.51% in the gut (median 94.26%) and 65.14% to 82.34% in the mouth. We thus identify thousands of microbial genomes from yet-to-be-named species, expand the pangenomes of human-associated microbes, and allow better exploitation of metagenomic technologies.


Asunto(s)
Metagenoma/genética , Metagenómica/métodos , Microbiota/genética , Macrodatos , Variación Genética/genética , Geografía , Humanos , Estilo de Vida , Filogenia , Análisis de Secuencia de ADN/métodos
4.
Annu Rev Neurosci ; 43: 441-464, 2020 07 08.
Artículo en Inglés | MEDLINE | ID: mdl-32283996

RESUMEN

As acquiring bigger data becomes easier in experimental brain science, computational and statistical brain science must achieve similar advances to fully capitalize on these data. Tackling these problems will benefit from a more explicit and concerted effort to work together. Specifically, brain science can be further democratized by harnessing the power of community-driven tools, which both are built by and benefit from many different people with different backgrounds and expertise. This perspective can be applied across modalities and scales and enables collaborations across previously siloed communities.


Asunto(s)
Macrodatos , Encéfalo/fisiología , Biología Computacional , Red Nerviosa/fisiología , Animales , Biología Computacional/métodos , Bases de Datos Genéticas , Expresión Génica/fisiología , Humanos
5.
Nature ; 600(7890): 695-700, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34880504

RESUMEN

Surveys are a crucial tool for understanding public opinion and behaviour, and their accuracy depends on maintaining statistical representativeness of their target populations by minimizing biases from all sources. Increasing data size shrinks confidence intervals but magnifies the effect of survey bias: an instance of the Big Data Paradox1. Here we demonstrate this paradox in estimates of first-dose COVID-19 vaccine uptake in US adults from 9 January to 19 May 2021 from two large surveys: Delphi-Facebook2,3 (about 250,000 responses per week) and Census Household Pulse4 (about 75,000 every two weeks). In May 2021, Delphi-Facebook overestimated uptake by 17 percentage points (14-20 percentage points with 5% benchmark imprecision) and Census Household Pulse by 14 (11-17 percentage points with 5% benchmark imprecision), compared to a retroactively updated benchmark the Centers for Disease Control and Prevention published on 26 May 2021. Moreover, their large sample sizes led to miniscule margins of error on the incorrect estimates. By contrast, an Axios-Ipsos online panel5 with about 1,000 responses per week following survey research best practices6 provided reliable estimates and uncertainty quantification. We decompose observed error using a recent analytic framework1 to explain the inaccuracy in the three surveys. We then analyse the implications for vaccine hesitancy and willingness. We show how a survey of 250,000 respondents can produce an estimate of the population mean that is no more accurate than an estimate from a simple random sample of size 10. Our central message is that data quality matters more than data quantity, and that compensating the former with the latter is a mathematically provable losing proposition.


Asunto(s)
Vacunas contra la COVID-19/administración & dosificación , Encuestas de Atención de la Salud , Vacunación/estadística & datos numéricos , Benchmarking , Sesgo , Macrodatos , COVID-19/epidemiología , COVID-19/prevención & control , Centers for Disease Control and Prevention, U.S. , Conjuntos de Datos como Asunto/normas , Femenino , Encuestas de Atención de la Salud/normas , Humanos , Masculino , Proyectos de Investigación , Tamaño de la Muestra , Medios de Comunicación Sociales , Estados Unidos/epidemiología , Vacilación a la Vacunación/estadística & datos numéricos
6.
Brief Bioinform ; 25(3)2024 Mar 27.
Artículo en Inglés | MEDLINE | ID: mdl-38711370

RESUMEN

Across many scientific disciplines, the development of computational models and algorithms for generating artificial or synthetic data is gaining momentum. In biology, there is a great opportunity to explore this further as more and more big data at multi-omics level are generated recently. In this opinion, we discuss the latest trends in biological applications based on process-driven and data-driven aspects. Moving ahead, we believe these methodologies can help shape novel multi-omics-scale cellular inferences.


Asunto(s)
Algoritmos , Biología Computacional , Biología Computacional/métodos , Genómica/métodos , Humanos , Macrodatos , Proteómica/métodos , Multiómica
7.
PLoS Biol ; 21(9): e3002306, 2023 09.
Artículo en Inglés | MEDLINE | ID: mdl-37751414

RESUMEN

Over the past 20 years, neuroscience has been propelled forward by theory-driven experimentation. We consider the future outlook for the field in the age of big neural data and powerful artificial intelligence models.


Asunto(s)
Inteligencia Artificial , Neurociencias , Macrodatos , Investigación Empírica , Proyectos de Investigación
8.
PLoS Biol ; 21(4): e3002116, 2023 04.
Artículo en Inglés | MEDLINE | ID: mdl-37099620

RESUMEN

Since its inception, synthetic biology has overcome many technical barriers but is at a crossroads for high-precision biological design. Devising ways to fully utilize big biological data may be the key to achieving greater heights in synthetic biology.


Asunto(s)
Macrodatos , Biología Sintética
9.
Nucleic Acids Res ; 52(D1): D18-D32, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-38018256

RESUMEN

The National Genomics Data Center (NGDC), which is a part of the China National Center for Bioinformation (CNCB), provides a family of database resources to support the global academic and industrial communities. With the rapid accumulation of multi-omics data at an unprecedented pace, CNCB-NGDC continuously expands and updates core database resources through big data archiving, integrative analysis and value-added curation. Importantly, NGDC collaborates closely with major international databases and initiatives to ensure seamless data exchange and interoperability. Over the past year, significant efforts have been dedicated to integrating diverse omics data, synthesizing expanding knowledge, developing new resources, and upgrading major existing resources. Particularly, several database resources are newly developed for the biodiversity of protists (P10K), bacteria (NTM-DB, MPA) as well as plant (PPGR, SoyOmics, PlantPan) and disease/trait association (CROST, HervD Atlas, HALL, MACdb, BioKA, BioKA, RePoS, PGG.SV, NAFLDkb). All the resources and services are publicly accessible at https://ngdc.cncb.ac.cn.


Asunto(s)
Biología Computacional , Bases de Datos Genéticas , Genómica , Macrodatos , China , Bases de Datos Genéticas/tendencias , Eucariontes , Internet
10.
Bioinformatics ; 40(2)2024 02 01.
Artículo en Inglés | MEDLINE | ID: mdl-38273677

RESUMEN

MOTIVATION: Given the widespread use of the variant call format (VCF/BCF) coupled with continuous surge in big data, there remains a perpetual demand for fast and flexible methods to manipulate these comprehensive formats across various programming languages. RESULTS: This work presents vcfpp, a C++ API of HTSlib in a single file, providing an intuitive interface to manipulate VCF/BCF files rapidly and safely, in addition to being portable. Moreover, this work introduces the vcfppR package to demonstrate the development of a high-performance R package with vcfpp, allowing for rapid and straightforward variants analyses. AVAILABILITY AND IMPLEMENTATION: vcfpp is available from https://github.com/Zilong-Li/vcfpp under MIT license. vcfppR is available from https://cran.r-project.org/web/packages/vcfppR.


Asunto(s)
Lenguajes de Programación , Programas Informáticos , Macrodatos
11.
Annu Rev Biomed Eng ; 26(1): 529-560, 2024 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-38594947

RESUMEN

Despite the remarkable advances in cancer diagnosis, treatment, and management over the past decade, malignant tumors remain a major public health problem. Further progress in combating cancer may be enabled by personalizing the delivery of therapies according to the predicted response for each individual patient. The design of personalized therapies requires the integration of patient-specific information with an appropriate mathematical model of tumor response. A fundamental barrier to realizing this paradigm is the current lack of a rigorous yet practical mathematical theory of tumor initiation, development, invasion, and response to therapy. We begin this review with an overview of different approaches to modeling tumor growth and treatment, including mechanistic as well as data-driven models based on big data and artificial intelligence. We then present illustrative examples of mathematical models manifesting their utility and discuss the limitations of stand-alone mechanistic and data-driven models. We then discuss the potential of mechanistic models for not only predicting but also optimizing response to therapy on a patient-specific basis. We describe current efforts and future possibilities to integrate mechanistic and data-driven models. We conclude by proposing five fundamental challenges that must be addressed to fully realize personalized care for cancer patients driven by computational models.


Asunto(s)
Inteligencia Artificial , Macrodatos , Neoplasias , Medicina de Precisión , Humanos , Neoplasias/terapia , Medicina de Precisión/métodos , Simulación por Computador , Modelos Biológicos , Modelación Específica para el Paciente
14.
Nature ; 566(7743): 195-204, 2019 02.
Artículo en Inglés | MEDLINE | ID: mdl-30760912

RESUMEN

Machine learning approaches are increasingly used to extract patterns and insights from the ever-increasing stream of geospatial data, but current approaches may not be optimal when system behaviour is dominated by spatial or temporal context. Here, rather than amending classical machine learning, we argue that these contextual cues should be used as part of deep learning (an approach that is able to extract spatio-temporal features automatically) to gain further process understanding of Earth system science problems, improving the predictive ability of seasonal forecasting and modelling of long-range spatial connections across multiple timescales, for example. The next step will be a hybrid modelling approach, coupling physical process models with the versatility of data-driven machine learning.


Asunto(s)
Macrodatos , Simulación por Computador , Aprendizaje Profundo , Ciencias de la Tierra/métodos , Predicción/métodos , Reconocimiento de Normas Patrones Automatizadas/métodos , Reconocimiento Facial , Femenino , Mapeo Geográfico , Humanos , Conocimiento , Regresión Psicológica , Reproducibilidad de los Resultados , Estaciones del Año , Análisis Espacio-Temporal , Factores de Tiempo , Traducción , Incertidumbre , Tiempo (Meteorología)
15.
Oncologist ; 29(5): 415-421, 2024 May 03.
Artículo en Inglés | MEDLINE | ID: mdl-38330451

RESUMEN

PURPOSE: Immune checkpoint inhibitors (ICIs) have significantly improved the survival of patients with cancer and provided long-term durable benefit. However, ICI-treated patients develop a range of toxicities known as immune-related adverse events (irAEs), which could compromise clinical benefits from these treatments. As the incidence and spectrum of irAEs differs across cancer types and ICI agents, it is imperative to characterize the incidence and spectrum of irAEs in a pan-cancer cohort to aid clinical management. DESIGN: We queried >400 000 trials registered at ClinicalTrials.gov and retrieved a comprehensive pan-cancer database of 71 087 ICI-treated participants from 19 cancer types and 7 ICI agents. We performed data harmonization and cleaning of these trial results into 293 harmonized adverse event categories using Medical Dictionary for Regulatory Activities. RESULTS: We developed irAExplorer (https://irae.tanlab.org), an interactive database that focuses on adverse events in patients administered with ICIs from big data mining. irAExplorer encompasses 71 087 distinct clinical trial participants from 343 clinical trials across 19 cancer types with well-annotated ICI treatment regimens and harmonized adverse event categories. We demonstrated a few of the irAE analyses through irAExplorer and highlighted some associations between treatment- or cancer-specific irAEs. CONCLUSION: The irAExplorer is a user-friendly resource that offers exploration, validation, and discovery of treatment- or cancer-specific irAEs across pan-cancer cohorts. We envision that irAExplorer can serve as a valuable resource to cross-validate users' internal datasets to increase the robustness of their findings.


Asunto(s)
Ensayos Clínicos como Asunto , Minería de Datos , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos , Inhibidores de Puntos de Control Inmunológico , Neoplasias , Humanos , Inhibidores de Puntos de Control Inmunológico/efectos adversos , Inhibidores de Puntos de Control Inmunológico/uso terapéutico , Neoplasias/tratamiento farmacológico , Efectos Colaterales y Reacciones Adversas Relacionados con Medicamentos/epidemiología , Macrodatos , Bases de Datos Factuales/estadística & datos numéricos
16.
J Gene Med ; 26(1): e3629, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-37940369

RESUMEN

In recent years, developing the idea of "cancer big data" has emerged as a result of the significant expansion of various fields such as clinical research, genomics, proteomics and public health records. Advances in omics technologies are making a significant contribution to cancer big data in biomedicine and disease diagnosis. The increasingly availability of extensive cancer big data has set the stage for the development of multimodal artificial intelligence (AI) frameworks. These frameworks aim to analyze high-dimensional multi-omics data, extracting meaningful information that is challenging to obtain manually. Although interpretability and data quality remain critical challenges, these methods hold great promise for advancing our understanding of cancer biology and improving patient care and clinical outcomes. Here, we provide an overview of cancer big data and explore the applications of both traditional machine learning and deep learning approaches in cancer genomic and proteomic studies. We briefly discuss the challenges and potential of AI techniques in the integrated analysis of omics data, as well as the future direction of personalized treatment options in cancer.


Asunto(s)
Inteligencia Artificial , Neoplasias , Humanos , Proteómica/métodos , Macrodatos , Genómica/métodos , Aprendizaje Automático , Neoplasias/diagnóstico , Neoplasias/genética , Neoplasias/terapia
17.
Radiology ; 310(2): e232030, 2024 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-38411520

RESUMEN

According to the World Health Organization, climate change is the single biggest health threat facing humanity. The global health care system, including medical imaging, must manage the health effects of climate change while at the same time addressing the large amount of greenhouse gas (GHG) emissions generated in the delivery of care. Data centers and computational efforts are increasingly large contributors to GHG emissions in radiology. This is due to the explosive increase in big data and artificial intelligence (AI) applications that have resulted in large energy requirements for developing and deploying AI models. However, AI also has the potential to improve environmental sustainability in medical imaging. For example, use of AI can shorten MRI scan times with accelerated acquisition times, improve the scheduling efficiency of scanners, and optimize the use of decision-support tools to reduce low-value imaging. The purpose of this Radiology in Focus article is to discuss this duality at the intersection of environmental sustainability and AI in radiology. Further discussed are strategies and opportunities to decrease AI-related emissions and to leverage AI to improve sustainability in radiology, with a focus on health equity. Co-benefits of these strategies are explored, including lower cost and improved patient outcomes. Finally, knowledge gaps and areas for future research are highlighted.


Asunto(s)
Inteligencia Artificial , Radiología , Humanos , Radiografía , Macrodatos , Cambio Climático
18.
Bioinformatics ; 39(9)2023 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-37725353

RESUMEN

MOTIVATION: Living a Big Data era in Biomedicine, there is an unmet need to systematically assess experimental observations in the context of available information. This assessment would offer a means for a comprehensive and robust validation of biomedical data results and provide an initial estimate of the potential novelty of the findings. RESULTS: Here we present BQsupports, a web-based tool built upon the Bioteque biomedical descriptors that systematically analyzes and quantifies the current support to a given set of observations. The tool relies on over 1000 distinct types of biomedical descriptors, covering over 11 different biological and chemical entities, including genes, cell lines, diseases, and small molecules. By exploring hundreds of descriptors, BQsupports provide support scores for each observation across a wide variety of biomedical contexts. These scores are then aggregated to summarize the biomedical support of the assessed dataset as a whole. Finally, the BQsupports also suggests predictive features of the given dataset, which can be exploited in downstream machine learning applications. AVAILABILITY AND IMPLEMENTATION: The web application and underlying data are available online (https://bqsupports.irbbarcelona.org).


Asunto(s)
Aprendizaje Automático , Programas Informáticos , Macrodatos
19.
J Transl Med ; 22(1): 128, 2024 02 02.
Artículo en Inglés | MEDLINE | ID: mdl-38308276

RESUMEN

BACKGROUND: DNMT3L is a crucial DNA methylation regulatory factor, yet its function and mechanism in hepatocellular carcinoma (HCC) remain poorly understood. Bioinformatics-based big data analysis has increasingly gained significance in cancer research. Therefore, this study aims to elucidate the role of DNMT3L in HCC by integrating big data analysis with experimental validation. METHODS: Dozens of HCC datasets were collected to analyze the expression of DNMT3L and its relationship with prognostic indicators, and were used for molecular regulatory relationship evaluation. The effects of DNMT3L on the malignant phenotypes of hepatoma cells were confirmed in vitro and in vivo. The regulatory mechanisms of DNMT3L were explored through MSP, western blot, and dual-luciferase assays. RESULTS: DNMT3L was found to be downregulated in HCC tissues and associated with better prognosis. Overexpression of DNMT3L inhibits cell proliferation and metastasis. Additionally, CDO1 was identified as a target gene of DNMT3L and also exhibits anti-cancer effects. DNMT3L upregulates CDO1 expression by competitively inhibiting DNMT3A-mediated methylation of CDO1 promoter. CONCLUSIONS: Our study revealed the role and epi-transcriptomic regulatory mechanism of DNMT3L in HCC, and underscored the essential role and applicability of big data analysis in elucidating complex biological processes.


Asunto(s)
Carcinoma Hepatocelular , Neoplasias Hepáticas , Humanos , Macrodatos , Carcinoma Hepatocelular/genética , Línea Celular Tumoral , ADN (Citosina-5-)-Metiltransferasas/genética , ADN (Citosina-5-)-Metiltransferasas/metabolismo , Metilación de ADN/genética , Neoplasias Hepáticas/genética , Regiones Promotoras Genéticas/genética
20.
Glob Chang Biol ; 30(1): e17116, 2024 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-38273575

RESUMEN

The scientific community has entered an era of big data. However, with big data comes big responsibilities, and best practices for how data are contributed to databases have not kept pace with the collection, aggregation, and analysis of big data. Here, we rigorously assess the quantity of data for specific leaf area (SLA) available within the largest and most frequently used global plant trait database, the TRY Plant Trait Database, exploring how much of the data were applicable (i.e., original, representative, logical, and comparable) and traceable (i.e., published, cited, and consistent). Over three-quarters of the SLA data in TRY either lacked applicability or traceability, leaving only 22.9% of the original data usable compared with the 64.9% typically deemed usable by standard data cleaning protocols. The remaining usable data differed markedly from the original for many species, which led to altered interpretation of ecological analyses. Though the data we consider here make up only 4.5% of SLA data within TRY, similar issues of applicability and traceability likely apply to SLA data for other species as well as other commonly measured, uploaded, and downloaded plant traits. We end with suggested steps forward for global ecological databases, including suggestions for both uploaders to and curators of databases with the hope that, through addressing the issues raised here, we can increase data quality and integrity within the ecological community.


Asunto(s)
Hojas de la Planta , Plantas , Macrodatos , Bases de Datos Factuales , Fenotipo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA