Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Nature ; 600(7890): 695-700, 2021 12.
Artículo en Inglés | MEDLINE | ID: mdl-34880504

RESUMEN

Surveys are a crucial tool for understanding public opinion and behaviour, and their accuracy depends on maintaining statistical representativeness of their target populations by minimizing biases from all sources. Increasing data size shrinks confidence intervals but magnifies the effect of survey bias: an instance of the Big Data Paradox1. Here we demonstrate this paradox in estimates of first-dose COVID-19 vaccine uptake in US adults from 9 January to 19 May 2021 from two large surveys: Delphi-Facebook2,3 (about 250,000 responses per week) and Census Household Pulse4 (about 75,000 every two weeks). In May 2021, Delphi-Facebook overestimated uptake by 17 percentage points (14-20 percentage points with 5% benchmark imprecision) and Census Household Pulse by 14 (11-17 percentage points with 5% benchmark imprecision), compared to a retroactively updated benchmark the Centers for Disease Control and Prevention published on 26 May 2021. Moreover, their large sample sizes led to miniscule margins of error on the incorrect estimates. By contrast, an Axios-Ipsos online panel5 with about 1,000 responses per week following survey research best practices6 provided reliable estimates and uncertainty quantification. We decompose observed error using a recent analytic framework1 to explain the inaccuracy in the three surveys. We then analyse the implications for vaccine hesitancy and willingness. We show how a survey of 250,000 respondents can produce an estimate of the population mean that is no more accurate than an estimate from a simple random sample of size 10. Our central message is that data quality matters more than data quantity, and that compensating the former with the latter is a mathematically provable losing proposition.


Asunto(s)
Vacunas contra la COVID-19/administración & dosificación , Encuestas de Atención de la Salud , Vacunación/estadística & datos numéricos , Benchmarking , Sesgo , Macrodatos , COVID-19/epidemiología , COVID-19/prevención & control , Centers for Disease Control and Prevention, U.S. , Conjuntos de Datos como Asunto/normas , Femenino , Encuestas de Atención de la Salud/normas , Humanos , Masculino , Proyectos de Investigación , Tamaño de la Muestra , Medios de Comunicación Sociales , Estados Unidos/epidemiología , Vacilación a la Vacunación/estadística & datos numéricos
2.
Patterns (N Y) ; 2(1): 100156, 2021 Jan 08.
Artículo en Inglés | MEDLINE | ID: mdl-33511362

RESUMEN

Digital technology is having a major impact on many areas of society, and there is equal opportunity for impact on science. This is particularly true in the environmental sciences as we seek to understand the complexities of the natural environment under climate change. This perspective presents the outcomes of a summit in this area, a unique cross-disciplinary gathering bringing together environmental scientists, data scientists, computer scientists, social scientists, and representatives of the creative arts. The key output of this workshop is an agreed vision in the form of a framework and associated roadmap, captured in the Windermere Accord. This accord envisions a new kind of environmental science underpinned by unprecedented amounts of data, with technological advances leading to breakthroughs in taming uncertainty and complexity, and also supporting openness, transparency, and reproducibility in science. The perspective also includes a call to build an international community working in this important area.

3.
Int J Biostat ; 17(2): 331-348, 2020 12 02.
Artículo en Inglés | MEDLINE | ID: mdl-34826372

RESUMEN

We propose a nonparametric test of independence, termed optHSIC, between a covariate and a right-censored lifetime. Because the presence of censoring creates a challenge in applying the standard permutation-based testing approaches, we use optimal transport to transform the censored dataset into an uncensored one, while preserving the relevant dependencies. We then apply a permutation test using the kernel-based dependence measure as a statistic to the transformed dataset. The type 1 error is proven to be correct in the case where censoring is independent of the covariate. Experiments indicate that optHSIC has power against a much wider class of alternatives than Cox proportional hazards regression and that it has the correct type 1 control even in the challenging cases where censoring strongly depends on the covariate.

4.
Sci Adv ; 5(11): eaau4996, 2019 11.
Artículo en Inglés | MEDLINE | ID: mdl-31807692

RESUMEN

Identifying causal relationships and quantifying their strength from observational time series data are key problems in disciplines dealing with complex dynamical systems such as the Earth system or the human body. Data-driven causal inference in such systems is challenging since datasets are often high dimensional and nonlinear with limited sample sizes. Here, we introduce a novel method that flexibly combines linear or nonlinear conditional independence tests with a causal discovery algorithm to estimate causal networks from large-scale time series datasets. We validate the method on time series of well-understood physical mechanisms in the climate system and the human heart and using large-scale synthetic datasets mimicking the typical properties of real-world data. The experiments demonstrate that our method outperforms state-of-the-art techniques in detection power, which opens up entirely new possibilities to discover and quantify causal networks from time series across a range of research fields.

5.
Spat Stat ; 28: 59-78, 2018 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-31008043

RESUMEN

The use of covariance kernels is ubiquitous in the field of spatial statistics. Kernels allow data to be mapped into high-dimensional feature spaces and can thus extend simple linear additive methods to nonlinear methods with higher order interactions. However, until recently, there has been a strong reliance on a limited class of stationary kernels such as the Matérn or squared exponential, limiting the expressiveness of these modelling approaches. Recent machine learning research has focused on spectral representations to model arbitrary stationary kernels and introduced more general representations that include classes of nonstationary kernels. In this paper, we exploit the connections between Fourier feature representations, Gaussian processes and neural networks to generalise previous approaches and develop a simple and efficient framework to learn arbitrarily complex nonstationary kernel functions directly from the data, while taking care to avoid overfitting using state-of-the-art methods from deep learning. We highlight the very broad array of kernel classes that could be created within this framework. We apply this to a time series dataset and a remote sensing problem involving land surface temperature in Eastern Africa. We show that without increasing the computational or storage complexity, nonstationary kernels can be used to improve generalisation performance and provide more interpretable results.

6.
Elife ; 42015 Jan 23.
Artículo en Inglés | MEDLINE | ID: mdl-25615722

RESUMEN

Electrophysiological data disclose rich dynamics in patterns of neural activity evoked by sensory objects. Retrieving objects from memory reinstates components of this activity. In humans, the temporal structure of this retrieved activity remains largely unexplored, and here we address this gap using the spatiotemporal precision of magnetoencephalography (MEG). In a sensory preconditioning paradigm, 'indirect' objects were paired with 'direct' objects to form associative links, and the latter were then paired with rewards. Using multivariate analysis methods we examined the short-time evolution of neural representations of indirect objects retrieved during reward-learning about direct objects. We found two components of the evoked representation of the indirect stimulus, 200 ms apart. The strength of retrieval of one, but not the other, representational component correlated with generalization of reward learning from direct to indirect stimuli. We suggest the temporal structure within retrieved neural representations may be key to their function.


Asunto(s)
Aprendizaje por Asociación/fisiología , Memoria/fisiología , Adolescente , Adulto , Conducta , Potenciales Evocados/fisiología , Femenino , Humanos , Magnetoencefalografía , Masculino , Análisis Multivariante , Estimulación Luminosa , Análisis y Desempeño de Tareas , Factores de Tiempo , Adulto Joven
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA