RESUMO
Randomization-based inference is a useful alternative to traditional population model-based methods. In trials with missing data, multiple imputation is often used. We describe how to construct a randomization test in clinical trials where multiple imputation is used for handling missing data. We illustrate the proposed methodology using Fisher's combining function applied to individual scores in two post-traumatic stress disorder trials.
Assuntos
Interpretação Estatística de Dados , Humanos , Distribuição AleatóriaAssuntos
Infecções por Coronavirus/epidemiologia , Modelos Teóricos , Pneumonia Viral/epidemiologia , Ciências Sociais , Incerteza , Viés , COVID-19 , Análise Custo-Benefício , Política de Saúde , Humanos , Modelos Biológicos , Pandemias/estatística & dados numéricos , Política , Saúde Pública/métodos , Saúde Pública/normas , Reprodutibilidade dos TestesRESUMO
Science lies nowadays in the centre of several storms. The better known is the finding of non-reproducibility of many scientific results, which stretches from the medical field (clinic and pre-clinic tests) to study on behaviour (priming research). Although the bad use of statistics is reported to be a patent cause of the reproducibility crisis, its deep reasons are to be sought elsewhere; particularly, in the passage from a regimen of little science - regulated by small communities of researchers - to the current big science - identified by a hypertrophic production of millions of research papers and by the imperative "publish or perish", in a setting dominated by market. While spirited debates (on vaccines, climate change, GMO) unfold in society, scientific articles which are bought or withdrawn are the signal of a deep crisis not only of science, but also of the expert thought. In this background, statistics is the main defendant, charged with using methods which experts themselves are not able to explain in an understandable way (p-test). Is there an escape? Yes, there is. Researchers can either court the power and defend the status quo, or contribute to a deep process of reformation, refusing both a vision of science as a religion and the idea that the problem is the poor scientific knowledge of the lay public.
Assuntos
Estatística como Assunto/normas , Reprodutibilidade dos TestesRESUMO
The recent elevated rate of large earthquakes has fueled concern that the underlying global rate of earthquake activity has increased, which would have important implications for assessments of seismic hazard and our understanding of how faults interact. We examine the timing of large (magnitude M≥7) earthquakes from 1900 to the present, after removing local clustering related to aftershocks. The global rate of M≥8 earthquakes has been at a record high roughly since 2004, but rates have been almost as high before, and the rate of smaller earthquakes is close to its historical average. Some features of the global catalog are improbable in retrospect, but so are some features of most random sequences--if the features are selected after looking at the data. For a variety of magnitude cutoffs and three statistical tests, the global catalog, with local clusters removed, is not distinguishable from a homogeneous Poisson process. Moreover, no plausible physical mechanism predicts real changes in the underlying global rate of large events. Together these facts suggest that the global risk of large earthquakes is no higher today than it has been in the past.
Assuntos
Terremotos , Internacionalidade , Análise por Conglomerados , Método de Monte Carlo , Distribuição de Poisson , Fatores de Risco , Fatores de TempoRESUMO
Darwin's classic image of an "entangled bank" of interdependencies among species has long suggested that it is difficult to predict how the loss of one species affects the abundance of others. We show that for dynamical models of realistically structured ecological networks in which pair-wise consumer-resource interactions allometrically scale to the (3/4) power--as suggested by metabolic theory--the effect of losing one species on another can be predicted well by simple functions of variables easily observed in nature. By systematically removing individual species from 600 networks ranging from 10-30 species, we analyzed how the strength of 254,032 possible pair-wise species interactions depended on 90 stochastically varied species, link, and network attributes. We found that the interaction strength between a pair of species is predicted well by simple functions of the two species' biomasses and the body mass of the species removed. On average, prediction accuracy increases with network size, suggesting that greater web complexity simplifies predicting interaction strengths. Applied to field data, our model successfully predicts interactions dominated by trophic effects and illuminates the sign and magnitude of important nontrophic interactions.
Assuntos
Ecologia , Cadeia Alimentar , Dinâmica Populacional , Animais , Biomassa , Tamanho Corporal , Extinção Biológica , Comportamento Alimentar , Modelos TeóricosRESUMO
[This corrects the article DOI: 10.1371/journal.pone.0202450.].
RESUMO
In order to evaluate mortality predictions based on boosted trees, this retrospective study uses electronic medical record data from three academic health centers for inpatients 18 years or older with at least one observation of each vital sign. Predictions were made 12, 24, and 48 hours before death. Models fit to training data from each institution were evaluated using hold-out test data from the same institution, and from the other institutions. Gradient-boosted trees (GBT) were compared to regularized logistic regression (LR) predictions, support vector machine (SVM) predictions, quick Sepsis-Related Organ Failure Assessment (qSOFA), and Modified Early Warning Score (MEWS) using area under the receiver operating characteristic curve (AUROC). For training and testing GBT on data from the same institution, the average AUROCs were 0.96, 0.95, and 0.94 across institutional test sets for 12-, 24-, and 48-hour predictions, respectively. When trained and tested on data from different hospitals, GBT AUROCs achieved up to 0.98, 0.96, and 0.96, for 12-, 24-, and 48-hour predictions, respectively. Average AUROC for 48-hour predictions for LR, SVM, MEWS, and qSOFA were 0.85, 0.79, 0.86 and 0.82, respectively. GBT predictions may help identify patients who would benefit from increased clinical care.
Assuntos
Aprendizado de Máquina , Sepse , Algoritmos , Mortalidade Hospitalar , Humanos , Estudos RetrospectivosRESUMO
SIGNIFICANCE: Foraged leafy greens are consumed around the globe, including in urban areas, and may play a larger role when food is scarce or expensive. It is thus important to assess the safety and nutritional value of wild greens foraged in urban environments. METHODS: Field observations, soil tests, and nutritional and toxicology tests on plant tissue were conducted for three sites, each roughly 9 square blocks, in disadvantaged neighborhoods in the East San Francisco Bay Area in 2014-2015. The sites included mixed-use areas and areas with high vehicle traffic. RESULTS: Edible wild greens were abundant, even during record droughts. Soil at some survey sites had elevated concentrations of lead and cadmium, but tissue tests suggest that rinsed greens of the tested species are safe to eat. Daily consumption of standard servings comprise less than the EPA reference doses of lead, cadmium, and other heavy metals. Pesticides, glyphosate, and PCBs were below detection limits. The nutrient density of 6 abundant species compared favorably to that of the most nutritious domesticated leafy greens. CONCLUSIONS: Wild edible greens harvested in industrial, mixed-use, and high-traffic urban areas in the San Francisco East Bay area are abundant and highly nutritious. Even grown in soils with elevated levels of heavy metals, tested species were safe to eat after rinsing in tap water. This does not mean that all edible greens growing in contaminated soil are safe to eat-tests on more species, in more locations, and over a broader range of soil chemistry are needed to determine what is generally safe and what is not. But it does suggest that wild greens could contribute to nutrition, food security, and sustainability in urban ecosystems. Current laws, regulations, and public-health guidance that forbid or discourage foraging on public lands, including urban areas, should be revisited.
Assuntos
Ecossistema , Análise de Alimentos , Contaminação de Alimentos/análise , Valor Nutritivo , Verduras/química , Humanos , São FranciscoRESUMO
The scientific community is increasingly concerned with the proportion of published "discoveries" that are not replicated in subsequent studies. The field of rodent behavioral phenotyping was one of the first to raise this concern, and to relate it to other methodological issues: the complex interaction between genotype and environment; the definitions of behavioral constructs; and the use of laboratory mice and rats as model species for investigating human health and disease mechanisms. In January 2015, researchers from various disciplines gathered at Tel Aviv University to discuss these issues. The general consensus was that the issue is prevalent and of concern, and should be addressed at the statistical, methodological and policy levels, but is not so severe as to call into question the validity and the usefulness of model organisms as a whole. Well-organized community efforts, coupled with improved data and metadata sharing, have a key role in identifying specific problems and promoting effective solutions. Replicability is closely related to validity, may affect generalizability and translation of findings, and has important ethical implications.
Assuntos
Experimentação Animal/normas , Comportamento Animal , Pesquisa/normas , Animais , Disseminação de Informação , Modelos Animais , Fenótipo , Reprodutibilidade dos Testes , Projetos de Pesquisa , RoedoresRESUMO
Statistical models often use observational data to predict phenomena; however, interpreting model terms to understand their influence can be problematic. This issue poses a challenge in species conservation where setting priorities requires estimating influences of potential stressors using observational data. We present a novel approach for inferring influence of a rare stressor on a rare species by blending predictive models with nonparametric permutation tests. We illustrate the approach with two case studies involving rare amphibians in Yosemite National Park, USA. The endangered frog, Rana sierrae, is known to be negatively impacted by non-native fish, while the threatened toad, Anaxyrus canorus, is potentially affected by packstock. Both stressors and amphibians are rare, occurring in ~10% of potential habitat patches. We first predict amphibian occupancy with a statistical model that includes all predictors but the stressor to stratify potential habitat by predicted suitability. A stratified permutation test then evaluates the association between stressor and amphibian, all else equal. Our approach confirms the known negative relationship between fish and R. sierrae, but finds no evidence of a negative relationship between current packstock use and A. canorus breeding. Our statistical approach has potential broad application for deriving understanding (not just prediction) from observational data.
Assuntos
Biodiversidade , Ecossistema , Parques Recreativos , Animais , California , GeografiaRESUMO
A group of 29 elderly subjects between 60.0 and 83.7 years of age at the beginning of the study, and whose hearing loss was not greater than moderate, was tested twice, an average of 5.27 years apart. The tests measured pure-tone thresholds, word recognition in quiet, and understanding of speech with various types of distortion (low-pass filtering, time compression) or interference (single speaker, babble noise, reverberation). Performance declined consistently and significantly between the two testing phases. In addition, the variability of speech understanding measures increased significantly between testing phases, though the variability of audiometric measurements did not. A right-ear superiority was observed but this lateral asymmetry did not increase between testing phases. Comparison of the elderly subjects with a group of young subjects with normal hearing shows that the decline of speech understanding measures accelerated significantly relative to the decline in audiometric measures in the seventh to ninth decades of life. On the assumption that speech understanding depends linearly on age and audiometric variables, there is evidence that this linear relationship changes with age, suggesting that not only the accuracy but also the nature of speech understanding evolves with age.