Pesquisa | Biblioteca Virtual em Saúde

1.

Novel community data in ecology-properties and prospects.

Hartig, Florian; Abrego, Nerea; Bush, Alex; Chase, Jonathan M; Guillera-Arroita, Gurutzeta; Leibold, Mathew A; Ovaskainen, Otso; Pellissier, Loïc; Pichler, Maximilian; Poggiato, Giovanni; Pollock, Laura; Si-Moussi, Sara; Thuiller, Wilfried; Viana, Duarte S; Warton, David I; Zurell, Damaris; Yu, Douglas W.

Trends Ecol Evol ; 39(3): 280-293, 2024 03.

Artigo em Inglês | MEDLINE | ID: mdl-37949795

RESUMO

New technologies for monitoring biodiversity such as environmental (e)DNA, passive acoustic monitoring, and optical sensors promise to generate automated spatiotemporal community observations at unprecedented scales and resolutions. Here, we introduce 'novel community data' as an umbrella term for these data. We review the emerging field around novel community data, focusing on new ecological questions that could be addressed; the analytical tools available or needed to make best use of these data; and the potential implications of these developments for policy and conservation. We conclude that novel community data offer many opportunities to advance our understanding of fundamental ecological processes, including community assembly, biotic interactions, micro- and macroevolution, and overall ecosystem functioning.

Assuntos

Biodiversidade , Ecossistema , DNA , Políticas

2.

Human lower leg muscles grow asynchronously.

Chow, Brian V Y; Morgan, Catherine; Rae, Caroline; Warton, David I; Novak, Iona; Davies, Suzanne; Lancaster, Ann; Popovic, Gordana C; Rizzo, Rodrigo R N; Rizzo, Claudia Y; Kyriagis, Maria; Herbert, Robert D; Bolsterlee, Bart.

J Anat ; 244(3): 476-485, 2024 03.

Artigo em Inglês | MEDLINE | ID: mdl-37917014

RESUMO

Muscle volume must increase substantially during childhood growth to generate the power required to propel the growing body. One unresolved but fundamental question about childhood muscle growth is whether muscles grow at equal rates; that is, if muscles grow in synchrony with each other. In this study, we used magnetic resonance imaging (MRI) and advances in artificial intelligence methods (deep learning) for medical image segmentation to investigate whether human lower leg muscles grow in synchrony. Muscle volumes were measured in 10 lower leg muscles in 208 typically developing children (eight infants aged less than 3 months and 200 children aged 5 to 15 years). We tested the hypothesis that human lower leg muscles grow synchronously by investigating whether the volume of individual lower leg muscles, expressed as a proportion of total lower leg muscle volume, remains constant with age. There were substantial age-related changes in the relative volume of most muscles in both boys and girls (p < 0.001). This was most evident between birth and five years of age but was still evident after five years. The medial gastrocnemius and soleus muscles, the largest muscles in infancy, grew faster than other muscles in the first five years. The findings demonstrate that muscles in the human lower leg grow asynchronously. This finding may assist early detection of atypical growth and allow targeted muscle-specific interventions to improve the quality of life, particularly for children with neuromotor conditions such as cerebral palsy.

Assuntos

Inteligência Artificial , Perna (Membro) , Masculino , Criança , Feminino , Humanos , Pré-Escolar , Qualidade de Vida , Músculo Esquelético/patologia , Extremidade Inferior , Imageamento por Ressonância Magnética/métodos

3.

A general algorithm for error-in-variables regression modelling using Monte Carlo expectation maximization.

Stoklosa, Jakub; Hwang, Wen-Han; Warton, David I.

PLoS One ; 18(4): e0283798, 2023.

Artigo em Inglês | MEDLINE | ID: mdl-37011065

RESUMO

In regression modelling, measurement error models are often needed to correct for uncertainty arising from measurements of covariates/predictor variables. The literature on measurement error (or errors-in-variables) modelling is plentiful, however, general algorithms and software for maximum likelihood estimation of models with measurement error are not as readily available, in a form that they can be used by applied researchers without relatively advanced statistical expertise. In this study, we develop a novel algorithm for measurement error modelling, which could in principle take any regression model fitted by maximum likelihood, or penalised likelihood, and extend it to account for uncertainty in covariates. This is achieved by exploiting an interesting property of the Monte Carlo Expectation-Maximization (MCEM) algorithm, namely that it can be expressed as an iteratively reweighted maximisation of complete data likelihoods (formed by imputing the missing values). Thus we can take any regression model for which we have an algorithm for (penalised) likelihood estimation when covariates are error-free, nest it within our proposed iteratively reweighted MCEM algorithm, and thus account for uncertainty in covariates. The approach is demonstrated on examples involving generalized linear models, point process models, generalized additive models and capture-recapture models. Because the proposed method uses maximum (penalised) likelihood, it inherits advantageous optimality and inferential properties, as illustrated by simulation. We also study the model robustness of some violations in predictor distributional assumptions. Software is provided as the refitME package on R, whose key function behaves like a refit() function, taking a fitted regression model object and re-fitting with a pre-specified amount of measurement error.

Assuntos

Algoritmos , Motivação , Funções Verossimilhança , Modelos Lineares , Simulação por Computador , Método de Monte Carlo , Modelos Estatísticos

4.

Leaf economics fundamentals explained by optimality principles.

Wang, Han; Prentice, I Colin; Wright, Ian J; Warton, David I; Qiao, Shengchao; Xu, Xiangtao; Zhou, Jian; Kikuzawa, Kihachiro; Stenseth, Nils Chr.

Sci Adv ; 9(3): eadd5667, 2023 Jan 18.

Artigo em Inglês | MEDLINE | ID: mdl-36652527

RESUMO

The life span of leaves increases with their mass per unit area (LMA). It is unclear why. Here, we show that this empirical generalization (the foundation of the worldwide leaf economics spectrum) is a consequence of natural selection, maximizing average net carbon gain over the leaf life cycle. Analyzing two large leaf trait datasets, we show that evergreen and deciduous species with diverse construction costs (assumed proportional to LMA) are selected by light, temperature, and growing-season length in different, but predictable, ways. We quantitatively explain the observed divergent latitudinal trends in evergreen and deciduous LMA and show how local distributions of LMA arise by selection under different environmental conditions acting on the species pool. These results illustrate how optimality principles can underpin a new theory for plant geography and terrestrial carbon dynamics.

5.

Generalized Matrix Factorization: efficient algorithms for fitting generalized linear latent variable models to large data arrays.

Kidzinski, Lukasz; Hui, Francis K C; Warton, David I; Hastie, Trevor.

J Mach Learn Res ; 232022 Nov.

Artigo em Inglês | MEDLINE | ID: mdl-37102181

RESUMO

Unmeasured or latent variables are often the cause of correlations between multivariate measurements, which are studied in a variety of fields such as psychology, ecology, and medicine. For Gaussian measurements, there are classical tools such as factor analysis or principal component analysis with a well-established theory and fast algorithms. Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses. However, current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets with thousands of observational units or responses. In this article, we propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood and then using a Newton method and Fisher scoring to learn the model parameters. Computationally, our method is noticeably faster and more stable, enabling GLLVM fits to much larger matrices than previously possible. We apply our method on a dataset of 48,000 observational units with over 2,000 observed species in each unit and find that most of the variability can be explained with a handful of factors. We publish an easy-to-use implementation of our proposed fitting algorithm.

6.

Selecting the model for multiple imputation of missing data: Just use an IC!

Noghrehchi, Firouzeh; Stoklosa, Jakub; Penev, Spiridon; Warton, David I.

Stat Med ; 40(10): 2467-2497, 2021 05 10.

Artigo em Inglês | MEDLINE | ID: mdl-33629367

RESUMO

Multiple imputation and maximum likelihood estimation (via the expectation-maximization algorithm) are two well-known methods readily used for analyzing data with missing values. While these two methods are often considered as being distinct from one another, multiple imputation (when using improper imputation) is actually equivalent to a stochastic expectation-maximization approximation to the likelihood. In this article, we exploit this key result to show that familiar likelihood-based approaches to model selection, such as Akaike's information criterion (AIC) and the Bayesian information criterion (BIC), can be used to choose the imputation model that best fits the observed data. Poor choice of imputation model is known to bias inference, and while sensitivity analysis has often been used to explore the implications of different imputation models, we show that the data can be used to choose an appropriate imputation model via conventional model selection tools. We show that BIC can be consistent for selecting the correct imputation model in the presence of missing data. We verify these results empirically through simulation studies, and demonstrate their practicality on two classical missing data examples. An interesting result we saw in simulations was that not only can parameter estimates be biased by misspecifying the imputation model, but also by overfitting the imputation model. This emphasizes the importance of using model selection not just to choose the appropriate type of imputation model, but also to decide on the appropriate level of imputation model complexity.

Assuntos

Algoritmos , Teorema de Bayes , Viés , Simulação por Computador , Humanos , Funções Verossimilhança

7.

Modeling recreational fishing intensity in a complex urbanised estuary.

Griffin, Kingsley J; Hedge, Luke H; Warton, David I; Astles, Karen L; Johnston, Emma L.

J Environ Manage ; 279: 111529, 2021 Feb 01.

Artigo em Inglês | MEDLINE | ID: mdl-33246754

RESUMO

Urbanised estuaries, ports and harbours are often utilised for recreational purposes, notably recreational angling. Yet there has been little quantitative assessment of the footprint and intensity of these activities at scales suitable for spatial management. Urban and industrialised estuaries have previously been considered as having low conservation value, perhaps due to issues with contamination and disturbance. Studies in recent decades have demonstrated that many of these systems are still highly biodiverse and of high value to local residents. As a response, urbanised estuaries are now being considered by coastal spatial management initiatives, where assessments of recreational use in these areas can help avoid 'user-environmental' and 'user-user' conflict. The models of these activities need to be developed at a scale relevant to governments and regulatory authorities, but the few human-use models that do exist integrate fishing intensity to a regional or even continental scale; too large to capture the fine scale variation inherent in complex urban fisheries. Species Distribution Modeling (SDM) is a tool commonly used to assess drivers of species range, but can be applied to models of recreational fishing in complex environments, at a scale relevant to regulatory bodies. Using point-data from 573 visual surveys with recently developed Poisson point process models, we examine the recreational fishery in Australia's busiest estuarine port, Sydney Harbour. We demonstrate the utility of these models for understanding the distribution of boat and shore-based fishers, and the effects of a range of temporally static (geographical) and dynamic (weather) predictors on these distributions.

Assuntos

Conservação dos Recursos Naturais , Estuários , Biodiversidade , Pesqueiros , Humanos , Recreação

8.

Efficient estimation of generalized linear latent variable models.

Niku, Jenni; Brooks, Wesley; Herliansyah, Riki; Hui, Francis K C; Taskinen, Sara; Warton, David I.

PLoS One ; 14(5): e0216129, 2019.

Artigo em Inglês | MEDLINE | ID: mdl-31042745

RESUMO

Generalized linear latent variable models (GLLVM) are popular tools for modeling multivariate, correlated responses. Such data are often encountered, for instance, in ecological studies, where presence-absences, counts, or biomass of interacting species are collected from a set of sites. Until very recently, the main challenge in fitting GLLVMs has been the lack of computationally efficient estimation methods. For likelihood based estimation, several closed form approximations for the marginal likelihood of GLLVMs have been proposed, but their efficient implementations have been lacking in the literature. To fill this gap, we show in this paper how to obtain computationally convenient estimation algorithms based on a combination of either the Laplace approximation method or variational approximation method, and automatic optimization techniques implemented in R software. An extensive set of simulation studies is used to assess the performances of different methods, from which it is shown that the variational approximation method used in conjunction with automatic optimization offers a powerful tool for estimation.

Assuntos

Modelos Lineares , Análise Multivariada , Algoritmos , Simulação por Computador , Interpretação Estatística de Dados , Funções Verossimilhança , Software

9.

Order selection and sparsity in latent variable models via the ordered factor LASSO.

Hui, Francis K C; Tanaka, Emi; Warton, David I.

Biometrics ; 74(4): 1311-1319, 2018 12.

Artigo em Inglês | MEDLINE | ID: mdl-29750847

RESUMO

Generalized linear latent variable models (GLLVMs) offer a general framework for flexibly analyzing data involving multiple responses. When fitting such models, two of the major challenges are selecting the order, that is, the number of factors, and an appropriate structure for the loading matrix, typically a sparse structure. Motivated by the application of GLLVMs to study marine species assemblages in the Southern Ocean, we propose the Ordered Factor LASSO or OFAL penalty for order selection and achieving sparsity in GLLVMs. The OFAL penalty is the first penalty developed specifically for order selection in latent variable models, and achieves this by using a hierarchically structured group LASSO type penalty to shrink entire columns of the loading matrix to zero, while ensuring that non-zero loadings are concentrated on the lower-order factors. Simultaneously, individual element sparsity is achieved through the use of an adaptive LASSO. In conjunction with using an information criterion which promotes aggressive shrinkage, simulation shows that the OFAL penalty performs strongly compared with standard methods and penalties for order selection, achieving sparsity, and prediction in GLLVMs. Applying the OFAL penalty to the Southern Ocean marine species dataset suggests the available environmental predictors explain roughly half of the total covariation between species, thus leading to a smaller number of latent variables and increased sparsity in the loading matrix compared to a model without any covariates.

Assuntos

Biometria/métodos , Análise Fatorial , Animais , Organismos Aquáticos , Simulação por Computador/estatística & dados numéricos , Funções Verossimilhança , Oceanos e Mares

10.

Why you cannot transform your way out of trouble for small counts.

Warton, David I.

Biometrics ; 74(1): 362-368, 2018 03.

Artigo em Inglês | MEDLINE | ID: mdl-28504821

RESUMO

While data transformation is a common strategy to satisfy linear modeling assumptions, a theoretical result is used to show that transformation cannot reasonably be expected to stabilize variances for small counts. Under broad assumptions, as counts get smaller, it is shown that the variance becomes proportional to the mean under monotonic transformations g(·) that satisfy g(0)=0, excepting a few pathological cases. A suggested rule-of-thumb is that if many predicted counts are less than one then data transformation cannot reasonably be expected to stabilize variances, even for a well-chosen transformation. This result has clear implications for the analysis of counts as often implemented in the applied sciences, but particularly for multivariate analysis in ecology. Multivariate discrete data are often collected in ecology, typically with a large proportion of zeros, and it is currently widespread to use methods of analysis that do not account for differences in variance across observations nor across responses. Simulations demonstrate that failure to account for the mean-variance relationship can have particularly severe consequences in this context, and also in the univariate context if the sampling design is unbalanced.

Assuntos

Interpretação Estatística de Dados , Modelos Estatísticos , Ecologia , Modelos Lineares , Análise Multivariada

11.

The PIT-trap-A "model-free" bootstrap procedure for inference about regression models with discrete, multivariate responses.

Warton, David I; Thibaut, Loïc; Wang, Yi Alice.

PLoS One ; 12(7): e0181790, 2017.

Artigo em Inglês | MEDLINE | ID: mdl-28738071

RESUMO

Bootstrap methods are widely used in statistics, and bootstrapping of residuals can be especially useful in the regression context. However, difficulties are encountered extending residual resampling to regression settings where residuals are not identically distributed (thus not amenable to bootstrapping)-common examples including logistic or Poisson regression and generalizations to handle clustered or multivariate data, such as generalised estimating equations. We propose a bootstrap method based on probability integral transform (PIT-) residuals, which we call the PIT-trap, which assumes data come from some marginal distribution F of known parametric form. This method can be understood as a type of "model-free bootstrap", adapted to the problem of discrete and highly multivariate data. PIT-residuals have the key property that they are (asymptotically) pivotal. The PIT-trap thus inherits the key property, not afforded by any other residual resampling approach, that the marginal distribution of data can be preserved under PIT-trapping. This in turn enables the derivation of some standard bootstrap properties, including second-order correctness of pivotal PIT-trap test statistics. In multivariate data, bootstrapping rows of PIT-residuals affords the property that it preserves correlation in data without the need for it to be modelled, a key point of difference as compared to a parametric bootstrap. The proposed method is illustrated on an example involving multivariate abundance data in ecology, and demonstrated via simulation to have improved properties as compared to competing resampling methods.

Assuntos

Modelos Estatísticos , Análise Multivariada , Distribuições Estatísticas , Simulação por Computador , Ecologia/métodos , Probabilidade , Projetos de Pesquisa

12.

Extending Joint Models in Community Ecology: A Response to Beissinger et al.

Warton, David I; Blanchet, F Guillaume; O'Hara, Robert; Ovaskainen, Otso; Taskinen, Sara; Walker, Steven C; Hui, Francis K C.

Trends Ecol Evol ; 31(10): 737-738, 2016 10.

Artigo em Inglês | MEDLINE | ID: mdl-27515225

Assuntos

Ecologia

13.

So Many Variables: Joint Modeling in Community Ecology.

Warton, David I; Blanchet, F Guillaume; O'Hara, Robert B; Ovaskainen, Otso; Taskinen, Sara; Walker, Steven C; Hui, Francis K C.

Trends Ecol Evol ; 30(12): 766-779, 2015 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-26519235

RESUMO

Technological advances have enabled a new class of multivariate models for ecology, with the potential now to specify a statistical model for abundances jointly across many taxa, to simultaneously explore interactions across taxa and the response of abundance to environmental variables. Joint models can be used for several purposes of interest to ecologists, including estimating patterns of residual correlation across taxa, ordination, multivariate inference about environmental effects and environment-by-trait interactions, accounting for missing predictors, and improving predictions in situations where one can leverage knowledge of some species to predict others. We demonstrate this by example and discuss recent computation tools and future directions.

Assuntos

Biota , Modelos Estatísticos , Ecossistema , Modelos Lineares

14.

Fast forward selection for generalized estimating equations with a large number of predictor variables.

Stoklosa, Jakub; Gibb, Heloise; Warton, David I.

Biometrics ; 70(1): 110-20, 2014 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-24350717

RESUMO

We propose a new variable selection criterion designed for use with forward selection algorithms; the score information criterion (SIC). The proposed criterion is based on score statistics which incorporate correlated response data. The main advantage of the SIC is that it is much faster to compute than existing model selection criteria when the number of predictor variables added to a model is large, this is because SIC can be computed for all candidate models without actually fitting them. A second advantage is that it incorporates the correlation between variables into its quasi-likelihood, leading to more desirable properties than competing selection criteria. Consistency and prediction properties are shown for the SIC. We conduct simulation studies to evaluate the selection and prediction performances, and compare these, as well as computational times, with some well-known variable selection criteria. We apply the SIC on a real data set collected on arthropods by considering variable selection on a large number of interactions terms consisting of species traits and environmental covariates.

Assuntos

Algoritmos , Interpretação Estatística de Dados , Funções Verossimilhança , Estudos Longitudinais/métodos , Modelos Estatísticos , Animais , Artrópodes/crescimento & desenvolvimento , Austrália , Simulação por Computador , Ecossistema

15.

Model-based control of observer bias for the analysis of presence-only data in ecology.

Warton, David I; Renner, Ian W; Ramp, Daniel.

PLoS One ; 8(11): e79168, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-24260167

RESUMO

Presence-only data, where information is available concerning species presence but not species absence, are subject to bias due to observers being more likely to visit and record sightings at some locations than others (hereafter "observer bias"). In this paper, we describe and evaluate a model-based approach to accounting for observer bias directly--by modelling presence locations as a function of known observer bias variables (such as accessibility variables) in addition to environmental variables, then conditioning on a common level of bias to make predictions of species occurrence free of such observer bias. We implement this idea using point process models with a LASSO penalty, a new presence-only method related to maximum entropy modelling, that implicitly addresses the "pseudo-absence problem" of where to locate pseudo-absences (and how many). The proposed method of bias-correction is evaluated using systematically collected presence/absence data for 62 plant species endemic to the Blue Mountains near Sydney, Australia. It is shown that modelling and controlling for observer bias significantly improves the accuracy of predictions made using presence-only data, and usually improves predictions as compared to pseudo-absence or "inventory" methods of bias correction based on absences from non-target species. Future research will consider the potential for improving the proposed bias-correction approach by estimating the observer bias simultaneously across multiple species.

Assuntos

Ecossistema , Modelos Biológicos , Fenômenos Fisiológicos Vegetais , Plantas , Austrália , Variações Dependentes do Observador

16.

To mix or not to mix: comparing the predictive performance of mixture models vs. separate species distribution models.

Hui, Francis K C; Warton, David I; Foster, Scott D; Dunstan, Piers K.

Ecology ; 94(9): 1913-9, 2013 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-24279262

RESUMO

Species distribution models (SDMs) are an important tool for studying the patterns of species across environmental and geographic space. For community data, a common approach involves fitting an SDM to each species separately, although the large number of models makes interpretation difficult and fails to exploit any similarities between individual species responses. A recently proposed alternative that can potentially overcome these difficulties is species archetype models (SAMs), a model-based approach that clusters species based on their environmental response. In this paper, we compare the predictive performance of SAMs against separate SDMs using a number of multi-species data sets. Results show that SAMs improve model accuracy and discriminatory capacity compared to separate SDMs. This is achieved by borrowing strength from common species having higher information content. Moreover, the improvement increases as the species become rarer.

Assuntos

Modelos Biológicos , Animais , Simulação por Computador , Demografia , Especificidade da Espécie , Temperatura

17.

Are introduced species better dispersers than native species? A global comparative study of seed dispersal distance.

Flores-Moreno, Habacuc; Thomson, Fiona J; Warton, David I; Moles, Angela T.

PLoS One ; 8(6): e68541, 2013.

Artigo em Inglês | MEDLINE | ID: mdl-23818991

RESUMO

We provide the first global test of the idea that introduced species have greater seed dispersal distances than do native species, using data for 51 introduced and 360 native species from the global literature. Counter to our expectations, there was no significant difference in mean or maximum dispersal distance between introduced and native species. Next, we asked whether differences in dispersal distance might have been obscured by differences in seed mass, plant height and dispersal syndrome, all traits that affect dispersal distance and which can differ between native and introduced species. When we included all three variables in the model, there was no clear difference in dispersal distance between introduced and native species. These results remained consistent when we performed analyses including a random effect for site. Analyses also showed that the lack of a significant difference in dispersal distance was not due to differences in biome, taxonomic composition, growth form, nitrogen fixation, our inclusion of non-invasive introduced species, or our exclusion of species with human-assisted dispersal. Thus, if introduced species do have higher spread rates, it seems likely that these are driven by differences in post-dispersal processes such as germination, seedling survival, and survival to reproduction.

Assuntos

Espécies Introduzidas , Dispersão de Sementes/fisiologia , Plântula/fisiologia , Sementes/fisiologia , Animais , Ecossistema , Germinação/fisiologia , Magnoliopsida/classificação , Magnoliopsida/crescimento & desenvolvimento , Magnoliopsida/fisiologia , Modelos Biológicos , Dinâmica Populacional , Plântula/crescimento & desenvolvimento , Sementes/crescimento & desenvolvimento

18.

Equivalence of MAXENT and Poisson point process models for species distribution modeling in ecology.

Renner, Ian W; Warton, David I.

Biometrics ; 69(1): 274-81, 2013 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-23379623

RESUMO

Modeling the spatial distribution of a species is a fundamental problem in ecology. A number of modeling methods have been developed, an extremely popular one being MAXENT, a maximum entropy modeling approach. In this article, we show that MAXENT is equivalent to a Poisson regression model and hence is related to a Poisson point process model, differing only in the intercept term, which is scale-dependent in MAXENT. We illustrate a number of improvements to MAXENT that follow from these relations. In particular, a point process model approach facilitates methods for choosing the appropriate spatial resolution, assessing model adequacy, and choosing the LASSO penalty parameter, all currently unavailable to MAXENT. The equivalence result represents a significant step in the unification of the species distribution modeling literature.

Assuntos

Ecologia/métodos , Ecossistema , Modelos Biológicos , Modelos Estatísticos , Animais , Eucalyptus/crescimento & desenvolvimento , New South Wales , Software

19.

Robust estimation and inference for bivariate line-fitting in allometry.

Taskinen, Sara; Warton, David I.

Biom J ; 53(4): 652-72, 2011 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-21681982

RESUMO

In allometry, bivariate techniques related to principal component analysis are often used in place of linear regression, and primary interest is in making inferences about the slope. We demonstrate that the current inferential methods are not robust to bivariate contamination, and consider four robust alternatives to the current methods -- a novel sandwich estimator approach, using robust covariance matrices derived via an influence function approach, Huber's M-estimator and the fast-and-robust bootstrap. Simulations demonstrate that Huber's M-estimators are highly efficient and robust against bivariate contamination, and when combined with the fast-and-robust bootstrap, we can make accurate inferences even from small samples.

Assuntos

Bioestatística/métodos , Análise de Variância , Tamanho Corporal , Probabilidade

20.

The arcsine is asinine: the analysis of proportions in ecology.

Warton, David I; Hui, Francis K C.

Ecology ; 92(1): 3-10, 2011 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-21560670

RESUMO

The arcsine square root transformation has long been standard procedure when analyzing proportional data in ecology, with applications in data sets containing binomial and non-binomial response variables. Here, we argue that the arcsine transform should not be used in either circumstance. For binomial data, logistic regression has greater interpretability and higher power than analyses of transformed data. However, it is important to check the data for additional unexplained variation, i.e., overdispersion, and to account for it via the inclusion of random effects in the model if found. For non-binomial data, the arcsine transform is undesirable on the grounds of interpretability, and because it can produce nonsensical predictions. The logit transformation is proposed as an alternative approach to address these issues. Examples are presented in both cases to illustrate these advantages, comparing various methods of analyzing proportions including untransformed, arcsine- and logit-transformed linear models and logistic regression (with or without random effects). Simulations demonstrate that logistic regression usually provides a gain in power over other methods.

Assuntos

Ecologia/métodos , Ecossistema , Modelos Biológicos , Estatística como Assunto , Simulação por Computador

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA