Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
País de afiliação
Intervalo de ano de publicação
1.
Biometrics ; 80(1)2024 Jan 29.
Artigo em Inglês | MEDLINE | ID: mdl-38465987

RESUMO

High-dimensional data sets are often available in genome-enabled predictions. Such data sets include nonlinear relationships with complex dependence structures. For such situations, vine copula-based (quantile) regression is an important tool. However, the current vine copula-based regression approaches do not scale up to high and ultra-high dimensions. To perform high-dimensional sparse vine copula-based regression, we propose 2 methods. First, we show their superiority regarding computational complexity over the existing methods. Second, we define relevant, irrelevant, and redundant explanatory variables for quantile regression. Then, we show our method's power in selecting relevant variables and prediction accuracy in high-dimensional sparse data sets via simulation studies. Next, we apply the proposed methods to the high-dimensional real data, aiming at the genomic prediction of maize traits. Some data processing and feature extraction steps for the real data are further discussed. Finally, we show the advantage of our methods over linear models and quantile regression forests in simulation studies and real data applications.


Assuntos
Genoma , Genômica , Genômica/métodos , Simulação por Computador , Modelos Lineares , Fenótipo
2.
Sensors (Basel) ; 20(16)2020 Aug 18.
Artigo em Inglês | MEDLINE | ID: mdl-32824713

RESUMO

Wind has a significant influence on the operational flight safety. To quantify the influence of the wind characteristics, a wind series generator is required in simulations. This paper presents a method to model the stochastic wind based on operational flight data using the Karhunen-Loève expansion. The proposed wind model allows us to generate new realizations of wind series, which follow the original statistical characteristics. To improve the accuracy of this wind model, a vine copula is used in this paper to capture the high dimensional dependence among the random variables in the expansions. Besides, the proposed stochastic model based on the Karhunen-Loève expansion is compared with the well-known von Karman turbulence model based on the spectral representation in this paper. Modeling results of turbulence data validate that the Karhunen-Loève expansion and the spectral representation coincide in the stationary process. Furthermore, construction results of the non-stationary wind process from operational flights show that the generated wind series have a good match in the statistical characteristics with the raw data. The proposed stochastic wind model allows us to integrate the new wind series into the Monte Carlo Simulation for quantitative assessments.

3.
Biometrics ; 75(2): 439-451, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-30549012

RESUMO

In many time-to-event studies, the event of interest is recurrent. Here, the data for each sample unit correspond to a series of gap times between the subsequent events. Given a limited follow-up period, the last gap time might be right-censored. In contrast to classical analysis, gap times and censoring times cannot be assumed independent, i.e., the sequential nature of the data induces dependent censoring. Also, the number of recurrences typically varies among sample units leading to unbalanced data. To model the association pattern between gap times, so far only parametric margins combined with the restrictive class of Archimedean copulas have been considered. Here, taking the specific data features into account, we extend existing work in several directions: we allow for nonparametric margins and consider the flexible class of D-vine copulas. A global and sequential (one- and two-stage) likelihood approach are suggested. We discuss the computational efficiency of each estimation strategy. Extensive simulations show good finite sample performance of the proposed methodology. It is used to analyze the association of recurrent asthma attacks in children. The analysis reveals that a D-vine copula detects relevant insights, on how dependence changes in strength and type over time.


Assuntos
Interpretação Estatística de Dados , Funções Verossimilhança , Modelos Estatísticos , Asma/patologia , Criança , Simulação por Computador , Humanos , Recidiva , Fatores de Tempo
4.
Biometrics ; 74(3): 997-1005, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29569339

RESUMO

We propose a model for unbalanced longitudinal data, where the univariate margins can be selected arbitrarily and the dependence structure is described with the help of a D-vine copula. We show that our approach is an extremely flexible extension of the widely used linear mixed model if the correlation is homogeneous over the considered individuals. As an alternative to joint maximum-likelihood a sequential estimation approach for the D-vine copula is provided and validated in a simulation study. The model can handle missing values without being forced to discard data. Since conditional distributions are known analytically, we easily make predictions for future events. For model selection, we adjust the Bayesian information criterion to our situation. In an application to heart surgery data our model performs clearly better than competing linear mixed models.


Assuntos
Modelos Lineares , Modelos Estatísticos , Estatísticas não Paramétricas , Procedimentos Cirúrgicos Cardíacos , Simulação por Computador , Humanos , Funções Verossimilhança , Projetos de Pesquisa
5.
Biometrics ; 71(2): 323-32, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25660495

RESUMO

We introduce an extension of R-vine copula models to allow for spatial dependencies and model based prediction at unobserved locations. The proposed spatial R-vine model combines the flexibility of vine copulas with the classical geostatistical idea of modeling spatial dependencies using the distances between the variable locations. In particular, the model is able to capture non-Gaussian spatial dependencies. To develop and illustrate our approach, we consider daily mean temperature data observed at 54 monitoring stations in Germany. We identify relationships between the vine copula parameters and the station distances and exploit these in order to reduce the huge number of parameters needed to parametrize a 54-dimensional R-vine model fitted to the data. The new distance based model parametrization results in a distinct reduction in the number of parameters and makes parameter estimation and prediction at unobserved locations feasible. The prediction capabilities are validated using adequate scoring techniques, showing a better performance of the spatial R-vine copula model compared to a Gaussian spatial model.


Assuntos
Clima , Modelos Estatísticos , Temperatura , Biometria , Alemanha , Humanos , Funções Verossimilhança , Distribuição Normal , Fatores de Tempo
6.
Biostatistics ; 11(1): 127-38, 2010 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-19948742

RESUMO

Longitudinal data with binary and ordinal outcomes routinely appear in medical applications. Existing methods are typically designed to deal with short measurement series. In contrast, modern longitudinal data can result in large numbers of subject-specific serial observations. In this framework, we consider multivariate probit models with random effects to capture heterogeneity and autoregressive terms for describing the serial dependence. Since likelihood inference for the proposed class of models is computationally burdensome because of high-dimensional intractable integrals, a pseudolikelihood approach is followed. The methodology is motivated by the analysis of a large longitudinal study on the determinants of migraine severity.


Assuntos
Bioestatística/métodos , Estudos Longitudinais , Modelos Estatísticos , Algoritmos , Analgésicos/uso terapêutico , Pressão Atmosférica , Simulação por Computador , Escolaridade , Humanos , Umidade , Funções Verossimilhança , Prontuários Médicos , Transtornos de Enxaqueca/diagnóstico , Transtornos de Enxaqueca/tratamento farmacológico , Transtornos de Enxaqueca/epidemiologia , Análise Multivariada , Probabilidade , Distribuições Estatísticas , Inquéritos e Questionários , Temperatura , Tempo (Meteorologia)
7.
Stat Appl Genet Mol Biol ; 9: Article26, 2010.
Artigo em Inglês | MEDLINE | ID: mdl-20597852

RESUMO

We consider the problem of locating multiple interacting quantitative trait loci (QTL) influencing traits measured in counts. In many applications the distribution of the count variable has a spike at zero. Zero-inflated generalized Poisson regression (ZIGPR) allows for an additional probability mass at zero and hence an improvement in the detection of significant loci. Classical model selection criteria often overestimate the QTL number. Therefore, modified versions of the Bayesian Information Criterion (mBIC and EBIC) were successfully used for QTL mapping. We apply these criteria based on ZIGPR as well as simpler models. An extensive simulation study shows their good power detecting QTL while controlling the false discovery rate. We illustrate how the inability of the Poisson distribution to account for over-dispersion leads to an overestimation of the QTL number and hence strongly discourages its application for identifying factors influencing count data. The proposed method is used to analyze the mice gallstone data of Lyons et al. (2003). Our results suggest the existence of a novel QTL on chromosome 4 interacting with another QTL previously identified on chromosome 5. We provide the corresponding code in R.


Assuntos
Locos de Características Quantitativas , Animais , Teorema de Bayes , Cromossomos , Cálculos Biliares , Humanos , Camundongos , Distribuição de Poisson , Probabilidade
8.
Biometrics ; 65(4): 1254-61, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19432783

RESUMO

We discuss tools for the evaluation of probabilistic forecasts and the critique of statistical models for count data. Our proposals include a nonrandomized version of the probability integral transform, marginal calibration diagrams, and proper scoring rules, such as the predictive deviance. In case studies, we critique count regression models for patent data, and assess the predictive performance of Bayesian age-period-cohort models for larynx cancer counts in Germany. The toolbox applies in Bayesian or classical and parametric or nonparametric settings and to any type of ordered discrete outcomes.


Assuntos
Biometria/métodos , Modelos Estatísticos , Teorema de Bayes , Estudos de Coortes , Alemanha/epidemiologia , Humanos , Neoplasias Laríngeas/epidemiologia , Análise de Regressão , Estatísticas não Paramétricas
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA