Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 27
Filtrar
1.
Stat Med ; 42(20): 3732-3744, 2023 09 10.
Artigo em Inglês | MEDLINE | ID: mdl-37312237

RESUMO

In clinical and epidemiological research doubly truncated data often appear. This is the case, for instance, when the data registry is formed by interval sampling. Double truncation generally induces a sampling bias on the target variable, so proper corrections of ordinary estimation and inference procedures must be used. Unfortunately, the nonparametric maximum likelihood estimator of a doubly truncated distribution has several drawbacks, like potential nonexistence and nonuniqueness issues, or large estimation variance. Interestingly, no correction for double truncation is needed when the sampling bias is ignorable, which may occur with interval sampling and other sampling designs. In such a case the ordinary empirical distribution function is a consistent and fully efficient estimator that generally brings remarkable variance improvements compared to the nonparametric maximum likelihood estimator. Thus, identification of such situations is critical for the simple and efficient estimation of the target distribution. In this article, we introduce for the first time formal testing procedures for the null hypothesis of ignorable sampling bias with doubly truncated data. The asymptotic properties of the proposed test statistic are investigated. A bootstrap algorithm to approximate the null distribution of the test in practice is introduced. The finite sample performance of the method is studied in simulated scenarios. Finally, applications to data on onset for childhood cancer and Parkinson's disease are given. Variance improvements in estimation are discussed and illustrated.


Assuntos
Algoritmos , Projetos de Pesquisa , Humanos , Criança , Viés de Seleção , Funções Verossimilhança , Simulação por Computador , Viés
2.
Biom J ; 62(3): 852-867, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-31919875

RESUMO

Registry data typically report incident cases within a certain calendar time interval. Such interval sampling induces double truncation on the incidence times, which may result in an observational bias. In this paper, we introduce nonparametric estimation for the cumulative incidences of competing risks when the incidence time is doubly truncated. Two different estimators are proposed depending on whether the truncation limits are independent of the competing events or not. The asymptotic properties of the estimators are established, and their finite sample performance is investigated through simulations. For illustration purposes, the estimators are applied to childhood cancer registry data, where the target population is peculiarly defined conditional on future cancer development. Then, in our application, the cumulative incidences inform on the distribution by age of the different types of cancer.


Assuntos
Biometria/métodos , Estatísticas não Paramétricas , Adulto , Distribuição por Idade , Idoso , Feminino , Humanos , Incidência , Masculino , Pessoa de Meia-Idade , Neoplasias/epidemiologia , Risco , Tamanho da Amostra
3.
Biom J ; 61(2): 424-441, 2019 03.
Artigo em Inglês | MEDLINE | ID: mdl-30589104

RESUMO

Next-generation sequencing (NGS) experiments are often performed in biomedical research nowadays, leading to methodological challenges related to the high-dimensional and complex nature of the recorded data. In this work we review some of the issues that arise in disorder detection from NGS experiments, that is, when the focus is the detection of deletion and duplication disorders for homozygosity and heterozygosity in DNA sequencing. A statistical model to cope with guanine/cytosine bias and phasing and prephasing phenomena at base level is proposed, and a goodness-of-fit procedure for disorder detection is derived. The method combines the proper evaluation of local p-values (one for each DNA base) with suitable corrections for multiple comparisons and the discrete nature of the p-values. A global test for the detection of disorders in the whole DNA region is proposed too. The performance of the introduced procedures is investigated through simulations. A real data illustration is provided.


Assuntos
Bioestatística/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Heterozigoto , Homozigoto , Modelos Estatísticos , Método de Monte Carlo
4.
Biometrics ; 74(4): 1203-1212, 2018 12.
Artigo em Inglês | MEDLINE | ID: mdl-29603718

RESUMO

Nonparametric estimation of the transition probability matrix of a progressive multi-state model is considered under cross-sectional sampling. Two different estimators adapted to possibly right-censored and left-truncated data are proposed. The estimators require full retrospective information before the truncation time, which, when exploited, increases efficiency. They are obtained as differences between two survival functions constructed for sub-samples of subjects occupying specific states at a certain time point. Both estimators correct the oversampling of relatively large survival times by using the left-truncation times associated with the cross-sectional observation. Asymptotic results are established, and finite sample performance is investigated through simulations. One of the proposed estimators performs better when there is no censoring, while the second one is strongly recommended with censored data. The new estimators are applied to data on patients in intensive care units (ICUs).


Assuntos
Biometria/métodos , Estatística como Assunto/métodos , Doença Aguda/mortalidade , Doença Aguda/terapia , Simulação por Computador , Estudos Transversais , Humanos , Unidades de Terapia Intensiva , Fatores de Tempo
5.
Biometrics ; 74(2): 481-487, 2018 06.
Artigo em Inglês | MEDLINE | ID: mdl-28886206

RESUMO

Doubly truncated data arise when event times are observed only if they fall within subject-specific, possibly random, intervals. While non-parametric methods for survivor function estimation using doubly truncated data have been intensively studied, only a few methods for fitting regression models have been suggested, and only for a limited number of covariates. In this article, we present a method to fit the Cox regression model to doubly truncated data with multiple discrete and continuous covariates, and describe how to implement it using existing software. The approach is used to study the association between candidate single nucleotide polymorphisms and age of onset of Parkinson's disease.


Assuntos
Biometria/métodos , Doença de Parkinson/genética , Modelos de Riscos Proporcionais , Idade de Início , Humanos , Polimorfismo de Nucleotídeo Único , Probabilidade , Análise de Regressão , Software
6.
Stat Med ; 36(12): 1964-1976, 2017 05 30.
Artigo em Inglês | MEDLINE | ID: mdl-28238225

RESUMO

In this work, we present direct regression analysis for the transition probabilities in the possibly non-Markov progressive illness-death model. The method is based on binomial regression, where the response is the indicator of the occupancy for the given state along time. Randomly weighted score equations that are able to remove the bias due to censoring are introduced. By solving these equations, one can estimate the possibly time-varying regression coefficients, which have an immediate interpretation as covariate effects on the transition probabilities. The performance of the proposed estimator is investigated through simulations. We apply the method to data from the Registry of Systematic Lupus Erythematosus RELESSER, a multicenter registry created by the Spanish Society of Rheumatology. Specifically, we investigate the effect of age at Lupus diagnosis, sex, and ethnicity on the probability of damage and death along time. Copyright © 2017 John Wiley & Sons, Ltd.


Assuntos
Progressão da Doença , Modelos Estatísticos , Mortalidade , Análise de Regressão , Fatores Etários , Viés , Feminino , Humanos , Lúpus Eritematoso Sistêmico/mortalidade , Lúpus Eritematoso Sistêmico/patologia , Masculino , Pessoa de Meia-Idade , Probabilidade , Sistema de Registros , Medição de Risco , Fatores Sexuais , Análise de Sobrevida
7.
Rheumatology (Oxford) ; 55(7): 1243-50, 2016 07.
Artigo em Inglês | MEDLINE | ID: mdl-27018057

RESUMO

OBJECTIVES: To identify patterns (clusters) of damage manifestations within a large cohort of SLE patients and evaluate the potential association of these clusters with a higher risk of mortality. METHODS: This is a multicentre, descriptive, cross-sectional study of a cohort of 3656 SLE patients from the Spanish Society of Rheumatology Lupus Registry. Organ damage was ascertained using the Systemic Lupus International Collaborating Clinics Damage Index. Using cluster analysis, groups of patients with similar patterns of damage manifestations were identified. Then, overall clusters were compared as well as the subgroup of patients within every cluster with disease duration shorter than 5 years. RESULTS: Three damage clusters were identified. Cluster 1 (80.6% of patients) presented a lower amount of individuals with damage (23.2 vs 100% in clusters 2 and 3, P < 0.001). Cluster 2 (11.4% of patients) was characterized by musculoskeletal damage in all patients. Cluster 3 (8.0% of patients) was the only group with cardiovascular damage, and this was present in all patients. The overall mortality rate of patients in clusters 2 and 3 was higher than that in cluster 1 (P < 0.001 for both comparisons) and in patients with disease duration shorter than 5 years as well. CONCLUSION: In a large cohort of SLE patients, cardiovascular and musculoskeletal damage manifestations were the two dominant forms of damage to sort patients into clinically meaningful clusters. Both in early and late stages of the disease, there was a significant association of these clusters with an increased risk of mortality. Physicians should pay special attention to the early prevention of damage in these two systems.


Assuntos
Doenças Cardiovasculares/mortalidade , Lúpus Eritematoso Sistêmico/complicações , Lúpus Eritematoso Sistêmico/mortalidade , Doenças Musculoesqueléticas/mortalidade , Índice de Gravidade de Doença , Adulto , Doenças Cardiovasculares/etiologia , Análise por Conglomerados , Estudos Transversais , Feminino , Humanos , Lúpus Eritematoso Sistêmico/patologia , Masculino , Pessoa de Meia-Idade , Doenças Musculoesqueléticas/etiologia , Sistema de Registros , Espanha , Fatores de Tempo
8.
Biometrics ; 71(2): 364-75, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25735883

RESUMO

Multi-state models are often used for modeling complex event history data. In these models the estimation of the transition probabilities is of particular interest, since they allow for long-term predictions of the process. These quantities have been traditionally estimated by the Aalen-Johansen estimator, which is consistent if the process is Markov. Several non-Markov estimators have been proposed in the recent literature, and their superiority with respect to the Aalen-Johansen estimator has been proved in situations in which the Markov condition is strongly violated. However, the existing estimators have the drawback of requiring that the support of the censoring distribution contains the support of the lifetime distribution, which is not often the case. In this article, we propose two new methods for estimating the transition probabilities in the progressive illness-death model. Some asymptotic results are derived. The proposed estimators are consistent regardless the Markov condition and the referred assumption about the censoring support. We explore the finite sample behavior of the estimators through simulations. The main conclusion of this piece of research is that the proposed estimators are much more efficient than the existing non-Markov estimators in most cases. An application to a clinical trial on colon cancer is included. Extensions to progressive processes beyond the three-state illness-death model are discussed.


Assuntos
Estatísticas não Paramétricas , Análise de Sobrevida , Algoritmos , Biometria , Neoplasias do Colo/mortalidade , Neoplasias do Colo/cirurgia , Simulação por Computador , Humanos , Estimativa de Kaplan-Meier , Cadeias de Markov , Modelos Estatísticos , Probabilidade , Processos Estocásticos
9.
Biom J ; 57(1): 108-22, 2015 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-25323102

RESUMO

In the field of multiple comparison procedures, adjusted p-values are an important tool to evaluate the significance of a test statistic while taking the multiplicity into account. In this paper, we introduce adjusted p-values for the recently proposed Sequential Goodness-of-Fit (SGoF) multiple test procedure by letting the level of the test vary on the unit interval. This extends previous research on the SGoF method, which is a method of high interest when one aims to increase the statistical power in a multiple testing scenario. The adjusted p-value is the smallest level at which the SGoF procedure would still reject the given null hypothesis, while controlling for the multiplicity of tests. The main properties of the adjusted p-values are investigated. In particular, we show that they are a subset of the original p-values, being equal to 1 for p-values above a certain threshold. These are very useful properties from a numerical viewpoint, since they allow for a simplified method to compute the adjusted p-values. We introduce a modification of the SGoF method, termed majorant version, which rejects the null hypotheses with adjusted p-values below the level. This modification rejects more null hypotheses as the level increases, something which is not in general the case for the original SGoF. Adjusted p-values for the conservative version of the SGoF procedure, which estimates the variance without assuming that all the null hypotheses are true, are also included. The situation with ties among the p-values is discussed too. Several real data applications are investigated to illustrate the practical usage of adjusted p-values, ranging from a small to a large number of tests.


Assuntos
Biometria/métodos , Animais , Criança , Exposição Ambiental/efeitos adversos , Perfilação da Expressão Gênica , Humanos , Chumbo/efeitos adversos , Infarto do Miocárdio/terapia , Mytilus edulis/genética , Neuropsicologia , Distribuição Normal
11.
Stat Appl Genet Mol Biol ; 11(3): Article 14, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22611594

RESUMO

In this paper a correction of SGoF multitesting method for dependent tests is introduced. The correction is based in the beta-binomial model, and therefore the new method is called Beta- Binomial SGoF (or BB-SGoF). Main properties of the new method are established, and its practical implementation is discussed. BB-SGoF is illustrated through the analysis of two different real data sets on gene/protein expression levels. The performance of the method is investigated through simulations too. One of the main conclusions of the paper is that SGoF strategy may have much power even in the presence of possible dependences among the tests.


Assuntos
Perfilação da Expressão Gênica/métodos , Modelos Estatísticos , Algoritmos , Animais , Simulação por Computador , Feminino , Humanos , Masculino
12.
Biom J ; 55(1): 52-67, 2013 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-23225621

RESUMO

In this paper, we introduce a new estimator of a percentile residual life function with censored data under a monotonicity constraint. Specifically, it is assumed that the percentile residual life is a decreasing function. This assumption is useful when estimating the percentile residual life of units, which degenerate with age. We establish a law of the iterated logarithm for the proposed estimator, and its n-equivalence to the unrestricted estimator. The asymptotic normal distribution of the estimator and its strong approximation to a Gaussian process are also established. We investigate the finite sample performance of the monotone estimator in an extensive simulation study. Finally, data from a clinical trial in primary biliary cirrhosis of the liver are analyzed with the proposed methods. One of the conclusions of our work is that the restricted estimator may be much more efficient than the unrestricted one.


Assuntos
Biometria/métodos , Análise de Sobrevida , Ensaios Clínicos como Assunto , Humanos , Cirrose Hepática Biliar/epidemiologia , Distribuição Normal , Processos Estocásticos
13.
Stat Med ; 31(30): 4416-27, 2012 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-22975898

RESUMO

Multistate models are useful tools for modeling disease progression when survival is the main outcome, but several intermediate events of interest are observed during the follow-up time. The illness-death model is a special multistate model with important applications in the biomedical literature. It provides a suitable representation of the individual's history when a unique intermediate event can be experienced before the main event of interest. Nonparametric estimation of transition probabilities in this and other multistate models is usually performed through the Aalen-Johansen estimator under a Markov assumption. The Markov assumption claims that given the present state, the future evolution of the illness is independent of the states previously visited and the transition times among them. However, this assumption fails in some applications, leading to inconsistent estimates. In this paper, we provide a new approach for testing Markovianity in the illness-death model. The new method is based on measuring the future-past association along time. This results in a detailed inspection of the process, which often reveals a non-Markovian behavior with different trends in the association measure. A test of significance for zero future-past association at each time point is introduced, and a significance trace is proposed accordingly. Besides, we propose a global test for Markovianity based on a supremum-type test statistic. The finite sample performance of the test is investigated through simulations. We illustrate the new method through the analysis of two biomedical data analysis.


Assuntos
Biometria/métodos , Progressão da Doença , Cadeias de Markov , Método de Monte Carlo , Análise de Sobrevida , Transplante de Medula Óssea , Simulação por Computador , Humanos , Leucemia Mieloide Aguda/mortalidade , Leucemia Mieloide Aguda/terapia , Modelos Biológicos , Estudos Multicêntricos como Assunto/estatística & dados numéricos , Estatísticas não Paramétricas , Resultado do Tratamento
14.
Biom J ; 54(2): 163-80, 2012 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-22522376

RESUMO

The three-state progressive model is a special multi-state model with important applications in Survival Analysis. It provides a suitable representation of the individual's history when an intermediate event (with a possible influence on the survival prognosis) is experienced before the main event of interest. Estimation of transition probabilities in this and other multi-state models is usually performed through the Aalen-Johansen estimator. However, Aalen-Johansen may be biased when the underlying process is not Markov. In this paper, we provide a new approach for testing Markovianity in the three-state progressive model. The new method is based on measuring the future-past association along time. This results in a deep inspection of the process that often reveals a non-Markovian behaviour with different trends in the association measure. A test of significance for zero future-past association at each time point is introduced, and a significance trace is proposed accordingly. The finite sample performance of the test is investigated through simulations. We illustrate the new method through real data analysis.


Assuntos
Progressão da Doença , Cadeias de Markov , Modelos Estatísticos , Humanos , Análise de Sobrevida , Fatores de Tempo
15.
Comput Methods Programs Biomed ; 217: 106694, 2022 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35278813

RESUMO

BACKGROUND AND OBJECTIVE: Nowadays the "low sample size, large dimension" scenario is often encountered in genetics and in the omic sciences, where the microarray data is typically formed by a large number of possibly dependent small samples. Standard methods to solve the k-sample problem in such a setting are of limited applicability due to lack of theoretical validation for large k, lengthy computational times, missing software solutions, or inability to deal with statistical dependence among the samples. This paper presents the R package Equalden.HD to overcome the referred limitations. METHODS: The package implements several tests for the null hypothesis that a large number of samples follow a common density. These methods are particularly well suited to the "low sample size, large dimension" setting. The implemented procedures allow for dependent samples. For each method Equalden.HD reports, among other things, the standardized value of the test statistic and the corresponding p-value. The package also includes two high-dimensional genetic data sets, Hedenfalk and Rat, which are used in this paper for illustration purposes. RESULTS: The usage of Equalden.HD has been illustrated through the analysis of Hedenfalk and Rat genetic data. Statistical dependence among the samples was found for both genetic data sets. The application of an appropriate k-sample test within Equalden.HD rejected the null hypothesis of inter-samples homogeneity. The methods were used to test for the within groups homogeneity in cluster analysis too, which is usually performed when the k samples are found to be significantly different. Equalden.HD helped to identify the individuals which are responsible for the lack of homogeneity of the samples. The limitations of the standard Kruskal-Wallis test for the identification of homogeneous clusters have been highlighted. CONCLUSIONS: The methods implemented by Equalden.HD are the unique omnibus nonparametric k-sample tests that have been validated as k grows. Furthermore, the package provides suitable corrections for possibly dependent samples, which is another distinctive feature. Thus, the package opens new doors for the statistical analysis of omic data. Limitations of standard methods (e.g. Anderson-Darling and Kruskal-Wallis) and existing software solutions in the setting with a large k have been emphasized.


Assuntos
Software , Animais , Análise por Conglomerados , Ratos , Tamanho da Amostra
16.
Biom J ; 53(1): 113-27, 2011 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-21259312

RESUMO

Let (T(1), T(2)) be gap times corresponding to two consecutive events, which are observed subject to random right-censoring. In this paper, a semiparametric estimator of the bivariate distribution function of (T(1), T(2)) and, more generally, of a functional E [φ(T(1),T(2))] is proposed. We assume that the probability of censoring for T(2) given the (possibly censored) gap times belongs to a parametric family of binary regression curves. We investigate the conditions under which the introduced estimator is consistent. We explore the finite sample behavior of the estimator and of its bootstrap standard error through simulations. The main conclusion of this paper is that the semiparametric estimator may be much more efficient than purely nonparametric methods. Real data illustration is included.


Assuntos
Biometria/métodos , Interpretação Estatística de Dados , Neoplasias da Bexiga Urinária/epidemiologia , Algoritmos , Simulação por Computador , Humanos , Modelos Logísticos , Probabilidade , Fatores de Tempo
17.
BMJ Evid Based Med ; 26(3): 121-126, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31988195

RESUMO

When analysing and presenting results of randomised clinical trials, trialists rarely report if or how underlying statistical assumptions were validated. To avoid data-driven biased trial results, it should be common practice to prospectively describe the assessments of underlying assumptions. In existing literature, there is no consensus on how trialists should assess and report underlying assumptions for the analyses of randomised clinical trials. With this study, we developed suggestions on how to test and validate underlying assumptions behind logistic regression, linear regression, and Cox regression when analysing results of randomised clinical trials.Two investigators compiled an initial draftbased on a review of the literature. Experienced statisticians and trialists from eight different research centres and trial units then participated in a anonymised consensus process, where we reached agreement on the suggestions presented in this paper.This paper provides detailed suggestions on 1) which underlying statistical assumptions behind logistic regression, multiple linear regression and Cox regression each should be assessed; 2) how these underlying assumptions may be assessed; and 3) what to do if these assumptions are violated.We believe that the validity of randomised clinical trial results will increase if our recommendations for assessing and dealing with violations of the underlying statistical assumptions are followed.


Assuntos
Projetos de Pesquisa , Humanos , Ensaios Clínicos Controlados Aleatórios como Assunto
18.
Stat Med ; 29(30): 3147-59, 2010 Dec 30.
Artigo em Inglês | MEDLINE | ID: mdl-21170909

RESUMO

Doubly truncated data are often encountered in the analysis of survival times, when the sample reduces to those individuals with terminating event falling on a given observational window. In this paper we assume that some information about the bivariate distribution function (df) of the truncation times is available. More specifically, we represent this information by means of a parametric model for the joint df of the truncation times. Under this assumption, a new semiparametric estimator of the lifetime df is derived. We obtain asymptotic results for the new estimator, and we show in simulations that it may be more efficient than the Efron-Petrosian nonparametric maximum likelihood estimator. Data on the age at diagnosis of childhood cancer in North Portugal are analyzed with the new method.


Assuntos
Interpretação Estatística de Dados , Modelos Estatísticos , Análise de Sobrevida , Fatores Etários , Criança , Simulação por Computador , Humanos , Neuroblastoma/diagnóstico , Neuroblastoma/mortalidade , Portugal
19.
BMC Bioinformatics ; 10: 209, 2009 Jul 08.
Artigo em Inglês | MEDLINE | ID: mdl-19586526

RESUMO

BACKGROUND: The detection of true significant cases under multiple testing is becoming a fundamental issue when analyzing high-dimensional biological data. Unfortunately, known multitest adjustments reduce their statistical power as the number of tests increase. We propose a new multitest adjustment, based on a sequential goodness of fit metatest (SGoF), which increases its statistical power with the number of tests. The method is compared with Bonferroni and FDR-based alternatives by simulating a multitest context via two different kinds of tests: 1) one-sample t-test, and 2) homogeneity G-test. RESULTS: It is shown that SGoF behaves especially well with small sample sizes when 1) the alternative hypothesis is weakly to moderately deviated from the null model, 2) there are widespread effects through the family of tests, and 3) the number of tests is large. CONCLUSION: Therefore, SGoF should become an important tool for multitest adjustment when working with high-dimensional biological data.


Assuntos
Simulação por Computador , Modelos Estatísticos , Tamanho da Amostra
20.
Stat Methods Med Res ; 18(2): 195-222, 2009 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-18562394

RESUMO

The experience of a patient in a survival study may be modelled as a process with two states and one possible transition from an "alive" state to a "dead" state. In some studies, however, the "alive" state may be partitioned into two or more intermediate (transient) states, each of which corresponding to a particular stage of the illness. In such studies, multi-state models can be used to model the movement of patients among the various states. In these models issues, of interest include the estimation of progression rates, assessing the effects of individual risk factors, survival rates or prognostic forecasting. In this article, we review modelling approaches for multi-state models, and we focus on the estimation of quantities such as the transition probabilities and survival probabilities. Differences between these approaches are discussed, focussing on possible advantages and disadvantages for each method. We also review the existing software currently available to fit the various models and present new software developed in the form of an R library to analyse such models. Different approaches and software are illustrated using data from the Stanford heart transplant study and data from a study on breast cancer conducted in Galicia, Spain.


Assuntos
Modelos Estatísticos , Biometria , Neoplasias da Mama/mortalidade , Feminino , Transplante de Coração/mortalidade , Humanos , Estudos Longitudinais , Cadeias de Markov , Análise Multivariada , Recidiva Local de Neoplasia/mortalidade , Modelos de Riscos Proporcionais , Análise de Regressão , Software , Processos Estocásticos , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA