Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 105
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Bioinformatics ; 40(1)2024 01 02.
Artículo en Inglés | MEDLINE | ID: mdl-38134422

RESUMEN

SUMMARY: The SOHPIE R package implements a novel functionality for "multivariable" differential co-abundance network (DN, hereafter) analyses of microbiome data. It incorporates a regression approach that adjusts for additional covariates for DN analyses. This distinguishes from previous prominent approaches in DN analyses such as MDiNE and NetCoMi which do not feature a covariate adjustment of finding taxa that are differentially connected (DC, hereafter) between individuals with different clinical and phenotypic characteristics. AVAILABILITY AND IMPLEMENTATION: SOHPIE with a vignette is available on CRAN repository https://CRAN.R-project.org/package=SOHPIE and published under General Public License (GPL) version 3 license.


Asunto(s)
Microbiota , Programas Informáticos , Humanos
2.
BMC Bioinformatics ; 25(1): 117, 2024 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-38500042

RESUMEN

BACKGROUND: A recent breakthrough in differential network (DN) analysis of microbiome data has been realized with the advent of next-generation sequencing technologies. The DN analysis disentangles the microbial co-abundance among taxa by comparing the network properties between two or more graphs under different biological conditions. However, the existing methods to the DN analysis for microbiome data do not adjust for other clinical differences between subjects. RESULTS: We propose a Statistical Approach via Pseudo-value Information and Estimation for Differential Network Analysis (SOHPIE-DNA) that incorporates additional covariates such as continuous age and categorical BMI. SOHPIE-DNA is a regression technique adopting jackknife pseudo-values that can be implemented readily for the analysis. We demonstrate through simulations that SOHPIE-DNA consistently reaches higher recall and F1-score, while maintaining similar precision and accuracy to existing methods (NetCoMi and MDiNE). Lastly, we apply SOHPIE-DNA on two real datasets from the American Gut Project and the Diet Exchange Study to showcase the utility. The analysis of the Diet Exchange Study is to showcase that SOHPIE-DNA can also be used to incorporate the temporal change of connectivity of taxa with the inclusion of additional covariates. As a result, our method has found taxa that are related to the prevention of intestinal inflammation and severity of fatigue in advanced metastatic cancer patients. CONCLUSION: SOHPIE-DNA is the first attempt of introducing the regression framework for the DN analysis in microbiome data. This enables the prediction of characteristics of a connectivity of a network with the presence of additional covariate information in the regression. The R package with a vignette of our methodology is available through the CRAN repository ( https://CRAN.R-project.org/package=SOHPIE ), named SOHPIE (pronounced as Sofie). The source code and user manual can be found at https://github.com/sjahnn/SOHPIE-DNA .


Asunto(s)
Microbiota , Humanos , Microbiota/genética , Programas Informáticos , Análisis de Regresión , ADN
3.
Stat Med ; 2024 Jun 16.
Artículo en Inglés | MEDLINE | ID: mdl-38880963

RESUMEN

In cancer and other medical studies, time-to-event (eg, death) data are common. One major task to analyze time-to-event (or survival) data is usually to compare two medical interventions (eg, a treatment and a control) regarding their effect on patients' hazard to have the event in concern. In such cases, we need to compare two hazard curves of the two related patient groups. In practice, a medical treatment often has a time-lag effect, that is, the treatment effect can only be observed after a time period since the treatment is applied. In such cases, the two hazard curves would be similar in an initial time period, and the traditional testing procedures, such as the log-rank test, would be ineffective in detecting the treatment effect because the similarity between the two hazard curves in the initial time period would attenuate the difference between the two hazard curves that is reflected in the related testing statistics. In this paper, we suggest a new method for comparing two hazard curves when there is a potential treatment time-lag effect based on a weighted log-rank test with a flexible weighting scheme. The new method is shown to be more effective than some representative existing methods in various cases when a treatment time-lag effect is present.

4.
Stat Med ; 43(13): 2527-2546, 2024 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-38618705

RESUMEN

Urban environments, characterized by bustling mass transit systems and high population density, host a complex web of microorganisms that impact microbial interactions. These urban microbiomes, influenced by diverse demographics and constant human movement, are vital for understanding microbial dynamics. We explore urban metagenomics, utilizing an extensive dataset from the Metagenomics & Metadesign of Subways & Urban Biomes (MetaSUB) consortium, and investigate antimicrobial resistance (AMR) patterns. In this pioneering research, we delve into the role of bacteriophages, or "phages"-viruses that prey on bacteria and can facilitate the exchange of antibiotic resistance genes (ARGs) through mechanisms like horizontal gene transfer (HGT). Despite their potential significance, existing literature lacks a consensus on their significance in ARG dissemination. We argue that they are an important consideration. We uncover that environmental variables, such as those on climate, demographics, and landscape, can obscure phage-resistome relationships. We adjust for these potential confounders and clarify these relationships across specific and overall antibiotic classes with precision, identifying several key phages. Leveraging machine learning tools and validating findings through clinical literature, we uncover novel associations, adding valuable insights to our comprehension of AMR development.


Asunto(s)
Bacteriófagos , Bacteriófagos/genética , Humanos , Análisis de los Mínimos Cuadrados , Metagenómica/métodos , Farmacorresistencia Bacteriana/genética , Transferencia de Gen Horizontal , Farmacorresistencia Microbiana/genética , Factores de Confusión Epidemiológicos , Antibacterianos/farmacología , Antibacterianos/uso terapéutico , Microbiota/efectos de los fármacos
5.
BMC Bioinformatics ; 24(1): 8, 2023 Jan 09.
Artículo en Inglés | MEDLINE | ID: mdl-36624383

RESUMEN

BACKGROUND: The differential network (DN) analysis identifies changes in measures of association among genes under two or more experimental conditions. In this article, we introduce a pseudo-value regression approach for network analysis (PRANA). This is a novel method of differential network analysis that also adjusts for additional clinical covariates. We start from mutual information criteria, followed by pseudo-value calculations, which are then entered into a robust regression model. RESULTS: This article assesses the model performances of PRANA in a multivariable setting, followed by a comparison to dnapath and DINGO in both univariable and multivariable settings through variety of simulations. Performance in terms of precision, recall, and F1 score of differentially connected (DC) genes is assessed. By and large, PRANA outperformed dnapath and DINGO, neither of which is equipped to adjust for available covariates such as patient-age. Lastly, we employ PRANA in a real data application from the Gene Expression Omnibus database to identify DC genes that are associated with chronic obstructive pulmonary disease to demonstrate its utility. CONCLUSION: To the best of our knowledge, this is the first attempt of utilizing a regression modeling for DN analysis by collective gene expression levels between two or more groups with the inclusion of additional clinical covariates. By and large, adjusting for available covariates improves accuracy of a DN analysis.


Asunto(s)
Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Humanos , Perfilación de la Expresión Génica/métodos
6.
BMC Genomics ; 24(1): 687, 2023 Nov 16.
Artículo en Inglés | MEDLINE | ID: mdl-37974076

RESUMEN

BACKGROUND: Advances in sequencing technology and cost reduction have enabled an emergence of various statistical methods used in RNA-sequencing data, including the differential co-expression network analysis (or differential network analysis). A key benefit of this method is that it takes into consideration the interactions between or among genes and do not require an established knowledge in biological pathways. As of now, none of existing softwares can incorporate covariates that should be adjusted if they are confounding factors while performing the differential network analysis. RESULTS: We develop an R package PRANA which a user can easily include multiple covariates. The main R function in this package leverages a novel pseudo-value regression approach for a differential network analysis in RNA-sequencing data. This software is also enclosed with complementary R functions for extracting adjusted p-values and coefficient estimates of all or specific variable for each gene, as well as for identifying the names of genes that are differentially connected (DC, hereafter) between subjects under biologically different conditions from the output. CONCLUSION: Herewith, we demonstrate the application of this package in a real data on chronic obstructive pulmonary disease. PRANA is available through the CRAN repositories under the GPL-3 license: https://cran.r-project.org/web/packages/PRANA/index.html .


Asunto(s)
ARN , Programas Informáticos , Humanos , Secuencia de Bases , Análisis de Secuencia de ARN
7.
Stat Med ; 42(13): 2162-2178, 2023 06 15.
Artículo en Inglés | MEDLINE | ID: mdl-36973919

RESUMEN

Informative cluster size (ICS) arises in situations with clustered data where a latent relationship exists between the number of participants in a cluster and the outcome measures. Although this phenomenon has been sporadically reported in the statistical literature for nearly two decades now, further exploration is needed in certain statistical methodologies to avoid potentially misleading inferences. For inference about population quantities without covariates, inverse cluster size reweightings are often employed to adjust for ICS. Further, to study the effect of covariates on disease progression described by a multistate model, the pseudo-value regression technique has gained popularity in time-to-event data analysis. We seek to answer the question: "How to apply pseudo-value regression to clustered time-to-event data when cluster size is informative?" ICS adjustment by the reweighting method can be performed in two steps; estimation of marginal functions of the multistate model and fitting the estimating equations based on pseudo-value responses, leading to four possible strategies. We present theoretical arguments and thorough simulation experiments to ascertain the correct strategy for adjusting for ICS. A further extension of our methodology is implemented to include informativeness induced by the intracluster group size. We demonstrate the methods in two real-world applications: (i) to determine predictors of tooth survival in a periodontal study and (ii) to identify indicators of ambulatory recovery in spinal cord injury patients who participated in locomotor-training rehabilitation.


Asunto(s)
Modelos Estadísticos , Diente , Humanos , Análisis por Conglomerados , Simulación por Computador , Análisis de Regresión
8.
Stat Med ; 2022 Dec 27.
Artículo en Inglés | MEDLINE | ID: mdl-36574753

RESUMEN

We propose a Bayesian hurdle mixed-effects model to analyze longitudinal ordinal data under a complex multilevel structure. This research was motivated by the dataset gathered from the Iowa Fluoride Study (IFS) in order to establish the relationships between fluorosis status and potential risk/protective factors. Dental fluorosis is characterized by spots on tooth enamel and is due to ingestion of excessive fluoride intake during enamel formation. Observations are collected from multiple surface zones on each tooth and on all available teeth of children from the studied cohort, which are longitudinally observed at ages 9, 13, and 17. The data not only exhibit a complex hierarchical structure, but also have a large proportion of zero values that are likely to follow different statistical patterns from non-zero categories. Therefore, we develop a hurdle model to consider the zero category separately, while a proportional odds model is used for the positive categories. The estimated parameters are obtained from a Gibbs sampler implemented by the OpenBUGS software. Our model is compared with two popular methods for ordinal data: the proportional odds model and the partial proportional odds model. We perform a comprehensive analysis of the IFS data and evaluate the accuracy and effectiveness of our methodology through simulation studies. Our discoveries provide novel insights to statisticians and dental practitioners about the associations between patient and clinical characteristics and dental fluorosis.

9.
Stat Med ; 40(6): 1336-1356, 2021 03 15.
Artículo en Inglés | MEDLINE | ID: mdl-33368533

RESUMEN

Dental caries (i.e., cavities) is one of the most common chronic childhood diseases and may continue to progress throughout a person's lifetime. The Iowa Fluoride Study (IFS) was designed to investigate the effects of various fluoride, dietary and nondietary factors on the progression of dental caries among a cohort of Iowa school children. We develop a mixed effects model to perform a comprehensive analysis of the longitudinal clustered data of IFS at ages 5, 9, 13, and 17. We combine a Bayesian hurdle framework with the Conway-Maxwell-Poisson regression model, which can account for both excessive zeros and various levels of dispersion. A hierarchical shrinkage prior distribution is used to share the temporal information for predictors in the fixed-effects model. The dependence among teeth of each individual child is modeled through a sparse covariance structure of the random effects across time. Moreover, we obtain the parameter estimates and credible intervals from a Gibbs sampler. Simulation studies are conducted to assess the accuracy and effectiveness of our statistical methodology. The results of this article provide novel tools to statistical practitioners and offer fresh insights to dental researchers on effects of various risk and protective factors on caries progression.


Asunto(s)
Caries Dental , Adolescente , Teorema de Bayes , Niño , Preescolar , Simulación por Computador , Caries Dental/epidemiología , Humanos , Iowa/epidemiología , Distribución de Poisson
10.
Stat Med ; 40(21): 4582-4596, 2021 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-34057216

RESUMEN

Repeated measures are often collected in longitudinal follow-up from clinical trials and observational studies. In many situations, these measures are adherent to some specific event and are only available when it occurs; an example is serum creatinine from laboratory tests for hospitalized acute kidney injuries. The frequency of event recurrences is potentially correlated with overall health condition and hence may influence the distribution of the outcome measure of interest, leading to informative cluster size. In particular, there may be a large portion of subjects without any events, thus no longitudinal measures are available, which may be due to insusceptibility to such events or censoring before any events, and this zero-inflation nature of the data needs to be taken into account. On the other hand, there often exists a terminal event that may be correlated with the recurrent events. Previous work in this area suffered from the limitation that not all these issues were handled simultaneously. To address this deficiency, we propose a novel joint modeling approach for longitudinal data adjusting for zero-inflated and informative cluster size as well as a terminal event. A three-stage semiparametric likelihood-based approach is applied for parameter estimation and inference. Extensive simulations are conducted to evaluate the performance of our proposal. Finally, we utilize the Assessment, Serial Evaluation, and Subsequent Sequelae of Acute Kidney Injury (ASSESS-AKI) study for illustration.


Asunto(s)
Modelos Estadísticos , Proyectos de Investigación , Humanos , Funciones de Verosimilitud , Estudios Longitudinales , Recurrencia
11.
Stat Med ; 40(28): 6410-6420, 2021 12 10.
Artículo en Inglés | MEDLINE | ID: mdl-34496070

RESUMEN

In studies following selective sampling protocols for secondary outcomes, conventional analyses regarding their appearance could provide misguided information. In the large type 1 diabetes prevention and prediction (DIPP) cohort study monitoring type 1 diabetes-associated autoantibodies, we propose to model their appearance via a multivariate frailty model, which incorporates a correlation component that is important for unbiased estimation of the baseline hazards under the selective sampling mechanism. As further advantages, the frailty model allows for systematic evaluation of the association and the differences in regression parameters among the autoantibodies. We demonstrate the properties of the model by a simulation study and the analysis of the autoantibodies and their association with background factors in the DIPP study, in which we found that high genetic risk is associated with the appearance of all the autoantibodies, whereas the association with sex and urban municipality was evident for IA-2A and IAA autoantibodies.


Asunto(s)
Diabetes Mellitus Tipo 1 , Fragilidad , Autoanticuerpos/análisis , Estudios de Cohortes , Humanos , Factores de Riesgo
12.
Brain Inj ; 35(8): 922-933, 2021 07 03.
Artículo en Inglés | MEDLINE | ID: mdl-34053386

RESUMEN

OBJECTIVE: Disrupted sleep is common following combat deployment. Contributors to risk include posttraumatic stress disorder (PTSD) and mild traumatic brain injury (mTBI); however, the mechanisms linking PTSD, mTBI, and sleep are unclear. Both PTSD and mTBI affect frontolimbic white matter tracts, such as the uncinate fasciculus. The current study examined the relationship between PTSD symptom presentation, lateralized uncinate fasciculus integrity, and sleep quality. METHOD: Participants include 42 combat veterans with and without PTSD and mTBI. Freesurfer and Tracula were used to establish specific white matter ROI integrity via 3-T MRI. The Pittsburgh Sleep Quality Index and PTSD Checklist were used to assess sleep quality and PTSD symptoms. RESULTS: Decreased fractional anisotropy in the right uncinate fasciculus (ß = -1.11, SE = 0.47, p < .05) and increased hyperarousal symptom severity (ß = 3.50, SE = 0.86, p < .001) were associated with poorer sleep quality. CONCLUSION: Both right uncinate integrity and hyperarousal symptom severity are associated withsleep quality in combat veterans. The right uncinate is a key regulator of limbic behavior and sympathetic nervous system reactivity, a core component of hyperarousal. Damage to this pathway may be one mechanism by which mTBI and/or PTSD could create vulnerability for sleep problems following combat deployment.


Asunto(s)
Trastornos por Estrés Postraumático , Veteranos , Sustancia Blanca , Nivel de Alerta , Humanos , Sueño , Trastornos por Estrés Postraumático/diagnóstico por imagen , Sustancia Blanca/diagnóstico por imagen
13.
J Stat Softw ; 98(12)2021 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-34321962

RESUMEN

Gene expression data provide an abundant resource for inferring connections in gene regulatory networks. While methodologies developed for this task have shown success, a challenge remains in comparing the performance among methods. Gold-standard datasets are scarce and limited in use. And while tools for simulating expression data are available, they are not designed to resemble the data obtained from RNA-seq experiments. SeqNet is an R package that provides tools for generating a rich variety of gene network structures and simulating RNA-seq data from them. This produces in silico RNA-seq data for benchmarking and assessing gene network inference methods. The package is available on CRAN and on GitHub at https://github.com/tgrimes/SeqNet.

14.
Biom J ; 63(4): 761-786, 2021 04.
Artículo en Inglés | MEDLINE | ID: mdl-33393147

RESUMEN

Biological and medical researchers often collect count data in clusters at multiple time points. The data can exhibit excessive zeros and a wide range of dispersion levels. In particular, our research was motivated by a dental dataset with such complex data features: the Iowa Fluoride Study (IFS). The study was designed to investigate the effects of various dietary and nondietary factors on the caries development of a cohort of Iowa school children at the ages of 5, 9, and 13. To analyze the multiyear IFS data, we propose a novel longitudinal method of a generalized estimating equations based marginal regression model. We use a zero-inflated model with a Conway-Maxwell-Poisson (CMP) distribution, which has the flexibility to account for all levels of dispersion. The parameters of interest are estimated through a modified expectation-solution algorithm to account for the clustered and temporal correlation structure. We fit the proposed zero-inflated CMP model and perform a comprehensive secondary analysis of the IFS dataset. It resulted in a number of notable conclusions that also make clinical sense. Additionally, we demonstrated the superiority of this modeling approach over two other popular competing models: the zero-inflated Poisson and negative binomial models. In the simulation studies, we further evaluate the performance of our point estimators, the variance estimators, and that of the large sample confidence intervals for the parameters of interest. It is also demonstrated that our longitudinal CMP model can correctly identify the time-varying dispersion patterns.


Asunto(s)
Fluoruros , Modelos Estadísticos , Niño , Simulación por Computador , Humanos , Iowa , Distribución de Poisson
15.
J Biopharm Stat ; 30(3): 462-480, 2020 05 03.
Artículo en Inglés | MEDLINE | ID: mdl-31691633

RESUMEN

In this work, we propose a novel method for individualized treatment selection when the treatment response is multivariate. For the K treatment (K ≥2) scenario we compare quantities that are suitable indexes based on outcome variables for each treatment conditional on patient-specific scores constructed from collected covariate measurements. Our method covers any number of treatments and outcome variables, and it can be applied for a broad set of models. The proposed method uses a rank aggregation technique to estimate an ordering of treatments based on ranked lists of treatment performance measures such as smooth conditional means and conditional probability of a response for one treatment dominating others. The method has the flexibility to incorporate patient and clinician preferences to the optimal treatment decision on an individual case basis. A simulation study demonstrates the performance of the proposed method in finite samples. We also present data analyses using HIV and Diabetes clinical trials data to show the applicability of the proposed procedure for real data.


Asunto(s)
Antivirales/uso terapéutico , Simulación por Computador/estadística & datos numéricos , Infecciones por VIH/tratamiento farmacológico , Evaluación de Resultado en la Atención de Salud/estadística & datos numéricos , Medicina de Precisión/estadística & datos numéricos , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricos , Método Doble Ciego , Infecciones por VIH/epidemiología , Humanos , Análisis Multivariante , Evaluación de Resultado en la Atención de Salud/métodos , Medicina de Precisión/métodos , Ensayos Clínicos Controlados Aleatorios como Asunto/métodos , Resultado del Tratamiento
16.
Stat Med ; 38(28): 5391-5412, 2019 12 10.
Artículo en Inglés | MEDLINE | ID: mdl-31637762

RESUMEN

In this work, we propose a semiparametric method for estimating the optimal treatment for a given patient based on individual covariate information for that patient when data from a crossover design are available. Here, we assume there are carry-over effects for patients switching from one treatment to another. For the K treatment (K ≥ 2) scenario, we show that nonparametric estimation of carry-over effects can have the undesirable property that comparison of treatment means can only be done using independent outcome measurements from different groups of patients rather than using available joint measurements for each patient. To overcome this barrier, we compare probabilities of outcome variable of each treatment dominating outcome variables for all other treatments conditional on patient-specific scores constructed from patient covariates. We suggest single-index models as appropriate models connecting outcome variables to covariates and our empirical investigations show that frequencies of correct treatment assignments are highly accurate. The proposed method is also rather robust against departures from a single-index model structure. We also conduct a real data analysis to show the applicability of the proposed procedure.


Asunto(s)
Ensayos Clínicos como Asunto/estadística & datos numéricos , Estudios Cruzados , Modelos Estadísticos , Medicina de Precisión/estadística & datos numéricos , Bioestadística , Simulación por Computador , Diabetes Mellitus Tipo 2/sangre , Diabetes Mellitus Tipo 2/dietoterapia , Hemoglobina Glucada/metabolismo , Humanos , Probabilidad , Ensayos Clínicos Controlados Aleatorios como Asunto/estadística & datos numéricos , Resultado del Tratamiento
17.
Stat Med ; 38(15): 2828-2846, 2019 07 10.
Artículo en Inglés | MEDLINE | ID: mdl-30941812

RESUMEN

In observational studies, generalized propensity score (GPS)-based statistical methods, such as inverse probability weighting (IPW) and doubly robust (DR) method, have been proposed to estimate the average treatment effect (ATE) among multiple treatment groups. In this article, we investigate the GPS-based statistical methods to estimate treatment effects from two aspects. The first aspect of our investigation is to obtain an optimal GPS estimation method among four competing GPS estimation methods by using a rank aggregation approach. We further examine whether the optimal GPS-based IPW and DR methods would improve the performance for estimating ATE. It is well known that the DR method is consistent if either the GPS or the outcome models are correctly specified. The second aspect of our investigation is to examine whether the DR method could be improved if we ensemble outcome models. To that end, bootstrap method and rank aggregation method are used to obtain the ensemble optimal outcome model from several competing outcome models, and the resulting outcome model is incorporated into the DR method, resulting in an ensemble DR (enDR) method. Extensive simulation results indicate that the enDR method provides the best performance in estimating the ATE regardless of the method used for estimating GPS. We illustrate our methods using the MarketScan healthcare insurance claims database to examine the treatment effects among three different bones and substitutes used for spinal fusion surgeries. We draw conclusions based on the estimates from the enDR method coupled with the optimal GPS estimation method.


Asunto(s)
Estudios Observacionales como Asunto/métodos , Puntaje de Propensión , Resultado del Tratamiento , Causalidad , Simulación por Computador , Humanos
18.
Stat Med ; 37(30): 4807-4822, 2018 12 30.
Artículo en Inglés | MEDLINE | ID: mdl-30232808

RESUMEN

There have been numerous attempts to extend the Wilcoxon rank-sum test to clustered data. Recently, one such rank-sum test (Dutta & Datta, 2016, Biometrics 72, 432-440) was developed to compare the group-specific marginal distributions of outcomes in clustered data where the conditional distributions of outcomes depend on the number of observations from that group in a given cluster, a phenomenon referred to as informative intra-cluster group (ICG) size. However, comparison of group-specific marginal distributions may not be sufficient in presence of some potentially useful covariables that are observed in the study. In addition, not accounting for the effect of these covariates can lead to biased and misleading inference for the group comparisons. Thus, the purpose of this article is twofold. First, we develop a method to estimate the covariate effects using rank-based weighted estimating equations that are appropriate when the ICG size is informative. Second, we construct an aligned rank-sum test based on the covariate adjusted outcomes. Asymptotic distributions of the R-estimators and the test statistic are provided. Through simulation studies, we show the importance of selecting proper weights in constructing the estimating equations when informativeness is present through the cluster or ICG sizes. We also demonstrate the superiority and the robustness of our method in comparison to regular parametric linear mixed models in clustered data. We apply our method to analyze different real-life data sets including a data on birthweights of rat pups in different litters and a dental data on tooth attachment loss.


Asunto(s)
Análisis por Conglomerados , Tamaño de la Muestra , Estadísticas no Paramétricas , Anciano , Animales , Peso al Nacer , Interpretación Estadística de Datos , Humanos , Modelos Lineales , Modelos Estadísticos , Pérdida de la Inserción Periodontal/epidemiología , Ratas
19.
Stat Med ; 37(27): 4071-4082, 2018 11 30.
Artículo en Inglés | MEDLINE | ID: mdl-30003565

RESUMEN

The log rank test is a popular nonparametric test for comparing survival distributions among groups. When data are organized in clusters of potentially correlated observations, adjustments can be made to account for within-cluster dependencies among observations, eg, tests derived from frailty models. Tests for clustered data can be further biased when the number of observations within each cluster and the distribution of groups within cluster are correlated with survival times, phenomena known as informative cluster size and informative within-cluster group size. In this manuscript, we develop a log rank test for clustered data that adjusts for the potentially biasing effect of informative cluster size and within-cluster group size. We provide the results of a simulation study demonstrating that our proposed test remains unbiased under cluster-based informativeness, while other candidate tests not accounting for the clustering structure do not properly maintain size. Furthermore, our test exhibits power advantages under scenarios in which traditional tests are appropriate. We demonstrate an application of our test by comparing time to functional progression between groups defined initial functional status in a spinal cord injury data set.


Asunto(s)
Análisis por Conglomerados , Estadísticas no Paramétricas , Interpretación Estadística de Datos , Humanos , Modelos Estadísticos , Factores de Riesgo , Tamaño de la Muestra , Análisis de Supervivencia
20.
Stat Med ; 37(23): 3357-3372, 2018 10 15.
Artículo en Inglés | MEDLINE | ID: mdl-29923344

RESUMEN

Multisample U-statistics encompass a wide class of test statistics that allow the comparison of 2 or more distributions. U-statistics are especially powerful because they can be applied to both numeric and nonnumeric data, eg, ordinal and categorical data where a pairwise similarity or distance-like measure between categories is available. However, when comparing the distribution of a variable across 2 or more groups, observed differences may be due to confounding covariates. For example, in a case-control study, the distribution of exposure in cases may differ from that in controls entirely because of variables that are related to both exposure and case status and are distributed differently among case and control participants. We propose to use individually reweighted data (ie, using the stratification score for retrospective data or the propensity score for prospective data) to construct adjusted U-statistics that can test the equality of distributions across 2 (or more) groups in the presence of confounding covariates. Asymptotic normality of our adjusted U-statistics is established and a closed form expression of their asymptotic variance is presented. The utility of our approach is demonstrated through simulation studies, as well as in an analysis of data from a case-control study conducted among African-Americans, comparing whether the similarity in haplotypes (ie, sets of adjacent genetic loci inherited from the same parent) occurring in a case and a control participant differs from the similarity in haplotypes occurring in 2 control participants.


Asunto(s)
Modelos Estadísticos , Negro o Afroamericano/genética , Análisis de Varianza , Bioestadística , Estudios de Casos y Controles , Catecol O-Metiltransferasa/genética , Simulación por Computador , Haplotipos , Humanos , Puntaje de Propensión , Estudios Prospectivos , Estudios Retrospectivos , Esquizofrenia/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA