Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 31
Filtrar
Más filtros

Base de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Biometrika ; 109(3): 817-835, 2022 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-36105175

RESUMEN

Factorization models express a statistical object of interest in terms of a collection of simpler objects. For example, a matrix or tensor can be expressed as a sum of rank-one components. However, in practice, it can be challenging to infer the relative impact of the different components as well as the number of components. A popular idea is to include infinitely many components having impact decreasing with the component index. This article is motivated by two limitations of existing methods: (1) lack of careful consideration of the within component sparsity structure; and (2) no accommodation for grouped variables and other non-exchangeable structures. We propose a general class of infinite factorization models that address these limitations. Theoretical support is provided, practical gains are shown in simulation studies, and an ecology application focusing on modelling bird species occurrence is discussed.

2.
Biometrika ; 108(2): 269-282, 2021 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-35747172

RESUMEN

Posterior computation for high-dimensional data with many parameters can be challenging. This article focuses on a new method for approximating posterior distributions of a low- to moderate-dimensional parameter in the presence of a high-dimensional or otherwise computationally challenging nuisance parameter. The focus is on regression models and the key idea is to separate the likelihood into two components through a rotation. One component involves only the nuisance parameters, which can then be integrated out using a novel type of Gaussian approximation. We provide theory on approximation accuracy that holds for a broad class of forms of the nuisance component and priors. Applying our method to simulated and real data sets shows that it can outperform state-of-the-art posterior approximation approaches.

3.
Biometrika ; 105(2): 431-446, 2018 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-29880978

RESUMEN

There has been substantial recent interest in record linkage, where one attempts to group the records pertaining to the same entities from one or more large databases that lack unique identifiers. This can be viewed as a type of microclustering, with few observations per cluster and a very large number of clusters. We show that the problem is fundamentally hard from a theoretical perspective and, even in idealized cases, accurate entity resolution is effectively impossible unless the number of entities is small relative to the number of records and/or the separation between records from different entities is extremely large. These results suggest conservatism in interpretation of the results of record linkage, support collection of additional data to more accurately disambiguate the entities, and motivate a focus on coarser inference. For example, results from a simulation study suggest that sometimes one may obtain accurate results for population size estimation even when fine-scale entity resolution is inaccurate.

4.
Biometrika ; 104(4): 939-952, 2017 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-29422695

RESUMEN

We consider shape restricted nonparametric regression on a closed set [Formula: see text], where it is reasonable to assume the function has no more than H local extrema interior to [Formula: see text]. Following a Bayesian approach we develop a nonparametric prior over a novel class of local extremum splines. This approach is shown to be consistent when modeling any continuously differentiable function within the class considered, and is used to develop methods for testing hypotheses on the shape of the curve. Sampling algorithms are developed, and the method is applied in simulation studies and data examples where the shape of the curve is of interest.

5.
Stat Probab Lett ; 113: 41-48, 2016 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-31427835

RESUMEN

In population studies, it is standard to sample data via designs in which the population is divided into strata, with the different strata assigned different probabilities of inclusion. Although there have been some proposals for including sample survey weights into Bayesian analyses, existing methods require complex models or ignore the stratified design underlying the survey weights. We propose a simple approach based on modeling the distribution of the selected sample as a mixture, with the mixture weights appropriately adjusted, while accounting for uncertainty in the adjustment. We focus for simplicity on Dirichlet process mixtures but the proposed approach can be applied more broadly. We sketch a simple Markov chain Monte Carlo algorithm for computation, and assess the approach via simulations and an application.

6.
Bioinformatics ; 31(24): 3890-6, 2015 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-26323717

RESUMEN

MOTIVATION: Both single marker and simultaneous analysis face challenges in GWAS due to the large number of markers genotyped for a small number of subjects. This large p small n problem is particularly challenging when the trait under investigation has low heritability. METHOD: In this article, we propose a two-stage approach that is a hybrid method of single and simultaneous analysis designed to improve genomic prediction of complex traits. In the first stage, we use a Bayesian independent screening method to select the most promising SNPs. In the second stage, we rely on a hierarchical model to analyze the joint impact of the selected markers. The model is designed to take into account familial dependence in the different subjects, while using local-global shrinkage priors on the marker effects. RESULTS: We evaluate the performance in simulation studies, and consider an application to animal breeding data. The illustrative data analysis reveals an encouraging result in terms of prediction performance and computational cost.


Asunto(s)
Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple , Animales , Teorema de Bayes , Cruzamiento , Bovinos , Genómica/métodos , Genotipo , Modelos Genéticos
7.
Biometrika ; 98(2): 291-306, 2011 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-23049129

RESUMEN

We focus on sparse modelling of high-dimensional covariance matrices using Bayesian latent factor models. We propose a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk towards zero as the column index increases. We use our prior on a parameter-expanded loading matrix to avoid the order dependence typical in factor analysis models and develop an efficient Gibbs sampler that scales well as data dimensionality increases. The gain in efficiency is achieved by the joint conjugacy property of the proposed prior, which allows block updating of the loadings matrix. We propose an adaptive Gibbs sampler for automatically truncating the infinite loading matrix through selection of the number of important factors. Theoretical results are provided on the support of the prior and truncation approximation bounds. A fast algorithm is proposed to produce approximate Bayes estimates. Latent factor regression methods are developed for prediction and variable selection in applications with high-dimensional correlated predictors. Operating characteristics are assessed through simulation studies, and the approach is applied to predict survival times from gene expression data.

8.
Biometrika ; 98(1): 35-48, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-23956461

RESUMEN

We consider geostatistical models that allow the locations at which data are collected to be informative about the outcomes. A Bayesian approach is proposed, which models the locations using a log Gaussian Cox process, while modelling the outcomes conditionally on the locations as Gaussian with a Gaussian process spatial random effect and adjustment for the location intensity process. We prove posterior propriety under an improper prior on the parameter controlling the degree of informative sampling, demonstrating that the data are informative. In addition, we show that the density of the locations and mean function of the outcome process can be estimated consistently under mild assumptions. The methods show significant evidence of informative sampling when applied to ozone data over Eastern U.S.A.

9.
Biometrics ; 60(3): 676-83, 2004 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-15339290

RESUMEN

In studying rates of occurrence and progression of lesions (or tumors), it is typically not possible to obtain exact onset times for each lesion. Instead, data consist of the number of lesions that reach a detectable size between screening examinations, along with measures of the size/severity of individual lesions at each exam time. This interval-censored data structure makes it difficult to properly adjust for the onset time distribution in assessing covariate effects on rates of lesion progression. This article proposes a joint model for the multiple lesion onset and progression process, motivated by cross-sectional data from a study of uterine leiomyoma tumors. By using a joint model, one can potentially obtain more precise inferences on rates of onset, while also performing onset time-adjusted inferences on lesion severity. Following a Bayesian approach, we propose a data augmentation Markov chain Monte Carlo algorithm for posterior computation.


Asunto(s)
Teorema de Bayes , Neoplasias/etiología , Neoplasias/patología , Adulto , Algoritmos , Biometría , Estudios Transversales , Femenino , Humanos , Leiomiomatosis/etiología , Leiomiomatosis/patología , Cadenas de Markov , Persona de Mediana Edad , Modelos Estadísticos , Método de Montecarlo , Procesos Estocásticos , Factores de Tiempo , Neoplasias Uterinas/etiología , Neoplasias Uterinas/patología
10.
JAMA ; 286(14): 1759-61, 2001 Oct 10.
Artículo en Inglés | MEDLINE | ID: mdl-11594902

RESUMEN

CONTEXT: Pregnancy test kits routinely recommend testing "as early as the first day of the missed period." However, a pregnancy cannot be detected before the blastocyst implants. Due to natural variability in the timing of ovulation, implantation does not necessarily occur before the expected onset of next menses. OBJECTIVE: To estimate the maximum screening sensitivity of pregnancy tests when used on the first day of the expected period, taking into account the natural variability of ovulation and implantation. DESIGN AND SETTING: Community-based prospective cohort study conducted in North Carolina between 1982 and 1986. PARTICIPANTS: Two hundred twenty-one healthy women 21 to 42 years of age who were planning to conceive. MAIN OUTCOME MEASURES: Day of implantation, defined by the serial assay of first morning urine samples using an extremely sensitive immunoradiometric assay for human chorionic gonadotropin (hCG), relative to the first day of the missed period, defined as the day on which women expected their next menses to begin, based on self-reported usual cycle length. RESULTS: Data were available for 136 clinical pregnancies conceived during the study, 14 (10%) of which had not yet implanted by the first day of the missed period. The highest possible screening sensitivity for an hCG-based pregnancy test therefore is estimated to be 90% (95% confidence interval [CI], 84%-94%) on the first day of the missed period. By 1 week after the first day of the missed period, the highest possible screening sensitivity is estimated to be 97% (95% CI, 94%-99%). CONCLUSIONS: In this study, using an extremely sensitive assay for hCG, 10% of clinical pregnancies were undetectable on the first day of missed menses. In practice, an even larger percentage of clinical pregnancies may be undetected by current test kits on this day, given their reported assay properties and other practical limitations.


Asunto(s)
Gonadotropina Coriónica/orina , Ciclo Menstrual , Pruebas de Embarazo , Adulto , Implantación del Embrión , Femenino , Humanos , Menstruación , Ovulación , Embarazo , Estudios Prospectivos , Juego de Reactivos para Diagnóstico , Autocuidado , Sensibilidad y Especificidad
11.
Hum Reprod ; 16(11): 2278-82, 2001 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-11679504

RESUMEN

BACKGROUND: The TwoDay Algorithm is a simple method for identifying the fertile window. It classifies a day as fertile if cervical secretions are present on that day or were present on the day before. This approach may be an effective alternative to the ovulation and symptothermal methods for populations and programmes that find current natural family planning methods difficult to implement. METHODS: We used data on secretions from a large multinational European fecundability study to assess the relationship between the days predicted to be potentially fertile by the TwoDay Algorithm and the day-specific probabilities of pregnancy based on intercourse patterns in 434 conception cycles from the study. RESULTS: The days around ovulation that had the highest fecundability were the days most likely to be classified as fertile by the TwoDay Algorithm. In addition, intercourse on a particular day in the fertile interval was twice as likely to result in a pregnancy if cervical secretions were present on that day or the day before. CONCLUSIONS: The TwoDay Algorithm is effective, both in identifying the fertile days of the cycle and in predicting days within the fertile interval that have a high pregnancy rate. Our data provide the first direct evidence that cervical secretions are associated with higher fecundability within the fertile window.


Asunto(s)
Algoritmos , Cuello del Útero/metabolismo , Fertilidad , Temperatura Corporal , Moco del Cuello Uterino/metabolismo , Estudios de Cohortes , Coito , Femenino , Humanos , Detección de la Ovulación , Embarazo , Probabilidad , Estudios Prospectivos
12.
Toxicol Lett ; 122(1): 33-44, 2001 May 31.
Artículo en Inglés | MEDLINE | ID: mdl-11397555

RESUMEN

The Tg.AC mouse carrying the v-Ha-ras structural gene is a useful model for the study of chemical carcinogens, especially those acting via non-genotoxic mechanisms. This study evaluated the efficacy of the non-toxic, water-soluble antioxidant from spinach, natural antioxidant (NAO), in reducing skin papilloma induction in female hemizygous Tg.AC mice treated dermally five times over 2.5 weeks with 2.5 microg 12-O-tetradecanoylphorbol-13-acetate (TPA). The TPA-only group was considered as a control; the other two groups received, additionally, NAO topically (2 mg) or orally (100 mg/kg), 5 days/week for 5 weeks. Papilloma counts made macroscopically during the clinical observations showed a significant decrease in multiplicity (P<0.01) in the NAO topically treated group. According to histological criteria, papilloma multiplicity were lower in both topical-NAO and oral-NAO groups, but significantly so only in the oral-NAO mice (P<0.01). The beneficial effect of NAO in the Tg.AC mouse is reported.


Asunto(s)
Antioxidantes/farmacología , Papiloma/prevención & control , Neoplasias Cutáneas/prevención & control , Administración Cutánea , Administración Oral , Animales , Peso Corporal/efectos de los fármacos , Carcinógenos/efectos adversos , Modelos Animales de Enfermedad , Femenino , Genes ras/genética , Genotipo , Ratones , Ratones Transgénicos , Papiloma/inducido químicamente , Papiloma/patología , Extractos Vegetales/farmacología , Neoplasias Cutáneas/inducido químicamente , Neoplasias Cutáneas/patología , Spinacia oleracea/química , Análisis de Supervivencia , Acetato de Tetradecanoilforbol/efectos adversos
13.
Biometrics ; 57(2): 396-403, 2001 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-11414562

RESUMEN

In some cross-sectional studies of chronic disease, data consist of the age at examination, whether the disease was present at the exam, and recall of the age at first diagnosis. This article describes a flexible parametric approach for combining current status and age at first diagnosis data. We assume that the log odds of onset by a given age and of detection by a given age conditional on onset by that age are nondecreasing functions of time plus linear combinations of covariates. Piecewise linear models are used to characterize changes across time in the baseline odds. Methods are described for accommodating informatively missing current status data and inferences based on the age-specific incidence of disease prior to a landmark event (e.g., puberty, menopause). Our formulation enables straightforward maximum likelihood estimation without requiring restrictive parametric or Markov assumptions. The methods are applied to data from a study of uterine fibroids.


Asunto(s)
Modelos Estadísticos , Adulto , Factores de Edad , Enfermedad Crónica , Estudios Transversales , Progresión de la Enfermedad , Femenino , Humanos , Leiomioma/diagnóstico , Leiomioma/fisiopatología , Persona de Mediana Edad , Oportunidad Relativa , Premenopausia , Probabilidad , Factores de Tiempo , Neoplasias Uterinas/diagnóstico , Neoplasias Uterinas/fisiopatología
14.
Am J Epidemiol ; 153(12): 1222-6, 2001 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-11415958

RESUMEN

In the past decade, there have been enormous advances in the use of Bayesian methodology for analysis of epidemiologic data, and there are now many practical advantages to the Bayesian approach. Bayesian models can easily accommodate unobserved variables such as an individual's true disease status in the presence of diagnostic error. The use of prior probability distributions represents a powerful mechanism for incorporating information from previous studies and for controlling confounding. Posterior probabilities can be used as easily interpretable alternatives to p values. Recent developments in Markov chain Monte Carlo methodology facilitate the implementation of Bayesian analyses of complex data sets containing missing observations and multidimensional outcomes. Tools are now available that allow epidemiologists to take advantage of this powerful approach to assessment of exposure-disease relations.


Asunto(s)
Teorema de Bayes , Humanos
15.
J Infect Dis ; 184(2): 127-35, 2001 Jul 15.
Artículo en Inglés | MEDLINE | ID: mdl-11424008

RESUMEN

Many human immunodeficiency virus (HIV)-infected persons receive prolonged treatment with DNA-reactive antiretroviral drugs. A prospective study was conducted of 26 HIV-infected men who provided samples before treatment and at multiple times after beginning treatment, to investigate effects of antiretrovirals on lymphocyte and sperm chromosomes and semen quality. Several antiretroviral regimens, all including a nucleoside component, were used. Lymphocyte metaphase analysis and sperm fluorescence in situ hybridization were used for cytogenetic studies. Semen analyses included conventional parameters (volume, concentration, viability, motility, and morphology). No significant effects on cytogenetic parameters, semen volume, or sperm concentration were detected. However, there were significant improvements in sperm motility for men with study entry CD4 cell counts >200 cells/mm(3), sperm morphology for men with entry CD4 cell counts < or =200 cells/mm(3), and the percentage of viable sperm in both groups. These findings suggest that nucleoside-containing antiretrovirals administered via recommended protocols do not induce chromosomal changes in lymphocytes or sperm but may produce improvements in semen quality.


Asunto(s)
Fármacos Anti-VIH/efectos adversos , Rotura Cromosómica , Cromosomas/efectos de los fármacos , Infecciones por VIH/tratamiento farmacológico , Infecciones por VIH/inmunología , Linfocitos/efectos de los fármacos , Metafase/efectos de los fármacos , Inhibidores de la Transcriptasa Inversa/efectos adversos , Espermatozoides/efectos de los fármacos , Adulto , Aneuploidia , Fármacos Anti-VIH/uso terapéutico , Recuento de Linfocito CD4 , Diploidia , Quimioterapia Combinada , Humanos , Hibridación Fluorescente in Situ , Estudios Longitudinales , Linfocitos/metabolismo , Linfocitos/patología , Masculino , Persona de Mediana Edad , Inhibidores de la Transcriptasa Inversa/uso terapéutico
16.
Contraception ; 63(4): 211-5, 2001 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-11376648

RESUMEN

Emergency post-coital contraceptives effectively reduce the risk of pregnancy, but their degree of efficacy remains uncertain. Measurement of efficacy depends on the pregnancy rate without treatment, which cannot be measured directly. We provide indirect estimates of such pregnancy rates, using data from a prospective study of 221 women who were attempting to conceive. We previously estimated the probability of pregnancy with an act of intercourse relative to ovulation. In this article, we extend these data to estimate the probability of pregnancy relative to intercourse on a given cycle day (counting from onset of previous menses). In assessing the efficacy of post-coital contraceptives, other approaches have not incorporated accurate information on the variability of ovulation. We find that the possibility of late ovulation produces a persistent risk of pregnancy even into the sixth week of the cycle. Post-coital contraceptives may be indicated even when intercourse has occurred late in the cycle.


Asunto(s)
Coito , Anticonceptivos Poscoito , Femenino , Humanos , Ciclo Menstrual , Ovulación , Embarazo , Probabilidad , Estudios Prospectivos , Factores de Tiempo
17.
Stat Med ; 20(6): 965-78, 2001 Mar 30.
Artículo en Inglés | MEDLINE | ID: mdl-11252016

RESUMEN

In modelling human fertility one ideally accounts for timing of intercourse relative to ovulation. Measurement error in identifying the day of ovulation can bias estimates of fecundability parameters and attenuate estimates of covariate effects. In the absence of a single perfect marker of ovulation, several error prone markers are sometimes obtained. In this paper we propose a semi-parametric mixture model that uses multiple independent markers of ovulation to account for measurement error. The model assigns each method of assessing ovulation a distinct non-parametric error distribution, and corrects bias in estimates of day-specific fecundability. We use a Monte Carlo EM algorithm for joint estimation of (i) the error distribution for the markers, (ii) the error-corrected fertility parameters, and (iii) the couple-specific random effects. We apply the methods to data from a North Carolina fertility study to assess the magnitude of error in measures of ovulation based on urinary luteinizing hormone and metabolites of ovarian hormones, and estimate the corrected day-specific probabilities of clinical pregnancy. Published in 2001 by John Wiley & Sons, Ltd.


Asunto(s)
Fertilidad/fisiología , Modelos Biológicos , Detección de la Ovulación/métodos , Ovulación/fisiología , Algoritmos , Biomarcadores , Cuerpo Lúteo/fisiología , Estrógenos/orina , Femenino , Humanos , Funciones de Verosimilitud , Hormona Luteinizante/orina , Masculino , North Carolina , Embarazo , Progesterona/orina
18.
Biometrics ; 57(1): 302-8, 2001 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-11252614

RESUMEN

This article describes a general class of factor analytic models for the analysis of clustered multivariate data in the presence of informative missingness. We assume that there are distinct sets of cluster-level latent variables related to the primary outcomes and to the censoring process, and we account for dependency between these latent variables through a hierarchical model. A linear model is used to relate covariates and latent variables to the primary outcomes for each subunit. A generalized linear model accounts for covariate and latent variable effects on the probability of censoring for subunits within each cluster. The model accounts for correlation within clusters and within subunits through a flexible factor analytic framework that allows multiple latent variables and covariate effects on the latent variables. The structure of the model facilitates implementation of Markov chain Monte Carlo methods for posterior estimation. Data from a spermatotoxicity study are analyzed to illustrate the proposed approach.


Asunto(s)
Biometría , Modelos Estadísticos , Animales , Análisis por Conglomerados , Interpretación Estadística de Datos , Técnicas In Vitro , Masculino , Modelos Biológicos , Análisis Multivariante , Ratas , Motilidad Espermática/efectos de los fármacos
19.
Biometrics ; 57(4): 1067-73, 2001 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-11764245

RESUMEN

Time to pregnancy studies that identify ovulation days and collect daily intercourse data can be used to estimate the day-specific probabilities of conception given intercourse on a single day relative to ovulation. In this article, a Bayesian semiparametric model is described for flexibly characterizing covariate effects and heterogeneity among couples in daily fecundability. The proposed model is characterized by the timing of the most fertile day of the cycle relative to ovulation, by the probability of conception due to intercourse on the most fertile day, and by the ratios of the daily conception probabilities for other days of the cycle relative to this peak probability. The ratios are assumed to be increasing in time to the peak and decreasing thereafter. Generalized linear mixed models are used to incorporate covariate and couple-specific effects on the peak probability and on the day-specific ratios. A Markov chain Monte Carlo algorithm is described for posterior estimation, and the methods are illustrated through application to caffeine data from a North Carolina pregnancy study.


Asunto(s)
Teorema de Bayes , Fertilidad , Ciclo Menstrual/fisiología , Algoritmos , Biometría , Cafeína/farmacología , Femenino , Fertilidad/efectos de los fármacos , Humanos , Cadenas de Markov , Modelos Biológicos , Método de Montecarlo , Embarazo , Factores de Tiempo
20.
Biostatistics ; 2(2): 131-45, 2001 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-12933545

RESUMEN

Models of human fertility that incorporate information on timing of intercourse have assumed that a single ovum is released each menstrual cycle. These models are misspecified if two or more viable ova are sometimes released in a single cycle, which is known to occur in dizygotic twin pregnancies. In this paper, we propose a model for multiple ovulation in humans. We assume that the unobservable number of viable ova in each cycle follows a multinomial distribution. Successful fertilization of each ovum depends on the ability of the cycle to support a pregnancy and on the aggregate of a set of unobservable Bernoulli trials representing the fertilizing effects of intercourse on various days. Our model accommodates general covariate effects, allows for heterogeneity among couples, and accounts for a sterile subpopulation of couples. Information on early detection of pregnancy can be incorporated to estimate the probability of embryo loss. We outline a Markov chain Monte Carlo algorithm for estimation of the posterior distributions of the parameters. The methods are applied to data from a North Carolina pregnancy study, and applications to studies of assisted reproduction are described.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA