Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 29
Filtrar
1.
Biometrika ; 109(3): 817-835, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-36105175

RESUMO

Factorization models express a statistical object of interest in terms of a collection of simpler objects. For example, a matrix or tensor can be expressed as a sum of rank-one components. However, in practice, it can be challenging to infer the relative impact of the different components as well as the number of components. A popular idea is to include infinitely many components having impact decreasing with the component index. This article is motivated by two limitations of existing methods: (1) lack of careful consideration of the within component sparsity structure; and (2) no accommodation for grouped variables and other non-exchangeable structures. We propose a general class of infinite factorization models that address these limitations. Theoretical support is provided, practical gains are shown in simulation studies, and an ecology application focusing on modelling bird species occurrence is discussed.

2.
Biometrika ; 108(2): 269-282, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-35747172

RESUMO

Posterior computation for high-dimensional data with many parameters can be challenging. This article focuses on a new method for approximating posterior distributions of a low- to moderate-dimensional parameter in the presence of a high-dimensional or otherwise computationally challenging nuisance parameter. The focus is on regression models and the key idea is to separate the likelihood into two components through a rotation. One component involves only the nuisance parameters, which can then be integrated out using a novel type of Gaussian approximation. We provide theory on approximation accuracy that holds for a broad class of forms of the nuisance component and priors. Applying our method to simulated and real data sets shows that it can outperform state-of-the-art posterior approximation approaches.

3.
Biometrika ; 105(2): 431-446, 2018 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-29880978

RESUMO

There has been substantial recent interest in record linkage, where one attempts to group the records pertaining to the same entities from one or more large databases that lack unique identifiers. This can be viewed as a type of microclustering, with few observations per cluster and a very large number of clusters. We show that the problem is fundamentally hard from a theoretical perspective and, even in idealized cases, accurate entity resolution is effectively impossible unless the number of entities is small relative to the number of records and/or the separation between records from different entities is extremely large. These results suggest conservatism in interpretation of the results of record linkage, support collection of additional data to more accurately disambiguate the entities, and motivate a focus on coarser inference. For example, results from a simulation study suggest that sometimes one may obtain accurate results for population size estimation even when fine-scale entity resolution is inaccurate.

4.
Biometrika ; 104(4): 939-952, 2017 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-29422695

RESUMO

We consider shape restricted nonparametric regression on a closed set [Formula: see text], where it is reasonable to assume the function has no more than H local extrema interior to [Formula: see text]. Following a Bayesian approach we develop a nonparametric prior over a novel class of local extremum splines. This approach is shown to be consistent when modeling any continuously differentiable function within the class considered, and is used to develop methods for testing hypotheses on the shape of the curve. Sampling algorithms are developed, and the method is applied in simulation studies and data examples where the shape of the curve is of interest.

5.
Stat Probab Lett ; 113: 41-48, 2016 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-31427835

RESUMO

In population studies, it is standard to sample data via designs in which the population is divided into strata, with the different strata assigned different probabilities of inclusion. Although there have been some proposals for including sample survey weights into Bayesian analyses, existing methods require complex models or ignore the stratified design underlying the survey weights. We propose a simple approach based on modeling the distribution of the selected sample as a mixture, with the mixture weights appropriately adjusted, while accounting for uncertainty in the adjustment. We focus for simplicity on Dirichlet process mixtures but the proposed approach can be applied more broadly. We sketch a simple Markov chain Monte Carlo algorithm for computation, and assess the approach via simulations and an application.

6.
Bioinformatics ; 31(24): 3890-6, 2015 Dec 15.
Artigo em Inglês | MEDLINE | ID: mdl-26323717

RESUMO

MOTIVATION: Both single marker and simultaneous analysis face challenges in GWAS due to the large number of markers genotyped for a small number of subjects. This large p small n problem is particularly challenging when the trait under investigation has low heritability. METHOD: In this article, we propose a two-stage approach that is a hybrid method of single and simultaneous analysis designed to improve genomic prediction of complex traits. In the first stage, we use a Bayesian independent screening method to select the most promising SNPs. In the second stage, we rely on a hierarchical model to analyze the joint impact of the selected markers. The model is designed to take into account familial dependence in the different subjects, while using local-global shrinkage priors on the marker effects. RESULTS: We evaluate the performance in simulation studies, and consider an application to animal breeding data. The illustrative data analysis reveals an encouraging result in terms of prediction performance and computational cost.


Assuntos
Estudo de Associação Genômica Ampla/métodos , Polimorfismo de Nucleotídeo Único , Animais , Teorema de Bayes , Cruzamento , Bovinos , Genômica/métodos , Genótipo , Modelos Genéticos
7.
Biometrika ; 98(2): 291-306, 2011 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23049129

RESUMO

We focus on sparse modelling of high-dimensional covariance matrices using Bayesian latent factor models. We propose a multiplicative gamma process shrinkage prior on the factor loadings which allows introduction of infinitely many factors, with the loadings increasingly shrunk towards zero as the column index increases. We use our prior on a parameter-expanded loading matrix to avoid the order dependence typical in factor analysis models and develop an efficient Gibbs sampler that scales well as data dimensionality increases. The gain in efficiency is achieved by the joint conjugacy property of the proposed prior, which allows block updating of the loadings matrix. We propose an adaptive Gibbs sampler for automatically truncating the infinite loading matrix through selection of the number of important factors. Theoretical results are provided on the support of the prior and truncation approximation bounds. A fast algorithm is proposed to produce approximate Bayes estimates. Latent factor regression methods are developed for prediction and variable selection in applications with high-dimensional correlated predictors. Operating characteristics are assessed through simulation studies, and the approach is applied to predict survival times from gene expression data.

8.
Biometrika ; 98(1): 35-48, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-23956461

RESUMO

We consider geostatistical models that allow the locations at which data are collected to be informative about the outcomes. A Bayesian approach is proposed, which models the locations using a log Gaussian Cox process, while modelling the outcomes conditionally on the locations as Gaussian with a Gaussian process spatial random effect and adjustment for the location intensity process. We prove posterior propriety under an improper prior on the parameter controlling the degree of informative sampling, demonstrating that the data are informative. In addition, we show that the density of the locations and mean function of the outcome process can be estimated consistently under mild assumptions. The methods show significant evidence of informative sampling when applied to ozone data over Eastern U.S.A.

9.
Biometrics ; 60(3): 676-83, 2004 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-15339290

RESUMO

In studying rates of occurrence and progression of lesions (or tumors), it is typically not possible to obtain exact onset times for each lesion. Instead, data consist of the number of lesions that reach a detectable size between screening examinations, along with measures of the size/severity of individual lesions at each exam time. This interval-censored data structure makes it difficult to properly adjust for the onset time distribution in assessing covariate effects on rates of lesion progression. This article proposes a joint model for the multiple lesion onset and progression process, motivated by cross-sectional data from a study of uterine leiomyoma tumors. By using a joint model, one can potentially obtain more precise inferences on rates of onset, while also performing onset time-adjusted inferences on lesion severity. Following a Bayesian approach, we propose a data augmentation Markov chain Monte Carlo algorithm for posterior computation.


Assuntos
Teorema de Bayes , Neoplasias/etiologia , Neoplasias/patologia , Adulto , Algoritmos , Biometria , Estudos Transversais , Feminino , Humanos , Leiomiomatose/etiologia , Leiomiomatose/patologia , Cadeias de Markov , Pessoa de Meia-Idade , Modelos Estatísticos , Método de Monte Carlo , Processos Estocásticos , Fatores de Tempo , Neoplasias Uterinas/etiologia , Neoplasias Uterinas/patologia
10.
Hum Reprod ; 16(11): 2278-82, 2001 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-11679504

RESUMO

BACKGROUND: The TwoDay Algorithm is a simple method for identifying the fertile window. It classifies a day as fertile if cervical secretions are present on that day or were present on the day before. This approach may be an effective alternative to the ovulation and symptothermal methods for populations and programmes that find current natural family planning methods difficult to implement. METHODS: We used data on secretions from a large multinational European fecundability study to assess the relationship between the days predicted to be potentially fertile by the TwoDay Algorithm and the day-specific probabilities of pregnancy based on intercourse patterns in 434 conception cycles from the study. RESULTS: The days around ovulation that had the highest fecundability were the days most likely to be classified as fertile by the TwoDay Algorithm. In addition, intercourse on a particular day in the fertile interval was twice as likely to result in a pregnancy if cervical secretions were present on that day or the day before. CONCLUSIONS: The TwoDay Algorithm is effective, both in identifying the fertile days of the cycle and in predicting days within the fertile interval that have a high pregnancy rate. Our data provide the first direct evidence that cervical secretions are associated with higher fecundability within the fertile window.


Assuntos
Algoritmos , Colo do Útero/metabolismo , Fertilidade , Temperatura Corporal , Muco do Colo Uterino/metabolismo , Estudos de Coortes , Coito , Feminino , Humanos , Detecção da Ovulação , Gravidez , Probabilidade , Estudos Prospectivos
11.
Toxicol Lett ; 122(1): 33-44, 2001 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-11397555

RESUMO

The Tg.AC mouse carrying the v-Ha-ras structural gene is a useful model for the study of chemical carcinogens, especially those acting via non-genotoxic mechanisms. This study evaluated the efficacy of the non-toxic, water-soluble antioxidant from spinach, natural antioxidant (NAO), in reducing skin papilloma induction in female hemizygous Tg.AC mice treated dermally five times over 2.5 weeks with 2.5 microg 12-O-tetradecanoylphorbol-13-acetate (TPA). The TPA-only group was considered as a control; the other two groups received, additionally, NAO topically (2 mg) or orally (100 mg/kg), 5 days/week for 5 weeks. Papilloma counts made macroscopically during the clinical observations showed a significant decrease in multiplicity (P<0.01) in the NAO topically treated group. According to histological criteria, papilloma multiplicity were lower in both topical-NAO and oral-NAO groups, but significantly so only in the oral-NAO mice (P<0.01). The beneficial effect of NAO in the Tg.AC mouse is reported.


Assuntos
Antioxidantes/farmacologia , Papiloma/prevenção & controle , Neoplasias Cutâneas/prevenção & controle , Administração Cutânea , Administração Oral , Animais , Peso Corporal/efeitos dos fármacos , Carcinógenos/efeitos adversos , Modelos Animais de Doenças , Feminino , Genes ras/genética , Genótipo , Camundongos , Camundongos Transgênicos , Papiloma/induzido quimicamente , Papiloma/patologia , Extratos Vegetais/farmacologia , Neoplasias Cutâneas/induzido quimicamente , Neoplasias Cutâneas/patologia , Spinacia oleracea/química , Análise de Sobrevida , Acetato de Tetradecanoilforbol/efeitos adversos
12.
Biometrics ; 57(2): 396-403, 2001 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-11414562

RESUMO

In some cross-sectional studies of chronic disease, data consist of the age at examination, whether the disease was present at the exam, and recall of the age at first diagnosis. This article describes a flexible parametric approach for combining current status and age at first diagnosis data. We assume that the log odds of onset by a given age and of detection by a given age conditional on onset by that age are nondecreasing functions of time plus linear combinations of covariates. Piecewise linear models are used to characterize changes across time in the baseline odds. Methods are described for accommodating informatively missing current status data and inferences based on the age-specific incidence of disease prior to a landmark event (e.g., puberty, menopause). Our formulation enables straightforward maximum likelihood estimation without requiring restrictive parametric or Markov assumptions. The methods are applied to data from a study of uterine fibroids.


Assuntos
Modelos Estatísticos , Adulto , Fatores Etários , Doença Crônica , Estudos Transversais , Progressão da Doença , Feminino , Humanos , Leiomioma/diagnóstico , Leiomioma/fisiopatologia , Pessoa de Meia-Idade , Razão de Chances , Pré-Menopausa , Probabilidade , Fatores de Tempo , Neoplasias Uterinas/diagnóstico , Neoplasias Uterinas/fisiopatologia
13.
Am J Epidemiol ; 153(12): 1222-6, 2001 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-11415958

RESUMO

In the past decade, there have been enormous advances in the use of Bayesian methodology for analysis of epidemiologic data, and there are now many practical advantages to the Bayesian approach. Bayesian models can easily accommodate unobserved variables such as an individual's true disease status in the presence of diagnostic error. The use of prior probability distributions represents a powerful mechanism for incorporating information from previous studies and for controlling confounding. Posterior probabilities can be used as easily interpretable alternatives to p values. Recent developments in Markov chain Monte Carlo methodology facilitate the implementation of Bayesian analyses of complex data sets containing missing observations and multidimensional outcomes. Tools are now available that allow epidemiologists to take advantage of this powerful approach to assessment of exposure-disease relations.


Assuntos
Teorema de Bayes , Humanos
14.
J Infect Dis ; 184(2): 127-35, 2001 Jul 15.
Artigo em Inglês | MEDLINE | ID: mdl-11424008

RESUMO

Many human immunodeficiency virus (HIV)-infected persons receive prolonged treatment with DNA-reactive antiretroviral drugs. A prospective study was conducted of 26 HIV-infected men who provided samples before treatment and at multiple times after beginning treatment, to investigate effects of antiretrovirals on lymphocyte and sperm chromosomes and semen quality. Several antiretroviral regimens, all including a nucleoside component, were used. Lymphocyte metaphase analysis and sperm fluorescence in situ hybridization were used for cytogenetic studies. Semen analyses included conventional parameters (volume, concentration, viability, motility, and morphology). No significant effects on cytogenetic parameters, semen volume, or sperm concentration were detected. However, there were significant improvements in sperm motility for men with study entry CD4 cell counts >200 cells/mm(3), sperm morphology for men with entry CD4 cell counts < or =200 cells/mm(3), and the percentage of viable sperm in both groups. These findings suggest that nucleoside-containing antiretrovirals administered via recommended protocols do not induce chromosomal changes in lymphocytes or sperm but may produce improvements in semen quality.


Assuntos
Fármacos Anti-HIV/efeitos adversos , Quebra Cromossômica , Cromossomos/efeitos dos fármacos , Infecções por HIV/tratamento farmacológico , Infecções por HIV/imunologia , Linfócitos/efeitos dos fármacos , Metáfase/efeitos dos fármacos , Inibidores da Transcriptase Reversa/efeitos adversos , Espermatozoides/efeitos dos fármacos , Adulto , Aneuploidia , Fármacos Anti-HIV/uso terapêutico , Contagem de Linfócito CD4 , Diploide , Quimioterapia Combinada , Humanos , Hibridização in Situ Fluorescente , Estudos Longitudinais , Linfócitos/metabolismo , Linfócitos/patologia , Masculino , Pessoa de Meia-Idade , Inibidores da Transcriptase Reversa/uso terapêutico
15.
Contraception ; 63(4): 211-5, 2001 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-11376648

RESUMO

Emergency post-coital contraceptives effectively reduce the risk of pregnancy, but their degree of efficacy remains uncertain. Measurement of efficacy depends on the pregnancy rate without treatment, which cannot be measured directly. We provide indirect estimates of such pregnancy rates, using data from a prospective study of 221 women who were attempting to conceive. We previously estimated the probability of pregnancy with an act of intercourse relative to ovulation. In this article, we extend these data to estimate the probability of pregnancy relative to intercourse on a given cycle day (counting from onset of previous menses). In assessing the efficacy of post-coital contraceptives, other approaches have not incorporated accurate information on the variability of ovulation. We find that the possibility of late ovulation produces a persistent risk of pregnancy even into the sixth week of the cycle. Post-coital contraceptives may be indicated even when intercourse has occurred late in the cycle.


Assuntos
Coito , Anticoncepcionais Pós-Coito , Feminino , Humanos , Ciclo Menstrual , Ovulação , Gravidez , Probabilidade , Estudos Prospectivos , Fatores de Tempo
16.
Stat Med ; 20(6): 965-78, 2001 Mar 30.
Artigo em Inglês | MEDLINE | ID: mdl-11252016

RESUMO

In modelling human fertility one ideally accounts for timing of intercourse relative to ovulation. Measurement error in identifying the day of ovulation can bias estimates of fecundability parameters and attenuate estimates of covariate effects. In the absence of a single perfect marker of ovulation, several error prone markers are sometimes obtained. In this paper we propose a semi-parametric mixture model that uses multiple independent markers of ovulation to account for measurement error. The model assigns each method of assessing ovulation a distinct non-parametric error distribution, and corrects bias in estimates of day-specific fecundability. We use a Monte Carlo EM algorithm for joint estimation of (i) the error distribution for the markers, (ii) the error-corrected fertility parameters, and (iii) the couple-specific random effects. We apply the methods to data from a North Carolina fertility study to assess the magnitude of error in measures of ovulation based on urinary luteinizing hormone and metabolites of ovarian hormones, and estimate the corrected day-specific probabilities of clinical pregnancy. Published in 2001 by John Wiley & Sons, Ltd.


Assuntos
Fertilidade/fisiologia , Modelos Biológicos , Detecção da Ovulação/métodos , Ovulação/fisiologia , Algoritmos , Biomarcadores , Corpo Lúteo/fisiologia , Estrogênios/urina , Feminino , Humanos , Funções Verossimilhança , Hormônio Luteinizante/urina , Masculino , North Carolina , Gravidez , Progesterona/urina
17.
Biometrics ; 57(1): 302-8, 2001 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-11252614

RESUMO

This article describes a general class of factor analytic models for the analysis of clustered multivariate data in the presence of informative missingness. We assume that there are distinct sets of cluster-level latent variables related to the primary outcomes and to the censoring process, and we account for dependency between these latent variables through a hierarchical model. A linear model is used to relate covariates and latent variables to the primary outcomes for each subunit. A generalized linear model accounts for covariate and latent variable effects on the probability of censoring for subunits within each cluster. The model accounts for correlation within clusters and within subunits through a flexible factor analytic framework that allows multiple latent variables and covariate effects on the latent variables. The structure of the model facilitates implementation of Markov chain Monte Carlo methods for posterior estimation. Data from a spermatotoxicity study are analyzed to illustrate the proposed approach.


Assuntos
Biometria , Modelos Estatísticos , Animais , Análise por Conglomerados , Interpretação Estatística de Dados , Técnicas In Vitro , Masculino , Modelos Biológicos , Análise Multivariada , Ratos , Motilidade dos Espermatozoides/efeitos dos fármacos
18.
Biometrics ; 57(4): 1067-73, 2001 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-11764245

RESUMO

Time to pregnancy studies that identify ovulation days and collect daily intercourse data can be used to estimate the day-specific probabilities of conception given intercourse on a single day relative to ovulation. In this article, a Bayesian semiparametric model is described for flexibly characterizing covariate effects and heterogeneity among couples in daily fecundability. The proposed model is characterized by the timing of the most fertile day of the cycle relative to ovulation, by the probability of conception due to intercourse on the most fertile day, and by the ratios of the daily conception probabilities for other days of the cycle relative to this peak probability. The ratios are assumed to be increasing in time to the peak and decreasing thereafter. Generalized linear mixed models are used to incorporate covariate and couple-specific effects on the peak probability and on the day-specific ratios. A Markov chain Monte Carlo algorithm is described for posterior estimation, and the methods are illustrated through application to caffeine data from a North Carolina pregnancy study.


Assuntos
Teorema de Bayes , Fertilidade , Ciclo Menstrual/fisiologia , Algoritmos , Biometria , Cafeína/farmacologia , Feminino , Fertilidade/efeitos dos fármacos , Humanos , Cadeias de Markov , Modelos Biológicos , Método de Monte Carlo , Gravidez , Fatores de Tempo
19.
Biostatistics ; 2(2): 131-45, 2001 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-12933545

RESUMO

Models of human fertility that incorporate information on timing of intercourse have assumed that a single ovum is released each menstrual cycle. These models are misspecified if two or more viable ova are sometimes released in a single cycle, which is known to occur in dizygotic twin pregnancies. In this paper, we propose a model for multiple ovulation in humans. We assume that the unobservable number of viable ova in each cycle follows a multinomial distribution. Successful fertilization of each ovum depends on the ability of the cycle to support a pregnancy and on the aggregate of a set of unobservable Bernoulli trials representing the fertilizing effects of intercourse on various days. Our model accommodates general covariate effects, allows for heterogeneity among couples, and accounts for a sterile subpopulation of couples. Information on early detection of pregnancy can be incorporated to estimate the probability of embryo loss. We outline a Markov chain Monte Carlo algorithm for estimation of the posterior distributions of the parameters. The methods are applied to data from a North Carolina pregnancy study, and applications to studies of assisted reproduction are described.

20.
Biometrics ; 56(4): 1068-75, 2000 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-11129462

RESUMO

In some types of cancer chemoprevention experiments and short-term carcinogenicity bioassays, the data consist of the number of observed tumors per animal and the times at which these tumors were first detected. In such studies, there is interest in distinguishing between treatment effects on the number of tumors induced by a known carcinogen and treatment effects on the tumor growth rate. Since animals may die before all induced tumors reach a detectable size, separation of these effects can be difficult. This paper describes a flexible parametric model for data of this type. Under our model, the tumor detection times are realizations of a delayed Poisson process that is characterized by the age-specific tumor induction rate and a random latency interval between tumor induction and detection. The model accommodates distinct treatment and animal-specific effects on the number of induced tumors (multiplicity) and the time to tumor detection (growth rate). A Gibbs sampler is developed for estimation of the posterior distributions of the parameters. The methods are illustrated through application to data from a breast cancer chemoprevention experiment.


Assuntos
Anticarcinógenos/uso terapêutico , Ensaios de Seleção de Medicamentos Antitumorais/métodos , Neoplasias Experimentais/patologia , Neoplasias Experimentais/prevenção & controle , Vitamina A/análogos & derivados , 9,10-Dimetil-1,2-benzantraceno , Animais , Biometria/métodos , Cantaxantina/uso terapêutico , Diterpenos , Feminino , Neoplasias Mamárias Experimentais/induzido quimicamente , Neoplasias Mamárias Experimentais/patologia , Neoplasias Mamárias Experimentais/prevenção & controle , Modelos Estatísticos , Ratos , Ratos Sprague-Dawley , Ésteres de Retinil , Vitamina A/uso terapêutico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA