Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 26
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
Entropy (Basel) ; 26(3)2024 Mar 11.
Artigo em Inglês | MEDLINE | ID: mdl-38539760

RESUMO

We commonly encounter the problem of identifying an optimally weight-adjusted version of the empirical distribution of observed data, adhering to predefined constraints on the weights. Such constraints often manifest as restrictions on the moments, tail behavior, shapes, number of modes, etc., of the resulting weight-adjusted empirical distribution. In this article, we substantially enhance the flexibility of such a methodology by introducing a nonparametrically imbued distributional constraint on the weights and developing a general framework leveraging the maximum entropy principle and tools from optimal transport. The key idea is to ensure that the maximum entropy weight-adjusted empirical distribution of the observed data is close to a pre-specified probability distribution in terms of the optimal transport metric, while allowing for subtle departures. The proposed scheme for the re-weighting of observations subject to constraints is reminiscent of the empirical likelihood and related ideas, but offers greater flexibility in applications where parametric distribution-guided constraints arise naturally. The versatility of the proposed framework is demonstrated in the context of three disparate applications where data re-weighting is warranted to satisfy side constraints on the optimization problem at the heart of the statistical task-namely, portfolio allocation, semi-parametric inference for complex surveys, and ensuring algorithmic fairness in machine learning algorithms.

2.
Entropy (Basel) ; 26(1)2024 Jan 11.
Artigo em Inglês | MEDLINE | ID: mdl-38248188

RESUMO

The rise of machine learning-driven decision-making has sparked a growing emphasis on algorithmic fairness. Within the realm of clustering, the notion of balance is utilized as a criterion for attaining fairness, which characterizes a clustering mechanism as fair when the resulting clusters maintain a consistent proportion of observations representing individuals from distinct groups delineated by protected attributes. Building on this idea, the literature has rapidly incorporated a myriad of extensions, devising fair versions of the existing frequentist clustering algorithms, e.g., k-means, k-medioids, etc., that aim at minimizing specific loss functions. These approaches lack uncertainty quantification associated with the optimal clustering configuration and only provide clustering boundaries without quantifying the probabilities associated with each observation belonging to the different clusters. In this article, we intend to offer a novel probabilistic formulation of the fair clustering problem that facilitates valid uncertainty quantification even under mild model misspecifications, without incurring substantial computational overhead. Mixture model-based fair clustering frameworks facilitate automatic uncertainty quantification, but tend to showcase brittleness under model misspecification and involve significant computational challenges. To circumnavigate such issues, we propose a generalized Bayesian fair clustering framework that inherently enjoys decision-theoretic interpretation. Moreover, we devise efficient computational algorithms that crucially leverage techniques from the existing literature on optimal transport and clustering based on loss functions. The gain from the proposed technology is showcased via numerical experiments and real data examples.

3.
J Neurochem ; 165(6): 874-891, 2023 06.
Artigo em Inglês | MEDLINE | ID: mdl-36945903

RESUMO

P2X receptors (P2X1-7) are trimeric ion channels activated by extracellular ATP. Each P2X subunit contains two transmembrane helices (TM1 and TM2). We substituted all residues in TM1 of rat P2X7 with alanine or leucine one by one, expressed mutants in HEK293T cells, and examined the pore permeability by recording both membrane currents and fluorescent dye uptake in response to agonist application. Alanine substitution of G27, K30, H34, Y40, F43, L45, M46, and D48 inhibited agonist-stimulated membrane current and dye uptake, and all but one substitution, D48A, prevented surface expression. Mutation V41A partially reduced both membrane current and dye uptake, while W31A and A44L showed reduced dye uptake not accompanied by reduced membrane current. Mutations T28A, I29A, and L33A showed small changes in agonist sensitivity, but they had no or small impact on dye uptake function. Replacing charged residues with residues of the same charge (K30R, H34K, and D48E) rescued receptor function, while replacement with residues of opposite charge inhibited (K30E and H34E) or potentiated (D48K) receptor function. Prolonged stimulation with agonist-induced current facilitation and a leftward shift in the dose-response curve in the P2X7 wild-type and most functional mutants, but sensitization was absent in the W31A, L33A, and A44L. Detailed analysis of the decay of responses revealed two kinetically distinct mechanisms of P2X7 deactivation: fast represents agonist unbinding, and slow might represent resetting of the receptor to the resting closed state. These results indicate that conserved and receptor-specific TM1 residues control surface expression of the P2X7 protein, non-polar residues control receptor sensitization, and D48 regulates intrinsic channel properties.


Assuntos
Canais Iônicos , Receptores Purinérgicos P2X7 , Ratos , Humanos , Animais , Células HEK293 , Transporte Biológico , Mutação/genética , Domínios Proteicos , Canais Iônicos/metabolismo , Receptores Purinérgicos P2X7/genética , Receptores Purinérgicos P2X7/metabolismo , Trifosfato de Adenosina/farmacologia , Trifosfato de Adenosina/metabolismo
4.
J Math Psychol ; 1012021 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-35496657

RESUMO

We describe a modified sequential probability ratio test that can be used to reduce the average sample size required to perform statistical hypothesis tests at specified levels of significance and power. Examples are provided for z tests, t tests, and tests of binomial success probabilities. A description of a software package to implement the test designs is provided. We compare the sample sizes required in fixed design tests conducted at 5% significance levels to the average sample sizes required in sequential tests conducted at 0.5% significance levels, and we find that the two sample sizes are approximately equal.

5.
Entropy (Basel) ; 22(11)2020 Nov 06.
Artigo em Inglês | MEDLINE | ID: mdl-33287031

RESUMO

Variational algorithms have gained prominence over the past two decades as a scalable computational environment for Bayesian inference. In this article, we explore tools from the dynamical systems literature to study the convergence of coordinate ascent algorithms for mean field variational inference. Focusing on the Ising model defined on two nodes, we fully characterize the dynamics of the sequential coordinate ascent algorithm and its parallel version. We observe that in the regime where the objective function is convex, both the algorithms are stable and exhibit convergence to the unique fixed point. Our analyses reveal interesting discordances between these two versions of the algorithm in the region when the objective function is non-convex. In fact, the parallel version exhibits a periodic oscillatory behavior which is absent in the sequential version. Drawing intuition from the Markov chain Monte Carlo literature, we empirically show that a parameter expansion of the Ising model, popularly called the Edward-Sokal coupling, leads to an enlargement of the regime of convergence to the global optima.

6.
Int J Mol Sci ; 21(22)2020 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-33182845

RESUMO

Activation of the P2X7 receptor results in the opening of a large pore that plays a role in immune responses, apoptosis, and many other physiological and pathological processes. Here, we investigated the role of conserved and unique residues in the extracellular vestibule connecting the agonist-binding domain with the transmembrane domain of rat P2X7 receptor. We found that all residues that are conserved among the P2X receptor subtypes respond to alanine mutagenesis with an inhibition (Y51, Q52, and G323) or a significant decrease (K49, G326, K327, and F328) of 2',3'-O-(benzoyl-4-benzoyl)-ATP (BzATP)-induced current and permeability to ethidium bromide, while the nonconserved residue (F322), which is also present in P2X4 receptor, responds with a 10-fold higher sensitivity to BzATP, much slower deactivation kinetics, and a higher propensity to form the large dye-permeable pore. We examined the membrane expression of conserved mutants and found that Y51, Q52, G323, and F328 play a role in the trafficking of the receptor to the plasma membrane, while K49 controls receptor responsiveness to agonists. Finally, we studied the importance of the physicochemical properties of these residues and observed that the K49R, F322Y, F322W, and F322L mutants significantly reversed the receptor function, indicating that positively charged and large hydrophobic residues are important at positions 49 and 322, respectively. These results show that clusters of conserved residues above the transmembrane domain 1 (K49-Y51-Q52) and transmembrane domain 2 (G326-K327-F328) are important for receptor structure, membrane expression, and channel gating and that the nonconserved residue (F322) at the top of the extracellular vestibule is involved in hydrophobic inter-subunit interaction which stabilizes the closed state of the P2X7 receptor channel.


Assuntos
Receptores Purinérgicos P2X7/genética , Receptores Purinérgicos P2X7/metabolismo , Sequência de Aminoácidos , Substituição de Aminoácidos , Animais , Proteínas de Bactérias/química , Proteínas de Bactérias/genética , Proteínas de Bactérias/metabolismo , Sequência Conservada , Células HEK293 , Humanos , Ativação do Canal Iônico , Cinética , Proteínas Luminescentes/química , Proteínas Luminescentes/genética , Proteínas Luminescentes/metabolismo , Modelos Moleculares , Mutagênese Sítio-Dirigida , Proteínas Mutantes/química , Proteínas Mutantes/genética , Proteínas Mutantes/metabolismo , Domínios Proteicos , Domínios e Motivos de Interação entre Proteínas , Ratos , Receptores Purinérgicos P2X7/química , Proteínas Recombinantes de Fusão/química , Proteínas Recombinantes de Fusão/genética , Proteínas Recombinantes de Fusão/metabolismo , Eletricidade Estática
7.
Biometrika ; 107(1): 205-221, 2020 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-33100350

RESUMO

We develop a Bayesian methodology aimed at simultaneously estimating low-rank and row-sparse matrices in a high-dimensional multiple-response linear regression model. We consider a carefully devised shrinkage prior on the matrix of regression coefficients which obviates the need to specify a prior on the rank, and shrinks the regression matrix towards low-rank and row-sparse structures. We provide theoretical support to the proposed methodology by proving minimax optimality of the posterior mean under the prediction risk in ultra-high dimensional settings where the number of predictors can grow sub-exponentially relative to the sample size. A one-step post-processing scheme induced by group lasso penalties on the rows of the estimated coefficient matrix is proposed for variable selection, with default choices of tuning parameters. We additionally provide an estimate of the rank using a novel optimization function achieving dimension reduction in the covariate space. We exhibit the performance of the proposed methodology in an extensive simulation study and a real data example.

8.
Biometrics ; 76(1): 316-325, 2020 03.
Artigo em Inglês | MEDLINE | ID: mdl-31393003

RESUMO

Accurate prognostic prediction using molecular information is a challenging area of research, which is essential to develop precision medicine. In this paper, we develop translational models to identify major actionable proteins that are associated with clinical outcomes, like the survival time of patients. There are considerable statistical and computational challenges due to the large dimension of the problems. Furthermore, data are available for different tumor types; hence data integration for various tumors is desirable. Having censored survival outcomes escalates one more level of complexity in the inferential procedure. We develop Bayesian hierarchical survival models, which accommodate all the challenges mentioned here. We use the hierarchical Bayesian accelerated failure time model for survival regression. Furthermore, we assume sparse horseshoe prior distribution for the regression coefficients to identify the major proteomic drivers. We borrow strength across tumor groups by introducing a correlation structure among the prior distributions. The proposed methods have been used to analyze data from the recently curated "The Cancer Proteome Atlas" (TCPA), which contains reverse-phase protein arrays-based high-quality protein expression data as well as detailed clinical annotation, including survival times. Our simulation and the TCPA data analysis illustrate the efficacy of the proposed integrative model, which links different tumors with the correlated prior structures.


Assuntos
Biometria/métodos , Neoplasias/metabolismo , Neoplasias/mortalidade , Proteoma/metabolismo , Proteômica/estatística & dados numéricos , Teorema de Bayes , Simulação por Computador , Interpretação Estatística de Dados , Humanos , Neoplasias Renais/metabolismo , Neoplasias Renais/mortalidade , Cadeias de Markov , Modelos Estatísticos , Método de Monte Carlo , Prognóstico , Análise Serial de Proteínas/estatística & dados numéricos , Análise de Sobrevida
9.
Med J Armed Forces India ; 75(3): 274-281, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31388229

RESUMO

BACKGROUND: Increased pulmonary ventilation helps lowlanders and natives to maintain arterial oxygenation at high altitudes. Natives of Ladakh have been shown to have similar ventilatory parameters as Tibetans at 3300 m. But there is limited literature comparing these parameters in Ladakhi natives with acclimatized lowland sojourners. METHODS: End-tidal carbon dioxide partial pressure (EtCO2), blood oxygen saturation (SpO2) and hemoglobin concentration (Hb) were measured in 276 participants, 126 native highlanders (NHL - 40 females, 86 males) and 150 acclimatized lowlanders (ALL - 60 females, 90 males). RESULTS: EtCO2 was greater in the NHL compared to the ALL, (33.8 ± 3.3 vs 31 ± 2.5 mmHg) although SpO2 was lower (90.9 ± 2.4 vs 91.7 ± 2.3%). When grouped by sex, NHL males had significantly greater EtCO2 than NHL females, ALL males and ALL females. Hb and calculated arterial oxygen content was similar in Ladakhis and acclimatized lowlanders, although greater in males compared to females. Systemic blood pressure, heart rate and the proportion of hypertensives was significantly greater in the ALL. CONCLUSION: Native Ladakhis, have a significantly greater resting EtCO2 (especially in males) and lower SpO2 than acclimatized lowlanders. Blood Hb concentration and oxygen content is, however, similar in natives and acclimatized lowlanders of the same sex.

10.
Stat Sin ; 28(2): 1053-1078, 2018 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-29643721

RESUMO

Bayesian model selection procedures based on nonlocal alternative prior densities are extended to ultrahigh dimensional settings and compared to other variable selection procedures using precision-recall curves. Variable selection procedures included in these comparisons include methods based on g-priors, reciprocal lasso, adaptive lasso, scad, and minimax concave penalty criteria. The use of precision-recall curves eliminates the sensitivity of our conclusions to the choice of tuning parameters. We find that Bayesian variable selection procedures based on nonlocal priors are competitive to all other procedures in a range of simulation scenarios, and we subsequently explain this favorable performance through a theoretical examination of their consistency properties. When certain regularity conditions apply, we demonstrate that the nonlocal procedures are consistent for linear models even when the number of covariates p increases sub-exponentially with the sample size n. A model selection procedure based on Zellner's g-prior is also found to be competitive with penalized likelihood methods in identifying the true model, but the posterior distribution on the model space induced by this method is much more dispersed than the posterior distribution induced on the model space by the nonlocal prior methods. We investigate the asymptotic form of the marginal likelihood based on the nonlocal priors and show that it attains a unique term that cannot be derived from the other Bayesian model selection procedures. We also propose a scalable and efficient algorithm called Simplified Shotgun Stochastic Search with Screening (S5) to explore the enormous model space, and we show that S5 dramatically reduces the computing time without losing the capacity to search the interesting region in the model space, at least in the simulation settings considered. The S5 algorithm is available in an R package BayesS5 on CRAN.

11.
Ann Stat ; 45(1): 1-38, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-29332971

RESUMO

Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions.

13.
Biometrika ; 103(4): 985-991, 2016 12.
Artigo em Inglês | MEDLINE | ID: mdl-28435166

RESUMO

We propose an efficient way to sample from a class of structured multivariate Gaussian distributions. The proposed algorithm only requires matrix multiplications and linear system solutions. Its computational complexity grows linearly with the dimension, unlike existing algorithms that rely on Cholesky factorizations with cubic complexity. The algorithm is broadly applicable in settings where Gaussian scale mixture priors are used on high-dimensional parameters. Its effectiveness is illustrated through a high-dimensional regression problem with a horseshoe prior on the regression coefficients. Other potential applications are outlined.

14.
Biometrics ; 72(1): 184-92, 2016 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-26394204

RESUMO

It is common in biomedical research to run case-control studies involving high-dimensional predictors, with the main goal being detection of the sparse subset of predictors having a significant association with disease. Usual analyses rely on independent screening, considering each predictor one at a time, or in some cases on logistic regression assuming no interactions. We propose a fundamentally different approach based on a nonparametric Bayesian low rank tensor factorization model for the retrospective likelihood. Our model allows a very flexible structure in characterizing the distribution of multivariate variables as unknown and without any linear assumptions as in logistic regression. Predictors are excluded only if they have no impact on disease risk, either directly or through interactions with other predictors. Hence, we obtain an omnibus approach for screening for important predictors. Computation relies on an efficient Gibbs sampler. The methods are shown to have high power and low false discovery rates in simulation studies, and we consider an application to an epidemiology study of birth defects.


Assuntos
Teorema de Bayes , Estudos de Casos e Controles , Anormalidades Congênitas/epidemiologia , Modelos Estatísticos , Estatísticas não Paramétricas , Simulação por Computador , Interpretação Estatística de Dados , Humanos , Incidência , Recém-Nascido , Reprodutibilidade dos Testes , Medição de Risco/métodos , Tamanho da Amostra , Sensibilidade e Especificidade
15.
J Neurochem ; 133(6): 815-27, 2015 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-25712548

RESUMO

In the sustained presence of agonist, the opening of P2X7R channel is followed by pore dilatation, which causes an increase in its permeability to larger organic cations, accompanied by receptor sensitization. To explore the molecular mechanisms by which the conductivity and sensitivity are increased, we analyzed the electrophysiological properties and YO-PRO-1 uptake of selected alanine mutants in the first and second transmembrane domains of the rat P2X7R. Substitution of residues Y40, F43, G338, and D352 with alanine reduced membrane trafficking, and the D352A was practically non-functional. The Y40A and F43A mutants that were expressed in the membrane lacked pore dilation ability. Moreover, the Y40A and Y40F displayed desensitization, whereas the Y40W partially recovered receptor function. The G338A/S mutations favored the open state of the channel and displayed instantaneous permeability to larger organic cations. The G338P was non-functional. The L341A and G345A displayed normal trafficking, current amplitude, and sensitization, but both mutations resulted in a decreased pore formation and dye uptake. These results showed that the increase in P2X7R conductivity and sensitivity is critically dependent on residues Y40 and F43 in the TM1 domain and that the region located at the intersection of TM2 helices controls the rate of large pore opening. We investigated the mechanism of the proapoptotic receptor P2X7R's large pore opening and its sensitization. We found that aromatic residues in the upper part of the first transmembrane domain (TM1) are critical for both the P2X7R channel pore opening and receptor sensitization, and residues located at or below the intersection of the second transmembrane domains (TM2) control the rate of pore opening. These findings identify new residues involved in pore formation of P2X7R.


Assuntos
Receptores Purinérgicos P2X7/química , Receptores Purinérgicos P2X7/metabolismo , Sequência de Aminoácidos , Animais , Células HEK293 , Humanos , Dados de Sequência Molecular , Mutagênese Sítio-Dirigida , Mutação , Técnicas de Patch-Clamp , Estrutura Terciária de Proteína , Transporte Proteico , Ratos , Transfecção
16.
J Am Stat Assoc ; 110(512): 1562-1576, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-31210707

RESUMO

It has become routine to collect data that are structured as multiway arrays (tensors). There is an enormous literature on low rank and sparse matrix factorizations, but limited consideration of extensions to the tensor case in statistics. The most common low rank tensor factorization relies on parallel factor analysis (PARAFAC), which expresses a rank k tensor as a sum of rank one tensors. When observations are only available for a tiny subset of the cells of a big tensor, the low rank assumption is not sufficient and PARAFAC has poor performance. We induce an additional layer of dimension reduction by allowing the effective rank to vary across dimensions of the table. For concreteness, we focus on a contingency table application. Taking a Bayesian approach, we place priors on terms in the factorization and develop an efficient Gibbs sampler for posterior computation. Theory is provided showing posterior concentration rates in high-dimensional settings, and the methods are shown to have excellent performance in simulations and several real data applications.

17.
J Am Stat Assoc ; 110(512): 1479-1490, 2015 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-27019543

RESUMO

Penalized regression methods, such as L1 regularization, are routinely used in high-dimensional applications, and there is a rich literature on optimality properties under sparsity assumptions. In the Bayesian paradigm, sparsity is routinely induced through two-component mixture priors having a probability mass at zero, but such priors encounter daunting computational problems in high dimensions. This has motivated continuous shrinkage priors, which can be expressed as global-local scale mixtures of Gaussians, facilitating computation. In contrast to the frequentist literature, little is known about the properties of such priors and the convergence and concentration of the corresponding posterior distribution. In this article, we propose a new class of Dirichlet-Laplace priors, which possess optimal posterior concentration and lead to efficient posterior computation. Finite sample performance of Dirichlet-Laplace priors relative to alternatives is assessed in simulated and real data examples.

18.
Ann Stat ; 42(1): 352-381, 2014 Feb 01.
Artigo em Inglês | MEDLINE | ID: mdl-25288827

RESUMO

In nonparametric regression problems involving multiple predictors, there is typically interest in estimating an anisotropic multivariate regression surface in the important predictors while discarding the unimportant ones. Our focus is on defining a Bayesian procedure that leads to the minimax optimal rate of posterior contraction (up to a log factor) adapting to the unknown dimension and anisotropic smoothness of the true surface. We propose such an approach based on a Gaussian process prior with dimension-specific scalings, which are assigned carefully-chosen hyperpriors. We additionally show that using a homogenous Gaussian process with a single bandwidth leads to a sub-optimal rate in anisotropic cases.

19.
J Neurosci ; 33(18): 8035-44, 2013 May 01.
Artigo em Inglês | MEDLINE | ID: mdl-23637193

RESUMO

The hypothalamic suprachiasmatic nuclei (SCN), the circadian master clock in mammals, releases ATP in a rhythm, but the role of extracellular ATP in the SCN is still unknown. In this study, we examined the expression and function of ATP-gated P2X receptors (P2XRs) in the SCN neurons of slices isolated from the brain of 16- to 20-day-old rats. Quantitative RT-PCR showed that the SCN contains mRNA for P2X 1-7 receptors and several G-protein-coupled P2Y receptors. Among the P2XR subunits, the P2X2 > P2X7 > P2X4 mRNAs were the most abundant. Whole-cell patch-clamp recordings from SCN neurons revealed that extracellular ATP application increased the frequency of spontaneous GABAergic IPSCs without changes in their amplitudes. The effect of ATP appears to be mediated by presynaptic P2X2Rs because ATPγS and 2MeS-ATP mimics, while the P2XR antagonist PPADS blocks, the observed enhancement of the frequency of GABA currents. There were significant differences between two SCN regions in that the effect of ATP was higher in the ventrolateral subdivision, which is densely innervated from outside the SCN. Little evidence was found for the presence of P2XR channels in somata of SCN neurons as P2X2R immunoreactivity colocalized with synapsin and ATP-induced current was observed in only 7% of cells. In fura-2 AM-loaded slices, BzATP as well as ADP stimulated intracellular Ca(2+) increase, indicating that the SCN cells express functional P2X7 and P2Y receptors. Our data suggest that ATP activates presynaptic P2X2Rs to regulate inhibitory synaptic transmission within the SCN and that this effect varies between regions.


Assuntos
Trifosfato de Adenosina/farmacologia , Inibição Neural/efeitos dos fármacos , Neurônios/efeitos dos fármacos , Núcleo Supraquiasmático/citologia , Transmissão Sináptica/efeitos dos fármacos , Animais , Animais Recém-Nascidos , Fenômenos Biofísicos/efeitos dos fármacos , Cálcio/metabolismo , Células Cultivadas , Relação Dose-Resposta a Droga , Antagonistas de Aminoácidos Excitatórios/farmacologia , Regulação da Expressão Gênica/efeitos dos fármacos , Técnicas In Vitro , Masculino , Técnicas de Patch-Clamp , Inibidores da Agregação Plaquetária/farmacologia , Purinérgicos/farmacologia , RNA Mensageiro/metabolismo , Ratos , Ratos Wistar , Receptores Purinérgicos P2X/genética , Receptores Purinérgicos P2X/metabolismo , Bloqueadores dos Canais de Sódio/farmacologia , Potenciais Sinápticos/efeitos dos fármacos , Tetrodotoxina/farmacologia , Ácido gama-Aminobutírico/farmacologia
20.
J Am Stat Assoc ; 107(497): 362-377, 2012 Mar 01.
Artigo em Inglês | MEDLINE | ID: mdl-23908561

RESUMO

Gaussian latent factor models are routinely used for modeling of dependence in continuous, binary, and ordered categorical data. For unordered categorical variables, Gaussian latent factor models lead to challenging computation and complex modeling structures. As an alternative, we propose a novel class of simplex factor models. In the single-factor case, the model treats the different categorical outcomes as independent with unknown marginals. The model can characterize flexible dependence structures parsimoniously with few factors, and as factors are added, any multivariate categorical data distribution can be accurately approximated. Using a Bayesian approach for computation and inferences, a Markov chain Monte Carlo (MCMC) algorithm is proposed that scales well with increasing dimension, with the number of factors treated as unknown. We develop an efficient proposal for updating the base probability vector in hierarchical Dirichlet models. Theoretical properties are described, and we evaluate the approach through simulation examples. Applications are described for modeling dependence in nucleotide sequences and prediction from high-dimensional categorical features.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...