Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 9.235
Filtrar
Mais filtros

Temas
Intervalo de ano de publicação
1.
Nature ; 628(8009): 771-775, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38632399

RESUMO

Quantitative detection of various molecules at very low concentrations in complex mixtures has been the main objective in many fields of science and engineering, from the detection of cancer-causing mutagens and early disease markers to environmental pollutants and bioterror agents1-5. Moreover, technologies that can detect these analytes without external labels or modifications are extremely valuable and often preferred6. In this regard, surface-enhanced Raman spectroscopy can detect molecular species in complex mixtures on the basis only of their intrinsic and unique vibrational signatures7. However, the development of surface-enhanced Raman spectroscopy for this purpose has been challenging so far because of uncontrollable signal heterogeneity and poor reproducibility at low analyte concentrations8. Here, as a proof of concept, we show that, using digital (nano)colloid-enhanced Raman spectroscopy, reproducible quantification of a broad range of target molecules at very low concentrations can be routinely achieved with single-molecule counting, limited only by the Poisson noise of the measurement process. As metallic colloidal nanoparticles that enhance these vibrational signatures, including hydroxylamine-reduced-silver colloids, can be fabricated at large scale under routine conditions, we anticipate that digital (nano)colloid-enhanced Raman spectroscopy will become the technology of choice for the reliable and ultrasensitive detection of various analytes, including those of great importance for human health.


Assuntos
Coloides , Imagem Individual de Molécula , Análise Espectral Raman , Coloides/química , Hidroxilamina/química , Nanopartículas Metálicas/química , Distribuição de Poisson , Estudo de Prova de Conceito , Reprodutibilidade dos Testes , Prata/química , Imagem Individual de Molécula/métodos , Imagem Individual de Molécula/normas , Análise Espectral Raman/métodos , Análise Espectral Raman/normas , Vibração
2.
Annu Rev Biochem ; 83: 813-41, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-24606136

RESUMO

Ions surround nucleic acids in what is referred to as an ion atmosphere. As a result, the folding and dynamics of RNA and DNA and their complexes with proteins and with each other cannot be understood without a reasonably sophisticated appreciation of these ions' electrostatic interactions. However, the underlying behavior of the ion atmosphere follows physical rules that are distinct from the rules of site binding that biochemists are most familiar and comfortable with. The main goal of this review is to familiarize nucleic acid experimentalists with the physical concepts that underlie nucleic acid-ion interactions. Throughout, we provide practical strategies for interpreting and analyzing nucleic acid experiments that avoid pitfalls from oversimplified or incorrect models. We briefly review the status of theories that predict or simulate nucleic acid-ion interactions and experiments that test these theories. Finally, we describe opportunities for going beyond phenomenological fits to a next-generation, truly predictive understanding of nucleic acid-ion interactions.


Assuntos
Íons/química , Ácidos Nucleicos/química , Algoritmos , Sítios de Ligação , Cátions , Cristalografia por Raios X , DNA/química , Magnésio/química , Metais/química , Modelos Teóricos , Conformação de Ácido Nucleico , Distribuição de Poisson , RNA/química , Software , Eletricidade Estática , Termodinâmica
3.
Nature ; 613(7942): 130-137, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36517599

RESUMO

The World Health Organization has a mandate to compile and disseminate statistics on mortality, and we have been tracking the progression of the COVID-19 pandemic since the beginning of 20201. Reported statistics on COVID-19 mortality are problematic for many countries owing to variations in testing access, differential diagnostic capacity and inconsistent certification of COVID-19 as cause of death. Beyond what is directly attributable to it, the pandemic has caused extensive collateral damage that has led to losses of lives and livelihoods. Here we report a comprehensive and consistent measurement of the impact of the COVID-19 pandemic by estimating excess deaths, by month, for 2020 and 2021. We predict the pandemic period all-cause deaths in locations lacking complete reported data using an overdispersed Poisson count framework that applies Bayesian inference techniques to quantify uncertainty. We estimate 14.83 million excess deaths globally, 2.74 times more deaths than the 5.42 million reported as due to COVID-19 for the period. There are wide variations in the excess death estimates across the six World Health Organization regions. We describe the data and methods used to generate these estimates and highlight the need for better reporting where gaps persist. We discuss various summary measures, and the hazards of ranking countries' epidemic responses.


Assuntos
COVID-19 , Pandemias , Organização Mundial da Saúde , Humanos , Teorema de Bayes , COVID-19/mortalidade , Pandemias/estatística & dados numéricos , Incerteza , Distribuição de Poisson
4.
Nature ; 581(7809): 452-458, 2020 05.
Artigo em Inglês | MEDLINE | ID: mdl-32461655

RESUMO

The acceleration of DNA sequencing in samples from patients and population studies has resulted in extensive catalogues of human genetic variation, but the interpretation of rare genetic variants remains problematic. A notable example of this challenge is the existence of disruptive variants in dosage-sensitive disease genes, even in apparently healthy individuals. Here, by manual curation of putative loss-of-function (pLoF) variants in haploinsufficient disease genes in the Genome Aggregation Database (gnomAD)1, we show that one explanation for this paradox involves alternative splicing of mRNA, which allows exons of a gene to be expressed at varying levels across different cell types. Currently, no existing annotation tool systematically incorporates information about exon expression into the interpretation of variants. We develop a transcript-level annotation metric known as the 'proportion expressed across transcripts', which quantifies isoform expression for variants. We calculate this metric using 11,706 tissue samples from the Genotype Tissue Expression (GTEx) project2 and show that it can differentiate between weakly and highly evolutionarily conserved exons, a proxy for functional importance. We demonstrate that expression-based annotation selectively filters 22.8% of falsely annotated pLoF variants found in haploinsufficient disease genes in gnomAD, while removing less than 4% of high-confidence pathogenic variants in the same genes. Finally, we apply our expression filter to the analysis of de novo variants in patients with autism spectrum disorder and intellectual disability or developmental disorders to show that pLoF variants in weakly expressed regions have similar effect sizes to those of synonymous variants, whereas pLoF variants in highly expressed exons are most strongly enriched among cases. Our annotation is fast, flexible and generalizable, making it possible for any variant file to be annotated with any isoform expression dataset, and will be valuable for the genetic diagnosis of rare diseases, the analysis of rare variant burden in complex disorders, and the curation and prioritization of variants in recall-by-genotype studies.


Assuntos
Doença/genética , Haploinsuficiência/genética , Mutação com Perda de Função/genética , Anotação de Sequência Molecular , Transcrição Gênica , Transcriptoma/genética , Transtorno do Espectro Autista/genética , Conjuntos de Dados como Assunto , Deficiências do Desenvolvimento/genética , Éxons/genética , Feminino , Genótipo , Humanos , Deficiência Intelectual/genética , Masculino , Anotação de Sequência Molecular/normas , Distribuição de Poisson , RNA Mensageiro/análise , RNA Mensageiro/genética , Doenças Raras/diagnóstico , Doenças Raras/genética , Reprodutibilidade dos Testes , Sequenciamento do Exoma
5.
Mol Biol Evol ; 41(6)2024 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-38693911

RESUMO

Modeling the rate at which adaptive phenotypes appear in a population is a key to predicting evolutionary processes. Given random mutations, should this rate be modeled by a simple Poisson process, or is a more complex dynamics needed? Here we use analytic calculations and simulations of evolving populations on explicit genotype-phenotype maps to show that the introduction of novel phenotypes can be "bursty" or overdispersed. In other words, a novel phenotype either appears multiple times in quick succession or not at all for many generations. These bursts are fundamentally caused by statistical fluctuations and other structure in the map from genotypes to phenotypes. Their strength depends on population parameters, being highest for "monomorphic" populations with low mutation rates. They can also be enhanced by additional inhomogeneities in the mapping from genotypes to phenotypes. We mainly investigate the effect of bursts using the well-studied genotype-phenotype map for RNA secondary structure, but find similar behavior in a lattice protein model and in Richard Dawkins's biomorphs model of morphological development. Bursts can profoundly affect adaptive dynamics. Most notably, they imply that fitness differences play a smaller role in determining which phenotype fixes than would be the case for a Poisson process without bursts.


Assuntos
Modelos Genéticos , Fenótipo , Genótipo , Simulação por Computador , Adaptação Fisiológica/genética , Evolução Molecular , Mutação , Evolução Biológica , Distribuição de Poisson , RNA/genética , Adaptação Biológica/genética
6.
Brief Bioinform ; 24(5)2023 09 20.
Artigo em Inglês | MEDLINE | ID: mdl-37507115

RESUMO

Single cell RNA-sequencing (scRNA-seq) technology has significantly advanced the understanding of transcriptomic signatures. Although various statistical models have been used to describe the distribution of gene expression across cells, a comprehensive assessment of the different models is missing. Moreover, the growing number of features associated with scRNA-seq datasets creates new challenges for analytical accuracy and computing speed. Here, we developed a Python-based package (TensorZINB) to solve the zero-inflated negative binomial (ZINB) model using the TensorFlow deep learning framework. We used a sequential initialization method to solve the numerical stability issues associated with hurdle and zero-inflated models. A recursive feature selection protocol was used to optimize feature selections for data processing and downstream differentially expressed gene (DEG) analysis. We proposed a class of hybrid models combining nested models to further improve the model's performance. Additionally, we developed a new method to convert a continuous distribution to its equivalent discrete form, so that statistical models can be fairly compared. Finally, we showed that the proposed TensorFlow algorithm (TensorZINB) was numerically stable and that its computing speed and performance were superior to those of existing ZINB solvers. Moreover, we implemented seven hurdle and zero-inflated statistical models in Python and systematically assessed their performance using a real scRNA-seq dataset. We demonstrated that the ZINB model achieved the lowest Akaike information criterion compared with other models tested. Taken together, TensorZINB was accurate, efficient and scalable for the implementation of ZINB and for large-scale scRNA-seq data analysis with DEG identification.


Assuntos
Perfilação da Expressão Gênica , Modelos Estatísticos , Distribuição de Poisson , Perfilação da Expressão Gênica/métodos , RNA , Análise de Sequência de RNA/métodos
7.
PLoS Comput Biol ; 20(2): e1011856, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38330050

RESUMO

Outbreaks of emerging and zoonotic infections represent a substantial threat to human health and well-being. These outbreaks tend to be characterised by highly stochastic transmission dynamics with intense variation in transmission potential between cases. The negative binomial distribution is commonly used as a model for transmission in the early stages of an epidemic as it has a natural interpretation as the convolution of a Poisson contact process and a gamma-distributed infectivity. In this study we expand upon the negative binomial model by introducing a beta-Poisson mixture model in which infectious individuals make contacts at the points of a Poisson process and then transmit infection along these contacts with a beta-distributed probability. We show that the negative binomial distribution is a limit case of this model, as is the zero-inflated Poisson distribution obtained by combining a Poisson-distributed contact process with an additional failure probability. We assess the beta-Poisson model's applicability by fitting it to secondary case distributions (the distribution of the number of subsequent cases generated by a single case) estimated from outbreaks covering a range of pathogens and geographical settings. We find that while the beta-Poisson mixture can achieve a closer to fit to data than the negative binomial distribution, it is consistently outperformed by the negative binomial in terms of Akaike Information Criterion, making it a suboptimal choice on parsimonious grounds. The beta-Poisson performs similarly to the negative binomial model in its ability to capture features of the secondary case distribution such as overdispersion, prevalence of superspreaders, and the probability of a case generating zero subsequent cases. Despite this possible shortcoming, the beta-Poisson distribution may still be of interest in the context of intervention modelling since its structure allows for the simulation of measures which change contact structures while leaving individual-level infectivity unchanged, and vice-versa.


Assuntos
Surtos de Doenças , Modelos Estatísticos , Humanos , Simulação por Computador , Distribuição de Poisson , Distribuição Binomial
8.
PLoS Comput Biol ; 20(8): e1012324, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-39106282

RESUMO

To understand the transmissibility and spread of infectious diseases, epidemiologists turn to estimates of the instantaneous reproduction number. While many estimation approaches exist, their utility may be limited. Challenges of surveillance data collection, model assumptions that are unverifiable with data alone, and computationally inefficient frameworks are critical limitations for many existing approaches. We propose a discrete spline-based approach that solves a convex optimization problem-Poisson trend filtering-using the proximal Newton method. It produces a locally adaptive estimator for instantaneous reproduction number estimation with heterogeneous smoothness. Our methodology remains accurate even under some process misspecifications and is computationally efficient, even for large-scale data. The implementation is easily accessible in a lightweight R package rtestim.


Assuntos
Algoritmos , Número Básico de Reprodução , Humanos , Biologia Computacional/métodos , Doenças Transmissíveis/epidemiologia , Simulação por Computador , Software , Modelos Epidemiológicos , Distribuição de Poisson , Modelos Estatísticos
9.
Biophys J ; 123(17): 2807-2814, 2024 Sep 03.
Artigo em Inglês | MEDLINE | ID: mdl-38356263

RESUMO

Electrostatics is of paramount importance to chemistry, physics, biology, and medicine. The Poisson-Boltzmann (PB) theory is a primary model for electrostatic analysis. However, it is highly challenging to compute accurate PB electrostatic solvation free energies for macromolecules due to the nonlinearity, dielectric jumps, charge singularity, and geometric complexity associated with the PB equation. The present work introduces a PB-based machine learning (PBML) model for biomolecular electrostatic analysis. Trained with the second-order accurate MIBPB solver, the proposed PBML model is found to be more accurate and faster than several eminent PB solvers in electrostatic analysis. The proposed PBML model can provide highly accurate PB electrostatic solvation free energy of new biomolecules or new conformations generated by molecular dynamics with much reduced computational cost.


Assuntos
Aprendizado de Máquina , Eletricidade Estática , Simulação de Dinâmica Molecular , Distribuição de Poisson , Termodinâmica
10.
BMC Bioinformatics ; 25(1): 168, 2024 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-38678218

RESUMO

This study investigates the impact of spatio- temporal correlation using four spatio-temporal models: Spatio-Temporal Poisson Linear Trend Model (SPLTM), Poisson Temporal Model (TMS), Spatio-Temporal Poisson Anova Model (SPAM), and Spatio-Temporal Poisson Separable Model (STSM) concerning food security and nutrition in Africa. Evaluating model goodness of fit using the Watanabe Akaike Information Criterion (WAIC) and assessing bias through root mean square error and mean absolute error values revealed a consistent monotonic pattern. SPLTM consistently demonstrates a propensity for overestimating food security, while TMS exhibits a diverse bias profile, shifting between overestimation and underestimation based on varying correlation settings. SPAM emerges as a beacon of reliability, showcasing minimal bias and WAIC across diverse scenarios, while STSM consistently underestimates food security, particularly in regions marked by low to moderate spatio-temporal correlation. SPAM consistently outperforms other models, making it a top choice for modeling food security and nutrition dynamics in Africa. This research highlights the impact of spatial and temporal correlations on food security and nutrition patterns and provides guidance for model selection and refinement. Researchers are encouraged to meticulously evaluate the biases and goodness of fit characteristics of models, ensuring their alignment with the specific attributes of their data and research goals. This knowledge empowers researchers to select models that offer reliability and consistency, enhancing the applicability of their findings.


Assuntos
Segurança Alimentar , África , Segurança Alimentar/métodos , Análise Espaço-Temporal , Humanos , Simulação por Computador , Distribuição de Poisson
11.
N Engl J Med ; 385(15): 1393-1400, 2021 10 07.
Artigo em Inglês | MEDLINE | ID: mdl-34525275

RESUMO

BACKGROUND: On July 30, 2021, the administration of a third (booster) dose of the BNT162b2 messenger RNA vaccine (Pfizer-BioNTech) was approved in Israel for persons who were 60 years of age or older and who had received a second dose of vaccine at least 5 months earlier. Data are needed regarding the effect of the booster dose on the rate of confirmed coronavirus 2019 disease (Covid-19) and the rate of severe illness. METHODS: We extracted data for the period from July 30 through August 31, 2021, from the Israeli Ministry of Health database regarding 1,137,804 persons who were 60 years of age or older and had been fully vaccinated (i.e., had received two doses of BNT162b2) at least 5 months earlier. In the primary analysis, we compared the rate of confirmed Covid-19 and the rate of severe illness between those who had received a booster injection at least 12 days earlier (booster group) and those who had not received a booster injection (nonbooster group). In a secondary analysis, we evaluated the rate of infection 4 to 6 days after the booster dose as compared with the rate at least 12 days after the booster. In all the analyses, we used Poisson regression after adjusting for possible confounding factors. RESULTS: At least 12 days after the booster dose, the rate of confirmed infection was lower in the booster group than in the nonbooster group by a factor of 11.3 (95% confidence interval [CI], 10.4 to 12.3); the rate of severe illness was lower by a factor of 19.5 (95% CI, 12.9 to 29.5). In a secondary analysis, the rate of confirmed infection at least 12 days after vaccination was lower than the rate after 4 to 6 days by a factor of 5.4 (95% CI, 4.8 to 6.1). CONCLUSIONS: In this study involving participants who were 60 years of age or older and had received two doses of the BNT162b2 vaccine at least 5 months earlier, we found that the rates of confirmed Covid-19 and severe illness were substantially lower among those who received a booster (third) dose of the BNT162b2 vaccine.


Assuntos
Vacinas contra COVID-19 , COVID-19/prevenção & controle , Imunização Secundária , Idoso , Idoso de 80 Anos ou mais , Vacina BNT162 , COVID-19/epidemiologia , Bases de Dados Factuais , Feminino , Humanos , Israel/epidemiologia , Masculino , Pessoa de Meia-Idade , Gravidade do Paciente , Distribuição de Poisson , SARS-CoV-2
12.
N Engl J Med ; 385(24): e85, 2021 12 09.
Artigo em Inglês | MEDLINE | ID: mdl-34706170

RESUMO

BACKGROUND: In December 2020, Israel began a mass vaccination campaign against coronavirus disease 2019 (Covid-19) by administering the BNT162b2 vaccine, which led to a sharp curtailing of the outbreak. After a period with almost no cases of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection, a resurgent Covid-19 outbreak began in mid-June 2021. Possible reasons for the resurgence were reduced vaccine effectiveness against the delta (B.1.617.2) variant and waning immunity. The extent of waning immunity of the vaccine against the delta variant in Israel is unclear. METHODS: We used data on confirmed infection and severe disease collected from an Israeli national database for the period of July 11 to 31, 2021, for all Israeli residents who had been fully vaccinated before June 2021. We used a Poisson regression model to compare rates of confirmed SARS-CoV-2 infection and severe Covid-19 among persons vaccinated during different time periods, with stratification according to age group and with adjustment for possible confounding factors. RESULTS: Among persons 60 years of age or older, the rate of infection in the July 11-31 period was higher among persons who became fully vaccinated in January 2021 (when they were first eligible) than among those fully vaccinated 2 months later, in March (rate ratio, 1.6; 95% confidence interval [CI], 1.3 to 2.0). Among persons 40 to 59 years of age, the rate ratio for infection among those fully vaccinated in February (when they were first eligible), as compared with 2 months later, in April, was 1.7 (95% CI, 1.4 to 2.1). Among persons 16 to 39 years of age, the rate ratio for infection among those fully vaccinated in March (when they were first eligible), as compared with 2 months later, in May, was 1.6 (95% CI, 1.3 to 2.0). The rate ratio for severe disease among persons fully vaccinated in the month when they were first eligible, as compared with those fully vaccinated in March, was 1.8 (95% CI, 1.1 to 2.9) among persons 60 years of age or older and 2.2 (95% CI, 0.6 to 7.7) among those 40 to 59 years of age; owing to small numbers, the rate ratio could not be calculated among persons 16 to 39 years of age. CONCLUSIONS: These findings indicate that immunity against the delta variant of SARS-CoV-2 waned in all age groups a few months after receipt of the second dose of vaccine.


Assuntos
Anticorpos Neutralizantes/sangue , Vacina BNT162/imunologia , COVID-19/epidemiologia , Imunogenicidade da Vacina , SARS-CoV-2 , Eficácia de Vacinas , Adolescente , Adulto , Idoso , Anticorpos Antivirais/sangue , COVID-19/imunologia , COVID-19/prevenção & controle , Feminino , Humanos , Imunização Secundária , Israel/epidemiologia , Masculino , Pessoa de Meia-Idade , Gravidade do Paciente , Distribuição de Poisson , Análise de Regressão , Fatores Socioeconômicos , Fatores de Tempo
13.
Phys Rev Lett ; 132(22): 228401, 2024 May 31.
Artigo em Inglês | MEDLINE | ID: mdl-38877921

RESUMO

During electrochemical signal transmission through synapses, triggered by an action potential (AP), a stochastic number of synaptic vesicles (SVs), called the "quantal content," release neurotransmitters in the synaptic cleft. It is widely accepted that the quantal content probability distribution is a binomial based on the number of ready-release SVs in the presynaptic terminal. But the latter number itself fluctuates due to its stochastic replenishment, hence the actual distribution of quantal content is unknown. We show that exact distribution of quantal content can be derived for general stochastic AP inputs in the steady state. For fixed interval AP train, we prove that the distribution is a binomial, and corroborate our predictions by comparison with electrophysiological recordings from MNTB-LSO synapses of juvenile mice. For a Poisson train, we show that the distribution is nonbinomial. Moreover, we find exact moments of the quantal content in the Poisson and other general cases, which may be used to obtain the model parameters from experiments.


Assuntos
Modelos Neurológicos , Transmissão Sináptica , Vesículas Sinápticas , Transmissão Sináptica/fisiologia , Animais , Camundongos , Vesículas Sinápticas/fisiologia , Vesículas Sinápticas/metabolismo , Potenciais de Ação/fisiologia , Processos Estocásticos , Distribuição de Poisson
14.
Neural Comput ; 36(8): 1449-1475, 2024 Jul 19.
Artigo em Inglês | MEDLINE | ID: mdl-39028957

RESUMO

Dimension reduction on neural activity paves a way for unsupervised neural decoding by dissociating the measurement of internal neural pattern reactivation from the measurement of external variable tuning. With assumptions only on the smoothness of latent dynamics and of internal tuning curves, the Poisson gaussian-process latent variable model (P-GPLVM; Wu et al., 2017) is a powerful tool to discover the low-dimensional latent structure for high-dimensional spike trains. However, when given novel neural data, the original model lacks a method to infer their latent trajectories in the learned latent space, limiting its ability for estimating the neural reactivation. Here, we extend the P-GPLVM to enable the latent variable inference of new data constrained by previously learned smoothness and mapping information. We also describe a principled approach for the constrained latent variable inference for temporally compressed patterns of activity, such as those found in population burst events during hippocampal sharp-wave ripples, as well as metrics for assessing the validity of neural pattern reactivation and inferring the encoded experience. Applying these approaches to hippocampal ensemble recordings during active maze exploration, we replicate the result that P-GPLVM learns a latent space encoding the animal's position. We further demonstrate that this latent space can differentiate one maze context from another. By inferring the latent variables of new neural data during running, certain neural patterns are observed to reactivate, in accordance with the similarity of experiences encoded by its nearby neural trajectories in the training data manifold. Finally, reactivation of neural patterns can be estimated for neural activity during population burst events as well, allowing the identification for replay events of versatile behaviors and more general experiences. Thus, our extension of the P-GPLVM framework for unsupervised analysis of neural activity can be used to answer critical questions related to scientific discovery.


Assuntos
Hipocampo , Modelos Neurológicos , Neurônios , Animais , Distribuição Normal , Distribuição de Poisson , Neurônios/fisiologia , Hipocampo/fisiologia , Potenciais de Ação/fisiologia , Aprendizado de Máquina não Supervisionado , Ratos
15.
Biometrics ; 80(2)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38682464

RESUMO

The current Poisson factor models often assume that the factors are unknown, which overlooks the explanatory potential of certain observable covariates. This study focuses on high dimensional settings, where the number of the count response variables and/or covariates can diverge as the sample size increases. A covariate-augmented overdispersed Poisson factor model is proposed to jointly perform a high-dimensional Poisson factor analysis and estimate a large coefficient matrix for overdispersed count data. A group of identifiability conditions is provided to theoretically guarantee computational identifiability. We incorporate the interdependence of both response variables and covariates by imposing a low-rank constraint on the large coefficient matrix. To address the computation challenges posed by nonlinearity, two high-dimensional latent matrices, and the low-rank constraint, we propose a novel variational estimation scheme that combines Laplace and Taylor approximations. We also develop a criterion based on a singular value ratio to determine the number of factors and the rank of the coefficient matrix. Comprehensive simulation studies demonstrate that the proposed method outperforms the state-of-the-art methods in estimation accuracy and computational efficiency. The practical merit of our method is demonstrated by an application to the CITE-seq dataset. A flexible implementation of our proposed method is available in the R package COAP.


Assuntos
Simulação por Computador , Modelos Estatísticos , Distribuição de Poisson , Humanos , Tamanho da Amostra , Biometria/métodos , Análise Fatorial
16.
Biometrics ; 80(2)2024 Mar 27.
Artigo em Inglês | MEDLINE | ID: mdl-38771658

RESUMO

Limitations of using the traditional Cox's hazard ratio for summarizing the magnitude of the treatment effect on time-to-event outcomes have been widely discussed, and alternative measures that do not have such limitations are gaining attention. One of the alternative methods recently proposed, in a simple 2-sample comparison setting, uses the average hazard with survival weight (AH), which can be interpreted as the general censoring-free person-time incidence rate on a given time window. In this paper, we propose a new regression analysis approach for the AH with a truncation time τ. We investigate 3 versions of AH regression analysis, assuming (1) independent censoring, (2) group-specific censoring, and (3) covariate-dependent censoring. The proposed AH regression methods are closely related to robust Poisson regression. While the new approach needs to require a truncation time τ explicitly, it can be more robust than Poisson regression in the presence of censoring. With the AH regression approach, one can summarize the between-group treatment difference in both absolute difference and relative terms, adjusting for covariates that are associated with the outcome. This property will increase the likelihood that the treatment effect magnitude is correctly interpreted. The AH regression approach can be a useful alternative to the traditional Cox's hazard ratio approach for estimating and reporting the magnitude of the treatment effect on time-to-event outcomes.


Assuntos
Modelos de Riscos Proporcionais , Humanos , Análise de Regressão , Análise de Sobrevida , Simulação por Computador , Distribuição de Poisson , Biometria/métodos , Modelos Estatísticos
17.
Stat Med ; 43(11): 2096-2121, 2024 May 20.
Artigo em Inglês | MEDLINE | ID: mdl-38488240

RESUMO

Excessive zeros in multivariate count data are often observed in scenarios of biomedicine and public health. To provide a better analysis on this type of data, we first develop a marginalized multivariate zero-inflated Poisson (MZIP) regression model to directly interpret the overall exposure effects on marginal means. Then, we define a multiple Pearson residual for our newly developed MZIP regression model by simultaneously taking heterogeneity and correlation into consideration. Furthermore, a new model averaging prediction method is introduced based on the multiple Pearson residual, and the asymptotical optimality of this model averaging prediction is proved. Simulations and two empirical applications in medicine are used to illustrate the effectiveness of the proposed method.


Assuntos
Simulação por Computador , Modelos Estatísticos , Humanos , Distribuição de Poisson , Análise Multivariada , Análise de Regressão , Interpretação Estatística de Dados
18.
Stat Med ; 43(1): 102-124, 2024 01 15.
Artigo em Inglês | MEDLINE | ID: mdl-37921025

RESUMO

Human microbiome research has gained increasing importance due to its critical roles in comprehending human health and disease. Within the realm of microbiome research, the data generated often involves operational taxonomic unit counts, which can frequently present challenges such as over-dispersion and zero-inflation. To address dispersion-related concerns, the generalized Poisson model offers a flexible solution, effectively handling data characterized by over-dispersion, equi-dispersion, and under-dispersion. Furthermore, the realm of zero-inflated generalized Poisson models provides a strategic avenue to simultaneously tackle both over-dispersion and zero-inflation. The phenomenon of zero-inflation frequently stems from the heterogeneous nature of study populations. It emerges when specific microbial taxa fail to thrive in the microbial community of certain subjects, consequently resulting in a consistent count of zeros for these individuals. This subset of subjects represents a latent class, where their zeros originate from the genuine absence of the microbial taxa. In this paper, we introduce a novel testing methodology designed to uncover such latent classes within generalized Poisson regression models. We establish a closed-form test statistic and deduce its asymptotic distribution based on estimating equations. To assess its efficacy, we conduct an extensive array of simulation studies, and further apply the test to detect latent classes in human gut microbiome data from the Bogalusa Heart Study.


Assuntos
Microbioma Gastrointestinal , Microbiota , Humanos , Modelos Estatísticos , Simulação por Computador , Estudos Longitudinais , Distribuição de Poisson
19.
Stat Med ; 43(13): 2547-2559, 2024 Jun 15.
Artigo em Inglês | MEDLINE | ID: mdl-38637330

RESUMO

Mediation analysis is an increasingly popular statistical method for explaining causal pathways to inform intervention. While methods have increased, there is still a dearth of robust mediation methods for count outcomes with excess zeroes. Current mediation methods addressing this issue are computationally intensive, biased, or challenging to interpret. To overcome these limitations, we propose a new mediation methodology for zero-inflated count outcomes using the marginalized zero-inflated Poisson (MZIP) model and the counterfactual approach to mediation. This novel work gives population-average mediation effects whose variance can be estimated rapidly via delta method. This methodology is extended to cases with exposure-mediator interactions. We apply this novel methodology to explore if diabetes diagnosis can explain BMI differences in healthcare utilization and test model performance via simulations comparing the proposed MZIP method to existing zero-inflated and Poisson methods. We find that our proposed method minimizes bias and computation time compared to alternative approaches while allowing for straight-forward interpretations.


Assuntos
Simulação por Computador , Análise de Mediação , Humanos , Distribuição de Poisson , Modelos Estatísticos , Índice de Massa Corporal , Diabetes Mellitus , Viés , Causalidade
20.
Stat Med ; 43(21): 4163-4177, 2024 Sep 20.
Artigo em Inglês | MEDLINE | ID: mdl-39030763

RESUMO

Ecological momentary assessment (EMA), a data collection method commonly employed in mHealth studies, allows for repeated real-time sampling of individuals' psychological, behavioral, and contextual states. Due to the frequent measurements, data collected using EMA are useful for understanding both the temporal dynamics in individuals' states and how these states relate to adverse health events. Motivated by data from a smoking cessation study, we propose a joint model for analyzing longitudinal EMA data to determine whether certain latent psychological states are associated with repeated cigarette use. Our method consists of a longitudinal submodel-a dynamic factor model-that models changes in the time-varying latent states and a cumulative risk submodel-a Poisson regression model-that connects the latent states with the total number of events. In the motivating data, both the predictors-the underlying psychological states-and the event outcome-the number of cigarettes smoked-are partially unobservable; we account for this incomplete information in our proposed model and estimation method. We take a two-stage approach to estimation that leverages existing software and uses importance sampling-based weights to reduce potential bias. We demonstrate that these weights are effective at reducing bias in the cumulative risk submodel parameters via simulation. We apply our method to a subset of data from a smoking cessation study to assess the association between psychological state and cigarette smoking. The analysis shows that above-average intensities of negative mood are associated with increased cigarette use.


Assuntos
Avaliação Momentânea Ecológica , Modelos Estatísticos , Abandono do Hábito de Fumar , Humanos , Estudos Longitudinais , Abandono do Hábito de Fumar/psicologia , Simulação por Computador , Distribuição de Poisson , Fumar/psicologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA