Pesquisa | BVS CLAP/SMR-OPAS/OMS

1.

Ornaments for efficient allele-specific expression estimation with bias correction.

Adduri, Abhinav; Kim, Seyoung.

Am J Hum Genet ; 111(8): 1770-1781, 2024 Aug 08.

Artigo em Inglês | MEDLINE | ID: mdl-39047729

RESUMO

Allele-specific expression plays a crucial role in unraveling various biological mechanisms, including genomic imprinting and gene expression controlled by cis-regulatory variants. However, existing methods for quantification from RNA-sequencing (RNA-seq) reads do not adequately and efficiently remove various allele-specific read mapping biases, such as reference bias arising from reads containing the alternative allele that do not map to the reference transcriptome or ambiguous mapping bias caused by reads containing the reference allele that map differently from reads containing the alternative allele. We present Ornaments, a computational tool for rapid and accurate estimation of allele-specific transcript expression at unphased heterozygous loci from RNA-seq reads while correcting for allele-specific read mapping biases. Ornaments removes reference bias by mapping reads to a personalized transcriptome and ambiguous mapping bias by probabilistically assigning reads to multiple transcripts and variant loci they map to. Ornaments is a lightweight extension of kallisto, a popular tool for fast RNA-seq quantification, that improves the efficiency and accuracy of WASP, a popular tool for bias correction in allele-specific read mapping. In experiments with simulated and human lymphoblastoid cell-line RNA-seq reads with the genomes of the 1000 Genomes Project, we demonstrate that Ornaments improves the accuracy of WASP and kallisto, is nearly as efficient as kallisto, and is an order of magnitude faster than WASP per sample, with the additional cost of constructing a personalized index for multiple samples. Additionally, we show that Ornaments finds imprinted transcripts with higher sensitivity than WASP, which detects imprinted signals only at gene level.

Assuntos

Alelos , Humanos , Transcriptoma/genética , Impressão Genômica , Análise de Sequência de RNA/métodos , Software , Perfilação da Expressão Gênica/métodos

2.

OSCAA: A two-dimensional Gaussian mixture model for copy number variation association analysis.

Yu, Xuanxuan; Luo, Xizhi; Cai, Guoshuai; Xiao, Feifei.

Genet Epidemiol ; 2024 Mar 27.

Artigo em Inglês | MEDLINE | ID: mdl-38533840

RESUMO

Copy number variants (CNVs) are prevalent in the human genome and are found to have a profound effect on genomic organization and human diseases. Discovering disease-associated CNVs is critical for understanding the pathogenesis of diseases and aiding their diagnosis and treatment. However, traditional methods for assessing the association between CNVs and disease risks adopt a two-stage strategy conducting quantitative CNV measurements first and then testing for association, which may lead to biased association estimation and low statistical power, serving as a major barrier in routine genome-wide assessment of such variation. In this article, we developed One-Stage CNV-disease Association Analysis (OSCAA), a flexible algorithm to discover disease-associated CNVs for both quantitative and qualitative traits. OSCAA employs a two-dimensional Gaussian mixture model that is built upon the PCs from copy number intensities, accounting for technical biases in CNV detection while simultaneously testing for their effect on outcome traits. In OSCAA, CNVs are identified and their associations with disease risk are evaluated simultaneously in a single step, taking into account the uncertainty of CNV identification in the statistical model. Our simulations demonstrated that OSCAA outperformed the existing one-stage method and traditional two-stage methods by yielding a more accurate estimate of the CNV-disease association, especially for short CNVs or CNVs with weak signals. In conclusion, OSCAA is a powerful and flexible approach for CNV association testing with high sensitivity and specificity, which can be easily applied to different traits and clinical risk predictions.

3.

Exponential family measurement error models for single-cell CRISPR screens.

Barry, Timothy; Roeder, Kathryn; Katsevich, Eugene.

Biostatistics ; 2024 Apr 22.

Artigo em Inglês | MEDLINE | ID: mdl-38649751

RESUMO

CRISPR genome engineering and single-cell RNA sequencing have accelerated biological discovery. Single-cell CRISPR screens unite these two technologies, linking genetic perturbations in individual cells to changes in gene expression and illuminating regulatory networks underlying diseases. Despite their promise, single-cell CRISPR screens present considerable statistical challenges. We demonstrate through theoretical and real data analyses that a standard method for estimation and inference in single-cell CRISPR screens-"thresholded regression"-exhibits attenuation bias and a bias-variance tradeoff as a function of an intrinsic, challenging-to-select tuning parameter. To overcome these difficulties, we introduce GLM-EIV ("GLM-based errors-in-variables"), a new method for single-cell CRISPR screen analysis. GLM-EIV extends the classical errors-in-variables model to responses and noisy predictors that are exponential family-distributed and potentially impacted by the same set of confounding variables. We develop a computational infrastructure to deploy GLM-EIV across hundreds of processors on clouds (e.g. Microsoft Azure) and high-performance clusters. Leveraging this infrastructure, we apply GLM-EIV to analyze two recent, large-scale, single-cell CRISPR screen datasets, yielding several new insights.

4.

A Bayesian approach to estimating COVID-19 incidence and infection fatality rates.

Slater, Justin J; Bansal, Aiyush; Campbell, Harlan; Rosenthal, Jeffrey S; Gustafson, Paul; Brown, Patrick E.

Biostatistics ; 25(2): 354-384, 2024 Apr 15.

Artigo em Inglês | MEDLINE | ID: mdl-36881693

RESUMO

Naive estimates of incidence and infection fatality rates (IFR) of coronavirus disease 2019 suffer from a variety of biases, many of which relate to preferential testing. This has motivated epidemiologists from around the globe to conduct serosurveys that measure the immunity of individuals by testing for the presence of SARS-CoV-2 antibodies in the blood. These quantitative measures (titer values) are then used as a proxy for previous or current infection. However, statistical methods that use this data to its full potential have yet to be developed. Previous researchers have discretized these continuous values, discarding potentially useful information. In this article, we demonstrate how multivariate mixture models can be used in combination with post-stratification to estimate cumulative incidence and IFR in an approximate Bayesian framework without discretization. In doing so, we account for uncertainty from both the estimated number of infections and incomplete deaths data to provide estimates of IFR. This method is demonstrated using data from the Action to Beat Coronavirus erosurvey in Canada.

Assuntos

COVID-19 , Humanos , COVID-19/epidemiologia , Teorema de Bayes , Incidência , SARS-CoV-2

5.

Model-based multifacet clustering with high-dimensional omics applications.

Zong, Wei; Li, Danyang; Seney, Marianne L; Mcclung, Colleen A; Tseng, George C.

Biostatistics ; 2024 Jul 13.

Artigo em Inglês | MEDLINE | ID: mdl-39002144

RESUMO

High-dimensional omics data often contain intricate and multifaceted information, resulting in the coexistence of multiple plausible sample partitions based on different subsets of selected features. Conventional clustering methods typically yield only one clustering solution, limiting their capacity to fully capture all facets of cluster structures in high-dimensional data. To address this challenge, we propose a model-based multifacet clustering (MFClust) method based on a mixture of Gaussian mixture models, where the former mixture achieves facet assignment for gene features and the latter mixture determines cluster assignment of samples. We demonstrate superior facet and cluster assignment accuracy of MFClust through simulation studies. The proposed method is applied to three transcriptomic applications from postmortem brain and lung disease studies. The result captures multifacet clustering structures associated with critical clinical variables and provides intriguing biological insights for further hypothesis generation and discovery.

6.

A semiparametric Gaussian mixture model for chest CT-based 3D blood vessel reconstruction.

Zeng, Qianhan; Zhou, Jing; Ji, Ying; Wang, Hansheng.

Biostatistics ; 2024 Apr 19.

Artigo em Inglês | MEDLINE | ID: mdl-38637995

RESUMO

Computed tomography (CT) has been a powerful diagnostic tool since its emergence in the 1970s. Using CT data, 3D structures of human internal organs and tissues, such as blood vessels, can be reconstructed using professional software. This 3D reconstruction is crucial for surgical operations and can serve as a vivid medical teaching example. However, traditional 3D reconstruction heavily relies on manual operations, which are time-consuming, subjective, and require substantial experience. To address this problem, we develop a novel semiparametric Gaussian mixture model tailored for the 3D reconstruction of blood vessels. This model extends the classical Gaussian mixture model by enabling nonparametric variations in the component-wise parameters of interest according to voxel positions. We develop a kernel-based expectation-maximization algorithm for estimating the model parameters, accompanied by a supporting asymptotic theory. Furthermore, we propose a novel regression method for optimal bandwidth selection. Compared to the conventional cross-validation-based (CV) method, the regression method outperforms the CV method in terms of computational and statistical efficiency. In application, this methodology facilitates the fully automated reconstruction of 3D blood vessel structures with remarkable accuracy.

7.

Covariate-guided Bayesian mixture of spline experts for the analysis of multivariate high-density longitudinal data.

Fu, Haoyi; Tang, Lu; Rosen, Ori; Hipwell, Alison E; Huppert, Theodore J; Krafty, Robert T.

Biostatistics ; 25(3): 666-680, 2024 Jul 01.

Artigo em Inglês | MEDLINE | ID: mdl-38141227

RESUMO

With rapid development of techniques to measure brain activity and structure, statistical methods for analyzing modern brain-imaging data play an important role in the advancement of science. Imaging data that measure brain function are usually multivariate high-density longitudinal data and are heterogeneous across both imaging sources and subjects, which lead to various statistical and computational challenges. In this article, we propose a group-based method to cluster a collection of multivariate high-density longitudinal data via a Bayesian mixture of smoothing splines. Our method assumes each multivariate high-density longitudinal trajectory is a mixture of multiple components with different mixing weights. Time-independent covariates are assumed to be associated with the mixture components and are incorporated via logistic weights of a mixture-of-experts model. We formulate this approach under a fully Bayesian framework using Gibbs sampling where the number of components is selected based on a deviance information criterion. The proposed method is compared to existing methods via simulation studies and is applied to a study on functional near-infrared spectroscopy, which aims to understand infant emotional reactivity and recovery from stress. The results reveal distinct patterns of brain activity, as well as associations between these patterns and selected covariates.

Assuntos

Teorema de Bayes , Humanos , Estudos Longitudinais , Encéfalo/fisiologia , Encéfalo/diagnóstico por imagem , Espectroscopia de Luz Próxima ao Infravermelho/métodos , Interpretação Estatística de Dados , Modelos Estatísticos , Lactente , Análise Multivariada , Bioestatística/métodos

8.

Context-dependent gene regulatory network reveals regulation dynamics and cell trajectories using unspliced transcripts.

Tu, Yueh-Hua; Juan, Hsueh-Fen; Huang, Hsuan-Cheng.

Brief Bioinform ; 24(2)2023 03 19.

Artigo em Inglês | MEDLINE | ID: mdl-36653899

RESUMO

Gene regulatory networks govern complex gene expression programs in various biological phenomena, including embryonic development, cell fate decisions and oncogenesis. Single-cell techniques are increasingly being used to study gene expression, providing higher resolution than traditional approaches. However, inferring a comprehensive gene regulatory network across different cell types remains a challenge. Here, we propose to construct context-dependent gene regulatory networks (CDGRNs) from single-cell RNA sequencing data utilizing both spliced and unspliced transcript expression levels. A gene regulatory network is decomposed into subnetworks corresponding to different transcriptomic contexts. Each subnetwork comprises the consensus active regulation pairs of transcription factors and their target genes shared by a group of cells, inferred by a Gaussian mixture model. We find that the union of gene regulation pairs in all contexts is sufficient to reconstruct differentiation trajectories. Functions specific to the cell cycle, cell differentiation or tissue-specific functions are enriched throughout the developmental process in each context. Surprisingly, we also observe that the network entropy of CDGRNs decreases along differentiation trajectories, indicating directionality in differentiation. Overall, CDGRN allows us to establish the connection between gene regulation at the molecular level and cell differentiation at the macroscopic level.

Assuntos

Desenvolvimento Embrionário , Redes Reguladoras de Genes , Diferenciação Celular/genética , Fatores de Transcrição/genética , Fatores de Transcrição/metabolismo , Perfilação da Expressão Gênica

9.

scGMAAE: Gaussian mixture adversarial autoencoders for diversification analysis of scRNA-seq data.

Wang, Hai-Yun; Zhao, Jian-Ping; Zheng, Chun-Hou; Su, Yan-Sen.

Brief Bioinform ; 24(1)2023 01 19.

Artigo em Inglês | MEDLINE | ID: mdl-36592058

RESUMO

The progress of single-cell RNA sequencing (scRNA-seq) has led to a large number of scRNA-seq data, which are widely used in biomedical research. The noise in the raw data and tens of thousands of genes pose a challenge to capture the real structure and effective information of scRNA-seq data. Most of the existing single-cell analysis methods assume that the low-dimensional embedding of the raw data belongs to a Gaussian distribution or a low-dimensional nonlinear space without any prior information, which limits the flexibility and controllability of the model to a great extent. In addition, many existing methods need high computational cost, which makes them difficult to be used to deal with large-scale datasets. Here, we design and develop a depth generation model named Gaussian mixture adversarial autoencoders (scGMAAE), assuming that the low-dimensional embedding of different types of cells follows different Gaussian distributions, integrating Bayesian variational inference and adversarial training, as to give the interpretable latent representation of complex data and discover the statistical distribution of different types of cells. The scGMAAE is provided with good controllability, interpretability and scalability. Therefore, it can process large-scale datasets in a short time and give competitive results. scGMAAE outperforms existing methods in several ways, including dimensionality reduction visualization, cell clustering, differential expression analysis and batch effect removal. Importantly, compared with most deep learning methods, scGMAAE requires less iterations to generate the best results.

Assuntos

Perfilação da Expressão Gênica , Análise da Expressão Gênica de Célula Única , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Distribuição Normal , Teorema de Bayes , Análise de Célula Única/métodos , Análise por Conglomerados

10.

Graph deep learning enabled spatial domains identification for spatial transcriptomics.

Liu, Teng; Fang, Zhao-Yu; Li, Xin; Zhang, Li-Ning; Cao, Dong-Sheng; Yin, Ming-Zhu.

Brief Bioinform ; 24(3)2023 05 19.

Artigo em Inglês | MEDLINE | ID: mdl-37080761

RESUMO

Advancing spatially resolved transcriptomics (ST) technologies help biologists comprehensively understand organ function and tissue microenvironment. Accurate spatial domain identification is the foundation for delineating genome heterogeneity and cellular interaction. Motivated by this perspective, a graph deep learning (GDL) based spatial clustering approach is constructed in this paper. First, the deep graph infomax module embedded with residual gated graph convolutional neural network is leveraged to address the gene expression profiles and spatial positions in ST. Then, the Bayesian Gaussian mixture model is applied to handle the latent embeddings to generate spatial domains. Designed experiments certify that the presented method is superior to other state-of-the-art GDL-enabled techniques on multiple ST datasets. The codes and dataset used in this manuscript are summarized at https://github.com/narutoten520/SCGDL.

Assuntos

Aprendizado Profundo , Transcriptoma , Teorema de Bayes , Perfilação da Expressão Gênica , Comunicação Celular

11.

MAST: Phylogenetic Inference with Mixtures Across Sites and Trees.

Wong, Thomas K F; Cherryh, Caitlin; Rodrigo, Allen G; Hahn, Matthew W; Minh, Bui Quang; Lanfear, Robert.

Syst Biol ; 73(2): 375-391, 2024 Jul 27.

Artigo em Inglês | MEDLINE | ID: mdl-38421146

RESUMO

Hundreds or thousands of loci are now routinely used in modern phylogenomic studies. Concatenation approaches to tree inference assume that there is a single topology for the entire dataset, but different loci may have different evolutionary histories due to incomplete lineage sorting (ILS), introgression, and/or horizontal gene transfer; even single loci may not be treelike due to recombination. To overcome this shortcoming, we introduce an implementation of a multi-tree mixture model that we call mixtures across sites and trees (MAST). This model extends a prior implementation by Boussau et al. (2009) by allowing users to estimate the weight of each of a set of pre-specified bifurcating trees in a single alignment. The MAST model allows each tree to have its own weight, topology, branch lengths, substitution model, nucleotide or amino acid frequencies, and model of rate heterogeneity across sites. We implemented the MAST model in a maximum-likelihood framework in the popular phylogenetic software, IQ-TREE. Simulations show that we can accurately recover the true model parameters, including branch lengths and tree weights for a given set of tree topologies, under a wide range of biologically realistic scenarios. We also show that we can use standard statistical inference approaches to reject a single-tree model when data are simulated under multiple trees (and vice versa). We applied the MAST model to multiple primate datasets and found that it can recover the signal of ILS in the Great Apes, as well as the asymmetry in minor trees caused by introgression among several macaque species. When applied to a dataset of 4 Platyrrhine species for which standard concatenated maximum likelihood (ML) and gene tree approaches disagree, we observe that MAST gives the highest weight (i.e., the largest proportion of sites) to the tree also supported by gene tree approaches. These results suggest that the MAST model is able to analyze a concatenated alignment using ML while avoiding some of the biases that come with assuming there is only a single tree. We discuss how the MAST model can be extended in the future.

Assuntos

Classificação , Filogenia , Classificação/métodos , Modelos Genéticos , Simulação por Computador , Software , Animais

12.

Baldur: Bayesian Hierarchical Modeling for Label-Free Proteomics with Gamma Regressing Mean-Variance Trends.

Berg, Philip; Popescu, George.

Mol Cell Proteomics ; 22(12): 100658, 2023 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-37806340

RESUMO

Label-free proteomics is a fast-growing methodology to infer abundances in mass spectrometry proteomics. Extensive research has focused on spectral quantification and peptide identification. However, research toward modeling and understanding quantitative proteomics data is scarce. Here we propose a Bayesian hierarchical decision model (Baldur) to test for differences in means between conditions for proteins, peptides, and post-translational modifications. We developed a Bayesian regression model to characterize local mean-variance trends in data, to estimate measurement uncertainty and hyperparameters for the decision model. A key contribution is the development of a new gamma regression model that describes the mean-variance dependency as a mixture of a common and a latent trend-allowing for localized trend estimates. We then evaluate the performance of Baldur, limma-trend, and t test on six benchmark datasets: five total proteomics and one post-translational modification dataset. We find that Baldur drastically improves the decision in noisier post-translational modification data over limma-trend and t test. In addition, we see significant improvements using Baldur over the other methods in the total proteomics datasets. Finally, we analyzed Baldur's performance when increasing the number of replicates and found that the method always increases precision with sample size, while showing robust control of the false positive rate. We conclude that our model vastly improves over popular data analysis methods (limma-trend and t test) in several spike-in datasets by achieving a high true positive detection rate, while greatly reducing the false-positive rate.

Assuntos

Proteínas , Proteômica , Proteômica/métodos , Teorema de Bayes , Proteínas/química , Peptídeos/metabolismo , Espectrometria de Massas/métodos

13.

A clustering procedure for three-way RNA sequencing data using data transformations and matrix-variate Gaussian mixture models.

Scharl, Theresa; Grün, Bettina.

BMC Bioinformatics ; 25(1): 90, 2024 Mar 01.

Artigo em Inglês | MEDLINE | ID: mdl-38429687

RESUMO

RNA sequencing of time-course experiments results in three-way count data where the dimensions are the genes, the time points and the biological units. Clustering RNA-seq data allows to extract groups of co-expressed genes over time. After standardisation, the normalised counts of individual genes across time points and biological units have similar properties as compositional data. We propose the following procedure to suitably cluster three-way RNA-seq data: (1) pre-process the RNA-seq data by calculating the normalised expression profiles, (2) transform the data using the additive log ratio transform to map the composition in the D-part Aitchison simplex to a D - 1 -dimensional Euclidean vector, (3) cluster the transformed RNA-seq data using matrix-variate Gaussian mixture models and (4) assess the quality of the overall cluster solution and of individual clusters based on cluster separation in the transformed space using density-based silhouette information and on compactness of the cluster in the original space using cluster maps as a suitable visualisation. The proposed procedure is illustrated on RNA-seq data from fission yeast and results are also compared to an analogous two-way approach after flattening out the biological units.

Assuntos

RNA , RNA/genética , Análise de Sequência de RNA/métodos , RNA-Seq , Sequência de Bases , Análise por Conglomerados

14.

Multiplet-Assisted Peak Alignment for ¹H NMR-Based Metabolomics.

Charris-Molina, Andrés; Burdisso, Paula; Hoijemberg, Pablo A.

J Proteome Res ; 23(1): 430-448, 2024 01 05.

Artigo em Inglês | MEDLINE | ID: mdl-38127799

RESUMO

NMR-based metabolomics aims at recovering biological information by comparing spectral data from samples of biological interest and appropriate controls. Any statistical analysis performed on the data matrix relies on the proper peak alignment to produce meaningful results. Through the last decades, several peak alignment algorithms have been proposed, as well as alternatives like spectral binning or strategies for annotation and quantification, the latter depending on reference databases. Most of the alignment algorithms, mainly based on segmentation of the spectra, present limitations for regions with peak overlap or cases of frequency order exchange. Here, we present our multiplet-assisted peak alignment algorithm, a new methodology that consists of aligning peaks by matching multiplet profiles of f1 traces from J-resolved spectra. A correspondence matrix with the linked f1 traces is built, and multivariate data analysis can be performed on it to obtain useful information from the data, overcoming the issues of peak overlap and frequency crossovers. Statistical total correlation spectroscopy can be applied on the matrix as well, toward a better identification of molecules of interest. The results can be queried on one-dimensional (1D) 1H databases or can be directly coupled to our previously published Chemical Shift Multiplet Database.

Assuntos

Imageamento por Ressonância Magnética , Metabolômica , Espectroscopia de Prótons por Ressonância Magnética , Metabolômica/métodos , Espectroscopia de Ressonância Magnética/métodos , Algoritmos

15.

Inferring single-cell copy number profiles through cross-cell segmentation of read counts.

Liu, Furui; Shi, Fangyuan; Yu, Zhenhua.

BMC Genomics ; 25(1): 25, 2024 Jan 02.

Artigo em Inglês | MEDLINE | ID: mdl-38166601

RESUMO

BACKGROUND: Copy number alteration (CNA) is one of the major genomic variations that frequently occur in cancers, and accurate inference of CNAs is essential for unmasking intra-tumor heterogeneity (ITH) and tumor evolutionary history. Single-cell DNA sequencing (scDNA-seq) makes it convenient to profile CNAs at single-cell resolution, and thus aids in better characterization of ITH. Despite that several computational methods have been proposed to decipher single-cell CNAs, their performance is limited in either breakpoint detection or copy number estimation due to the high dimensionality and noisy nature of read counts data. RESULTS: By treating breakpoint detection as a process to segment high dimensional read count sequence, we develop a novel method called DeepCNA for cross-cell segmentation of read count sequence and per-cell inference of CNAs. To cope with the difficulty of segmentation, an autoencoder (AE) network is employed in DeepCNA to project the original data into a low-dimensional space, where the breakpoints can be efficiently detected along each latent dimension and further merged to obtain the final breakpoints. Unlike the existing methods that manually calculate certain statistics of read counts to find breakpoints, the AE model makes it convenient to automatically learn the representations. Based on the inferred breakpoints, we employ a mixture model to predict copy numbers of segments for each cell, and leverage expectation-maximization algorithm to efficiently estimate cell ploidy by exploring the most abundant copy number state. Benchmarking results on simulated and real data demonstrate our method is able to accurately infer breakpoints as well as absolute copy numbers and surpasses the existing methods under different test conditions. DeepCNA can be accessed at: https://github.com/zhyu-lab/deepcna . CONCLUSIONS: Profiling single-cell CNAs based on deep learning is becoming a new paradigm of scDNA-seq data analysis, and DeepCNA is an enhancement to the current arsenal of computational methods for investigating cancer genomics.

Assuntos

Variações do Número de Cópias de DNA , Neoplasias , Humanos , Algoritmos , Genômica/métodos , Análise de Sequência de DNA , Neoplasias/genética

16.

Harmonization of CSF and imaging biomarkers in Alzheimer's disease: Need and practical applications for genetics studies and preclinical classification.

Timsina, Jigyasha; Ali, Muhammad; Do, Anh; Wang, Lihua; Western, Daniel; Sung, Yun Ju; Cruchaga, Carlos.

Neurobiol Dis ; 190: 106373, 2024 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-38072165

RESUMO

In Alzheimer's disease (AD) research, cerebrospinal fluid (CSF) Amyloid beta (Aß), Tau and pTau are the most accepted and well validated biomarkers. Several methods and platforms exist to measure those biomarkers, leading to challenges in combining data across studies. Thus, there is a need to identify methods that harmonize and standardize these values. We used a Z-score based approach to harmonize CSF and amyloid imaging data from multiple cohorts and compared GWAS results using this approach with currently accepted methods. We also used a generalized mixture model to calculate the threshold for biomarker-positivity. Based on our findings, our normalization approach performed as well as meta-analysis and did not lead to any spurious results. In terms of dichotomization, cutoffs calculated with this approach were very similar to those reported previously. These findings show that the Z-score based harmonization approach can be applied to heterogeneous platforms and provides biomarker cut-offs consistent with the classical approaches without requiring any additional data.

Assuntos

Doença de Alzheimer , Humanos , Doença de Alzheimer/diagnóstico por imagem , Doença de Alzheimer/genética , Doença de Alzheimer/líquido cefalorraquidiano , Peptídeos beta-Amiloides/líquido cefalorraquidiano , Proteínas tau/genética , Proteínas tau/líquido cefalorraquidiano , Tomografia por Emissão de Pósitrons , Biomarcadores/líquido cefalorraquidiano , Fragmentos de Peptídeos/líquido cefalorraquidiano

17.

Exposure to metal mixtures and telomere length in Bangladeshi children.

Farzan, Shohreh F; Niu, Zhongzheng; Guo, Fangqi; Shahriar, Mohammad; Kibriya, Muhammad G; Jasmine, Farzana; Sarwar, Golam; Jackson, Brian P; Ahsan, Habibul; Argos, Maria.

Am J Epidemiol ; 2024 Jul 05.

Artigo em Inglês | MEDLINE | ID: mdl-38973734

RESUMO

Telomere length is associated with chronic diseases and in younger populations, may represent a biomarker of disease susceptibility. As growing evidence suggests that environmental factors, including metals, may impact telomere length, we investigated the association between 17 metals measured in toenail samples and leukocyte relative telomere length (RTL), among 472 five- to seven-year-old children enrolled in the Bangladesh Environmental Research in Children's Health (BiRCH) cohort. In single exposure linear regression models, a doubling of arsenic (As) and mercury (Hg) (µg/g) were associated with a -0.21 (95%CI: -0.032, -0.010; p=0.0005) and -0.017 (95%CI: -0.029, -0.004; p=0.006) difference in RTL, respectively. In Bayesian Kernel Machine Regression (BKMR) mixture models, the overall metal mixture was inversely associated with RTL (P-for-trend <0.001). Negative associations with RTL were observed with both log2-As and log2-Hg, while an inverted U-shaped association was observed for log2-zinc (Zn) with RTL. We found little evidence of interaction among metals. Sex-stratification identified stronger associations of the overall mixture and log2-As with RTL among females, compared to males. Our study suggests that As and Hg may independently influence RTL in mid-childhood. Further studies are needed to investigate potential long-term impacts of metal-associated telomere shortening in childhood on health outcomes in adult life.

18.

Geographic Variation, Economic Activity, and Labor Market Characteristics in Trajectories of Suicide in the United States, 2008-2020.

Keyes, Katherine M; Kandula, Sasikiran; Martinez-Ales, Gonzalo; Gimbrone, Catherine; Joseph, Victoria; Monnat, Shannon; Rutherford, Caroline; Olfson, Mark; Gould, Madelyn; Shaman, Jeffrey.

Am J Epidemiol ; 193(2): 256-266, 2024 Feb 05.

Artigo em Inglês | MEDLINE | ID: mdl-37846128

RESUMO

Suicide rates in the United States have increased over the past 15 years, with substantial geographic variation in these increases; yet there have been few attempts to cluster counties by the magnitude of suicide rate changes according to intercept and slope or to identify the economic precursors of increases. We used vital statistics data and growth mixture models to identify clusters of counties by their magnitude of suicide growth from 2008 to 2020 and examined associations with county economic and labor indices. Our models identified 5 clusters, each differentiated by intercept and slope magnitude, with the highest-rate cluster (4% of counties) being observed mainly in sparsely populated areas in the West and Alaska, starting the time series at 25.4 suicides per 100,000 population, and exhibiting the steepest increase in slope (0.69/100,000/year). There was no cluster for which the suicide rate was stable or declining. Counties in the highest-rate cluster were more likely to have agricultural and service economies and less likely to have urban professional economies. Given the increased burden of suicide, with no clusters of counties improving over time, additional policy and prevention efforts are needed, particularly targeted at rural areas in the West.

Assuntos

Suicídio , Humanos , Estados Unidos/epidemiologia , População Rural

19.

Indicators of cure for women living after uterine and ovarian cancers: a population-based study.

Giudici, Fabiola; De Paoli, Angela; Toffolutti, Federica; Guzzinati, Stefano; Francisci, Silvia; Bucchi, Lauro; Gatta, Gemma; Demuru, Elena; Mallone, Sandra; Cin, Antonella Dal; Caldarella, Adele; Cuccaro, Francesco; Migliore, Enrica; Gambino, Maria Letizia; Ravaioli, Alessandra; Puppo, Antonella; Ferrante, Margherita; Carrozzi, Giuliano; Stracci, Fabrizio; Musolino, Antonino; Gasparotti, Cinzia; Cavallo, Rossella; Mazzucco, Walter; Vitale, Maria Francesca; Cascone, Giuseppe; Ballotari, Paola; Ferretti, Stefano; Mangone, Lucia; Rizzello, Roberto Vito; Sampietro, Giuseppe; Mian, Michael; Boschetti, Lorenza; Galasso, Rocco; Bella, Francesca; Piras, Daniela; Sessa, Alessandra; Seghini, Pietro; Fanetti, Anna Clara; Pinna, Pasquala; De Angelis, Roberta; Serraino, Diego; Dal Maso, Luigino.

Am J Epidemiol ; 2024 04 15.

Artigo em Inglês | MEDLINE | ID: mdl-38629583

RESUMO

This study aims to estimate long-term survival, cancer prevalence, and several cure indicators for Italian women with gynaecological cancers. Thirty-one cancer registries, representing 47% of the Italian female population, were included. Mixture cure models were used to estimate Net Survival (NS), Cure Fraction, Time To Cure (5-year conditional NS>95%), Cure Prevalence (women who will not die of cancer), and Already Cured (living longer than Time to Cure). In 2018, 0.4% (121,704) of Italian women were alive after corpus uteri cancer, 0.2% (52,551) after cervical, and 0.2% (52,153) after ovarian cancer. More than 90% of patients with uterine cancers and 83% with ovarian cancer will not die from their neoplasm (Cure Prevalence). Women with gynaecological cancers have a residual excess risk of death <5% after 5 years since diagnosis. The Cure Fraction was 69% for corpus uteri, 32% for ovarian, and 58% for cervical cancer patients. Time To Cure was ≤10 years for women with gynaecological cancers aged <55 years. 74% of patients with cervical cancer, 63% with corpus uteri cancer, and 55% with ovarian cancer were Already Cured. These results will contribute to improving follow-up programs for women with gynaecological cancers and supporting efforts against discrimination of already cured ones.

20.

Bayesian sample size determination in basket trials borrowing information between subsets.

Zheng, Haiyan; Grayling, Michael J; Mozgunov, Pavel; Jaki, Thomas; Wason, James M S.

Biostatistics ; 24(4): 1000-1016, 2023 10 18.

Artigo em Inglês | MEDLINE | ID: mdl-35993875

RESUMO

Basket trials are increasingly used for the simultaneous evaluation of a new treatment in various patient subgroups under one overarching protocol. We propose a Bayesian approach to sample size determination in basket trials that permit borrowing of information between commensurate subsets. Specifically, we consider a randomized basket trial design where patients are randomly assigned to the new treatment or control within each trial subset ("subtrial" for short). Closed-form sample size formulae are derived to ensure that each subtrial has a specified chance of correctly deciding whether the new treatment is superior to or not better than the control by some clinically relevant difference. Given prespecified levels of pairwise (in)commensurability, the subtrial sample sizes are solved simultaneously. The proposed Bayesian approach resembles the frequentist formulation of the problem in yielding comparable sample sizes for circumstances of no borrowing. When borrowing is enabled between commensurate subtrials, a considerably smaller trial sample size is required compared to the widely implemented approach of no borrowing. We illustrate the use of our sample size formulae with two examples based on real basket trials. A comprehensive simulation study further shows that the proposed methodology can maintain the true positive and false positive rates at desired levels.

Assuntos

Projetos de Pesquisa , Humanos , Tamanho da Amostra , Teorema de Bayes , Simulação por Computador

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA