Búsqueda | Portal de Búsqueda de la BVS

1.

OASIS: An interpretable, finite-sample valid alternative to Pearson's X² for scientific discovery.

Baharav, Tavor Z; Tse, David; Salzman, Julia.

Proc Natl Acad Sci U S A ; 121(15): e2304671121, 2024 Apr 09.

Artículo en Inglés | MEDLINE | ID: mdl-38564640

RESUMEN

Contingency tables, data represented as counts matrices, are ubiquitous across quantitative research and data-science applications. Existing statistical tests are insufficient however, as none are simultaneously computationally efficient and statistically valid for a finite number of observations. In this work, motivated by a recent application in reference-free genomic inference [K. Chaung et al., Cell 186, 5440-5456 (2023)], we develop Optimized Adaptive Statistic for Inferring Structure (OASIS), a family of statistical tests for contingency tables. OASIS constructs a test statistic which is linear in the normalized data matrix, providing closed-form P-value bounds through classical concentration inequalities. In the process, OASIS provides a decomposition of the table, lending interpretability to its rejection of the null. We derive the asymptotic distribution of the OASIS test statistic, showing that these finite-sample bounds correctly characterize the test statistic's P-value up to a variance term. Experiments on genomic sequencing data highlight the power and interpretability of OASIS. Using OASIS, we develop a method that can detect SARS-CoV-2 and Mycobacterium tuberculosis strains de novo, which existing approaches cannot achieve. We demonstrate in simulations that OASIS is robust to overdispersion, a common feature in genomic data like single-cell RNA sequencing, where under accepted noise models OASIS provides good control of the false discovery rate, while Pearson's [Formula: see text] consistently rejects the null. Additionally, we show in simulations that OASIS is more powerful than Pearson's [Formula: see text] in certain regimes, including for some important two group alternatives, which we corroborate with approximate power calculations.

Asunto(s)

Genoma , Genómica , Mapeo Cromosómico

2.

Graphene sandwich-based biological specimen preparation for cryo-EM analysis.

Xu, Jie; Gao, Xiaoyin; Zheng, Liming; Jia, Xia; Xu, Kui; Ma, Yuwei; Wei, Xiaoding; Liu, Nan; Peng, Hailin; Wang, Hong-Wei.

Proc Natl Acad Sci U S A ; 121(5): e2309384121, 2024 Jan 30.

Artículo en Inglés | MEDLINE | ID: mdl-38252835

RESUMEN

High-quality specimen preparation plays a crucial role in cryo-electron microscopy (cryo-EM) structural analysis. In this study, we have developed a reliable and convenient technique called the graphene sandwich method for preparing cryo-EM specimens. This method involves using two layers of graphene films that enclose macromolecules on both sides, allowing for an appropriate ice thickness for cryo-EM analysis. The graphene sandwich helps to mitigate beam-induced charging effect and reduce particle motion compared to specimens prepared using the traditional method with graphene support on only one side, therefore improving the cryo-EM data quality. These advancements may open new opportunities to expand the use of graphene in the field of biological electron microscopy.

Asunto(s)

Grafito , Microscopía por Crioelectrón , Exactitud de los Datos , Movimiento (Física)

3.

Current concepts, advances, and challenges in deciphering the human microbiota with metatranscriptomics.

Ojala, Teija; Häkkinen, Aino-Elina; Kankuri, Esko; Kankainen, Matti.

Trends Genet ; 39(9): 686-702, 2023 09.

Artículo en Inglés | MEDLINE | ID: mdl-37365103

RESUMEN

Metatranscriptomics refers to the analysis of the collective microbial transcriptome of a sample. Its increased utilization for the characterization of human-associated microbial communities has enabled the discovery of many disease-state related microbial activities. Here, we review the principles of metatranscriptomics-based analysis of human-associated microbial samples. We describe strengths and weaknesses of popular sample preparation, sequencing, and bioinformatics approaches and summarize strategies for their use. We then discuss how human-associated microbial communities have recently been examined and how their characterization may change. We conclude that metatranscriptomics insights into human microbiotas under health and disease have not only expanded our knowledge on human health, but also opened avenues for rational antimicrobial drug use and disease management.

Asunto(s)

Metagenómica , Microbiota , Humanos , Microbiota/genética , Transcriptoma/genética , Secuenciación de Nucleótidos de Alto Rendimiento

4.

Polygenic risk prediction: why and when out-of-sample prediction R² can exceed SNP-based heritability.

Wang, Xiaotong; Walker, Alicia; Revez, Joana A; Ni, Guiyan; Adams, Mark J; McIntosh, Andrew M; Visscher, Peter M; Wray, Naomi R.

Am J Hum Genet ; 110(7): 1207-1215, 2023 07 06.

Artículo en Inglés | MEDLINE | ID: mdl-37379836

RESUMEN

In polygenic score (PGS) analysis, the coefficient of determination (R2) is a key statistic to evaluate efficacy. R2 is the proportion of phenotypic variance explained by the PGS, calculated in a cohort that is independent of the genome-wide association study (GWAS) that provided estimates of allelic effect sizes. The SNP-based heritability (hSNP2, the proportion of total phenotypic variances attributable to all common SNPs) is the theoretical upper limit of the out-of-sample prediction R2. However, in real data analyses R2 has been reported to exceed hSNP2, which occurs in parallel with the observation that hSNP2 estimates tend to decline as the number of cohorts being meta-analyzed increases. Here, we quantify why and when these observations are expected. Using theory and simulation, we show that if heterogeneities in cohort-specific hSNP2 exist, or if genetic correlations between cohorts are less than one, hSNP2 estimates can decrease as the number of cohorts being meta-analyzed increases. We derive conditions when the out-of-sample prediction R2 will be greater than hSNP2 and show the validity of our derivations with real data from a binary trait (major depression) and a continuous trait (educational attainment). Our research calls for a better approach to integrating information from multiple cohorts to address issues of between-cohort heterogeneity.

Asunto(s)

Estudio de Asociación del Genoma Completo , Polimorfismo de Nucleótido Simple , Humanos , Polimorfismo de Nucleótido Simple/genética , Herencia Multifactorial/genética , Fenotipo , Simulación por Computador

5.

Detecting tipping points of complex diseases by network information entropy.

Lyu, Chengshang; Chen, Lingxi; Liu, Xiaoping.

Brief Bioinform ; 25(4)2024 May 23.

Artículo en Inglés | MEDLINE | ID: mdl-38960408

RESUMEN

The progression of complex diseases often involves abrupt and non-linear changes characterized by sudden shifts that trigger critical transformations. Identifying these critical states or tipping points is crucial for understanding disease progression and developing effective interventions. To address this challenge, we have developed a model-free method named Network Information Entropy of Edges (NIEE). Leveraging dynamic network biomarkers, sample-specific networks, and information entropy theories, NIEE can detect critical states or tipping points in diverse data types, including bulk, single-sample expression data. By applying NIEE to real disease datasets, we successfully identified critical predisease stages and tipping points before disease onset. Our findings underscore NIEE's potential to enhance comprehension of complex disease development.

Asunto(s)

Entropía , Humanos , Redes Reguladoras de Genes , Biología Computacional/métodos , Progresión de la Enfermedad , Biomarcadores , Algoritmos

6.

scEVOLVE: cell-type incremental annotation without forgetting for single-cell RNA-seq data.

Zhai, Yuyao; Chen, Liang; Deng, Minghua.

Brief Bioinform ; 25(2)2024 Jan 22.

Artículo en Inglés | MEDLINE | ID: mdl-38366803

RESUMEN

The evolution in single-cell RNA sequencing (scRNA-seq) technology has opened a new avenue for researchers to inspect cellular heterogeneity with single-cell precision. One crucial aspect of this technology is cell-type annotation, which is fundamental for any subsequent analysis in single-cell data mining. Recently, the scientific community has seen a surge in the development of automatic annotation methods aimed at this task. However, these methods generally operate at a steady-state total cell-type capacity, significantly restricting the cell annotation systems'capacity for continuous knowledge acquisition. Furthermore, creating a unified scRNA-seq annotation system remains challenged by the need to progressively expand its understanding of ever-increasing cell-type concepts derived from a continuous data stream. In response to these challenges, this paper presents a novel and challenging setting for annotation, namely cell-type incremental annotation. This concept is designed to perpetually enhance cell-type knowledge, gleaned from continuously incoming data. This task encounters difficulty with data stream samples that can only be observed once, leading to catastrophic forgetting. To address this problem, we introduce our breakthrough methodology termed scEVOLVE, an incremental annotation method. This innovative approach is built upon the methodology of contrastive sample replay combined with the fundamental principle of partition confidence maximization. Specifically, we initially retain and replay sections of the old data in each subsequent training phase, then establish a unique prototypical learning objective to mitigate the cell-type imbalance problem, as an alternative to using cross-entropy. To effectively emulate a model that trains concurrently with complete data, we introduce a cell-type decorrelation strategy that efficiently scatters feature representations of each cell type uniformly. We constructed the scEVOLVE framework with simplicity and ease of integration into most deep softmax-based single-cell annotation methods. Thorough experiments conducted on a range of meticulously constructed benchmarks consistently prove that our methodology can incrementally learn numerous cell types over an extended period, outperforming other strategies that fail quickly. As far as our knowledge extends, this is the first attempt to propose and formulate an end-to-end algorithm framework to address this new, practical task. Additionally, scEVOLVE, coded in Python using the Pytorch machine-learning library, is freely accessible at https://github.com/aimeeyaoyao/scEVOLVE.

Asunto(s)

Algoritmos , Análisis de Expresión Génica de una Sola Célula , Benchmarking , Entropía , Biblioteca de Genes , Análisis de Secuencia de ARN , Perfilación de la Expresión Génica , Análisis por Conglomerados

7.

A hybrid demultiplexing strategy that improves performance and robustness of cell hashing.

Li, Lei; Sun, Jiayi; Fu, Yanbin; Changrob, Siriruk; McGrath, Joshua J C; Wilson, Patrick C.

Brief Bioinform ; 25(4)2024 May 23.

Artículo en Inglés | MEDLINE | ID: mdl-38828640

RESUMEN

Cell hashing, a nucleotide barcode-based method that allows users to pool multiple samples and demultiplex in downstream analysis, has gained widespread popularity in single-cell sequencing due to its compatibility, simplicity, and cost-effectiveness. Despite these advantages, the performance of this method remains unsatisfactory under certain circumstances, especially in experiments that have imbalanced sample sizes or use many hashtag antibodies. Here, we introduce a hybrid demultiplexing strategy that increases accuracy and cell recovery in multi-sample single-cell experiments. This approach correlates the results of cell hashing and genetic variant clustering, enabling precise and efficient cell identity determination without additional experimental costs or efforts. In addition, we developed HTOreader, a demultiplexing tool for cell hashing that improves the accuracy of cut-off calling by avoiding the dominance of negative signals in experiments with many hashtags or imbalanced sample sizes. When compared to existing methods using real-world datasets, this hybrid approach and HTOreader consistently generate reliable results with increased accuracy and cell recovery.

Asunto(s)

Análisis de la Célula Individual , Análisis de la Célula Individual/métodos , Humanos , Algoritmos , Programas Informáticos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Biología Computacional/métodos

8.

MGCNSS: miRNA-disease association prediction with multi-layer graph convolution and distance-based negative sample selection strategy.

Tian, Zhen; Han, Chenguang; Xu, Lewen; Teng, Zhixia; Song, Wei.

Brief Bioinform ; 25(3)2024 Mar 27.

Artículo en Inglés | MEDLINE | ID: mdl-38622356

RESUMEN

Identifying disease-associated microRNAs (miRNAs) could help understand the deep mechanism of diseases, which promotes the development of new medicine. Recently, network-based approaches have been widely proposed for inferring the potential associations between miRNAs and diseases. However, these approaches ignore the importance of different relations in meta-paths when learning the embeddings of miRNAs and diseases. Besides, they pay little attention to screening out reliable negative samples which is crucial for improving the prediction accuracy. In this study, we propose a novel approach named MGCNSS with the multi-layer graph convolution and high-quality negative sample selection strategy. Specifically, MGCNSS first constructs a comprehensive heterogeneous network by integrating miRNA and disease similarity networks coupled with their known association relationships. Then, we employ the multi-layer graph convolution to automatically capture the meta-path relations with different lengths in the heterogeneous network and learn the discriminative representations of miRNAs and diseases. After that, MGCNSS establishes a highly reliable negative sample set from the unlabeled sample set with the negative distance-based sample selection strategy. Finally, we train MGCNSS under an unsupervised learning manner and predict the potential associations between miRNAs and diseases. The experimental results fully demonstrate that MGCNSS outperforms all baseline methods on both balanced and imbalanced datasets. More importantly, we conduct case studies on colon neoplasms and esophageal neoplasms, further confirming the ability of MGCNSS to detect potential candidate miRNAs. The source code is publicly available on GitHub https://github.com/15136943622/MGCNSS/tree/master.

Asunto(s)

Neoplasias del Colon , MicroARNs , Humanos , MicroARNs/genética , Algoritmos , Biología Computacional/métodos , Programas Informáticos , Neoplasias del Colon/genética

9.

Elliptic PDE learning is provably data-efficient.

Boullé, Nicolas; Halikias, Diana; Townsend, Alex.

Proc Natl Acad Sci U S A ; 120(39): e2303904120, 2023 Sep 26.

Artículo en Inglés | MEDLINE | ID: mdl-37722063

RESUMEN

Partial differential equations (PDE) learning is an emerging field that combines physics and machine learning to recover unknown physical systems from experimental data. While deep learning models traditionally require copious amounts of training data, recent PDE learning techniques achieve spectacular results with limited data availability. Still, these results are empirical. Our work provides theoretical guarantees on the number of input-output training pairs required in PDE learning. Specifically, we exploit randomized numerical linear algebra and PDE theory to derive a provably data-efficient algorithm that recovers solution operators of three-dimensional uniformly elliptic PDEs from input-output data and achieves an exponential convergence rate of the error with respect to the size of the training dataset with an exceptionally high probability of success.

10.

Warming-induced tree growth may help offset increasing disturbance across the Canadian boreal forest.

Wang, Jiejie; Taylor, Anthony R; D'Orangeville, Loïc.

Proc Natl Acad Sci U S A ; 120(2): e2212780120, 2023 01 10.

Artículo en Inglés | MEDLINE | ID: mdl-36595673

RESUMEN

Large projected increases in forest disturbance pose a major threat to future wood fiber supply and carbon sequestration in the cold-limited, Canadian boreal forest ecosystem. Given the large sensitivity of tree growth to temperature, warming-induced increases in forest productivity have the potential to reduce these threats, but research efforts to date have yielded contradictory results attributed to limited data availability, methodological biases, and regional variability in forest dynamics. Here, we apply a machine learning algorithm to an unprecedented network of over 1 million tree growth records (1958 to 2018) from 20,089 permanent sample plots distributed across both Canada and the United States, spanning a 16.5 °C climatic gradient. Fitted models were then used to project the near-term (2050 s time period) growth of the six most abundant tree species in the Canadian boreal forest. Our results reveal a large, positive effect of increasing thermal energy on tree growth for most of the target species, leading to 20.5 to 22.7% projected gains in growth with climate change under RCP 4.5 and 8.5. The magnitude of these gains, which peak in the colder and wetter regions of the boreal forest, suggests that warming-induced growth increases should no longer be considered marginal but may in fact significantly offset some of the negative impacts of projected increases in drought and wildfire on wood supply and carbon sequestration and have major implications on ecological forecasts and the global economy.

Asunto(s)

Taiga , Árboles , Canadá , Ecosistema , Bosques , Cambio Climático

11.

Rubble pile asteroids are forever.

Jourdan, Fred; Timms, Nicholas E; Nakamura, Tomoki; Rickard, William D A; Mayers, Celia; Reddy, Steven M; Saxey, David; Daly, Luke; Bland, Phil A; Eroglu, Ela; Fougerouse, Denis.

Proc Natl Acad Sci U S A ; 120(5): e2214353120, 2023 Jan 31.

Artículo en Inglés | MEDLINE | ID: mdl-36689662

RESUMEN

Rubble piles asteroids consist of reassembled fragments from shattered monolithic asteroids and are much more abundant than previously thought in the solar system. Although monolithic asteroids that are a kilometer in diameter have been predicted to have a lifespan of few 100 million years, it is currently not known how durable rubble pile asteroids are. Here, we show that rubble pile asteroids can survive ambient solar system bombardment processes for extremely long periods and potentially 10 times longer than their monolith counterparts. We studied three regolith dust particles recovered by the Hayabusa space probe from the rubble pile asteroid 25143 Itokawa using electron backscatter diffraction, time-of-flight secondary ion mass spectrometry, atom probe tomography, and 40Ar/39Ar dating techniques. Our results show that the particles have only been affected by shock pressure of ca. 5 to 15 GPa. Two particles have 40Ar/39Ar ages of 4,219 ± 35 and 4,149 ± 41 My and when combined with thermal and diffusion models; these results constrain the formation age of the rubble pile structure to ≥4.2 billion years ago. Such a long survival time for an asteroid is attributed to the shock-absorbent nature of rubble pile material and suggests that rubble piles are hard to destroy once they are created. Our results suggest that rubble piles are probably more abundant in the asteroid belt than previously thought and provide constrain to help develop mitigation strategies to prevent asteroid collisions with Earth.

Asunto(s)

Polvo , Planeta Tierra , Difusión , Electrones , Longevidad

12.

Gene-based association tests in family samples using GWAS summary statistics.

Wang, Peng; Xu, Xiao; Li, Ming; Lou, Xiang-Yang; Xu, Siqi; Wu, Baolin; Gao, Guimin; Yin, Ping; Liu, Nianjun.

Genet Epidemiol ; 48(3): 103-113, 2024 04.

Artículo en Inglés | MEDLINE | ID: mdl-38317324

RESUMEN

Genome-wide association studies (GWAS) have led to rapid growth in detecting genetic variants associated with various phenotypes. Owing to a great number of publicly accessible GWAS summary statistics, and the difficulty in obtaining individual-level genotype data, many existing gene-based association tests have been adapted to require only GWAS summary statistics rather than individual-level data. However, these association tests are restricted to unrelated individuals and thus do not apply to family samples directly. Moreover, due to its flexibility and effectiveness, the linear mixed model has been increasingly utilized in GWAS to handle correlated data, such as family samples. However, it remains unknown how to perform gene-based association tests in family samples using the GWAS summary statistics estimated from the linear mixed model. In this study, we show that, when family size is negligible compared to the total sample size, the diagonal block structure of the kinship matrix makes it possible to approximate the correlation matrix of marginal Z scores by linkage disequilibrium matrix. Based on this result, current methods utilizing summary statistics for unrelated individuals can be directly applied to family data without any modifications. Our simulation results demonstrate that this proposed strategy controls the type 1 error rate well in various situations. Finally, we exemplify the usefulness of the proposed approach with a dental caries GWAS data set.

Asunto(s)

Caries Dental , Estudio de Asociación del Genoma Completo , Humanos , Estudio de Asociación del Genoma Completo/métodos , Polimorfismo de Nucleótido Simple , Modelos Genéticos , Fenotipo

13.

Targeted volume correlative light and electron microscopy of an environmental marine microorganism.

Mocaer, Karel; Mizzon, Giulia; Gunkel, Manuel; Halavatyi, Aliaksandr; Steyer, Anna; Oorschot, Viola; Schorb, Martin; Le Kieffre, Charlotte; Yee, Daniel P; Chevalier, Fabien; Gallet, Benoit; Decelle, Johan; Schwab, Yannick; Ronchi, Paolo.

J Cell Sci ; 136(15)2023 08 01.

Artículo en Inglés | MEDLINE | ID: mdl-37455654

RESUMEN

Photosynthetic microalgae are responsible for an important fraction of CO2 fixation and O2 production on Earth. Three-dimensional (3D) ultrastructural characterization of these organisms in their natural environment can contribute to a deeper understanding of their cell biology. However, the low throughput of volume electron microscopy (vEM) methods along with the complexity and heterogeneity of environmental samples pose great technical challenges. In the present study, we used a workflow based on a specific electron microscopy sample preparation method compatible with both light and vEM imaging in order to target one cell among a complex natural community. This method revealed the 3D subcellular landscape of a photosynthetic dinoflagellate, which we identified as Ensiculifera tyrrhenica, with quantitative characterization of multiple organelles. We show that this cell contains a single convoluted chloroplast and show the arrangement of the flagellar apparatus with its associated photosensitive elements. Moreover, we observed partial chromatin unfolding, potentially associated with transcription activity in these organisms, in which chromosomes are permanently condensed. Together with providing insights in dinoflagellate biology, this proof-of-principle study illustrates an efficient tool for the targeted ultrastructural analysis of environmental microorganisms in heterogeneous mixes.

Asunto(s)

Imagenología Tridimensional , Microscopía Electrónica de Rastreo , Imagenología Tridimensional/métodos

14.

Bayesian mixed model inference for genetic association under related samples with brain network phenotype.

Tian, Xinyuan; Wang, Yiting; Wang, Selena; Zhao, Yi; Zhao, Yize.

Biostatistics ; 2024 Mar 17.

Artículo en Inglés | MEDLINE | ID: mdl-38494649

RESUMEN

Genetic association studies for brain connectivity phenotypes have gained prominence due to advances in noninvasive imaging techniques and quantitative genetics. Brain connectivity traits, characterized by network configurations and unique biological structures, present distinct challenges compared to other quantitative phenotypes. Furthermore, the presence of sample relatedness in the most imaging genetics studies limits the feasibility of adopting existing network-response modeling. In this article, we fill this gap by proposing a Bayesian network-response mixed-effect model that considers a network-variate phenotype and incorporates population structures including pedigrees and unknown sample relatedness. To accommodate the inherent topological architecture associated with the genetic contributions to the phenotype, we model the effect components via a set of effect network configurations and impose an inter-network sparsity and intra-network shrinkage to dissect the phenotypic network configurations affected by the risk genetic variant. A Markov chain Monte Carlo (MCMC) algorithm is further developed to facilitate uncertainty quantification. We evaluate the performance of our model through extensive simulations. By further applying the method to study, the genetic bases for brain structural connectivity using data from the Human Connectome Project with excessive family structures, we obtain plausible and interpretable results. Beyond brain connectivity genetic studies, our proposed model also provides a general linear mixed-effect regression framework for network-variate outcomes.

15.

SPNE: sample-perturbed network entropy for revealing critical states of complex biological systems.

Zhong, Jiayuan; Ding, Dandan; Liu, Juntan; Liu, Rui; Chen, Pei.

Brief Bioinform ; 24(2)2023 03 19.

Artículo en Inglés | MEDLINE | ID: mdl-36705581

RESUMEN

Complex biological systems do not always develop smoothly but occasionally undergo a sharp transition; i.e. there exists a critical transition or tipping point at which a drastic qualitative shift occurs. Hunting for such a critical transition is important to prevent or delay the occurrence of catastrophic consequences, such as disease deterioration. However, the identification of the critical state for complex biological systems is still a challenging problem when using high-dimensional small sample data, especially where only a certain sample is available, which often leads to the failure of most traditional statistical approaches. In this study, a novel quantitative method, sample-perturbed network entropy (SPNE), is developed based on the sample-perturbed directed network to reveal the critical state of complex biological systems at the single-sample level. Specifically, the SPNE approach effectively quantifies the perturbation effect caused by a specific sample on the directed network in terms of network entropy and thus captures the criticality of biological systems. This model-free method was applied to both bulk and single-cell expression data. Our approach was validated by successfully detecting the early warning signals of the critical states for six real datasets, including four tumor datasets from The Cancer Genome Atlas (TCGA) and two single-cell datasets of cell differentiation. In addition, the functional analyses of signaling biomarkers demonstrated the effectiveness of the analytical and computational results.

Asunto(s)

Neoplasias , Humanos , Entropía , Progresión de la Enfermedad , Biomarcadores/metabolismo , Transducción de Señal

16.

A comprehensive investigation of statistical and machine learning approaches for predicting complex human diseases on genomic variants.

Wang, Chonghao; Zhang, Jing; Veldsman, Werner Pieter; Zhou, Xin; Zhang, Lu.

Brief Bioinform ; 24(1)2023 01 19.

Artículo en Inglés | MEDLINE | ID: mdl-36585786

RESUMEN

Quantifying an individual's risk for common diseases is an important goal of precision health. The polygenic risk score (PRS), which aggregates multiple risk alleles of candidate diseases, has emerged as a standard approach for identifying high-risk individuals. Although several studies have been performed to benchmark the PRS calculation tools and assess their potential to guide future clinical applications, some issues remain to be further investigated, such as lacking (i) various simulated data with different genetic effects; (ii) evaluation of machine learning models and (iii) evaluation on multiple ancestries studies. In this study, we systematically validated and compared 13 statistical methods, 5 machine learning models and 2 ensemble models using simulated data with additive and genetic interaction models, 22 common diseases with internal training sets, 4 common diseases with external summary statistics and 3 common diseases for trans-ancestry studies in UK Biobank. The statistical methods were better in simulated data from additive models and machine learning models have edges for data that include genetic interactions. Ensemble models are generally the best choice by integrating various statistical methods. LDpred2 outperformed the other standalone tools, whereas PRS-CS, lassosum and DBSLMM showed comparable performance. We also identified that disease heritability strongly affected the predictive performance of all methods. Both the number and effect sizes of risk SNPs are important; and sample size strongly influences the performance of all methods. For the trans-ancestry studies, we found that the performance of most methods became worse when training and testing sets were from different populations.

Asunto(s)

Aprendizaje Automático , Herencia Multifactorial , Humanos , Factores de Riesgo , Genómica , Predisposición Genética a la Enfermedad , Estudio de Asociación del Genoma Completo/métodos

17.

SWEET: a single-sample network inference method for deciphering individual features in disease.

Chen, Hsin-Hua; Hsueh, Chun-Wei; Lee, Chia-Hwa; Hao, Ting-Yi; Tu, Tzu-Ying; Chang, Lan-Yun; Lee, Jih-Chin; Lin, Chun-Yu.

Brief Bioinform ; 24(2)2023 03 19.

Artículo en Inglés | MEDLINE | ID: mdl-36719112

RESUMEN

Recently, extracting inherent biological system information (e.g. cellular networks) from genome-wide expression profiles for developing personalized diagnostic and therapeutic strategies has become increasingly important. However, accurately constructing single-sample networks (SINs) to capture individual characteristics and heterogeneity in disease remains challenging. Here, we propose a sample-specific-weighted correlation network (SWEET) method to model SINs by integrating the genome-wide sample-to-sample correlation (i.e. sample weights) with the differential network between perturbed and aggregate networks. For a group of samples, the genome-wide sample weights can be assessed without prior knowledge of intrinsic subpopulations to address the network edge number bias caused by sample size differences. Compared with the state-of-the-art SIN inference methods, the SWEET SINs in 16 cancers more likely fit the scale-free property, display higher overlap with the human interactomes and perform better in identifying three types of cancer-related genes. Moreover, integrating SWEET SINs with a network proximity measure facilitates characterizing individual features and therapy in diseases, such as somatic mutation, mut-driver and essential genes. Biological experiments further validated two candidate repurposable drugs, albendazole for head and neck squamous cell carcinoma (HNSCC) and lung adenocarcinoma (LUAD) and encorafenib for HNSCC. By applying SWEET, we also identified two possible LUAD subtypes that exhibit distinct clinical features and molecular mechanisms. Overall, the SWEET method complements current SIN inference and analysis methods and presents a view of biological systems at the network level to offer numerous clues for further investigation and clinical translation in network medicine and precision medicine.

Asunto(s)

Redes Reguladoras de Genes , Neoplasias de Cabeza y Cuello , Humanos , Carcinoma de Células Escamosas de Cabeza y Cuello/genética , Oncogenes , Neoplasias de Cabeza y Cuello/genética

18.

Consensus clustering with missing labels (ccml): a consensus clustering tool for multi-omics integrative prediction in cohorts with unequal sample coverage.

Li, Chuan-Xing; Chen, Hongyan; Zounemat-Kermani, Nazanin; Adcock, Ian M; Sköld, C Magnus; Zhou, Meng; Wheelock, Åsa M.

Brief Bioinform ; 25(1)2023 11 22.

Artículo en Inglés | MEDLINE | ID: mdl-38205966

RESUMEN

Multi-omics data integration is a complex and challenging task in biomedical research. Consensus clustering, also known as meta-clustering or cluster ensembles, has become an increasingly popular downstream tool for phenotyping and endotyping using multiple omics and clinical data. However, current consensus clustering methods typically rely on ensembling clustering outputs with similar sample coverages (mathematical replicates), which may not reflect real-world data with varying sample coverages (biological replicates). To address this issue, we propose a new consensus clustering with missing labels (ccml) strategy termed ccml, an R protocol for two-step consensus clustering that can handle unequal missing labels (i.e. multiple predictive labels with different sample coverages). Initially, the regular consensus weights are adjusted (normalized) by sample coverage, then a regular consensus clustering is performed to predict the optimal final cluster. We applied the ccml method to predict molecularly distinct groups based on 9-omics integration in the Karolinska COSMIC cohort, which investigates chronic obstructive pulmonary disease, and 24-omics handprint integrative subgrouping of adult asthma patients of the U-BIOPRED cohort. We propose ccml as a downstream toolkit for multi-omics integration analysis algorithms such as Similarity Network Fusion and robust clustering of clinical data to overcome the limitations posed by missing data, which is inevitable in human cohorts consisting of multiple data modalities. The ccml tool is available in the R language (https://CRAN.R-project.org/package=ccml, https://github.com/pulmonomics-lab/ccml, or https://github.com/ZhoulabCPH/ccml).

Asunto(s)

Asma , Multiómica , Adulto , Humanos , Consenso , Análisis por Conglomerados , Algoritmos , Asma/genética

19.

µPhos: a scalable and sensitive platform for high-dimensional phosphoproteomics.

Oliinyk, Denys; Will, Andreas; Schneidmadel, Felix R; Böhme, Maximilian; Rinke, Jenny; Hochhaus, Andreas; Ernst, Thomas; Hahn, Nina; Geis, Christian; Lubeck, Markus; Raether, Oliver; Humphrey, Sean J; Meier, Florian.

Mol Syst Biol ; 20(8): 972-995, 2024 Aug.

Artículo en Inglés | MEDLINE | ID: mdl-38907068

RESUMEN

Mass spectrometry has revolutionized cell signaling research by vastly simplifying the analysis of many thousands of phosphorylation sites in the human proteome. Defining the cellular response to perturbations is crucial for further illuminating the functionality of the phosphoproteome. Here we describe µPhos ('microPhos'), an accessible phosphoproteomics platform that permits phosphopeptide enrichment from 96-well cell culture and small tissue amounts in <8 h total processing time. By greatly minimizing transfer steps and liquid volumes, we demonstrate increased sensitivity, >90% selectivity, and excellent quantitative reproducibility. Employing highly sensitive trapped ion mobility mass spectrometry, we quantify ~17,000 Class I phosphosites in a human cancer cell line using 20 µg starting material, and confidently localize ~6200 phosphosites from 1 µg. This depth covers key signaling pathways, rendering sample-limited applications and perturbation experiments with hundreds of samples viable. We employ µPhos to study drug- and time-dependent response signatures in a leukemia cell line, and by quantifying 30,000 Class I phosphosites in the mouse brain we reveal distinct spatial kinase activities in subregions of the hippocampal formation.

Asunto(s)

Fosfopéptidos , Fosfoproteínas , Proteómica , Proteómica/métodos , Humanos , Animales , Ratones , Fosfoproteínas/metabolismo , Fosforilación , Línea Celular Tumoral , Fosfopéptidos/metabolismo , Fosfopéptidos/análisis , Espectrometría de Masas/métodos , Transducción de Señal , Proteoma/metabolismo , Reproducibilidad de los Resultados , Hipocampo/metabolismo , Hipocampo/citología

20.

Direct current stimulation modulates prefrontal cell activity and behaviour without inducing seizure-like firing.

Fehring, Daniel J; Yokoo, Seiichirou; Abe, Hiroshi; Buckley, Mark J; Miyamoto, Kentaro; Jaberzadeh, Shapour; Yamamori, Tetsuo; Tanaka, Keiji; Rosa, Marcello G P; Mansouri, Farshad A.

Brain ; 2024 Aug 21.

Artículo en Inglés | MEDLINE | ID: mdl-39166526

RESUMEN

Transcranial direct current stimulation (tDCS) has garnered significant interest for its potential to enhance cognitive functions and as a therapeutic intervention in various cognitive disorders. However, the clinical application of tDCS has been hampered by significant variability in its cognitive outcomes. Furthermore, the widespread use of tDCS has raised concerns regarding its safety and efficacy, particularly due to our limited understanding of its underlying neural mechanisms at the cellular level. We still do not know 'where', 'when', and 'how' tDCS modulates information encoding by neurons, to lead to the observed changes in cognitive functions. Without elucidating these fundamental unknowns, the root causes of its outcome variability and long-term safety remain elusive, challenging the effective application of tDCS in clinical settings. Addressing this gap, our study investigates the effects of tDCS, applied over the dorsolateral prefrontal cortex (dlPFC), on cognitive abilities and individual neuron activity in macaque monkeys performing cognitive tasks. Like humans performing a Delayed Match-to-Sample task, monkeys exhibited practice-related slowing in their responses (within-session behavioural adaptation). Concurrently, there were practice-related changes in simultaneously recorded activity of prefrontal neurons (within-session neuronal adaptation). Anodal tDCS attenuated both these behavioural and neuronal adaptations when compared to sham. Furthermore, tDCS abolished the correlation between monkeys' response time and neuronal firing rate. At a single-cell level, we also found that following tDCS, neuronal firing rate was more likely to exhibit task-specific modulation than after sham stimulation. These tDCS-induced changes in both behaviour and neuronal activity persisted even after the end of tDCS stimulation. Importantly, multiple applications of tDCS did not alter burst-like firing rates of individual neurons when compared to sham stimulation. This suggests that tDCS modulates neural activity without enhancing susceptibility to epileptiform activity, confirming a potential for safe use in clinical settings. Our research contributes unprecedented insights into the 'where', 'when', and 'how' of tDCS effects on neuronal activity and cognitive functions by showing that modulation of monkeys' behaviour by the tDCS of the prefrontal cortex is accompanied by alterations in prefrontal cortical cell activity ('where') during distinct trial phases ('when'). Importantly, tDCS led to task-specific and state-dependent alterations in prefrontal cell activities ('how'). Our findings suggest a significant shift from the view that the tDCS effects are merely due to polarity-specific shifts in cortical excitability and instead, propose a more complex mechanism of action for tDCS that encompasses various aspects of cortical neuronal activity without increasing burst-like epileptiform susceptibility.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA