Búsqueda | Portal de Búsqueda de la BVS

1.

FilamentID reveals the composition and function of metabolic enzyme polymers during gametogenesis.

Hugener, Jannik; Xu, Jingwei; Wettstein, Rahel; Ioannidi, Lydia; Velikov, Daniel; Wollweber, Florian; Henggeler, Adrian; Matos, Joao; Pilhofer, Martin.

Cell ; 187(13): 3303-3318.e18, 2024 Jun 20.

Artículo en Inglés | MEDLINE | ID: mdl-38906101

RESUMEN

Gamete formation and subsequent offspring development often involve extended phases of suspended cellular development or even dormancy. How cells adapt to recover and resume growth remains poorly understood. Here, we visualized budding yeast cells undergoing meiosis by cryo-electron tomography (cryoET) and discovered elaborate filamentous assemblies decorating the nucleus, cytoplasm, and mitochondria. To determine filament composition, we developed a "filament identification" (FilamentID) workflow that combines multiscale cryoET/cryo-electron microscopy (cryoEM) analyses of partially lysed cells or organelles. FilamentID identified the mitochondrial filaments as being composed of the conserved aldehyde dehydrogenase Ald4ALDH2 and the nucleoplasmic/cytoplasmic filaments as consisting of acetyl-coenzyme A (CoA) synthetase Acs1ACSS2. Structural characterization further revealed the mechanism underlying polymerization and enabled us to genetically perturb filament formation. Acs1 polymerization facilitates the recovery of chronologically aged spores and, more generally, the cell cycle re-entry of starved cells. FilamentID is broadly applicable to characterize filaments of unknown identity in diverse cellular contexts.

Asunto(s)

Gametogénesis , Mitocondrias , Proteínas de Saccharomyces cerevisiae , Saccharomyces cerevisiae , Aldehído Deshidrogenasa/metabolismo , Aldehído Deshidrogenasa/química , Núcleo Celular/metabolismo , Núcleo Celular/ultraestructura , Coenzima A Ligasas/metabolismo , Microscopía por Crioelectrón , Citoplasma/metabolismo , Tomografía con Microscopio Electrónico , Meiosis , Mitocondrias/metabolismo , Mitocondrias/ultraestructura , Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/ultraestructura , Proteínas de Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/química , Esporas Fúngicas/metabolismo , Modelos Moleculares , Estructura Cuaternaria de Proteína

2.

Evaluation of polygenic scoring methods in five biobanks shows larger variation between biobanks than methods and finds benefits of ensemble learning.

Monti, Remo; Eick, Lisa; Hudjashov, Georgi; Läll, Kristi; Kanoni, Stavroula; Wolford, Brooke N; Wingfield, Benjamin; Pain, Oliver; Wharrie, Sophie; Jermy, Bradley; McMahon, Aoife; Hartonen, Tuomo; Heyne, Henrike; Mars, Nina; Lambert, Samuel; Hveem, Kristian; Inouye, Michael; van Heel, David A; Mägi, Reedik; Marttinen, Pekka; Ripatti, Samuli; Ganna, Andrea; Lippert, Christoph.

Am J Hum Genet ; 111(7): 1431-1447, 2024 Jul 11.

Artículo en Inglés | MEDLINE | ID: mdl-38908374

RESUMEN

Methods of estimating polygenic scores (PGSs) from genome-wide association studies are increasingly utilized. However, independent method evaluation is lacking, and method comparisons are often limited. Here, we evaluate polygenic scores derived via seven methods in five biobank studies (totaling about 1.2 million participants) across 16 diseases and quantitative traits, building on a reference-standardized framework. We conducted meta-analyses to quantify the effects of method choice, hyperparameter tuning, method ensembling, and the target biobank on PGS performance. We found that no single method consistently outperformed all others. PGS effect sizes were more variable between biobanks than between methods within biobanks when methods were well tuned. Differences between methods were largest for the two investigated autoimmune diseases, seropositive rheumatoid arthritis and type 1 diabetes. For most methods, cross-validation was more reliable for tuning hyperparameters than automatic tuning (without the use of target data). For a given target phenotype, elastic net models combining PGS across methods (ensemble PGS) tuned in the UK Biobank provided consistent, high, and cross-biobank transferable performance, increasing PGS effect sizes (ß coefficients) by a median of 5.0% relative to LDpred2 and MegaPRS (the two best-performing single methods when tuned with cross-validation). Our interactively browsable online-results and open-source workflow prspipe provide a rich resource and reference for the analysis of polygenic scoring methods across biobanks.

Asunto(s)

Bancos de Muestras Biológicas , Estudio de Asociación del Genoma Completo , Herencia Multifactorial , Humanos , Herencia Multifactorial/genética , Fenotipo , Diabetes Mellitus Tipo 1/genética , Polimorfismo de Nucleótido Simple , Aprendizaje Automático

3.

Comparative Analysis of Droplet-Based Ultra-High-Throughput Single-Cell RNA-Seq Systems.

Zhang, Xiannian; Li, Tianqi; Liu, Feng; Chen, Yaqi; Yao, Jiacheng; Li, Zeyao; Huang, Yanyi; Wang, Jianbin.

Mol Cell ; 73(1): 130-142.e5, 2019 01 03.

Artículo en Inglés | MEDLINE | ID: mdl-30472192

RESUMEN

Since its establishment in 2009, single-cell RNA sequencing (RNA-seq) has been a major driver behind progress in biomedical research. In developmental biology and stem cell studies, the ability to profile single cells confers particular benefits. Although most studies still focus on individual tissues or organs, the recent development of ultra-high-throughput single-cell RNA-seq has demonstrated potential power in characterizing more complex systems or even the entire body. However, although multiple ultra-high-throughput single-cell RNA-seq systems have attracted attention, no systematic comparison of these systems has been performed. Here, with the same cell line and bioinformatics pipeline, we developed directly comparable datasets for each of three widely used droplet-based ultra-high-throughput single-cell RNA-seq systems, inDrop, Drop-seq, and 10X Genomics Chromium. Although each system is capable of profiling single-cell transcriptomes, their detailed comparison revealed the distinguishing features and suitable applications for each system.

Asunto(s)

Perfilación de la Expresión Génica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Técnicas Analíticas Microfluídicas , ARN/genética , Análisis de Secuencia de ARN/métodos , Análisis de la Célula Individual/métodos , Transcriptoma , Automatización de Laboratorios , Secuencia de Bases , Línea Celular , Biología Computacional , Análisis Costo-Beneficio , Código de Barras del ADN Taxonómico , Perfilación de la Expresión Génica/economía , Secuenciación de Nucleótidos de Alto Rendimiento/economía , Humanos , Técnicas Analíticas Microfluídicas/economía , Reproducibilidad de los Resultados , Análisis de Secuencia de ARN/economía , Análisis de la Célula Individual/economía , Flujo de Trabajo

4.

Principled and interpretable alignability testing and integration of single-cell data.

Ma, Rong; Sun, Eric D; Donoho, David; Zou, James.

Proc Natl Acad Sci U S A ; 121(10): e2313719121, 2024 Mar 05.

Artículo en Inglés | MEDLINE | ID: mdl-38416677

RESUMEN

Single-cell data integration can provide a comprehensive molecular view of cells, and many algorithms have been developed to remove unwanted technical or biological variations and integrate heterogeneous single-cell datasets. Despite their wide usage, existing methods suffer from several fundamental limitations. In particular, we lack a rigorous statistical test for whether two high-dimensional single-cell datasets are alignable (and therefore should even be aligned). Moreover, popular methods can substantially distort the data during alignment, making the aligned data and downstream analysis difficult to interpret. To overcome these limitations, we present a spectral manifold alignment and inference (SMAI) framework, which enables principled and interpretable alignability testing and structure-preserving integration of single-cell data with the same type of features. SMAI provides a statistical test to robustly assess the alignability between datasets to avoid misleading inference and is justified by high-dimensional statistical theory. On a diverse range of real and simulated benchmark datasets, it outperforms commonly used alignment methods. Moreover, we show that SMAI improves various downstream analyses such as identification of differentially expressed genes and imputation of single-cell spatial transcriptomics, providing further biological insights. SMAI's interpretability also enables quantification and a deeper understanding of the sources of technical confounders in single-cell data.

Asunto(s)

Algoritmos , Perfilación de la Expresión Génica , Expresión Génica , Análisis de la Célula Individual

5.

Production and characterization of monoclonal antibodies to Xenopus proteins.

Horr, Brett; Kurtz, Ryan; Pandey, Ankit; Hoffstrom, Benjamin G; Schock, Elizabeth; LaBonne, Carole; Alfandari, Dominique.

Development ; 150(4)2023 02 15.

Artículo en Inglés | MEDLINE | ID: mdl-36789951

RESUMEN

Monoclonal antibodies are powerful and versatile tools that enable the study of proteins in diverse contexts. They are often utilized to assist with identification of subcellular localization and characterization of the function of target proteins of interest. However, because there can be considerable sequence diversity between orthologous proteins in Xenopus and mammals, antibodies produced against mouse or human proteins often do not recognize Xenopus counterparts. To address this issue, we refined existing mouse monoclonal antibody production protocols to generate antibodies against Xenopus proteins of interest. Here, we describe several approaches for the generation of useful mouse anti-Xenopus antibodies to multiple Xenopus proteins and their validation in various experimental approaches. These novel antibodies are now available to the research community through the Developmental Study Hybridoma Bank (DSHB).

Asunto(s)

Anticuerpos Monoclonales , Proteínas de Xenopus , Animales , Ratones , Hibridomas , Xenopus laevis , Proteínas de Xenopus/genética

6.

CIARA: a cluster-independent algorithm for identifying markers of rare cell types from single-cell sequencing data.

Lubatti, Gabriele; Stock, Marco; Iturbide, Ane; Ruiz Tejada Segura, Mayra L; Riepl, Melina; Tyser, Richard C V; Danese, Anna; Colomé-Tatché, Maria; Theis, Fabian J; Srinivas, Shankar; Torres-Padilla, Maria-Elena; Scialdone, Antonio.

Development ; 150(11)2023 Jun 01.

Artículo en Inglés | MEDLINE | ID: mdl-37294170

RESUMEN

A powerful feature of single-cell genomics is the possibility of identifying cell types from their molecular profiles. In particular, identifying novel rare cell types and their marker genes is a key potential of single-cell RNA sequencing. Standard clustering approaches perform well in identifying relatively abundant cell types, but tend to miss rarer cell types. Here, we have developed CIARA (Cluster Independent Algorithm for the identification of markers of RAre cell types), a cluster-independent computational tool designed to select genes that are likely to be markers of rare cell types. Genes selected by CIARA are subsequently integrated with common clustering algorithms to single out groups of rare cell types. CIARA outperforms existing methods for rare cell type detection, and we use it to find previously uncharacterized rare populations of cells in a human gastrula and among mouse embryonic stem cells treated with retinoic acid. Moreover, CIARA can be applied more generally to any type of single-cell omic data, thus allowing the identification of rare cells across multiple data modalities. We provide implementations of CIARA in user-friendly packages available in R and Python.

Asunto(s)

Algoritmos , Análisis de la Célula Individual , Animales , Humanos , Ratones , Análisis de Secuencia de ARN/métodos , Análisis por Conglomerados , Análisis de la Célula Individual/métodos , Perfilación de la Expresión Génica/métodos

7.

scTOP: physics-inspired order parameters for cellular identification and visualization.

Yampolskaya, Maria; Herriges, Michael J; Ikonomou, Laertis; Kotton, Darrell N; Mehta, Pankaj.

Development ; 150(21)2023 11 01.

Artículo en Inglés | MEDLINE | ID: mdl-37756586

RESUMEN

Advances in single-cell RNA sequencing provide an unprecedented window into cellular identity. The abundance of data requires new theoretical and computational frameworks to analyze the dynamics of differentiation and integrate knowledge from cell atlases. We present 'single-cell Type Order Parameters' (scTOP): a statistical, physics-inspired approach for quantifying cell identity given a reference basis of cell types. scTOP can accurately classify cells, visualize developmental trajectories and assess the fidelity of engineered cells. Importantly, scTOP does this without feature selection, statistical fitting or dimensional reduction (e.g. uniform manifold approximation and projection, principle components analysis, etc.). We illustrate the power of scTOP using human and mouse datasets. By reanalyzing mouse lung data, we characterize a transient hybrid alveolar type 1/alveolar type 2 cell population. Visualizations of lineage tracing hematopoiesis data using scTOP confirm that a single clone can give rise to multiple mature cell types. We assess the transcriptional similarity between endogenous and donor-derived cells in the context of murine pulmonary cell transplantation. Our results suggest that physics-inspired order parameters can be an important tool for understanding differentiation and characterizing engineered cells. scTOP is available as an easy-to-use Python package.

Asunto(s)

Pulmón , Análisis de la Célula Individual , Animales , Humanos , Ratones , Diferenciación Celular/genética , Análisis de la Célula Individual/métodos , Análisis de Secuencia de ARN/métodos

8.

ABAG-docking benchmark: a non-redundant structure benchmark dataset for antibody-antigen computational docking.

Zhao, Nan; Han, Bingqing; Zhao, Cuicui; Xu, Jinbo; Gong, Xinqi.

Brief Bioinform ; 25(2)2024 Jan 22.

Artículo en Inglés | MEDLINE | ID: mdl-38385879

RESUMEN

Accurate prediction of antibody-antigen complex structures is pivotal in drug discovery, vaccine design and disease treatment and can facilitate the development of more effective therapies and diagnostics. In this work, we first review the antibody-antigen docking (ABAG-docking) datasets. Then, we present the creation and characterization of a comprehensive benchmark dataset of antibody-antigen complexes. We categorize the dataset based on docking difficulty, interface properties and structural characteristics, to provide a diverse set of cases for rigorous evaluation. Compared with Docking Benchmark 5.5, we have added 112 cases, including 14 single-domain antibody (sdAb) cases and 98 monoclonal antibody (mAb) cases, and also increased the proportion of Difficult cases. Our dataset contains diverse cases, including human/humanized antibodies, sdAbs, rodent antibodies and other types, opening the door to better algorithm development. Furthermore, we provide details on the process of building the benchmark dataset and introduce a pipeline for periodic updates to keep it up to date. We also utilize multiple complex prediction methods including ZDOCK, ClusPro, HDOCK and AlphaFold-Multimer for testing and analyzing this dataset. This benchmark serves as a valuable resource for evaluating and advancing docking computational methods in the analysis of antibody-antigen interaction, enabling researchers to develop more accurate and effective tools for predicting and designing antibody-antigen complexes. The non-redundant ABAG-docking structure benchmark dataset is available at https://github.com/Zhaonan99/Antibody-antigen-complex-structure-benchmark-dataset.

Asunto(s)

Algoritmos , Benchmarking , Humanos , Anticuerpos Monoclonales , Anticuerpos Monoclonales Humanizados , Complejo Antígeno-Anticuerpo

9.

GSScore: a novel Graphormer-based shell-like scoring method for protein-ligand docking.

Guo, Linyuan; Wang, Jianxin.

Brief Bioinform ; 25(3)2024 Mar 27.

Artículo en Inglés | MEDLINE | ID: mdl-38706316

RESUMEN

Protein-ligand interactions (PLIs) are essential for cellular activities and drug discovery. But due to the complexity and high cost of experimental methods, there is a great demand for computational approaches to recognize PLI patterns, such as protein-ligand docking. In recent years, more and more models based on machine learning have been developed to directly predict the root mean square deviation (RMSD) of a ligand docking pose with reference to its native binding pose. However, new scoring methods are pressingly needed in methodology for more accurate RMSD prediction. We present a new deep learning-based scoring method for RMSD prediction of protein-ligand docking poses based on a Graphormer method and Shell-like graph architecture, named GSScore. To recognize near-native conformations from a set of poses, GSScore takes atoms as nodes and then establishes the docking interface of protein-ligand into multiple bipartite graphs within different shell ranges. Benefiting from the Graphormer and Shell-like graph architecture, GSScore can effectively capture the subtle differences between energetically favorable near-native conformations and unfavorable non-native poses without extra information. GSScore was extensively evaluated on diverse test sets including a subset of PDBBind version 2019, CASF2016 as well as DUD-E, and obtained significant improvements over existing methods in terms of RMSE, $R$ (Pearson correlation coefficient), Spearman correlation coefficient and Docking power.

Asunto(s)

Simulación del Acoplamiento Molecular , Proteínas , Ligandos , Proteínas/química , Proteínas/metabolismo , Unión Proteica , Programas Informáticos , Algoritmos , Biología Computacional/métodos , Conformación Proteica , Bases de Datos de Proteínas , Aprendizaje Profundo

10.

A novel approach to study multi-domain motions in JAK1's activation mechanism based on energy landscape.

Sun, Shengjie; Rodriguez, Georgialina; Zhao, Gaoshu; Sanchez, Jason E; Guo, Wenhan; Du, Dan; Rodriguez Moncivais, Omar J; Hu, Dehua; Liu, Jing; Kirken, Robert Arthur; Li, Lin.

Brief Bioinform ; 25(2)2024 Jan 22.

Artículo en Inglés | MEDLINE | ID: mdl-38446738

RESUMEN

The family of Janus Kinases (JAKs) associated with the JAK-signal transducers and activators of transcription signaling pathway plays a vital role in the regulation of various cellular processes. The conformational change of JAKs is the fundamental steps for activation, affecting multiple intracellular signaling pathways. However, the transitional process from inactive to active kinase is still a mystery. This study is aimed at investigating the electrostatic properties and transitional states of JAK1 to a fully activation to a catalytically active enzyme. To achieve this goal, structures of the inhibited/activated full-length JAK1 were modelled and the energies of JAK1 with Tyrosine Kinase (TK) domain at different positions were calculated, and Dijkstra's method was applied to find the energetically smoothest path. Through a comparison of the energetically smoothest paths of kinase inactivating P733L and S703I mutations, an evaluation of the reasons why these mutations lead to negative or positive regulation of JAK1 are provided. Our energy analysis suggests that activation of JAK1 is thermodynamically spontaneous, with the inhibition resulting from an energy barrier at the initial steps of activation, specifically the release of the TK domain from the inhibited Four-point-one, Ezrin, Radixin, Moesin-PK cavity. Overall, this work provides insights into the potential pathway for TK translocation and the activation mechanism of JAK1.

Asunto(s)

Transducción de Señal , Mutación , Dominios Proteicos

11.

VISH-Pred: an ensemble of fine-tuned ESM models for protein toxicity prediction.

Mall, Raghvendra; Singh, Ankita; Patel, Chirag N; Guirimand, Gregory; Castiglione, Filippo.

Brief Bioinform ; 25(4)2024 May 23.

Artículo en Inglés | MEDLINE | ID: mdl-38842509

RESUMEN

Peptide- and protein-based therapeutics are becoming a promising treatment regimen for myriad diseases. Toxicity of proteins is the primary hurdle for protein-based therapies. Thus, there is an urgent need for accurate in silico methods for determining toxic proteins to filter the pool of potential candidates. At the same time, it is imperative to precisely identify non-toxic proteins to expand the possibilities for protein-based biologics. To address this challenge, we proposed an ensemble framework, called VISH-Pred, comprising models built by fine-tuning ESM2 transformer models on a large, experimentally validated, curated dataset of protein and peptide toxicities. The primary steps in the VISH-Pred framework are to efficiently estimate protein toxicities taking just the protein sequence as input, employing an under sampling technique to handle the humongous class-imbalance in the data and learning representations from fine-tuned ESM2 protein language models which are then fed to machine learning techniques such as Lightgbm and XGBoost. The VISH-Pred framework is able to correctly identify both peptides/proteins with potential toxicity and non-toxic proteins, achieving a Matthews correlation coefficient of 0.737, 0.716 and 0.322 and F1-score of 0.759, 0.696 and 0.713 on three non-redundant blind tests, respectively, outperforming other methods by over $10\%$ on these quality metrics. Moreover, VISH-Pred achieved the best accuracy and area under receiver operating curve scores on these independent test sets, highlighting the robustness and generalization capability of the framework. By making VISH-Pred available as an easy-to-use web server, we expect it to serve as a valuable asset for future endeavors aimed at discerning the toxicity of peptides and enabling efficient protein-based therapeutics.

Asunto(s)

Proteínas , Proteínas/metabolismo , Proteínas/química , Aprendizaje Automático , Bases de Datos de Proteínas , Biología Computacional/métodos , Humanos , Péptidos/toxicidad , Péptidos/química , Simulación por Computador , Algoritmos , Programas Informáticos

12.

Fuzzy kernel evidence Random Forest for identifying pseudouridine sites.

Chen, Mingshuai; Sun, Mingai; Su, Xi; Tiwari, Prayag; Ding, Yijie.

Brief Bioinform ; 25(3)2024 Mar 27.

Artículo en Inglés | MEDLINE | ID: mdl-38622357

RESUMEN

Pseudouridine is an RNA modification that is widely distributed in both prokaryotes and eukaryotes, and plays a critical role in numerous biological activities. Despite its importance, the precise identification of pseudouridine sites through experimental approaches poses significant challenges, requiring substantial time and resources.Therefore, there is a growing need for computational techniques that can reliably and quickly identify pseudouridine sites from vast amounts of RNA sequencing data. In this study, we propose fuzzy kernel evidence Random Forest (FKeERF) to identify pseudouridine sites. This method is called PseU-FKeERF, which demonstrates high accuracy in identifying pseudouridine sites from RNA sequencing data. The PseU-FKeERF model selected four RNA feature coding schemes with relatively good performance for feature combination, and then input them into the newly proposed FKeERF method for category prediction. FKeERF not only uses fuzzy logic to expand the original feature space, but also combines kernel methods that are easy to interpret in general for category prediction. Both cross-validation tests and independent tests on benchmark datasets have shown that PseU-FKeERF has better predictive performance than several state-of-the-art methods. This new method not only improves the accuracy of pseudouridine site identification, but also provides a certain reference for disease control and related drug development in the future.

Asunto(s)

Seudouridina , Bosques Aleatorios , Seudouridina/genética , ARN/genética , Secuencia de Bases

13.

DeepSS2GO: protein function prediction from secondary structure.

Song, Fu V; Su, Jiaqi; Huang, Sixing; Zhang, Neng; Li, Kaiyue; Ni, Ming; Liao, Maofu.

Brief Bioinform ; 25(3)2024 Mar 27.

Artículo en Inglés | MEDLINE | ID: mdl-38701416

RESUMEN

Predicting protein function is crucial for understanding biological life processes, preventing diseases and developing new drug targets. In recent years, methods based on sequence, structure and biological networks for protein function annotation have been extensively researched. Although obtaining a protein in three-dimensional structure through experimental or computational methods enhances the accuracy of function prediction, the sheer volume of proteins sequenced by high-throughput technologies presents a significant challenge. To address this issue, we introduce a deep neural network model DeepSS2GO (Secondary Structure to Gene Ontology). It is a predictor incorporating secondary structure features along with primary sequence and homology information. The algorithm expertly combines the speed of sequence-based information with the accuracy of structure-based features while streamlining the redundant data in primary sequences and bypassing the time-consuming challenges of tertiary structure analysis. The results show that the prediction performance surpasses state-of-the-art algorithms. It has the ability to predict key functions by effectively utilizing secondary structure information, rather than broadly predicting general Gene Ontology terms. Additionally, DeepSS2GO predicts five times faster than advanced algorithms, making it highly applicable to massive sequencing data. The source code and trained models are available at https://github.com/orca233/DeepSS2GO.

Asunto(s)

Algoritmos , Biología Computacional , Redes Neurales de la Computación , Estructura Secundaria de Proteína , Proteínas , Proteínas/química , Proteínas/metabolismo , Proteínas/genética , Biología Computacional/métodos , Bases de Datos de Proteínas , Ontología de Genes , Análisis de Secuencia de Proteína/métodos , Programas Informáticos

14.

A comparative benchmarking and evaluation framework for heterogeneous network-based drug repositioning methods.

Li, Yinghong; Yang, Yinqi; Tong, Zhuohao; Wang, Yu; Mi, Qin; Bai, Mingze; Liang, Guizhao; Li, Bo; Shu, Kunxian.

Brief Bioinform ; 25(3)2024 Mar 27.

Artículo en Inglés | MEDLINE | ID: mdl-38647153

RESUMEN

Computational drug repositioning, which involves identifying new indications for existing drugs, is an increasingly attractive research area due to its advantages in reducing both overall cost and development time. As a result, a growing number of computational drug repositioning methods have emerged. Heterogeneous network-based drug repositioning methods have been shown to outperform other approaches. However, there is a dearth of systematic evaluation studies of these methods, encompassing performance, scalability and usability, as well as a standardized process for evaluating new methods. Additionally, previous studies have only compared several methods, with conflicting results. In this context, we conducted a systematic benchmarking study of 28 heterogeneous network-based drug repositioning methods on 11 existing datasets. We developed a comprehensive framework to evaluate their performance, scalability and usability. Our study revealed that methods such as HGIMC, ITRPCA and BNNR exhibit the best overall performance, as they rely on matrix completion or factorization. HINGRL, MLMC, ITRPCA and HGIMC demonstrate the best performance, while NMFDR, GROBMC and SCPMF display superior scalability. For usability, HGIMC, DRHGCN and BNNR are the top performers. Building on these findings, we developed an online tool called HN-DREP (http://hn-drep.lyhbio.com/) to facilitate researchers in viewing all the detailed evaluation results and selecting the appropriate method. HN-DREP also provides an external drug repositioning prediction service for a specific disease or drug by integrating predictions from all methods. Furthermore, we have released a Snakemake workflow named HN-DRES (https://github.com/lyhbio/HN-DRES) to facilitate benchmarking and support the extension of new methods into the field.

Asunto(s)

Benchmarking , Reposicionamiento de Medicamentos , Reposicionamiento de Medicamentos/métodos , Humanos , Biología Computacional/métodos , Programas Informáticos , Algoritmos

15.

A kinetic model for solving a combination optimization problem in ab-initio Cryo-EM 3D reconstruction.

Liu, Jiaxuan; Lu, Yonggang; Zhu, Li.

Brief Bioinform ; 25(2)2024 Jan 22.

Artículo en Inglés | MEDLINE | ID: mdl-38261343

RESUMEN

Cryo-Electron Microscopy (cryo-EM) is a widely used and effective method for determining the three-dimensional (3D) structure of biological molecules. For ab-initio Cryo-EM 3D reconstruction using single particle analysis (SPA), estimating the projection direction of the projection image is a crucial step. However, the existing SPA methods based on common lines are sensitive to noise. The error in common line detection will lead to a poor estimation of the projection directions and thus may greatly affect the final reconstruction results. To improve the reconstruction results, multiple candidate common lines are estimated for each pair of projection images. The key problem then becomes a combination optimization problem of selecting consistent common lines from multiple candidates. To solve the problem efficiently, a physics-inspired method based on a kinetic model is proposed in this work. More specifically, hypothetical attractive forces between each pair of candidate common lines are used to calculate a hypothetical torque exerted on each projection image in the 3D reconstruction space, and the rotation under the hypothetical torque is used to optimize the projection direction estimation of the projection image. This way, the consistent common lines along with the projection directions can be found directly without enumeration of all the combinations of the multiple candidate common lines. Compared with the traditional methods, the proposed method is shown to be able to produce more accurate 3D reconstruction results from high noise projection images. Besides the practical value, the proposed method also serves as a good reference for solving similar combinatorial optimization problems.

Asunto(s)

Imagenología Tridimensional , Microscopía por Crioelectrón , Cinética

16.

Fundamental limits in structured principal component analysis and how to reach them.

Barbier, Jean; Camilli, Francesco; Mondelli, Marco; Sáenz, Manuel.

Proc Natl Acad Sci U S A ; 120(30): e2302028120, 2023 Jul 25.

Artículo en Inglés | MEDLINE | ID: mdl-37463204

RESUMEN

How do statistical dependencies in measurement noise influence high-dimensional inference? To answer this, we study the paradigmatic spiked matrix model of principal components analysis (PCA), where a rank-one matrix is corrupted by additive noise. We go beyond the usual independence assumption on the noise entries, by drawing the noise from a low-order polynomial orthogonal matrix ensemble. The resulting noise correlations make the setting relevant for applications but analytically challenging. We provide characterization of the Bayes optimal limits of inference in this model. If the spike is rotation invariant, we show that standard spectral PCA is optimal. However, for more general priors, both PCA and the existing approximate message-passing algorithm (AMP) fall short of achieving the information-theoretic limits, which we compute using the replica method from statistical physics. We thus propose an AMP, inspired by the theory of adaptive Thouless-Anderson-Palmer equations, which is empirically observed to saturate the conjectured theoretical limit. This AMP comes with a rigorous state evolution analysis tracking its performance. Although we focus on specific noise distributions, our methodology can be generalized to a wide class of trace matrix ensembles at the cost of more involved expressions. Finally, despite the seemingly strong assumption of rotation-invariant noise, our theory empirically predicts algorithmic performance on real data, pointing at strong universality properties.

17.

Principles of metabolome conservation in animals.

Liska, Orsolya; Boross, Gábor; Rocabert, Charles; Szappanos, Balázs; Tengölics, Roland; Papp, Balázs.

Proc Natl Acad Sci U S A ; 120(35): e2302147120, 2023 08 29.

Artículo en Inglés | MEDLINE | ID: mdl-37603743

RESUMEN

Metabolite levels shape cellular physiology and disease susceptibility, yet the general principles governing metabolome evolution are largely unknown. Here, we introduce a measure of conservation of individual metabolite levels among related species. By analyzing multispecies tissue metabolome datasets in phylogenetically diverse mammals and fruit flies, we show that conservation varies extensively across metabolites. Three major functional properties, metabolite abundance, essentiality, and association with human diseases predict conservation, highlighting a striking parallel between the evolutionary forces driving metabolome and protein sequence conservation. Metabolic network simulations recapitulated these general patterns and revealed that abundant metabolites are highly conserved due to their strong coupling to key metabolic fluxes in the network. Finally, we show that biomarkers of metabolic diseases can be distinguished from other metabolites simply based on evolutionary conservation, without requiring any prior clinical knowledge. Overall, this study uncovers simple rules that govern metabolic evolution in animals and implies that most tissue metabolome differences between species are permitted, rather than favored by natural selection. More broadly, our work paves the way toward using evolutionary information to identify biomarkers, as well as to detect pathogenic metabolome alterations in individual patients.

Asunto(s)

Drosophila , Metaboloma , Animales , Humanos , Secuencia de Aminoácidos , Conocimiento , Mamíferos

18.

Autocorrelation analysis for cryo-EM with sparsity constraints: Improved sample complexity and projection-based algorithms.

Bendory, Tamir; Khoo, Yuehaw; Kileel, Joe; Mickelin, Oscar; Singer, Amit.

Proc Natl Acad Sci U S A ; 120(18): e2216507120, 2023 May 02.

Artículo en Inglés | MEDLINE | ID: mdl-37094135

RESUMEN

The number of noisy images required for molecular reconstruction in single-particle cryoelectron microscopy (cryo-EM) is governed by the autocorrelations of the observed, randomly oriented, noisy projection images. In this work, we consider the effect of imposing sparsity priors on the molecule. We use techniques from signal processing, optimization, and applied algebraic geometry to obtain theoretical and computational contributions for this challenging nonlinear inverse problem with sparsity constraints. We prove that molecular structures modeled as sums of Gaussians are uniquely determined by the second-order autocorrelation of their projection images, implying that the sample complexity is proportional to the square of the variance of the noise. This theory improves upon the nonsparse case, where the third-order autocorrelation is required for uniformly oriented particle images and the sample complexity scales with the cube of the noise variance. Furthermore, we build a computational framework to reconstruct molecular structures which are sparse in the wavelet basis. This method combines the sparse representation for the molecule with projection-based techniques used for phase retrieval in X-ray crystallography.

19.

Physical mechanisms of red blood cell splenic filtration.

Moreau, Alexis; Yaya, François; Lu, Huijie; Surendranath, Anagha; Charrier, Anne; Dehapiot, Benoit; Helfer, Emmanuèle; Viallat, Annie; Peng, Zhangli.

Proc Natl Acad Sci U S A ; 120(44): e2300095120, 2023 Oct 31.

Artículo en Inglés | MEDLINE | ID: mdl-37874856

RESUMEN

The splenic interendothelial slits fulfill the essential function of continuously filtering red blood cells (RBCs) from the bloodstream to eliminate abnormal and aged cells. To date, the process by which 8 [Formula: see text]m RBCs pass through 0.3 [Formula: see text]m-wide slits remains enigmatic. Does the slit caliber increase during RBC passage as sometimes suggested? Here, we elucidated the mechanisms that govern the RBC retention or passage dynamics in slits by combining multiscale modeling, live imaging, and microfluidic experiments on an original device with submicron-wide physiologically calibrated slits. We observed that healthy RBCs pass through 0.28 [Formula: see text]m-wide rigid slits at 37 °C. To achieve this feat, they must meet two requirements. Geometrically, their surface area-to-volume ratio must be compatible with a shape in two tether-connected equal spheres. Mechanically, the cells with a low surface area-to-volume ratio (28% of RBCs in a 0.4 [Formula: see text]m-wide slit) must locally unfold their spectrin cytoskeleton inside the slit. In contrast, activation of the mechanosensitive PIEZO1 channel is not required. The RBC transit time through the slits follows a [Formula: see text]1 and [Formula: see text]3 power law with in-slit pressure drop and slip width, respectively. This law is similar to that of a Newtonian fluid in a two-dimensional Poiseuille flow, showing that the dynamics of RBCs is controlled by their cytoplasmic viscosity. Altogether, our results show that filtration through submicron-wide slits is possible without further slit opening. Furthermore, our approach addresses the critical need for in vitro evaluation of splenic clearance of diseased or engineered RBCs for transfusion and drug delivery.

Asunto(s)

Eritrocitos , Bazo , Eritrocitos/metabolismo , Citoesqueleto , Microfluídica , Espectrina/metabolismo

20.

Computing geodesic paths encoding a curvature prior for curvilinear structure tracking.

Chen, Da; Mirebeau, Jean-Marie; Shu, Minglei; Cohen, Laurent D.

Proc Natl Acad Sci U S A ; 120(33): e2218869120, 2023 Aug 15.

Artículo en Inglés | MEDLINE | ID: mdl-37549251

RESUMEN

In this paper, we introduce an efficient method for computing curves minimizing a variant of the Euler-Mumford elastica energy, with fixed endpoints and tangents at these endpoints, where the bending energy is enhanced with a user-defined and data-driven scalar-valued term referred to as the curvature prior. In order to guarantee that the globally optimal curve is extracted, the proposed method involves the numerical computation of the viscosity solution to a specific static Hamilton-Jacobi-Bellman (HJB) partial differential equation (PDE). For that purpose, we derive the explicit Hamiltonian associated with this variant model equipped with a curvature prior, discretize the resulting HJB PDE using an adaptive finite difference scheme, and solve it in a single pass using a generalized fast-marching method. In addition, we also present a practical method for estimating the curvature prior values from image data, designed for the task of accurately tracking curvilinear structure centerlines. Numerical experiments on synthetic and real-image data illustrate the advantages of the considered variant of the elastica model with a prior curvature enhancement in complex scenarios where challenging geometric structures appear.

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA