Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
J Chem Inf Model ; 64(7): 2829-2838, 2024 Apr 08.
Artigo em Inglês | MEDLINE | ID: mdl-37402705

RESUMO

Single-cell RNA sequencing (scRNA-seq) is widely used to reveal heterogeneity in cells, which has given us insights into cell-cell communication, cell differentiation, and differential gene expression. However, analyzing scRNA-seq data is a challenge due to sparsity and the large number of genes involved. Therefore, dimensionality reduction and feature selection are important for removing spurious signals and enhancing the downstream analysis. We present Correlated Clustering and Projection (CCP), a new data-domain dimensionality reduction method, for the first time. CCP projects each cluster of similar genes into a supergene defined as the accumulated pairwise nonlinear gene-gene correlations among all cells. Using 14 benchmark data sets, we demonstrate that CCP has significant advantages over classical principal component analysis (PCA) for clustering and/or classification problems with intrinsically high dimensionality. In addition, we introduce the Residue-Similarity index (RSI) as a novel metric for clustering and classification and the R-S plot as a new visualization tool. We show that the RSI correlates with accuracy without requiring the knowledge of the true labels. The R-S plot provides a unique alternative to the uniform manifold approximation and projection (UMAP) and t-distributed stochastic neighbor embedding (t-SNE) for data with a large number of cell types.


Assuntos
Algoritmos , Análise de Célula Única , Análise de Célula Única/métodos , Análise por Conglomerados , Análise de Componente Principal , Análise de Sequência de RNA/métodos , Perfilação da Expressão Gênica/métodos
2.
J Chem Inf Model ; 63(16): 4995-5000, 2023 08 28.
Artigo em Inglês | MEDLINE | ID: mdl-37548575

RESUMO

We implemented an ab initio CCS prediction workflow which incrementally refines generated structures using molecular mechanics, a deep learning potential, conformational clustering, and quantum mechanics (QM). Automating intermediate steps for a high performance computing (HPC) environment allows users to input the SMILES structure of small organic molecules and obtain a Boltzmann averaged collisional cross section (CCS) value as output. The CCS of a molecular species is a metric measured by ion mobility spectrometry (IMS) which can improve annotation of untargeted metabolomics experiments. We report only a minor drop in accuracy when we expedite the CCS calculation by replacing the QM geometry refinement step with a single-point energy calculation. Even though the workflow involves stochastic steps (i.e., conformation generation and clustering), the final CCS value was highly reproducible for multiple iterations on L-carnosine. Finally, we illustrate that the gas phase ensembles modeled for the workflow are intermediate files which can be used for the prediction of other properties such as aqueous phase nuclear magnetic resonance chemical shift prediction. The software is available at the following link: https://github.com/DasSusanta/snakemake_ccs.


Assuntos
Metabolômica , Software , Metabolômica/métodos , Simulação de Dinâmica Molecular , Espectroscopia de Ressonância Magnética , Metodologias Computacionais
3.
J Am Soc Mass Spectrom ; 33(5): 750-759, 2022 May 04.
Artigo em Inglês | MEDLINE | ID: mdl-35378036

RESUMO

The interpretation of ion mobility coupled to mass spectrometry (IM-MS) data to predict unknown structures is challenging and depends on accurate theoretical estimates of the molecular ion collision cross section (CCS) against a buffer gas in a low or atmospheric pressure drift chamber. The sensitivity and reliability of computational prediction of CCS values depend on accurately modeling the molecular state over accessible conformations. In this work, we developed an efficient CCS computational workflow using a machine learning model in conjunction with standard DFT methods and CCS calculations. Furthermore, we have performed Traveling Wave IM-MS (TWIMS) experiments to validate the extant experimental values and assess uncertainties in experimentally measured CCS values. The developed workflow yielded accurate structural predictions and provides unique insights into the likely preferred conformation analyzed using IM-MS experiments. The complete workflow makes the computation of CCS values tractable for a large number of conformationally flexible metabolites with complex molecular structures.


Assuntos
Espectrometria de Mobilidade Iônica , Aprendizado de Máquina , Espectrometria de Mobilidade Iônica/métodos , Conformação Molecular , Estrutura Molecular , Reprodutibilidade dos Testes
4.
J Chem Inf Model ; 61(4): 1647-1656, 2021 04 26.
Artigo em Inglês | MEDLINE | ID: mdl-33780248

RESUMO

While accurately modeling the conformational ensemble is required for predicting properties of flexible molecules, the optimal method of obtaining the conformational ensemble appears as varied as their applications. Ensemble structures have been modeled by generation, refinement, and clustering of conformations with a sufficient number of samples. We present a conformational clustering algorithm intended to automate the conformational clustering step through the Louvain algorithm, which requires minimal hyperparameters and importantly no predefined number of clusters or threshold values. The conformational graphs produced by this method for O-succinyl-l-homoserine, oxidized nicotinamide adenine dinucleotide, and 200 representative metabolites each preserved the geometric/energetic correlation expected for points on the potential energy surface. Clustering based on these graphs provides partitions informed by the potential energy surface. Automating conformational clustering in a workflow with AutoGraph may mitigate human biases introduced by guess and check over hyperparameter selection while allowing flexibility to the result by not imposing predefined criteria other than optimizing the model's loss function. Associated codes are available at https://github.com/TanemuraKiyoto/AutoGraph.


Assuntos
Algoritmos , Análise por Conglomerados , Humanos , Conformação Molecular , Fluxo de Trabalho
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA