Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 6 de 6
Filter
Add more filters










Database
Publication year range
1.
Nat Commun ; 15(1): 1197, 2024 Feb 16.
Article in English | MEDLINE | ID: mdl-38365821

ABSTRACT

Recent years have seen rapid development of descriptor generation based on representation learning of extremely diverse molecules, especially those that apply natural language processing (NLP) models to SMILES, a literal representation of molecular structure. However, little research has been done on how these models understand chemical structure. To address this black box, we investigated the relationship between the learning progress of SMILES and chemical structure using a representative NLP model, the Transformer. We show that while the Transformer learns partial structures of molecules quickly, it requires extended training to understand overall structures. Consistently, the accuracy of molecular property predictions using descriptors generated from models at different learning steps was similar from the beginning to the end of training. Furthermore, we found that the Transformer requires particularly long training to learn chirality and sometimes stagnates with low performance due to misunderstanding of enantiomers. These findings are expected to deepen the understanding of NLP models in chemistry.

2.
Radiol Phys Technol ; 16(4): 560-568, 2023 Dec.
Article in English | MEDLINE | ID: mdl-37733207

ABSTRACT

The purpose was to investigate air-kerma area product (PKA) and entrance surface air-kerma rate ([Formula: see text]a,e) on the effect of the shape of automatic dose rate control (ADRC) in the presence of a wedge compensation filter. We compared and evaluated the variability of the X-ray output using a combination of wedge compensation filters and the ADRC. Two ADRC shapes (round and square) and three poly-methyl-methacrylate thicknesses (15, 20, and 25 cm) were used. A wedge compensation filter was inserted 2 cm at a time, up to 6 cm. When the wedge compensation filter was inserted to 6 cm for 20 cm of poly-methyl-methacrylate, the X-ray output fluctuated significantly. The PKA was reduced by 39% when the wedge compensation filter was inserted to 6 cm and by 59% when it was inserted to 4 cm under round-type for 20 cm poly-methyl-methacrylate. The shape of the ADRC affects [Formula: see text]a,e and PKA.


Subject(s)
Angiography , Methacrylates , Radiation Dosage , Phantoms, Imaging , Radiography
3.
J Cheminform ; 15(1): 45, 2023 Apr 12.
Article in English | MEDLINE | ID: mdl-37046349

ABSTRACT

Descriptor generation methods using latent representations of encoder-decoder (ED) models with SMILES as input are useful because of the continuity of descriptor and restorability to the structure. However, it is not clear how the structure is recognized in the learning progress of ED models. In this work, we created ED models of various learning progress and investigated the relationship between structural information and learning progress. We showed that compound substructures were learned early in ED models by monitoring the accuracy of downstream tasks and input-output substructure similarity using substructure-based descriptors, which suggests that existing evaluation methods based on the accuracy of downstream tasks may not be sensitive enough to evaluate the performance of ED models with SMILES as descriptor generation methods. On the other hand, we showed that structure restoration was time-consuming, and in particular, insufficient learning led to the estimation of a larger structure than the actual one. It can be inferred that determining the endpoint of the structure is a difficult task for the model. To our knowledge, this is the first study to link the learning progress of SMILES by ED model to chemical structures for a wide range of chemicals.

4.
NAR Genom Bioinform ; 5(1): lqad022, 2023 Mar.
Article in English | MEDLINE | ID: mdl-36915410

ABSTRACT

Transcriptomic data of cultured cells treated with a chemical are widely recognized as useful numeric information that describes the effects of the chemical. This property is due to the high coverage and low arbitrariness of the transcriptomic data as profiles of chemicals. Considering the importance of posttranslational regulation, proteomic profiles could provide insights into the unrecognized aspects of the effects of chemicals. Therefore, this study aimed to address the question of how well the proteomic profiles obtained using data-independent acquisition (DIA) with the sequential window acquisition of all theoretical mass spectra, which can achieve comprehensive and arbitrariness-free protein quantification, can describe chemical effects. We demonstrated that the proteomic data obtained using DIA-MS exhibited favorable properties as profile data, such as being able to discriminate chemicals like the transcriptomic profiles. Furthermore, we revealed a new mode of action of a natural compound, harmine, through profile data analysis using the proteomic profile data. To our knowledge, this is the first study to investigate the properties of proteomic data obtained using DIA-MS as the profiles of chemicals. Our 54 (samples) × 2831 (proteins) data matrix would be an important source for further analyses to understand the effects of chemicals in a data-driven manner.

5.
Yakugaku Zasshi ; 143(2): 127-132, 2023.
Article in Japanese | MEDLINE | ID: mdl-36724926

ABSTRACT

The effects of drugs and other low-molecular-weight compounds are complex and may be unintended by the developer. These compounds and drugs should be avoided if these unintended effects are harmful; however, unintended effects are not always as harmful as suggested by drug repositioning. Therefore, a comprehensive understanding of complex drug actions is essential. Omics data can be regarded as the nonarbitrary transformation of biological information about a sample into comprehensive numerical information comprising multivariate data with a large number of variables. However, the changes are often based on a small number of elements in different dimensions (i.e., latent variables). The omics data of compound-treated samples comprehensively capture the complex effects of compounds, including their unrecognized aspects. Therefore, finding latent variables in these data is expected to contribute to the understanding of multiple effects. In particular, it can be interpreted as decomposing multiple effects into a smaller number of easily understandable effects. Although latent variable models of omics data have been used to understand the mechanisms of diseases, no approach has considered the multiple effects of compounds and their decomposition. Therefore, we propose to decompose and understand the multiple effects of low-molecular-weight compounds without arbitrariness and have been developing analytical methods and verifying their usefulness. In particular, we focused on classical factor analysis among latent variable models and have been examining the biological validity of the estimates obtained under linear assumptions.


Subject(s)
Drug Repositioning , Molecular Weight , Factor Analysis, Statistical
6.
J Nat Prod ; 84(4): 1283-1293, 2021 04 23.
Article in English | MEDLINE | ID: mdl-33836128

ABSTRACT

It is difficult to understand the entire effect of a natural product because such products generally have multiple effects. We propose a strategy to understand these effects effectively by decomposing them with a profile data analysis method we developed. A transcriptome profile data set was obtained from a public database and analyzed. Considering their high similarity in structure and transcriptome profile, we focused on rescinnamine and syrosingopine. Decomposed effects predicted clear differences between the compounds. Two of the decomposed effects, SREBF1 activation and HDAC inhibition, were investigated experimentally because the relationship between these effects and the compounds had not yet been reported. Analyses in vitro validated these effects, and their strength was consistent with predicted scores. Moreover, the number of outliers in decomposed effects per compound was higher in natural products than in drugs in the data set, which is consistent with the nature of the effects of natural products.


Subject(s)
Biological Products/chemistry , Data Analysis , Databases, Factual , Reserpine/analogs & derivatives , Reserpine/chemistry , Transcriptome
SELECTION OF CITATIONS
SEARCH DETAIL
...