Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 2 de 2
Filter
Add more filters










Database
Language
Publication year range
1.
Biomolecules ; 13(3)2023 03 08.
Article in English | MEDLINE | ID: mdl-36979433

ABSTRACT

Machine learning-based models have been widely used in the early drug-design pipeline. To validate these models, cross-validation strategies have been employed, including those using clustering of molecules in terms of their chemical structures. However, the poor clustering of compounds will compromise such validation, especially on test molecules dissimilar to those in the training set. This study aims at finding the best way to cluster the molecules screened by the National Cancer Institute (NCI)-60 project by comparing hierarchical, Taylor-Butina, and uniform manifold approximation and projection (UMAP) clustering methods. The best-performing algorithm can then be used to generate clusters for model validation strategies. This study also aims at measuring the impact of removing outlier molecules prior to the clustering step. Clustering results are evaluated using three well-known clustering quality metrics. In addition, we compute an average similarity matrix to assess the quality of each cluster. The results show variation in clustering quality from method to method. The clusters obtained by the hierarchical and Taylor-Butina methods are more computationally expensive to use in cross-validation strategies, and both cluster the molecules poorly. In contrast, the UMAP method provides the best quality, and therefore we recommend it to analyze this highly valuable dataset.


Subject(s)
Algorithms , Machine Learning , United States , National Cancer Institute (U.S.) , Cluster Analysis , Drug Design
2.
Med Biol Eng Comput ; 58(10): 2475-2495, 2020 Oct.
Article in English | MEDLINE | ID: mdl-32780256

ABSTRACT

In this paper, we propose four variants of the Markov random field model by using constrained clustering for breast mass segmentation. These variants were tested with a set of images extracted from a public database. The obtained results have shown that the proposed variants, which allow to include additional information in the form of constraints to the clustering process, present better visual segmentation results than the original model, as well as a lower final energy which implies a better quality in the final segmentation. Specifically, the centroid initialization method used by our variants allows us to locate about 90% of the regions of interest that contain a mass, which subsequently with the pairwise constraints helped us recover a maximum of 93% of the masses. The segmentation results are also quantitatively evaluated using three supervised segmentation measures. These measures show that the mass segmentation quality of the proposed variants, considering the breast density level, is consistent with the corresponding segmentation annotated by specialized radiologists.


Subject(s)
Breast Neoplasms/diagnostic imaging , Mammography/methods , Markov Chains , Radiographic Image Interpretation, Computer-Assisted/methods , Algorithms , Breast Density , Cluster Analysis , Databases, Factual , Female , Humans
SELECTION OF CITATIONS
SEARCH DETAIL
...