Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
1.
Comput Struct Biotechnol J ; 23: 1181-1188, 2024 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-38510976

RESUMO

Biomedical imaging techniques such as high content screening (HCS) are valuable for drug discovery, but high costs limit their use to pharmaceutical companies. To address this issue, The JUMP-CP consortium released a massive open image dataset of chemical and genetic perturbations, providing a valuable resource for deep learning research. In this work, we aim to utilize the JUMP-CP dataset to develop a universal representation model for HCS data, mainly data generated using U2OS cells and CellPainting protocol, using supervised and self-supervised learning approaches. We propose an evaluation protocol that assesses their performance on mode of action and property prediction tasks using a popular phenotypic screening dataset. Results show that the self-supervised approach that uses data from multiple consortium partners provides representation that is more robust to batch effects whilst simultaneously achieving performance on par with standard approaches. Together with other conclusions, it provides recommendations on the training strategy of a representation model for HCS images.

2.
J Cheminform ; 16(1): 3, 2024 Jan 03.
Artigo em Inglês | MEDLINE | ID: mdl-38173009

RESUMO

The prediction of molecular properties is a crucial aspect in drug discovery that can save a lot of money and time during the drug design process. The use of machine learning methods to predict molecular properties has become increasingly popular in recent years. Despite advancements in the field, several challenges remain that need to be addressed, like finding an optimal pre-training procedure to improve performance on small datasets, which are common in drug discovery. In our paper, we tackle these problems by introducing Relative Molecule Self-Attention Transformer for molecular representation learning. It is a novel architecture that uses relative self-attention and 3D molecular representation to capture the interactions between atoms and bonds that enrich the backbone model with domain-specific inductive biases. Furthermore, our two-step pretraining procedure allows us to tune only a few hyperparameter values to achieve good performance comparable with state-of-the-art models on a wide selection of downstream tasks.

3.
Neural Netw ; 168: 580-601, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-37837747

RESUMO

The problem of reducing processing time of large deep learning models is a fundamental challenge in many real-world applications. Early exit methods strive towards this goal by attaching additional Internal Classifiers (ICs) to intermediate layers of a neural network. ICs can quickly return predictions for easy examples and, as a result, reduce the average inference time of the whole model. However, if a particular IC does not decide to return an answer early, its predictions are discarded, with its computations effectively being wasted. To solve this issue, we introduce Zero Time Waste (ZTW), a novel approach in which each IC reuses predictions returned by its predecessors by (1) adding direct connections between ICs and (2) combining previous outputs in an ensemble-like manner. We conduct extensive experiments across various multiple modes, datasets, and architectures to demonstrate that ZTW achieves a significantly better accuracy vs. inference time trade-off than other early exit methods. On the ImageNet dataset, it obtains superior results over the best baseline method in 11 out of 16 cases, reaching up to 5 percentage points of improvement on low computational budgets.


Assuntos
Motivação , Redes Neurais de Computação , Bases de Dados Factuais
4.
Artigo em Inglês | MEDLINE | ID: mdl-37028296

RESUMO

Interpolating between points is a problem connected simultaneously with finding geodesics and study of generative models. In the case of geodesics, we search for the curves with the shortest length, while in the case of generative models, we typically apply linear interpolation in the latent space. However, this interpolation uses implicitly the fact that Gaussian is unimodal. Thus, the problem of interpolating in the case when the latent density is non-Gaussian is an open problem. In this article, we present a general and unified approach to interpolation, which simultaneously allows us to search for geodesics and interpolating curves in latent space in the case of arbitrary density. Our results have a strong theoretical background based on the introduced quality measure of an interpolating curve. In particular, we show that maximizing the quality measure of the curve can be equivalently understood as a search of geodesic for a certain redefinition of the Riemannian metric on the space. We provide examples in three important cases. First, we show that our approach can be easily applied to finding geodesics on manifolds. Next, we focus our attention in finding interpolations in pretrained generative models. We show that our model effectively works in the case of arbitrary density. Moreover, we can interpolate in the subset of the space consisting of data possessing a given feature. The last case is focused on finding interpolation in the space of chemical compounds.

5.
IEEE Trans Pattern Anal Mach Intell ; 44(11): 8508-8519, 2022 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34460365

RESUMO

We propose OneFlow - a flow-based one-class classifier for anomaly (outlier) detection that finds a minimal volume bounding region. Contrary to density-based methods, OneFlow is constructed in such a way that its result typically does not depend on the structure of outliers. This is caused by the fact that during training the gradient of the cost function is propagated only over the points located near to the decision boundary (behavior similar to the support vectors in SVM). The combination of flow models and a Bernstein quantile estimator allows OneFlow to find a parametric form of bounding region, which can be useful in various applications including describing shapes from 3D point clouds. Experiments show that the proposed model outperforms related methods on real-world anomaly detection problems.

6.
IEEE Trans Pattern Anal Mach Intell ; 44(12): 9995-10008, 2022 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-34826294

RESUMO

In this work, we propose a novel method for generating 3D point clouds that leverages the properties of hypernetworks. Contrary to the existing methods that learn only the representation of a 3D object, our approach simultaneously finds a representation of the object and its 3D surface. The main idea of our HyperCloud method is to build a hypernetwork that returns weights of a particular neural network (target network) trained to map points from prior distribution into a 3D shape. As a consequence, a particular 3D shape can be generated using point-by-point sampling from the prior distribution and transforming the sampled points with the target network. Since the hypernetwork is based on an auto-encoder architecture trained to reconstruct realistic 3D shapes, the target network weights can be considered to be a parametrization of the surface of a 3D shape, and not a standard representation of point cloud usually returned by competitive approaches. We also show that relying on hypernetworks to build 3D point cloud representations offers an elegant and flexible framework. To that point, we further extend our method by incorporating flow-based models, which results in a novel HyperFlow approach.

7.
IEEE Trans Neural Netw Learn Syst ; 32(9): 3930-3941, 2021 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32845846

RESUMO

We propose a semi-supervised generative model, SeGMA, which learns a joint probability distribution of data and their classes and is implemented in a typical Wasserstein autoencoder framework. We choose a mixture of Gaussians as a target distribution in latent space, which provides a natural splitting of data into clusters. To connect Gaussian components with correct classes, we use a small amount of labeled data and a Gaussian classifier induced by the target distribution. SeGMA is optimized efficiently due to the use of the Cramer-Wold distance as a maximum mean discrepancy penalty, which yields a closed-form expression for a mixture of spherical Gaussian components and, thus, obviates the need of sampling. While SeGMA preserves all properties of its semi-supervised predecessors and achieves at least as good generative performance on standard benchmark data sets, it presents additional features: 1) interpolation between any pair of points in the latent space produces realistically looking samples; 2) combining the interpolation property with disentangling of class and style information, SeGMA is able to perform continuous style transfer from one class to another; and 3) it is possible to change the intensity of class characteristics in a data point by moving the latent representation of the data point away from specific Gaussian components.

8.
J Chem Inf Model ; 60(9): 4246-4262, 2020 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-32865414

RESUMO

Docking is one of the most important steps in virtual screening pipelines, and it is an established method for examining potential interactions between ligands and receptors. However, this method is computationally expensive, and it is often among the last steps of the process of compound libraries evaluation. In this work, we investigate the feasibility of learning a deep neural network to predict the docking output directly from a two-dimensional compound structure. The developed protocol is orders of magnitude faster than typical docking software, and it returns ligand-receptor complexes encoded in the form of the interaction fingerprint. Its speed and efficiency unlock the application possibilities, such as screening compound libraries of vast size on the basis of contact patterns or docking score (derived on the basis of predicted interaction schemes). We tested our approach on several G protein-coupled receptor targets and 4 CYP enzymes in retrospective virtual screening experiments, and a variant of graph convolutional network appeared to be most effective in emulating docking results. The method can be easily used by the community based on the code available in the Supporting Information.


Assuntos
Redes Neurais de Computação , Software , Ligantes , Simulação de Acoplamento Molecular , Ligação Proteica , Receptores Acoplados a Proteínas G , Estudos Retrospectivos
9.
J Chem Inf Model ; 59(12): 4974-4992, 2019 12 23.
Artigo em Inglês | MEDLINE | ID: mdl-31604014

RESUMO

New computational approaches for virtual screening applications are constantly being developed. However, before a particular tool is used to search for new active compounds, its effectiveness in the type of task must be examined. In this study, we conducted a detailed analysis of various aspects of preparation of respective data sets for such an evaluation. We propose a protocol for fetching data from the ChEMBL database, examine various compound representations in terms of the possible bias resulting from the way they are generated, and define a new metric for comparing the structural similarity of compounds, which is in line with chemical intuition. The newly developed method is also used for the evaluation of various approaches for division of the data set into training and test set parts, which are also examined in detail in terms of being the source of possible results bias. Finally, machine learning methods are applied in cross-validation studies of data sets constructed within the paper, constituting benchmarks for the assessment of computational methods developed for virtual screening tasks. Additionally, analogous data sets for class A G protein-coupled receptors (100 targets with the highest number of records) were prepared. They are available at http://gmum.net/benchmarks/ , together with script enabling reproduction of all results available at https://github.com/lesniak43/ananas .


Assuntos
Avaliação Pré-Clínica de Medicamentos/métodos , Aprendizado de Máquina , Receptores Acoplados a Proteínas G/metabolismo , Benchmarking , Ligantes , Interface Usuário-Computador
10.
Mol Divers ; 23(3): 603-613, 2019 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-30484023

RESUMO

Three-dimensional descriptors are often used to search for new biologically active compounds, in both ligand- and structure-based approaches, capturing the spatial orientation of molecules. They frequently constitute an input for machine learning-based predictions of compound activity or quantitative structure-activity relationship modeling; however, the distribution of their values and the accuracy of depicting compound orientations might have an impact on the power of the obtained predictive models. In this study, we analyzed the distribution of three-dimensional descriptors calculated for docking poses of active and inactive compounds for all aminergic G protein-coupled receptors with available crystal structures, focusing on the variation in conformations for different receptors and crystals. We demonstrated that the consistency in compound orientation in the binding site is rather not correlated with the affinity itself, but is more influenced by other factors, such as the number of rotatable bonds and crystal structure used for docking studies. The visualizations of the descriptors distributions were prepared and made available online at http://chem.gmum.net/vischem_stability , which enables the investigation of chemical structures referring to particular data points depicted in the figures. Moreover, the performed analysis can assist in choosing crystal structure for docking studies, helping in selection of conditions providing the best discrimination between active and inactive compounds in machine learning-based experiments.


Assuntos
Aminas/metabolismo , Simulação de Acoplamento Molecular , Receptores Acoplados a Proteínas G/química , Receptores Acoplados a Proteínas G/metabolismo , Cristalografia por Raios X , Ligantes , Aprendizado de Máquina , Conformação Proteica
11.
Contemp Oncol (Pozn) ; 19(5): 400-9, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-26793026

RESUMO

AIM OF THE STUDY: To evaluate outcome, costs and treatment differences in rectal cancer patients between various regions in Poland. MATERIAL AND METHODS: Data from the Polish National Health Fund of all patients with rectal cancer diagnosed and treated between 2005 and 2007 were analyzed. Overall, relative 5-year survival and the percentage of patients receiving chemotherapy, radiotherapy and surgery were analyzed. The possible influence of cost of treatment per patient and mean number of rectal cancer patients per surgical oncologist were analyzed as well. RESULTS: In total 15,281 patients with rectal cancer were diagnosed and treated in Poland in 2005-2007 within the services of the National Health Fund. The overall, relative 5-year survival rate was 51.6%. Curative surgery was performed in 64.1% of patients. Radiotherapy and chemotherapy were used in 47.5% and 60.7% of patients, respectively. The mean cost of treatment of one rectal cancer patient was 32,800 PLN and there were 49.8 rectal cancer patients per specialist in surgical oncology. Important differences between regions were found in all these factors, but without a significant influence on survival. A correlation between numbers of patients per specialist in different voivodeships and survival rates was observed, as well as a correlation between percentage of surgical resection in voivodeships and survival rates (p = 0.07). CONCLUSIONS: Results of treatment of colorectal cancer in Poland improved significantly during the last decade. There exist however, important disparities between regions in terms of method of treatment, costs and outcomes.

12.
PLoS One ; 9(7): e102069, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25019251

RESUMO

The automatic clustering of chemical compounds is an important branch of chemoinformatics. In this paper the Asymmetric Clustering Index (Aci) is proposed to assess how well an automatically created partition reflects the reference. The asymmetry allows for a distinction between the fixed reference and the numerically constructed partition. The introduced index is applied to evaluate the quality of hierarchical clustering procedures for 5-HT1A receptor ligands. We find that the most appropriate combination of parameters for the hierarchical clustering of compounds with a determined activity for this biological target is the Klekota Roth fingerprint combined with the complete linkage function and the Buser similarity metric.


Assuntos
Análise por Conglomerados , Biologia Computacional/métodos , Ligantes , Receptor 5-HT1A de Serotonina/metabolismo , Fenômenos Bioquímicos , Humanos
13.
World J Surg ; 33(3): 469-74, 2009 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19148700

RESUMO

PURPOSE: The prognosis for stage III melanoma patients is mixed, and there is need for new prognostic factors to be incorporated into a revision of melanoma TNM staging. We analyzed the possible role of the timing of lymph node involvement as an important prognostic factor. METHODS: Among 249 melanoma patients who underwent ilioinguinal lymphadenectomy, a group of 185 patients with a thick (>4mm) melanoma and full clinical data available was analyzed. The mean depth of invasion was 5.85 mm; the tumor was ulcerated in 67 cases (36.2%); and Clark V was diagnosed in 82 patients (44.3%). The median interval between primary excision and the time of lymphadenectomy was 11.1 months. RESULTS: Recurrent disease was reported in 150 of 185 patients. The first sites of recurrence were the skin in 15.7%, lymph nodes in 13.5%, and distant metastases in 28.7%; the remaining 43 patients (23.2%) had multifocal recurrences. In all, 35 patients (18.9%) were disease-free. Skip metastases (positive iliac and negative inguinal lymph nodes) were found in 26 patients (14%). Multivariate Cox analysis showed that only the time between the first surgery and lymphadenectomy and the number of involved nodes were significant predictors of survival. Relative risk of death was 5.2 times higher for patients who had simultaneously undergone lymphadenectomy (compared to lymph dissection performed >1 year after primary excision) and about 2.7 times higher for those with more regionally advanced disease (pN3 vs. pN1). CONCLUSIONS: The long disease-free interval before the development of lymph node metastases and before node dissection is a favorable prognostic factor independent of other well known parameters.


Assuntos
Extremidade Inferior , Linfonodos/patologia , Melanoma/secundário , Neoplasias Cutâneas/patologia , Adolescente , Adulto , Idoso , Idoso de 80 Anos ou mais , Feminino , Humanos , Estudos Longitudinais , Excisão de Linfonodo/mortalidade , Linfonodos/cirurgia , Metástase Linfática , Masculino , Melanoma/mortalidade , Melanoma/cirurgia , Pessoa de Meia-Idade , Análise Multivariada , Recidiva Local de Neoplasia/patologia , Estadiamento de Neoplasias , Prognóstico , Neoplasias Cutâneas/mortalidade , Neoplasias Cutâneas/cirurgia , Taxa de Sobrevida , Fatores de Tempo , Adulto Jovem
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...