Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 51
Filtrar
1.
Biomedicines ; 12(9)2024 Sep 11.
Artigo em Inglês | MEDLINE | ID: mdl-39335588

RESUMO

Inhibitors of the serine protease furin have been widely studied as antimicrobial agents due to their ability to block the cleavage and activation of certain viral surface proteins and bacterial toxins. In this study, the antipseudomonal effects and safety profiles of the furin inhibitors MI-1851 and MI-2415 were assessed. Fluorescence quenching studies suggested no relevant binding of the compounds to human serum albumin and α1-acid glycoprotein. Both inhibitors demonstrated significant antipseudomonal activity in Madin-Darby canine kidney cells, especially compound MI-1851 at very low concentrations (0.5 µM). Using non-tumorigenic porcine IPEC-J2 cells, neither of the two furin inhibitors induced cytotoxicity (CCK-8 assay) or altered significantly the intracellular (Amplex Red assay) or extracellular (DCFH-DA assay) redox status even at a concentration of 100 µM. The same assays with MI-2415 conducted on primary human hepatocytes also resulted in no changes in cell viability and oxidative stress at up to 100 µM. Microsomal and hepatocyte-based CYP3A4 activity assays showed that both inhibitors exhibited a concentration-dependent inhibition of the isoenzyme at high concentrations. In conclusion, this study indicates a good safety profile of the furin inhibitors MI-1851 and MI-2415, suggesting their applicability as antimicrobials for further in vivo investigations, despite some inhibitory effects on CYP3A4.

2.
Sci Rep ; 14(1): 16621, 2024 07 18.
Artigo em Inglês | MEDLINE | ID: mdl-39025978

RESUMO

Certain corona- and influenza viruses utilize type II transmembrane serine proteases for cell entry, making these enzymes potential drug targets for the treatment of viral respiratory infections. In this study, the cytotoxicity and inhibitory effects of seven matriptase/TMPRSS2 inhibitors (MI-21, MI-463, MI-472, MI-485, MI-1900, MI-1903, and MI-1904) on cytochrome P450 enzymes were evaluated using fluorometric assays. Additionally, their antiviral activity against influenza A virus subtypes H1N1 and H9N2 was assessed. The metabolic depletion rates of these inhibitors in human primary hepatocytes were determined over a 120-min period by LC-MS/MS, and PK parameters were calculated. The tested compounds, with the exception of MI-21, displayed potent inhibition of CYP3A4, while all compounds lacked inhibitory effects on CYP1A2, CYP2C9, CYP2C19, and CYP2D6. The differences between the CYP3A4 activity within the series were rationalized by ligand docking. Elucidation of PK parameters showed that inhibitors MI-463, MI-472, MI-485, MI-1900 and MI-1904 were more stable compounds than MI-21 and MI-1903. Anti-H1N1 properties of inhibitors MI-463 and MI-1900 and anti-H9N2 effects of MI-463 were shown at 20 and 50 µM after 24 h incubation with the inhibitors, suggesting that these inhibitors can be applied to block entry of these viruses by suppressing host matriptase/TMPRSS2-mediated cleavage.


Assuntos
Antivirais , Hepatócitos , Serina Endopeptidases , Cães , Humanos , Antivirais/farmacologia , Citocromo P-450 CYP3A/metabolismo , Hepatócitos/virologia , Hepatócitos/metabolismo , Hepatócitos/efeitos dos fármacos , Vírus da Influenza A Subtipo H1N1/efeitos dos fármacos , Simulação de Acoplamento Molecular , Serina Endopeptidases/metabolismo , Animais , Células Madin Darby de Rim Canino
3.
Sci Data ; 11(1): 540, 2024 May 25.
Artigo em Inglês | MEDLINE | ID: mdl-38796485

RESUMO

Amongst fishes, zebrafish (Danio rerio) has gained popularity as a model system over most other species and while their value as a model is well documented, their usefulness is limited in certain fields of research such as behavior. By embracing other, less conventional experimental organisms, opportunities arise to gain broader insights into evolution and development, as well as studying behavioral aspects not available in current popular model systems. The anabantoid paradise fish (Macropodus opercularis), an "air-breather" species has a highly complex behavioral repertoire and has been the subject of many ethological investigations but lacks genomic resources. Here we report the reference genome assembly of M. opercularis using long-read sequences at 150-fold coverage. The final assembly consisted of 483,077,705 base pairs (~483 Mb) on 152 contigs. Within the assembled genome we identified and annotated 20,157 protein coding genes and assigned ~90% of them to orthogroups.


Assuntos
Peixes , Genoma , Animais , Peixes/genética
4.
Artigo em Inglês | MEDLINE | ID: mdl-37818738

RESUMO

Paradise fish (Macropodus opercularis) is an air-breathing freshwater fish species with a signature labyrinth organ capable of extracting oxygen from the air that helps these fish to survive in hypoxic environments. The appearance of this evolutionary innovation in anabantoids resulted in a rewired circulatory system, but also in the emergence of species-specific behaviors, such as territorial display, courtship and parental care in the case of the paradise fish. Early zoologists were intrigued by the structure and function of the labyrinth apparatus and a series of detailed descriptive histological studies at the beginning of the 20th century revealed the ontogenesis and function of this specialized system. A few decades later, these fish became the subject of numerous ethological studies, and detailed ethograms of their behavior were constructed. These latter studies also demonstrated a strong genetic component underlying their behavior, but due to lack of adequate molecular tools, the fine genetic dissection of the behavior was not possible at the time. The technological breakthroughs that transformed developmental biology and behavioral genetics in the past decades, however, give us now a unique opportunity to revisit these old questions. Building on the classic descriptive studies, the new methodologies will allow us to follow the development of the labyrinth apparatus at a cellular resolution, reveal the genes involved in this process and also the genetic architecture behind the complex behaviors that we can observe in this species.

5.
bioRxiv ; 2023 Aug 10.
Artigo em Inglês | MEDLINE | ID: mdl-37609174

RESUMO

Over the decades, a small number of model species, each representative of a larger taxa, have dominated the field of biological research. Amongst fishes, zebrafish (Danio rerio) has gained popularity over most other species and while their value as a model is well documented, their usefulness is limited in certain fields of research such as behavior. By embracing other, less conventional experimental organisms, opportunities arise to gain broader insights into evolution and development, as well as studying behavioral aspects not available in current popular model systems. The anabantoid paradise fish (Macropodus opercularis), an "air-breather" species from Southeast Asia, has a highly complex behavioral repertoire and has been the subject of many ethological investigations, but lacks genomic resources. Here we report the reference genome assembly of Macropodus opercularis using long-read sequences at 150-fold coverage. The final assembly consisted of ≈483 Mb on 152 contigs. Within the assembled genome we identified and annotated 20,157 protein coding genes and assigned ≈90% of them to orthogroups. Completeness analysis showed that 98.5% of the Actinopterygii core gene set (ODB10) was present as a complete ortholog in our reference genome with a further 1.2 % being present in a fragmented form. Additionally, we cloned multiple genes important during early development and using newly developed in situ hybridization protocols, we showed that they have conserved expression patterns.

6.
Eur J Pharm Sci ; 188: 106514, 2023 Sep 01.
Artigo em Inglês | MEDLINE | ID: mdl-37402429

RESUMO

Gastrointestinal absorption is a key factor amongst the ADME-related (absorption, distribution, metabolism and excretion) pharmacokinetic properties; therefore, it has a major role in drug discovery and drug safety determinations. The Parallel Artificial Membrane Permeability Assay (PAMPA) can be considered as the most popular and well-known screening assay for the measurement of gastrointestinal absorption. Our study provides quantitative structure-property relationship (QSPR) models based on experimental PAMPA permeability data for almost four hundred diverse molecules, which is a great extension of the applicability of the models in the chemical space. Two- and three-dimensional molecular descriptors were applied for the model building in every case. We have compared the performance of a classical partial least squares regression (PLS) model with two major machine learning algorithms: artificial neural networks (ANN) and support vector machine (SVM). Due to the applied gradient pH in the experiments, we have calculated the descriptors for the model building at pH values of 7.4 and 6.5, and compared the effect of pH on the performance of the models. After a complex validation protocol, the best model had an R2=0.91 for the training set, and R2= 0.84 for the external test set. The developed models are capable for the robust and fast prediction of new compounds with an excellent accuracy compared to the previous QSPR models.


Assuntos
Algoritmos , Absorção Gastrointestinal , Absorção Intestinal , Permeabilidade , Relação Quantitativa Estrutura-Atividade , Aprendizado de Máquina
7.
J Chem Inf Model ; 62(14): 3415-3425, 2022 07 25.
Artigo em Inglês | MEDLINE | ID: mdl-35834424

RESUMO

Molecular dynamics (MD) is a core methodology of molecular modeling and computational design for the study of the dynamics and temporal evolution of molecular systems. MD simulations have particularly benefited from the rapid increase of computational power that has characterized the past decades of computational chemical research, being the first method to be successfully migrated to the GPU infrastructure. While new-generation MD software is capable of delivering simulations on an ever-increasing scale, relatively less effort is invested in developing postprocessing methods that can keep up with the quickly expanding volumes of data that are being generated. Here, we introduce a new idea for sampling frames from large MD trajectories, based on the recently introduced framework of extended similarity indices. Our approach presents a new, linearly scaling alternative to the traditional approach of applying a clustering algorithm that usually scales as a quadratic function of the number of frames. When showcasing its usage on case studies with different system sizes and simulation lengths, we have registered speedups of up to 2 orders of magnitude, as compared to traditional clustering algorithms. The conformational diversity of the selected frames is also noticeably higher, which is a further advantage for certain applications, such as the selection of structural ensembles for ligand docking. The method is available open-source at https://github.com/ramirandaq/MultipleComparisons.


Assuntos
Simulação de Dinâmica Molecular , Proteínas , Algoritmos , Análise por Conglomerados , Proteínas/química , Software
8.
Front Chem ; 10: 852893, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35755260

RESUMO

The screening of compounds for ADME-Tox targets plays an important role in drug design. QSPR models can increase the speed of these specific tasks, although the performance of the models highly depends on several factors, such as the applied molecular descriptors. In this study, a detailed comparison of the most popular descriptor groups has been carried out for six main ADME-Tox classification targets: Ames mutagenicity, P-glycoprotein inhibition, hERG inhibition, hepatotoxicity, blood-brain-barrier permeability, and cytochrome P450 2C9 inhibition. The literature-based, medium-sized binary classification datasets (all above 1,000 molecules) were used for the model building by two common algorithms, XGBoost and the RPropMLP neural network. Five molecular representation sets were compared along with their joint applications: Morgan, Atompairs, and MACCS fingerprints, and the traditional 1D and 2D molecular descriptors, as well as 3D molecular descriptors, separately. The statistical evaluation of the model performances was based on 18 different performance parameters. Although all the developed models were close to the usual performance of QSPR models for each specific ADME-Tox target, the results clearly showed the superiority of the traditional 1D, 2D, and 3D descriptors in the case of the XGBoost algorithm. It is worth trying the classical tools in single model building because the use of 2D descriptors can produce even better models for almost every dataset than the combination of all the examined descriptor sets.

9.
J Comput Aided Mol Des ; 36(3): 157-173, 2022 03.
Artigo em Inglês | MEDLINE | ID: mdl-35288838

RESUMO

Extended (or n-ary) similarity indices have been recently proposed to extend the comparative analysis of binary strings. Going beyond the traditional notion of pairwise comparisons, these novel indices allow comparing any number of objects at the same time. This results in a remarkable efficiency gain with respect to other approaches, since now we can compare N molecules in O(N) instead of the common quadratic O(N2) timescale. This favorable scaling has motivated the application of these indices to diversity selection, clustering, phylogenetic analysis, chemical space visualization, and post-processing of molecular dynamics simulations. However, the current formulation of the n-ary indices is limited to vectors with binary or categorical inputs. Here, we present the further generalization of this formalism so it can be applied to numerical data, i.e. to vectors with continuous components. We discuss several ways to achieve this extension and present their analytical properties. As a practical example, we apply this formalism to the problem of feature selection in QSAR and prove that the extended continuous similarity indices provide a convenient way to discern between several sets of descriptors.


Assuntos
Desenho de Fármacos , Relação Quantitativa Estrutura-Atividade , Filogenia
10.
Eur J Med Chem ; 231: 114163, 2022 Mar 05.
Artigo em Inglês | MEDLINE | ID: mdl-35131537

RESUMO

Intrinsically disordered proteins (IDPs) play important roles in disease pathologies; however, their lack of defined stable 3D structures make traditional drug design strategies typically less effective against these targets. Based on promising results of targeted covalent inhibitors (TCIs) on challenging targets, we have developed a covalent design strategy targeting IDPs. As a model system we chose tau, an endogenous IDP of the central nervous system that is associated with severe neurodegenerative diseases via its aggregation. First, we mapped the tractability of available cysteines in tau and prioritized suitable warheads. Next, we introduced the selected vinylsulfone warhead to the non-covalent scaffolds of potential tau aggregation inhibitors. The designed covalent tau binders were synthesized and tested in aggregation models, and inhibited tau aggregation effectively. Our results revealed the usefulness of the covalent design strategy against therapeutically relevant IDP targets and provided promising candidates for the treatment of tauopathies.


Assuntos
Proteínas Intrinsicamente Desordenadas , Doenças Neurodegenerativas , Tauopatias , Cisteína , Desenho de Fármacos , Humanos , Proteínas Intrinsicamente Desordenadas/química , Doenças Neurodegenerativas/metabolismo , Tauopatias/tratamento farmacológico , Proteínas tau/metabolismo
11.
Ecol Evol ; 12(2): e8596, 2022 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-35169454

RESUMO

Commercial fishery harvest can influence the evolution of wild fish populations. Our knowledge of selection on morphology is however limited, with most previous studies focusing on body size, age, and maturation. Within species, variation in morphology can influence locomotor ability, possibly making some individuals more vulnerable to capture by fishing gears. Additionally, selection on morphology has the potential to influence other foraging, behavioral, and life-history related traits. Here we carried out simulated fishing using two types of gears: a trawl (an active gear) and a trap (a passive gear), to assess morphological trait-based selection in relation to capture vulnerability. Using geometric morphometrics, we assessed differences in shape between high and low vulnerability fish, showing that high vulnerability individuals display shallower body shapes regardless of gear type. For trawling, low vulnerability fish displayed morphological characteristics that may be associated with higher burst-swimming, including a larger caudal region and narrower head, similar to evolutionary responses seen in fish populations responding to natural predation. Taken together, these results suggest that divergent selection can lead to phenotypic differences in harvested fish populations.

12.
ChemMedChem ; 17(2): e202100569, 2022 01 19.
Artigo em Inglês | MEDLINE | ID: mdl-34632716

RESUMO

Maternal Embryonic Leucine-zipper Kinase (MELK) is a current oncotarget involved in a diverse range of human cancers, with the usage of MELK inhibitors being explored clinically. Here, we aimed to discover new MELK inhibitor chemotypes from our in-house compound library with a consensus-based virtual screening workflow, employing three screening concepts. After careful retrospective validation, prospective screening and in vitro enzyme inhibition testing revealed a series of [1,2,4]triazolo[1,5-b]isoquinolines as a new structural class of MELK inhibitors, with the lead compound of the series exhibiting a sub-micromolar inhibitory activity. The structure-activity relationship of the series was explored by testing further analogs based on a structure-guided selection process. Importantly, the present work marks the first disclosure of the synthesis and bioactivity of this class of compounds.


Assuntos
Inibidores de Proteínas Quinases/farmacologia , Proteínas Serina-Treonina Quinases/antagonistas & inibidores , Relação Dose-Resposta a Droga , Avaliação Pré-Clínica de Medicamentos , Humanos , Simulação de Acoplamento Molecular , Estrutura Molecular , Inibidores de Proteínas Quinases/síntese química , Inibidores de Proteínas Quinases/química , Proteínas Serina-Treonina Quinases/metabolismo , Relação Estrutura-Atividade
13.
Proc Natl Acad Sci U S A ; 118(51)2021 12 21.
Artigo em Inglês | MEDLINE | ID: mdl-34903645

RESUMO

Fisheries induce one of the strongest anthropogenic selective pressures on natural populations, but the genetic effects of fishing remain unclear. Crucially, we lack knowledge of how capture-associated selection and its interaction with reductions in population density caused by fishing can potentially shift which genes are under selection. Using experimental fish reared at two densities and repeatedly harvested by simulated trawling, we show consistent phenotypic selection on growth, metabolism, and social behavior regardless of density. However, the specific genes under selection-mainly related to brain function and neurogenesis-varied with the population density. This interaction between direct fishing selection and density could fundamentally alter the genomic responses to harvest. The evolutionary consequences of fishing are therefore likely context dependent, possibly varying as exploited populations decline. These results highlight the need to consider environmental factors when predicting effects of human-induced selection and evolution.


Assuntos
Pesqueiros , Características de História de Vida , Seleção Genética , Agressão , Animais , Metabolismo Energético/genética , Feminino , Estudos de Associação Genética , Genoma , Masculino , Fenótipo , Densidade Demográfica , Peixe-Zebra
14.
Evol Appl ; 14(10): 2527-2540, 2021 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-34745341

RESUMO

Fishing-associated selection is one of the most important human-induced evolutionary pressures for natural populations. However, it is unclear whether fishing leads to heritable phenotypic changes in the targeted populations, as the heritability and genetic correlations of traits potentially under selection have received little attention. In addition, phenotypic changes could arise from fishing-associated environmental effects, such as reductions in population density. Using fish reared at baseline and reduced group density and repeatedly harvested by simulated trawling, we show that trawling can induce direct selection on fish social behaviour. As sociability has significant heritability and is also genetically correlated with activity and exploration, trawling has the potential to induce both direct selection and indirect selection on a variety of fish behaviours, potentially leading to evolution over time. However, while trawling selection was consistent between density conditions, the heritability and genetic correlations of behaviours changed according to the population density. Fishing-associated environmental effects can thus modify the evolutionary potential of fish behaviour, revealing the need to use a more integrative approach to address the evolutionary consequences of fishing.

15.
Comput Struct Biotechnol J ; 19: 3628-3639, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34257841

RESUMO

Quantification of similarities between protein sequences or DNA/RNA strands is a (sub-)task that is ubiquitously present in bioinformatics workflows, and is usually accomplished by pairwise comparisons of sequences, utilizing simple (e.g. percent identity) or more intricate concepts (e.g. substitution scoring matrices). Complex tasks (such as clustering) rely on a large number of pairwise comparisons under the hood, instead of a direct quantification of set similarities. Based on our recently introduced framework that enables multiple comparisons of binary molecular fingerprints (i.e., direct calculation of the similarity of fingerprint sets), here we introduce novel symmetric similarity indices for analogous calculations on sets of character sequences with more than two (t) possible items (e.g. DNA/RNA sequences with t = 4, or protein sequences with t = 20). The features of these new indices are studied in detail with analysis of variance (ANOVA), and demonstrated with three case studies of protein/DNA sequences with varying degrees of similarity (or evolutionary proximity). The Python code for the extended many-item similarity indices is publicly available at: https://github.com/ramirandaq/tn_Comparisons.

16.
J Pharm Biomed Anal ; 203: 114218, 2021 Sep 05.
Artigo em Inglês | MEDLINE | ID: mdl-34166924

RESUMO

The capability to predict corneal permeability based on physicochemical parameters has always been a desirable objective of ophthalmic drug development. However, previous work has been limited to cases where either the diversity of compounds used was lacking or the performance of the models was poor. Our study provides extensive quantitative structure-property relationship (QSPR) models for corneal permeability predictions. The models involved in vitro corneal permeability measurements of 189 diverse compounds. Preliminary analysis of data showed that there is no significant correlation between corneal-PAMPA (Parallel Artificial Membrane Permeability Assay) permeability values and other pharmacokinetically relevant in silico drug transport parameters like Caco-2, jejunal permeability and blood-brain partition coefficient (logBB). Two different QSPR models were developed: one for corneal permeability and one for corneal membrane retention, based on experimental corneal-PAMPA permeability data. Partial least squares regression was applied for producing the models, which contained classical molecular descriptors and ECFP fingerprints in combination. A complex validation protocol (including internal and external validation) was carried out to provide robust and appropriate predictions for the permeability and membrane retention values. Both models had an overall fit of R2 > 0.90, including R2-values not lower than 0.85 for validation runs, and provide quick and accurate predictions of corneal permeability values for a diverse set of compounds.


Assuntos
Membranas Artificiais , Relação Quantitativa Estrutura-Atividade , Células CACO-2 , Permeabilidade da Membrana Celular , Simulação por Computador , Humanos , Permeabilidade
17.
Mol Divers ; 25(3): 1409-1424, 2021 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-34110577

RESUMO

In this review, we outline the current trends in the field of machine learning-driven classification studies related to ADME (absorption, distribution, metabolism and excretion) and toxicity endpoints from the past six years (2015-2021). The study focuses only on classification models with large datasets (i.e. more than a thousand compounds). A comprehensive literature search and meta-analysis was carried out for nine different targets: hERG-mediated cardiotoxicity, blood-brain barrier penetration, permeability glycoprotein (P-gp) substrate/inhibitor, cytochrome P450 enzyme family, acute oral toxicity, mutagenicity, carcinogenicity, respiratory toxicity and irritation/corrosion. The comparison of the best classification models was targeted to reveal the differences between machine learning algorithms and modeling types, endpoint-specific performances, dataset sizes and the different validation protocols. Based on the evaluation of the data, we can say that tree-based algorithms are (still) dominating the field, with consensus modeling being an increasing trend in drug safety predictions. Although one can already find classification models with great performances to hERG-mediated cardiotoxicity and the isoenzymes of the cytochrome P450 enzyme family, these targets are still central to ADMET-related research efforts.


Assuntos
Desenho de Fármacos , Aprendizado de Máquina , Modelos Moleculares , Relação Quantitativa Estrutura-Atividade , Algoritmos , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Canal de Potássio ERG1/química , Canal de Potássio ERG1/genética , Humanos , Redes Neurais de Computação , Farmacocinética , Máquina de Vetores de Suporte , Distribuição Tecidual
18.
Foods ; 10(5)2021 May 19.
Artigo em Inglês | MEDLINE | ID: mdl-34069392

RESUMO

Binary similarity measures have been used in several research fields, but their application in sensory data analysis is limited as of yet. Since check-all-that-apply (CATA) data consist of binary answers from the participants, binary similarity measures seem to be a natural choice for their evaluation. This work aims to define the discrimination ability of CATA participants by calculating the consensus values of 44 binary similarity measures. The proposed methodology consists of three steps: (i) calculating the binary similarity values of the assessors, sample pair-wise; (ii) clustering participants into good and poor discriminators based on their binary similarity values; (iii) performing correspondence analysis on the CATA data of the two clusters. Results of three case studies are presented, highlighting that a simple clustering based on the computed binary similarity measures results in higher quality correspondence analysis with more significant attributes, as well as better sample discrimination (even according to overall liking).

19.
J Cheminform ; 13(1): 33, 2021 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-33892799

RESUMO

Despite being a central concept in cheminformatics, molecular similarity has so far been limited to the simultaneous comparison of only two molecules at a time and using one index, generally the Tanimoto coefficent. In a recent contribution we have not only introduced a complete mathematical framework for extended similarity calculations, (i.e. comparisons of more than two molecules at a time) but defined a series of novel idices. Part 1 is a detailed analysis of the effects of various parameters on the similarity values calculated by the extended formulas. Their features were revealed by sum of ranking differences and ANOVA. Here, in addition to characterizing several important aspects of the newly introduced similarity metrics, we will highlight their applicability and utility in real-life scenarios using datasets with popular molecular fingerprints. Remarkably, for large datasets, the use of extended similarity measures provides an unprecedented speed-up over "traditional" pairwise similarity matrix calculations. We also provide illustrative examples of a more direct algorithm based on the extended Tanimoto similarity to select diverse compound sets, resulting in much higher levels of diversity than traditional approaches. We discuss the inner and outer consistency of our indices, which are key in practical applications, showing whether the n-ary and binary indices rank the data in the same way. We demonstrate the use of the new n-ary similarity metrics on t-distributed stochastic neighbor embedding (t-SNE) plots of datasets of varying diversity, or corresponding to ligands of different pharmaceutical targets, which show that our indices provide a better measure of set compactness than standard binary measures. We also present a conceptual example of the applicability of our indices in agglomerative hierarchical algorithms. The Python code for calculating the extended similarity metrics is freely available at: https://github.com/ramirandaq/MultipleComparisons.

20.
J Cheminform ; 13(1): 32, 2021 Apr 23.
Artigo em Inglês | MEDLINE | ID: mdl-33892802

RESUMO

Quantification of the similarity of objects is a key concept in many areas of computational science. This includes cheminformatics, where molecular similarity is usually quantified based on binary fingerprints. While there is a wide selection of available molecular representations and similarity metrics, there were no previous efforts to extend the computational framework of similarity calculations to the simultaneous comparison of more than two objects (molecules) at the same time. The present study bridges this gap, by introducing a straightforward computational framework for comparing multiple objects at the same time and providing extended formulas for as many similarity metrics as possible. In the binary case (i.e. when comparing two molecules pairwise) these are naturally reduced to their well-known formulas. We provide a detailed analysis on the effects of various parameters on the similarity values calculated by the extended formulas. The extended similarity indices are entirely general and do not depend on the fingerprints used. Two types of variance analysis (ANOVA) help to understand the main features of the indices: (i) ANOVA of mean similarity indices; (ii) ANOVA of sum of ranking differences (SRD). Practical aspects and applications of the extended similarity indices are detailed in the accompanying paper: Miranda-Quintana et al. J Cheminform. 2021. https://doi.org/10.1186/s13321-021-00504-4 . Python code for calculating the extended similarity metrics is freely available at: https://github.com/ramirandaq/MultipleComparisons .

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA