Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 14 de 14
Filter
Add more filters










Publication year range
1.
Drug Discov Today ; 22(2): 377-381, 2017 02.
Article in English | MEDLINE | ID: mdl-27965161

ABSTRACT

The large costs associated with modern drug discovery mean that governments and regulatory bodies need to provide economic incentives to promote the development of orphan drugs (i.e., medicinal products that are designed to treat rare disease that affect only small numbers of patients). Under European Union (EU) legislation, a medicine can only be authorised for treating a specific rare disease if it is not similar to other orphan drugs already authorised for that particular disease. Here, we discuss the use of 2D fingerprints to calculate the Tanimoto similarity between potential and existing orphan drugs for the same disease, and present logistic regression models correlating these computed similarities with the judgements of human experts.


Subject(s)
Orphan Drug Production , Humans , Legislation, Drug , Molecular Structure
2.
J Chem Inf Model ; 55(2): 214-21, 2015 Feb 23.
Article in English | MEDLINE | ID: mdl-25615712

ABSTRACT

This work describes a genetic algorithm for the calculation of substructural analysis for use in ligand-based virtual screening. The algorithm is simple in concept and effective in operation, with simulated virtual screening experiments using the MDDR and WOMBAT data sets showing it to be superior to substructural analysis weights based on a naive Bayesian classifier.


Subject(s)
Algorithms , Genetics , High-Throughput Screening Assays/methods , Area Under Curve , Bayes Theorem , Cyclooxygenase Inhibitors/chemistry , Cyclooxygenase Inhibitors/pharmacology , Databases, Chemical , Ligands , Machine Learning , Renin/antagonists & inhibitors , Structure-Activity Relationship
3.
Mol Inform ; 34(9): 598-607, 2015 09.
Article in English | MEDLINE | ID: mdl-27490711

ABSTRACT

This paper summarises work in chemoinformatics carried out in the Information School of the University of Sheffield during the period 2002-2014. Research studies are described on fingerprint-based similarity searching, data fusion, applications of reduced graphs and pharmacophore mapping, and on the School's teaching in chemoinformatics.


Subject(s)
Computational Biology , Computer Simulation , Databases, Chemical , Universities
4.
J Cheminform ; 6(1): 5, 2014 Feb 01.
Article in English | MEDLINE | ID: mdl-24485002

ABSTRACT

BACKGROUND: In the European Union, medicines are authorised for some rare disease only if they are judged to be dissimilar to authorised orphan drugs for that disease. This paper describes the use of 2D fingerprints to show the extent of the relationship between computed levels of structural similarity for pairs of molecules and expert judgments of the similarities of those pairs. The resulting relationship can be used to provide input to the assessment of new active compounds for which orphan drug authorisation is being sought. RESULTS: 143 experts provided judgments of the similarity or dissimilarity of 100 pairs of drug-like molecules from the DrugBank 3.0 database. The similarities of these pairs were also computed using BCI, Daylight, ECFC4, ECFP4, MDL and Unity 2D fingerprints. Logistic regression analyses demonstrated a strong relationship between the human and computed similarity assessments, with the resulting regression models having significant predictive power in experiments using data from submissions of orphan drug medicines to the European Medicines Agency. The BCI fingerprints performed best overall on the DrugBank dataset while the BCI, Daylight, ECFP4 and Unity fingerprints performed comparably on the European Medicines Agency dataset. CONCLUSIONS: Measures of structural similarity based on 2D fingerprints can provide a useful source of information for the assessment of orphan drug status by regulatory authorities.

5.
Bioorg Med Chem ; 20(18): 5366-71, 2012 Sep 15.
Article in English | MEDLINE | ID: mdl-22484008

ABSTRACT

Consensus clustering involves combining multiple clusterings of the same set of objects to achieve a single clustering that will, hopefully, provide a better picture of the groupings that are present in a dataset. This Letter reports the use of consensus clustering methods on sets of chemical compounds represented by 2D fingerprints. Experiments with DUD, IDAlert, MDDR and MUV data suggests that consensus methods are unlikely to result in significant improvements in clustering effectiveness as compared to the use of a single clustering method.


Subject(s)
Cluster Analysis , Databases, Pharmaceutical , Pharmaceutical Preparations/analysis , Pharmaceutical Preparations/chemistry , Molecular Structure
6.
J Cheminform ; 3(1): 29, 2011 Aug 08.
Article in English | MEDLINE | ID: mdl-21824430

ABSTRACT

BACKGROUND: Data fusion methods are widely used in virtual screening, and make the implicit assumption that the more often a molecule is retrieved in multiple similarity searches, the more likely it is to be active. This paper tests the correctness of this assumption. RESULTS: Sets of 25 searches using either the same reference structure and 25 different similarity measures (similarity fusion) or 25 different reference structures and the same similarity measure (group fusion) show that large numbers of unique molecules are retrieved by just a single search, but that the numbers of unique molecules decrease very rapidly as more searches are considered. This rapid decrease is accompanied by a rapid increase in the fraction of those retrieved molecules that are active. There is an approximately log-log relationship between the numbers of different molecules retrieved and the number of searches carried out, and a rationale for this power-law behaviour is provided. CONCLUSIONS: Using multiple searches provides a simple way of increasing the precision of a similarity search, and thus provides a justification for the use of data fusion methods in virtual screening.

7.
Future Med Chem ; 3(4): 405-14, 2011 Mar.
Article in English | MEDLINE | ID: mdl-21452977

ABSTRACT

BACKGROUND: It has been suggested that similarity searching using 2D fingerprints may not be suitable for scaffold hopping. METHODS: This article reports a detailed evaluation of the effectiveness of six common types of 2D fingerprints when they are used for scaffold-hopping similarity searches of the Molecular Design Limited Drug Data Report database, World of Molecular Bioactivity database and Maximum Unbiased Validation database. RESULTS: The results demonstrate that 2D fingerprints can be used for scaffold hopping, with novel scaffolds being identified in nearly every search that was carried out. The degree of enrichment depends on the structural diversity of the actives that are being sought, with the greatest enrichments often being obtained using the extended connectivity fingerprint encoding a circular substructure of diameter four bonds (ECFP4) fingerprint. CONCLUSION: 2D fingerprints provide a simple and computationally efficient way of identifying novel chemotypes in lead-discovery programs.


Subject(s)
Artificial Intelligence , Drug Design , Databases, Factual , Pharmaceutical Preparations/chemistry , Quantitative Structure-Activity Relationship
8.
J Chem Inf Model ; 50(8): 1340-9, 2010 Aug 23.
Article in English | MEDLINE | ID: mdl-20672867

ABSTRACT

This paper discusses the weighting of two-dimensional fingerprints for similarity-based virtual screening, specifically the use of weights that assign greatest importance to the substructural fragments that occur least frequently in the database that is being screened. Virtual screening experiments using the MDL Drug Data Report and World of Molecular Bioactivity databases show that the use of such inverse frequency weighting schemes can result, in some circumstances, in marked increases in screening effectiveness when compared with the use of conventional, unweighted fingerprints. Analysis of the characteristics of the various schemes demonstrates that such weights are best used to weight the fingerprint of the reference structure in a similarity search, with the database structures' fingerprints unweighted. However, the increases in performance resulting from such weights are only observed with structurally homogeneous sets of active molecules; when the actives are diverse, the best results are obtained using conventional, unweighted fingerprints for both the reference structure and the database structures.


Subject(s)
Drug Design , Computer Simulation , Databases, Factual , Molecular Structure
9.
J Comput Aided Mol Des ; 23(9): 655-68, 2009 Sep.
Article in English | MEDLINE | ID: mdl-19536456

ABSTRACT

Current systems for similarity-based virtual screening use similarity measures in which all the fragments in a fingerprint contribute equally to the calculation of structural similarity. This paper discusses the weighting of fragments on the basis of their frequencies of occurrence in molecules. Extensive experiments with sets of active molecules from the MDL Drug Data Report and the World of Molecular Bioactivity databases, using fingerprints encoding Tripos holograms, Pipeline Pilot ECFC_4 circular substructures and Sunset Molecular keys, demonstrate clearly that frequency-based screening is generally more effective than conventional, unweighted screening. The results suggest that standardising the raw occurrence frequencies by taking the square root of the frequencies will maximise the effectiveness of virtual screening. An upper-bound analysis shows the complex interactions that can take place between representations, weighting schemes and similarity coefficients when similarity measures are computed, and provides a rationalisation of the relative performance of the various weighting schemes.


Subject(s)
Drug Discovery/methods , Molecular Structure , Algorithms , Databases, Factual , Holography/methods , Pharmaceutical Preparations/chemistry , Pharmaceutical Preparations/classification , Quantitative Structure-Activity Relationship
10.
J Chem Inf Model ; 49(2): 155-61, 2009 Feb.
Article in English | MEDLINE | ID: mdl-19434820

ABSTRACT

Standardization is used to ensure that the variables in a similarity calculation make an equal contribution to the computed similarity value. This paper compares the use of seven different methods that have been suggested previously for the standardization of integer-valued or real-valued data, comparing the results with unstandardized data. Sets of structures from the MDL Drug Data Report and IDAlert databases and represented by Pipeline Pilot physicochemical parameters, molecular holograms and Molconn-Z parameters are clustered using the k-means and Ward's clustering methods. The resulting classifications are evaluated in terms of the degree of clustering of active compounds selected from eleven different biological activity classes, with these classes also being used in similarity searches. It is shown that there is no consistent pattern when the various standardization methods are ranked in order of decreasing effectiveness and that there is no obvious performance benefit (when compared to unstandardized data) that is likely to be obtained from the use of any particular standardization method.


Subject(s)
Molecular Structure , Cluster Analysis
11.
Bioinformation ; 1(7): 237-41, 2006 Nov 14.
Article in English | MEDLINE | ID: mdl-17597897

ABSTRACT

Peptides are of great therapeutic potential as vaccines and drugs. Knowledge of physicochemical descriptors, including the partition coefficient logP, is useful for the development of predictive Quantitative Structure-Activity Relationships (QSARs). We have investigated the accuracy of available programs for the prediction of logP values for peptides with known experimental values obtained from the literature. Eight prediction programs were tested, of which seven programs were fragment-based methods: XLogP, LogKow, PLogP, ACDLogP, AlogP, Interactive Analysis's LogP and MlogP; and one program used a whole molecule approach: QikProp. The predictive accuracy of the programs was assessed using r(2) values, with ALogP being the most effective (r( 2) = 0.822) and MLogP the least (r(2) = 0.090). We also examined three distinct types of peptide structure: blocked, unblocked, and cyclic. For each study (all peptides, blocked, unblocked and cyclic peptides) the performance of programs rated from best to worse is as follows: all peptides - ALogP, QikProp, PLogP, XLogP, IALogP, LogKow, ACDLogP, and MlogP; blocked peptides - PLogP, XLogP, ACDLogP, IALogP, LogKow, QikProp, ALogP, and MLogP; unblocked peptides - QikProp, IALogP, ALogP, ACDLogP, MLogP, XLogP, LogKow and PLogP; cyclic peptides - LogKow, ALogP, XLogP, MLogP, QikProp, ACDLogP, IALogP. In summary, all programs gave better predictions for blocked peptides, while, in general, logP values for cyclic peptides were under-predicted and those of unblocked peptides were over-predicted.

12.
J Chem Inf Comput Sci ; 44(3): 894-902, 2004.
Article in English | MEDLINE | ID: mdl-15154754

ABSTRACT

This paper evaluates the use of the fuzzy k-means clustering method for the clustering of files of 2D chemical structures. Simulated property prediction experiments with the Starlist file of logP values demonstrate that use of the fuzzy k-means method can, in some cases, yield results that are superior to those obtained with the conventional k-means method and with Ward's clustering method. Clustering of several small sets of agrochemical compounds demonstrate the ability of the fuzzy k-means method to highlight multicluster membership and to identify outlier compounds, although the former can be difficult to interpret in some cases.

13.
J Chem Inf Comput Sci ; 43(3): 819-28, 2003.
Article in English | MEDLINE | ID: mdl-12767139

ABSTRACT

We discuss the size-bias inherent in several chemical similarity coefficients when used for the similarity searching or diversity selection of compound collections. Limits to the upper bounds of 14 standard similarity coefficients are investigated, and the results are used to identify some exceptional characteristics of a few of the coefficients. An additional numerical contribution to the known size bias in the Tanimoto coefficient is identified. Graphical plots with respect to relative bit density are introduced to further assess the coefficients. Our methods reveal the asymmetries inherent in most similarity coefficients that lead to bias in selection, most notably with the Forbes and Russell-Rao coefficients. Conversely, when applied to the recently introduced Modified Tanimoto coefficient our methods provide support for the view that it is less biased toward molecular size than most. In this work we focus our discussion on fragment-based bit strings, but we demonstrate how our approach can be generalized to continuous representations.

14.
J Chem Inf Comput Sci ; 43(2): 406-11, 2003.
Article in English | MEDLINE | ID: mdl-12653502

ABSTRACT

This paper discusses the calculation of the similarities between pairs of substituents on ring systems. An R-group descriptor characterizes the distribution of some atom-based property, such as elemental type or partial atomic charge, at increasing numbers of bonds distant from the point of substitution on the parent ring. The similarity between a pair of descriptors is then calculated by a comparison of the corresponding property vectors. Experiments with the BIOSTER database demonstrate the ability of such similarity measures to discriminate between bioisosteric and nonbioisosteric functional groups.

SELECTION OF CITATIONS
SEARCH DETAIL