Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 37
Filter
1.
J Viral Hepat ; 20(12): 882-9, 2013 Dec.
Article in English | MEDLINE | ID: mdl-24304458

ABSTRACT

Human APOBEC3 (A3) cytosine deaminases are antiviral restriction factors capable of editing the genome the hepatitis B virus (HBV). Despite the importance of the human A3 protein family for the innate immune response little is known about the clinical relevance for hepatitis B. The aim of this study was to utilize ultra-deep pyrosequencing (UDPS) data to analyse the phenomenon of G-to-A hypermutation of the complete HBV genome and to relate it to fundamental characteristics of patients with chronic hepatitis B. By analysing the viral population of 80 treatment naïve patients (47 HBeAg-positive and 33 HBeAg-negative), we identified an unequal distribution of G-to-A hypermutations across the genome. Our data indicate that G-to-A hypermutation occurs predominantly in a region between nucleotide positions 600 and 1800 a region which is usually single stranded in matured HBV particles. This implies that A3 likely edits HBV in the virion. Hypermutation rates for HBeAg-negative patients were more than 10-fold higher than those of HBeAg-positive patients. For HBeAg-negative patients higher hypermutation rates were significantly associated with the degree of fibrosis. Additionally, we found that for HBeAg-positive chronic hepatitis G-to-A hypermutation rates were significantly associated with the relative prevalence of the G1764A mutation, which is related to HBeAg seroconversion. In total, our data imply an important association of hypermutation mediated by A3 deaminases with the natural progression of chronic hepatitis B infections both in terms of HBeAg seroconversion and disease progression towards cirrhosis.


Subject(s)
Hepatitis B virus/genetics , Hepatitis B virus/pathogenicity , Hepatitis B, Chronic/pathology , Hepatitis B, Chronic/virology , Mutation , Adolescent , Adult , Aged , DNA, Viral , Disease Progression , Female , Hepatitis B Surface Antigens/blood , High-Throughput Nucleotide Sequencing , Humans , Liver Cirrhosis/pathology , Liver Cirrhosis/virology , Male , Middle Aged , Molecular Sequence Data , Young Adult
2.
Article in German | MEDLINE | ID: mdl-24170077

ABSTRACT

Medicine is experiencing a period of change: Extensive molecular biological data on the patient are increasingly included in diagnosis and treatment. This trend is based on the development of targeted drugs and accompanying diagnostics, which serve the purpose of providing advance evidence that the medication promises therapy success for the patient. According to this concept drugs are often given in combination. The sizes of patient groups for which a given therapy out of many possible alternatives can be expected to be successful are quite limited. The relationship between the molecular data pertaining to a patient and their disease phenotype are complex and cannot be determined manually. Thus, computer-based bioinformatics methods play a central role in interpreting the molecular data and as an instrument for providing recommendations for the practicing physician. Bioinformatics is an essential component in basic research, in the development of new concepts for diagnosis and therapy as well as in clinical practice, in which these concepts are applied to treating patients. This article discusses the role of bioinformatics in both basic research and clinical practice. We present the example of treatment of HIV patients, for which bioinformatics-assisted therapy selection has already entered clinical practice. Such a therapy concept is also predestined for other diseases (e.g., cancer). The article concludes with remarks on the prerequisites for society as a whole for ensuring success of this concept of personalised medicine as a factor of medical progress.


Subject(s)
Computational Biology/methods , Drug Design , Molecular Diagnostic Techniques/methods , Molecular Targeted Therapy/methods , Pharmacogenetics/methods , Precision Medicine/methods , Humans
3.
HIV Med ; 12(4): 211-8, 2011 Apr.
Article in English | MEDLINE | ID: mdl-20731728

ABSTRACT

OBJECTIVES: The EuResist expert system is a novel data-driven online system for computing the probability of 8-week success for any given pair of HIV-1 genotype and combination antiretroviral therapy regimen plus optional patient information. The objective of this study was to compare the EuResist system vs. human experts (EVE) for the ability to predict response to treatment. METHODS: The EuResist system was compared with 10 HIV-1 drug resistance experts for the ability to predict 8-week response to 25 treatment cases derived from the EuResist database validation data set. All current and past patient data were made available to simulate clinical practice. The experts were asked to provide a qualitative and quantitative estimate of the probability of treatment success. RESULTS: There were 15 treatment successes and 10 treatment failures. In the classification task, the number of mislabelled cases was six for EuResist and 6-13 for the human experts [mean±standard deviation (SD) 9.1±1.9]. The accuracy of EuResist was higher than the average for the experts (0.76 vs. 0.64, respectively). The quantitative estimates computed by EuResist were significantly correlated (Pearson r=0.695, P<0.0001) with the mean quantitative estimates provided by the experts. However, the agreement among experts was only moderate (for the classification task, inter-rater κ=0.355; for the quantitative estimation, mean±SD coefficient of variation=55.9±22.4%). CONCLUSIONS: With this limited data set, the EuResist engine performed comparably to or better than human experts. The system warrants further investigation as a treatment-decision support tool in clinical practice.


Subject(s)
Expert Systems , HIV Infections/drug therapy , HIV-1/drug effects , Databases, Factual , Female , HIV Infections/genetics , HIV Infections/virology , HIV-1/genetics , Humans , Male , Probability , Treatment Outcome , Viral Load
4.
J Viral Hepat ; 17(3): 217-21, 2010 Mar.
Article in English | MEDLINE | ID: mdl-19758279

ABSTRACT

The mechanisms of synergy in antiviral activity of interferon-alpha and ribavirin in treating chronic hepatitis C virus (HCV) infection are still unknown. Interferon-alpha indirectly induces cleavage of viral RNA by RNase L at UU/UA dinucleotides. There is evidence that HCV genomes with a higher number of UU/UA dinucleotides are more sensitive to interferon-alpha. As a guanosine analogue, ribavirin exerts a mutagenic effect promoting G-to-A and C-to-U transitions. This study investigates whether ribavirin-induced mutagenesis causes a higher frequency of UU/UA dinucleotides in the viral progeny sequences. Increased mutational frequencies in favour of G-to-A and C-to-U transitions during ribavirin treatment was reported by Hofmann et al. (Gastroenterology 2007;132:921-930). Overall, 937 nucleotide sequences from that publication were reanalysed for RNase L cleavage sites. These included HCV NS3 quasispecies from three patients with ribavirin monotherapy and NS5B quasispecies from patients who received ribavirin alone (n = 7) or in combination with interferon-alpha (n = 7) at baseline and during treatment; NS5B quasispecies from a subgenomic HCV replicon system after 24, 48 and 72 h of cultivation with or without ribavirin or with levovirin. For NS3 quasispecies during ribavirin monotherapy and NS5B quasispecies from patients who received ribavirin alone or in combination with interferon-alpha, analysis of RNase L cleavage sites did not reveal changes during treatment or differences between treatment regimes. Similarly, RNaseL cleavage sites from NS5B quasispecies of the HCV replicon did not differ significantly between time points or treatments. In conclusion, Ribavirin-induced mutagenesis did not increase RNase L cleavage sites (UU/UA dinucleotides) within the HCV NS3 or NS5B encoding regions.


Subject(s)
Antiviral Agents/therapeutic use , Endoribonucleases/metabolism , Hepacivirus/drug effects , Hepatitis C, Chronic/drug therapy , Interferon-alpha/therapeutic use , Polyethylene Glycols/therapeutic use , Ribavirin/therapeutic use , Antiviral Agents/pharmacology , Binding Sites , Cell Line , Genome, Viral , Humans , Interferon alpha-2 , Point Mutation , Recombinant Proteins , Ribavirin/pharmacology , Selection, Genetic , Viral Nonstructural Proteins/genetics , Virus Cultivation
5.
Eur J Med Res ; 12(9): 453-62, 2007 Oct 15.
Article in English | MEDLINE | ID: mdl-17933727

ABSTRACT

HIV infects target cells by binding of its envelope gp120 protein to CD4 and a coreceptor on the cell surface. In vivo, the different HIV-strains use either CCR5 or CXCR4 as coreceptor. CCR5-using strains are named R5 viruses, while CXCR4-using strains are named X4. X4 viruses usually occur in the later stages. Coreceptor usage is a marker for disease progression. Additionally interest on coreceptors continually raises as a consequence of the development of a new class of antiretroviral drugs, namely the coreceptor antagonists or blockers. These specific drugs block the CCR5 or the CXCR4 coreceptors. So far, the CXCR4 blockers are not allowed to be used in the clinical practice due to their severe side effects. On the other hand, CCR5 blockers are currently in clinical practice, although they can only be administered after a baseline determination of the coreceptor usage of the predominant viral strain. Most of the coreceptor analyses in clinical cohorts have been performed with commercially available phenotypic assays. As for resistance testing of NRTIs, NNRTIs and PIs, efforts have also been made to predict the coreceptor usage from the genotype of the viruses. Different rules have been published based on the amino acid sequence of the Env-V3 region of HIV-gp120, which is known to be the major determinant of coreceptor usage. Among these, the most widely used is the 11/25 rule. Recently, bioinformatics driven prediction systems have been developed. Three of the interpretation systems are freely available via internet: WetCat, WebPSSM, geno2pheno[coreceptor]. All three systems focus on the Env-V3 region and take the amino acid sequence only into account. They learn from phenotypic and corresponding genotypic data. So far, two cohorts have been analyzed with such a genotypic approach and provided frequencies of R5 virus strains that are within the range of those reported with phenotypic assays. For one of the systems, geno2pheno[coreceptor], additional clinical data (e.g. CD4+T-cell counts) or structural information can be used to improve the prediction. Such genotypic systems provide the possibility for rapid screening of patients who may be administered with CCR5 blockers like the recently licensed Maraviroc.


Subject(s)
Anti-HIV Agents/therapeutic use , Drug Resistance, Viral/genetics , HIV Infections/drug therapy , HIV-1/drug effects , Receptors, CCR5/genetics , Receptors, CXCR4/genetics , Software , Amino Acid Sequence , Anti-HIV Agents/pharmacology , CCR5 Receptor Antagonists , HIV Envelope Protein gp120/chemistry , HIV Envelope Protein gp120/genetics , HIV Envelope Protein gp120/physiology , HIV Infections/virology , HIV-1/classification , HIV-1/genetics , Humans , Molecular Sequence Data , Prognosis , Receptors, CCR5/physiology , Receptors, CXCR4/antagonists & inhibitors , Receptors, CXCR4/physiology
6.
J Viral Hepat ; 14(5): 338-49, 2007 May.
Article in English | MEDLINE | ID: mdl-17439523

ABSTRACT

Chronic hepatitis C is a major cause of liver cirrhosis leading to chronic liver failure and hepatocellular carcinoma. Different hepatitis C virus (HCV) proteins have been associated with resistance to interferon-alpha-based therapy. However, the exact mechanisms of virus-mediated interferon resistance are not completely understood. The importance of amino acid (aa) variations within the HCV nonstructural (NS)4B protein for replication efficiency and viral decline during the therapy is unknown. We investigated pretreatment sera from 42 patients with known outcome to interferon-based therapy. The complete NS4B gene was amplified and sequenced. Mutational analyses of predicted conformational, functional, structural and phylogenetic properties of the deduced aa sequences were performed. The complete NS4B protein was highly conserved with a median frequency of 0.015 +/- 0.009 aa exchanges (median +/- SD, 4.00 +/- 2.31). Especially within the predicted transmembranous domains of the NS4B protein, the mean number of aa variations was low (median frequency, 0.013 +/- 0.013). Neither the number of aa variations nor specific aa exchanges were correlated with HCV RNA serum concentration at baseline. A rapid initial HCV RNA decline of >/=1.5 log(10) IU/mL at week 2 of interferon-based therapy was associated with a higher frequency of nonconservative aa exchanges within the complete NS4B protein in comparison with patients with a nonrapid HCV RNA decline (median frequency, 0.011 +/- 0.005 vs 0.004 +/- 0.003, P = 0.006). Overall, the aa sequence of the NS4B protein was highly conserved, indicating an important role for replication in vivo. Amino acid variations with relevant changes of physicochemical properties may influence replication efficiency, associated with a rapid early virological response.


Subject(s)
Amino Acid Sequence , Antiviral Agents/therapeutic use , Hepatitis C, Chronic/drug therapy , Interferon-alpha/therapeutic use , Viral Nonstructural Proteins/genetics , Adult , Aged , Amino Acid Substitution , Amino Acids , Consensus Sequence , DNA Mutational Analysis , Genotype , Hepatitis C, Chronic/classification , Hepatitis C, Chronic/epidemiology , Hepatitis C, Chronic/genetics , Hepatitis C, Chronic/virology , Humans , Kinetics , Male , Middle Aged , Molecular Sequence Data , Nucleic Acid Amplification Techniques , Phylogeny , Protein Conformation , Protein Structure, Tertiary , RNA, Viral/blood , Sequence Analysis, DNA , Sequence Homology, Amino Acid , Treatment Outcome , Viral Nonstructural Proteins/chemistry , Viral Nonstructural Proteins/classification , White People/statistics & numerical data
7.
Bioinformatics ; 20(5): 770-6, 2004 Mar 22.
Article in English | MEDLINE | ID: mdl-14751994

ABSTRACT

MOTIVATION: We introduce a new approach to using the information contained in sequence-to-function prediction data in order to recognize protein template classes, a critical step in predicting protein structure. The data on which our method is based comprise probabilities of functional categories; for given query sequences these probabilities are obtained by a neural net that has previously been trained on a variety of functionally important features. On a training set of sequences we assess the relevance of individual functional categories for identifying a given structural family. Using a combination of the most relevant categories, the likelihood of a query sequence to belong to a specific family can be estimated. RESULTS: The performance of the method is evaluated using cross-validation. For a fixed structural family and for every sequence, a score is calculated that measures the evidence for family membership. Even for structural families of small size, family members receive significantly higher scores. For some examples, we show that the relevant functional features identified by this method are biologically meaningful. The proposed approach can be used to improve existing sequence-to-structure prediction methods. AVAILABILITY: Matlab code is available on request from the authors. The data are available at http://www.mpisb.mpg.de/~sommer/Fun2Struc/


Subject(s)
Algorithms , Artificial Intelligence , Proteins/chemistry , Proteins/metabolism , Sequence Alignment/methods , Sequence Analysis, Protein/methods , Amino Acid Sequence , Molecular Sequence Data , Pattern Recognition, Automated , Proteins/classification , Sequence Homology, Amino Acid , Software , Structure-Activity Relationship
8.
Bioinformatics ; 20(2): 268-70, 2004 Jan 22.
Article in English | MEDLINE | ID: mdl-14734319

ABSTRACT

SUMMARY: The Helmholtz Network for Bioinformatics (HNB) is a joint venture of eleven German bioinformatics research groups that offers convenient access to numerous bioinformatics resources through a single web portal. The 'Guided Solution Finder' which is available through the HNB portal helps users to locate the appropriate resources to answer their queries by employing a detailed, tree-like questionnaire. Furthermore, automated complex tool cascades ('tasks'), involving resources located on different servers, have been implemented, allowing users to perform comprehensive data analyses without the requirement of further manual intervention for data transfer and re-formatting. Currently, automated cascades for the analysis of regulatory DNA segments as well as for the prediction of protein functional properties are provided. AVAILABILITY: The HNB portal is available at http://www.hnbioinfo.de


Subject(s)
Algorithms , Computational Biology/methods , Database Management Systems , Information Storage and Retrieval/methods , Internet , Sequence Analysis, DNA/methods , Sequence Analysis, Protein/methods , User-Computer Interface , Computational Biology/organization & administration , Germany , Interinstitutional Relations , Software
9.
Bioinformatics ; 17 Suppl 1: S323-31, 2001.
Article in English | MEDLINE | ID: mdl-11473024

ABSTRACT

Microarrays measure values that are approximately proportional to the numbers of copies of different mRNA molecules in samples. Due to technical difficulties, the constant of proportionality between the measured intensities and the numbers of mRNA copies per cell is unknown and may vary for different arrays. Usually, the data are normalized (i.e., array-wise multiplied by appropriate factors) in order to compensate for this effect and to enable informative comparisons between different experiments. Centralization is a new two-step method for the computation of such normalization factors that is both biologically better motivated and more robust than standard approaches. First, for each pair of arrays the quotient of the constants of proportionality is estimated. Second, from the resulting matrix of pairwise quotients an optimally consistent scaling of the samples is computed.


Subject(s)
Computational Biology , Gene Expression Profiling/statistics & numerical data , Oligonucleotide Array Sequence Analysis/statistics & numerical data , Genetic Techniques/statistics & numerical data , Models, Genetic , RNA, Messenger/genetics , RNA, Messenger/metabolism
10.
Acta Crystallogr A ; 57(Pt 4): 442-50, 2001 Jul.
Article in English | MEDLINE | ID: mdl-11418755

ABSTRACT

The ever increasing number of experimentally resolved crystal structures supports the possibility of fully empirical crystal structure prediction for small organic molecules. Empirical methods promise to be significantly more efficient than methods that attempt to solve the same problem from first principles. However, the transformation from data to empirical knowledge and further to functional algorithms is not trivial and the usefulness of the result depends strongly on the quantity and the quality of the data. In this work, a simple scoring function is parameterized to discriminate between the correct structure and a set of decoys for a large number of different molecular systems. The method is fully automatic and has the advantage that the complete scoring function is parametrized at once, leading to a self-consistent set of parameters. The obtained scoring function is tested on an independent set of crystal structures taken from the P1 and P1; space groups. With the trained scoring function and FlexCryst, a program for small-molecule crystal structure prediction, it is shown that approximately 73% of the 239 tested molecules in space group P1 are predicted correctly. For the more complex space group P1;, the success rate is 26%. Comparison with force-field potentials indicates the physical content of the obtained scoring function, a result of direct importance for protein threading where such database-based potentials are being applied.

11.
J Mol Biol ; 308(2): 377-95, 2001 Apr 27.
Article in English | MEDLINE | ID: mdl-11327774

ABSTRACT

Side-chain or even backbone adjustments upon docking of different ligands to the same protein structure, a phenomenon known as induced fit, are frequently observed. Sometimes point mutations within the active site influence the ligand binding of proteins. Furthermore, for homology derived protein structures there are often ambiguities in side-chain placement and uncertainties in loop modeling which may be critical for docking applications. Nevertheless, only very few molecular docking approaches have taken into account such variations in protein structures. We present the new software tool FlexE which addresses the problem of protein structure variations during docking calculations. FlexE can dock flexible ligands into an ensemble of protein structures which represents the flexibility, point mutations, or alternative models of a protein. The FlexE approach is based on a united protein description generated from the superimposed structures of the ensemble. For varying parts of the protein, discrete alternative conformations are explicitly taken into account, which can be combinatorially joined to create new valid protein structures.FlexE was evaluated using ten protein structure ensembles containing 105 crystal structures from the PDB and one modeled structure with 60 ligands in total. For 50 ligands (83 %) FlexE finds a placement with an RMSD to the crystal structure below 2.0 A. In all cases our results are of similar quality to the best solution obtained by sequentially docking the ligands into all protein structures (cross docking). In most cases the computing time is significantly lower than the accumulated run times for the single structures. FlexE takes about five and a half minutes on average for placing one ligand into the united protein description on a common workstation. The example of the aldose reductase demonstrates the necessity of considering protein structure variations for docking calculations. We docked three potent inhibitors into four protein structures with substantial conformational changes within the active site. Using only one rigid protein structure for screening would have missed potential inhibitors whereas all inhibitors can be docked taking all protein structures into account.


Subject(s)
Computer Simulation , Proteins/chemistry , Proteins/metabolism , Software , Aldehyde Reductase/antagonists & inhibitors , Aldehyde Reductase/chemistry , Aldehyde Reductase/metabolism , Algorithms , Animals , Binding Sites , Crystallography, X-Ray , Drug Design , Enzyme Inhibitors/chemistry , Enzyme Inhibitors/metabolism , Folic Acid/analogs & derivatives , Folic Acid/chemistry , Folic Acid/metabolism , Folic Acid Antagonists/chemistry , Folic Acid Antagonists/metabolism , Humans , Internet , Ligands , Methotrexate/chemistry , Methotrexate/metabolism , Models, Molecular , Pliability , Point Mutation/genetics , Protein Binding , Protein Conformation , Proteins/antagonists & inhibitors , Proteins/genetics , Tetrahydrofolate Dehydrogenase/chemistry , Tetrahydrofolate Dehydrogenase/metabolism , Time Factors
12.
J Comput Biol ; 7(3-4): 483-501, 2000.
Article in English | MEDLINE | ID: mdl-11108475

ABSTRACT

Various bioinformatics problems require optimizing several different properties simultaneously. For example, in the protein threading problem, a scoring function combines the values for different parameters of possible sequence-to-structure alignments into a single score to allow for unambiguous optimization. In this context, an essential question is how each property should be weighted. As the native structures are known for some sequences, a partial ordering on optimal alignments to other structures, e.g., derived from structural comparisons, may be used to adjust the weights. To resolve the arising interdependence of weights and computed solutions, we propose a heuristic approach: iterating the computation of solutions (here, threading alignments) given the weights and the estimation of optimal weights of the scoring function given these solutions via systematic calibration methods. For our application (i.e., threading), this iterative approach results in structurally meaningful weights that significantly improve performance on both the training and the test data sets. In addition, the optimized parameters show significant improvements on the recognition rate for a grossly enlarged comprehensive benchmark, a modified recognition protocol as well as modified alignment types (local instead of global and profiles instead of single sequences). These results show the general validity of the optimized weights for the given threading program and the associated scoring contributions.


Subject(s)
Computational Biology/methods , Algorithms , Models, Molecular , Protein Folding , Proteins/chemistry , Software
13.
Bioinformatics ; 16(9): 799-807, 2000 Sep.
Article in English | MEDLINE | ID: mdl-11108702

ABSTRACT

MOTIVATION: In order to extract protein sequences from nucleotide sequences, it is an important step to recognize points at which regions start that code for proteins. These points are called translation initiation sites (TIS). RESULTS: The task of finding TIS can be modeled as a classification problem. We demonstrate the applicability of support vector machines for this task, and show how to incorporate prior biological knowledge by engineering an appropriate kernel function. With the described techniques the recognition performance can be improved by 26% over leading existing approaches. We provide evidence that existing related methods (e.g. ESTScan) could profit from advanced TIS recognition.


Subject(s)
Codon, Initiator/genetics , Computational Biology/methods , Genomics/methods , Protein Biosynthesis/genetics , Sequence Analysis, DNA/methods , Algorithms , Animals , Databases, Factual , Humans , Predictive Value of Tests , Reproducibility of Results , Sensitivity and Specificity , Vertebrates/genetics
14.
Bioinformatics ; 16(9): 825-36, 2000 Sep.
Article in English | MEDLINE | ID: mdl-11108705

ABSTRACT

MOTIVATION: A number of metabolic databases are available electronically, some with features for querying and visualizing metabolic pathways and regulatory networks. We present a unifying, systematic approach based on PETRI nets for storing, displaying, comparing, searching and simulating such nets from a number of different sources. RESULTS: Information from each data source is extracted and compiled into a PETRI net. Such PETRI nets then allow to investigate the (differential) content in metabolic databases, to map and integrate genomic information and functional annotations, to compare sequence and metabolic databases with respect to their functional annotations, and to define, generate and search paths and pathways in nets. We present an algorithm to systematically generate all pathways satisfying additional constraints in such PETRI nets. Finally, based on the set of valid pathways, so-called differential metabolic displays (DMDs) are introduced to exhibit specific differences between biological systems, i.e. different developmental states, disease states, or different organisms, on the level of paths and pathways. DMDs will be useful for target finding and function prediction, especially in the context of the interpretation of expression data.


Subject(s)
Algorithms , Computational Biology/methods , Data Display , Databases, Factual , Metabolism/physiology , Catalysis , Computer Simulation , Enzymes/genetics , Enzymes/metabolism , Glycolysis , Mycoplasma/metabolism , Yeasts/metabolism
15.
Article in English | MEDLINE | ID: mdl-10977101

ABSTRACT

We present a new approach for the evaluation of gene expression data. The basic idea is to generate biologically possible pathways and to score them with respect to gene expression measurements. We suggest sample scoring functions for different problem specifications. We assess the significance of the scores for the investigated pathways by comparison to a number of scores for random pathways. We show that simple scoring functions can assign statistically significant scores to biologically relevant pathways. This suggests that the combination of appropriate scoring functions with the systematic generation of pathways can be used in order to select the most interesting pathways based on gene expression measurements.


Subject(s)
Computer Simulation , Gene Expression , Models, Genetic , Models, Theoretical , Animals , Humans
16.
J Comput Aided Mol Des ; 14(3): 215-32, 2000 Mar.
Article in English | MEDLINE | ID: mdl-10756477

ABSTRACT

In drug design, often enough, no structural information on a particular receptor protein is available. However, frequently a considerable number of different ligands is known together with their measured binding affinities towards a receptor under consideration. In such a situation, a set of plausible relative superpositions of different ligands, hopefully approximating their putative binding geometry, is usually the method of choice for preparing data for the subsequent application of 3D methods that analyze the similarity or diversity of the ligands. Examples are 3D-QSAR studies, pharmacophore elucidation, and receptor modeling. An aggravating fact is that ligands are usually quite flexible and a rigorous analysis has to incorporate molecular flexibility. We review the past six years of scientific publishing on molecular superposition. Our focus lies on automatic procedures to be performed on arbitrary molecular structures. Methodical aspects are our main concern here. Accordingly, plain application studies with few methodical elements are omitted in this presentation. While this review cannot mention every contribution to this actively developing field, we intend to provide pointers to the recent literature providing important contributions to computational methods for the structural alignment of molecules. Finally we provide a perspective on how superposition methods can effectively be used for the purpose of virtual database screening. In our opinion it is the ultimate goal to detect analogues in structure databases of nontrivial size in order to narrow down the search space for subsequent experiments.


Subject(s)
Computer Simulation , Molecular Structure , Algorithms , Database Management Systems , Proteins/chemistry , Structure-Activity Relationship
17.
Brief Bioinform ; 1(3): 275-88, 2000 Sep.
Article in English | MEDLINE | ID: mdl-11465038

ABSTRACT

Along the long path from genomic data to a new drug, the knowledge of three-dimensional protein structure can be of significant help in several places. This paper points out such places, discusses the virtues of protein structure knowledge and reviews bioinformatics methods for gaining such knowledge on the protein structure.


Subject(s)
Drug Design , Proteins/chemistry , Computational Biology , Computer Simulation , Computer-Aided Design , Molecular Structure , Proteins/physiology , Sequence Alignment , Software
18.
Proteins ; 37(2): 228-41, 1999 Nov 01.
Article in English | MEDLINE | ID: mdl-10584068

ABSTRACT

We report on a test of FLEXX, a fully automatic docking tool for flexible ligands, on a highly diverse data set of 200 protein-ligand complexes from the Protein Data Bank. In total 46.5% of the complexes of the data set can be reproduced by a FLEXX docking solution at rank 1 with an rms deviation (RMSD) from the observed structure of less than 2 A. This rate rises to 70% if one looks at the entire generated solution set. FLEXX produces reliable results for ligands with up to 15 components which can be docked in 80% of the cases with acceptable accuracy. Ligands with more than 15 components tend to generate wrong solutions more often. The average runtime of FLEXX on this test set is 93 seconds per complex on a SUN Ultra-30 workstation. In addition, we report on "cross-docking" experiments, in which several receptor structures of complexes with identical proteins have been used for docking all cocrystallized ligands of these complexes. In most cases, these experiments show that FLEXX can acceptably dock a ligand into a foreign receptor structure. Finally we report on screening runs of ligands out of a library with 556 entries against ten different proteins. In eight cases FLEXX is able to find the original inhibitor within the top 7% of the total library.


Subject(s)
Algorithms , Computational Biology , Models, Molecular , Proteins/chemistry , Databases, Factual , Ligands , Protein Binding
19.
J Med Chem ; 42(21): 4422-33, 1999 Oct 21.
Article in English | MEDLINE | ID: mdl-10543886

ABSTRACT

A two-stage method for the computational prediction of the structure of protein-ligand complexes is proposed. Given an experimentally determined structure of the protein, in the first stage a large number of plausible ligand conformations is generated using the fast docking algorithm FlexX. In the second stage these conformations are minimized and reranked using a method based on a classical force field. The two-stage method is tested for 10 different protein-ligand complexes. For 9 of them experimentally determined structures are known. It turns out that the two-stage method strongly improves the predictive power as compared to that of the fast docking stage alone. The tenth case is a bona fide prediction of a complex of thrombin with a new inhibitor for which no experimentally determined structure is available so far.


Subject(s)
Drug Design , Proteins/chemistry , Algorithms , Antithrombins/chemistry , Dipeptides/chemistry , Inositol Phosphates/chemistry , Ligands , Models, Molecular , Molecular Structure , Phosphoric Monoester Hydrolases/chemistry , Piperidines/chemistry , Thrombin/antagonists & inhibitors , Thrombin/chemistry
20.
J Mol Biol ; 290(3): 757-79, 1999 Jul 16.
Article in English | MEDLINE | ID: mdl-10395828

ABSTRACT

We present the recursive dynamic programming (RDP) method for the threading approach to three-dimensional protein structure prediction. RDP is based on the divide-and-conquer paradigm and maps the protein sequence whose backbone structure is to be found (the protein target) onto the known backbone structure of a model protein (the protein template) in a stepwise fashion, a technique that is similar to computing local alignments but utilising different cost functions. We begin by mapping parts of the target onto the template that show statistically significant similarity with the template sequence. After mapping, the template structure is modified in order to account for the mapped target residues. Then significant similarities between the yet unmapped parts of the target and the modified template are searched, and the resulting segments of the target are mapped onto the template. This recursive process of identifying segments in the target to be mapped onto the template and modifying the template is continued until no significant similarities between the remaining parts of target and template are found. Those parts which are left unmapped by the procedure are interpreted as gaps. The RDP method is robust in the sense that different local alignment methods can be used, several alternatives of mapping parts of the target onto the template can be handled and compared in the process, and the cost functions can be dynamically adapted to biological needs. Our computer experiments show that the RDP procedure is efficient and effective. We can thread a typical protein sequence against a database of 887 template domains in about 12 hours even on a low-cost workstation (SUN Ultra 5). In statistical evaluations on databases of known protein structures, RDP significantly outperforms competing methods. RDP has been especially valuable in providing accurate alignments for modeling active sites of proteins.RDP is part of the ToPLign system (GMD Toolbox for protein alignment) and can be accessed via the WWW independently or in concert with other ToPLign tools at http://cartan.gmd.de/ToPLign.html.


Subject(s)
Protein Conformation , Proteins/chemistry , Amino Acid Sequence , Databases, Factual , Molecular Sequence Data , Protein Folding , Sequence Homology, Amino Acid
SELECTION OF CITATIONS
SEARCH DETAIL
...