Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 54
Filter
1.
Nucleic Acids Res ; 38(Web Server issue): W719-23, 2010 Jul.
Article in English | MEDLINE | ID: mdl-20501602

ABSTRACT

The WHAT IF molecular-modelling and drug design program is widely distributed in the world of protein structure bioinformatics. Although originally designed as an interactive application, its highly modular design and inbuilt control language have recently enabled its deployment as a collection of programmatically accessible web services. We report here a collection of WHAT IF-based protein structure bioinformatics web services: these relate to structure quality, the use of symmetry in crystal structures, structure correction and optimization, adding hydrogens and optimizing hydrogen bonds and a series of geometric calculations. The freely accessible web services are based on the industry standard WS-I profile and the EMBRACE technical guidelines, and are available via both REST and SOAP paradigms. The web services run on a dedicated computational cluster; their function and availability is monitored daily.


Subject(s)
Protein Conformation , Software , Computational Biology , Computer Graphics , Internet , Models, Molecular
2.
Bioinformatics ; 26(18): i568-74, 2010 Sep 15.
Article in English | MEDLINE | ID: mdl-20823323

ABSTRACT

MOTIVATION: In recent years, the gulf between the mass of accumulating-research data and the massive literature describing and analyzing those data has widened. The need for intelligent tools to bridge this gap, to rescue the knowledge being systematically isolated in literature and data silos, is now widely acknowledged. RESULTS: To this end, we have developed Utopia Documents, a novel PDF reader that semantically integrates visualization and data-analysis tools with published research articles. In a successful pilot with editors of the Biochemical Journal (BJ), the system has been used to transform static document features into objects that can be linked, annotated, visualized and analyzed interactively (http://www.biochemj.org/bj/424/3/). Utopia Documents is now used routinely by BJ editors to mark up article content prior to publication. Recent additions include integration of various text-mining and biodatabase plugins, demonstrating the system's ability to seamlessly integrate on-line content with PDF articles. AVAILABILITY: http://getutopia.com.


Subject(s)
Information Services , Literature , Publications , Research , Software , Internet , Periodicals as Topic , Publications/classification , Publishing
3.
Science ; 278(5338): 609-14, 1997 Oct 24.
Article in English | MEDLINE | ID: mdl-9381171

ABSTRACT

Ancient duplications and rearrangements of protein-coding segments have resulted in complex gene family relationships. Duplications can be tandem or dispersed and can involve entire coding regions or modules that correspond to folded protein domains. As a result, gene products may acquire new specificities, altered recognition properties, or modified functions. Extreme proliferation of some families within an organism, perhaps at the expense of other families, may correspond to functional innovations during evolution. The underlying processes are still at work, and the large fraction of human and other genomes consisting of transposable elements may be a manifestation of the evolutionary benefits of genomic flexibility.


Subject(s)
Multigene Family , Proteins/genetics , Amino Acid Sequence , Animals , Base Sequence , Computer Communication Networks , Databases as Topic , Evolution, Molecular , Genetic Variation , Humans , Phylogeny , Proteins/chemistry , Proteins/classification , Proteins/physiology , Repetitive Sequences, Nucleic Acid
4.
F1000Res ; 62017.
Article in English | MEDLINE | ID: mdl-28781748

ABSTRACT

ELIXIR-UK is the UK node of ELIXIR, the European infrastructure for life science data. Since its foundation in 2014, ELIXIR-UK has played a leading role in training both within the UK and in the ELIXIR Training Platform, which coordinates and delivers training across all ELIXIR members. ELIXIR-UK contributes to the Training Platform's coordination and supports the development of training to address key skill gaps amongst UK scientists. As part of this work it acts as a conduit for nationally-important bioinformatics training resources to promote their activities to the ELIXIR community. ELIXIR-UK also leads ELIXIR's flagship Training Portal, TeSS, which collects information about a diverse range of training and makes it easily accessible to the community. ELIXIR-UK also works with others to provide key digital skills training, partnering with the Software Sustainability Institute to provide Software Carpentry training to the ELIXIR community and to establish the Data Carpentry initiative, and taking a lead role amongst national stakeholders to deliver the StaTS project - a coordinated effort to drive engagement with training in statistics.

5.
Nucleic Acids Res ; 31(1): 400-2, 2003 Jan 01.
Article in English | MEDLINE | ID: mdl-12520033

ABSTRACT

The PRINTS database houses a collection of protein fingerprints. These may be used to assign uncharacterised sequences to known families and hence to infer tentative functions. The September 2002 release (version 36.0) includes 1800 fingerprints, encoding approximately 11 000 motifs, covering a range of globular and membrane proteins, modular polypeptides and so on. In addition to its continued steady growth, we report here the development of an automatic supplement, prePRINTS, designed to increase the coverage of the resource and reduce some of the manual burdens inherent in its maintenance. The databases are accessible for interrogation and searching at http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/.


Subject(s)
Amino Acid Motifs , Databases, Protein , Proteins/chemistry , Animals , Automation , Conserved Sequence , Software
6.
Nucleic Acids Res ; 30(1): 239-41, 2002 Jan 01.
Article in English | MEDLINE | ID: mdl-11752304

ABSTRACT

The PRINTS database houses a collection of protein fingerprints. These may be used to make family and tentative functional assignments for uncharacterised sequences. The September 2001 release (version 32.0) includes 1600 fingerprints, encoding approximately 10 000 motifs, covering a range of globular and membrane proteins, modular polypeptides and so on. In addition to its continued steady growth, we report here its use as a source of annotation in the InterPro resource, and the use of its relational cousin, PRINTS-S, to model relationships between families, including those beyond the reach of conventional sequence analysis approaches. The database is accessible for BLAST, fingerprint and text searches at http://www.bioinf.man.ac.uk/dbbrowser/PRINTS/.


Subject(s)
Databases, Protein , Evolution, Molecular , Proteins/genetics , Amino Acid Motifs , Animals , Information Storage and Retrieval , Internet , Proteins/physiology , Sequence Alignment
7.
Nucleic Acids Res ; 32(Database issue): D401-5, 2004 Jan 01.
Article in English | MEDLINE | ID: mdl-14681443

ABSTRACT

CADRE is a public resource for housing and analysing genomic data extracted from species of Aspergillus. It arose to enable maintenance of the complete annotated genomic sequence of Aspergillus fumigatus and to provide tools for searching, analysing and visualizing features of fungal genomes. By implementing CADRE using Ensembl, a framework is in place for storing and comparing several genomes: the resource will thus expand by including other Aspergillus genomes (such as Aspergillus nidulans) as they become available. CADRE is accessible at http://www.cadre. man.ac.uk.


Subject(s)
Aspergillus/genetics , Databases, Genetic , Genome, Fungal , Aspergillus fumigatus/genetics , Computational Biology , Genes, Fungal , Genomics , Information Storage and Retrieval , Internet , Software
8.
Nucleic Acids Res ; 29(1): 37-40, 2001 Jan 01.
Article in English | MEDLINE | ID: mdl-11125043

ABSTRACT

Signature databases are vital tools for identifying distant relationships in novel sequences and hence for inferring protein function. InterPro is an integrated documentation resource for protein families, domains and functional sites, which amalgamates the efforts of the PROSITE, PRINTS, Pfam and ProDom database projects. Each InterPro entry includes a functional description, annotation, literature references and links back to the relevant member database(s). Release 2.0 of InterPro (October 2000) contains over 3000 entries, representing families, domains, repeats and sites of post-translational modification encoded by a total of 6804 different regular expressions, profiles, fingerprints and Hidden Markov Models. Each InterPro entry lists all the matches against SWISS-PROT and TrEMBL (more than 1,000,000 hits from 462,500 proteins in SWISS-PROT and TrEMBL). The database is accessible for text- and sequence-based searches at http://www.ebi.ac.uk/interpro/. Questions can be emailed to interhelp@ebi.ac.uk.


Subject(s)
Databases, Factual , Proteins , Information Services , Internet , Protein Structure, Tertiary , Proteins/chemistry , Proteins/genetics
9.
Trends Pharmacol Sci ; 22(4): 162-5, 2001 Apr.
Article in English | MEDLINE | ID: mdl-11282406

ABSTRACT

Analysis of G-protein-coupled receptor (GPCR) subtypes has attracted considerable interest because some drugs that act on GPCRs cause therapeutic problems as a result of their failure to differentiate between subtypes. In this article, an extensive compendium of diagnostic 'fingerprints' for GPCR subtypes and their families will be described. These fingerprints offer new opportunities to investigate correlations between specific sequence motifs and ligand binding or G-protein coupling, and are likely to prove valuable both in seeking novel receptors in genome data and in the characterization of orphan receptors.


Subject(s)
Peptide Mapping , Receptors, Cell Surface/chemistry , Computational Biology , Databases, Factual , Drug Industry , Humans , Receptors, Cell Surface/drug effects , Receptors, Cell Surface/genetics , Sequence Alignment
10.
J Mol Biol ; 234(4): 1270-3, 1993 Dec 20.
Article in English | MEDLINE | ID: mdl-8263929

ABSTRACT

The NADP(+)-dependent hexameric glutamate dehydrogenase from Escherichia coli has been crystallized as the apo-enzyme and also in the presence of its substrates 2-oxoglutarate, glutamate or NADP+, using either pulsed equilibrium microdialysis, or the hanging drop method of vapour diffusion. Three non-isomorphous, but related, crystal forms have been obtained, all of which belong to the orthorhombic system and are most likely to be in space group P2(1)2(1)2(1). One crystal form is grown from ammonium sulphate, includes the apoenzyme and the binary complexes with 2-oxoglutarate or NADP+, and has cell dimensions a = 157.5 A, b = 212.5 A, c = 101.0 A with a hexamer in the asymmetric unit. Crystallizations using glutamate as the precipitant produced two further crystal forms, which show significant changes in the b and c cell dimensions with respect to the apo-enzyme crystals, with parameters a = 160.0 A, b = 217.5 A c = 92.4 A and a = 160.0 A, b = 223.0 A c = 92.4 A, respectively. X-ray diffraction photographs taken with synchrotron radiation show measurable reflections to beyond 3.0 A resolution.


Subject(s)
Glutamate Dehydrogenase/ultrastructure , Bacterial Proteins/ultrastructure , Crystallography, X-Ray , Escherichia coli/enzymology
11.
Protein Sci ; 2(5): 753-61, 1993 May.
Article in English | MEDLINE | ID: mdl-7684291

ABSTRACT

The lipocalins and fatty acid-binding proteins (FABPs) are two recently identified protein families that both function by binding small hydrophobic molecules. We have sought to clarify relationships within and between these two groups through an analysis of both structure and sequence. Within a similar overall folding pattern, we find large parts of the lipocalin and FABP structures to be quantitatively equivalent. The three largest structurally conserved regions within the lipocalin common core correspond to characteristic sequence motifs that we have used to determine the constitution of this family using an iterative sequence analysis procedure. This afforded a new interpretation of the family, which highlighted the difficulties of determining a comprehensive and coherent classification of the lipocalins. The first of the three conserved sequence motifs is also common to the FABPs and corresponds to a conserved structural element characteristic of both families. Similarities of structure and sequence within the two families suggests that they form part of a larger "structural superfamily"; we have christened this overall group the calycins to reflect the cup-shaped structure of its members.


Subject(s)
Carrier Proteins/chemistry , Carrier Proteins/classification , Insect Proteins , Neoplasm Proteins , Tumor Suppressor Proteins , Alpha-Globulins/chemistry , Alpha-Globulins/classification , Amino Acid Sequence , Animals , Fatty Acid-Binding Protein 7 , Fatty Acid-Binding Proteins , Humans , Invertebrate Hormones/chemistry , Invertebrate Hormones/classification , Molecular Sequence Data , Protein Conformation , Protein Structure, Secondary , Retinol-Binding Proteins/chemistry , Retinol-Binding Proteins/classification , Sequence Analysis , Sequence Homology, Amino Acid
12.
Int J Biochem Cell Biol ; 32(2): 139-55, 2000 Feb.
Article in English | MEDLINE | ID: mdl-10687950

ABSTRACT

In the wake of the numerous now-fruitful genome projects, we have witnessed a 'tsunami' of sequence data and with it the birth of the field of bioinformatics. Bioinformatics involves the application of information technology to the management and analysis of biological data. For many of us, this means that databases and their search tools have become an essential part of the research environment. However, the rate of sequence generation and the haphazard proliferation of databases have made it difficult to keep pace with developments, even for the cognoscenti. Moreover, increasing amounts of sequence information do not necessarily equate with an increase in knowledge, and in the panic to automate the route from raw data to biological insight, we may be generating and propagating innumerable errors in our precious databases. In the genome era upon us, researchers want rapid, easy-to-use, reliable tools for functional characterisation of newly determined sequences. For the pharmaceutical industry in particular, the Pandora's box of bioinformatics harbours an information-rich nugget, ripe with potential drug targets and possible new avenues for the development of therapeutic agents. This review outlines the current status of the major pattern databases now used routinely in the analysis of protein sequences. The review is divided into three main sections. In the first, commonly used terms are defined and the methods behind the databases are briefly described; in the second, the structure and content of the principal pattern databases are discussed; and in the final part, several alignment databases, which are frequently confused with pattern databases, are mentioned. For the new-comer, the array of resources, the range of methods behind them and the different tools required to search them can be confusing. The review therefore also briefly mentions a current international endeavour to integrate the diverse databases, which effort should facilitate sequence analysis in the future. This is particularly important for target-discovery programmes, where the challenge is to rationalise the enormous numbers of potential targets generated by sequence database searches. This problem may be addressed, at least in part, by reducing search outputs to the more focused and manageable subsets suggested by searches of integrated groups of family-specific pattern databases.


Subject(s)
Databases, Factual , Proteins/genetics , Proteins/physiology , Amino Acid Motifs , Amino Acid Sequence , Animals , Computational Biology , Humans , Pattern Recognition, Automated , Proteins/chemistry , Sequence Alignment
13.
Gene ; 98(2): 153-9, 1991 Feb 15.
Article in English | MEDLINE | ID: mdl-1849861

ABSTRACT

A multiple alignment has been constructed, containing 37 sequences from related families of membrane-bound receptors believed to share the same structural framework as rhodopsin. Sequence homology within families was high (occasionally greater than 90%), but homology between them was generally low (20% or less). Database pattern-scanning methods were therefore used to construct a set of discriminators to aid both the task of alignment and the identification of distantly related sequences showing similar rhodopsin-like transmembrane helices. The results indicate that these discriminators are uniquely able to identify each of the transmembrane helices without major cross-reaction with similar regions in unrelated integral membrane proteins. This ability engenders more accurate alignments of the sequences and facilitates structural analysis and model building of the receptors.


Subject(s)
Databases, Factual , GTP-Binding Proteins/genetics , Receptors, Cell Surface/genetics , Sequence Homology, Nucleic Acid , Amino Acid Sequence , Animals , Humans , Molecular Sequence Data , Protein Conformation , Rhodopsin/genetics
14.
Gene ; 221(1): GC57-63, 1998 Oct 09.
Article in English | MEDLINE | ID: mdl-9852962

ABSTRACT

CINEMA is a new editor for manipulating and generating multiple sequence alignments. The program provides both an interface to existing databases of alignments on the Internet and a tool for constructing and modifying alignments locally. It is written in Java, so executable code will run on most major desktop platforms without modification. The implementation is highly flexible, so the applet can be easily customised with additional functions; and the object classes are reusable, promoting rapid development of program extensions. Formerly, such extended functionality might have been provided via browser plug-ins, which have to be downloaded and installed on every client before loading data. Now, for the first time, an applet is available that allows interactive client-side processing of an alignment, which can then be stored or processed automatically on the server. The program is embedded in a comprehensive help file and is accessible both as a stand-alone tool on UCL's Bioinformatics Server; http:/(/)www.biochem.ucl.ac.uk/bsm/dbbrowser+ ++/CINEMA2.02/, and as an integral part of the PRINTS protein fingerprint database. Exploitation of such novel technologies revolutionises the way users may interact with public databases in the future: bioinformatics centres need not simply provide data, but are now able to offer the means by which information is visualised and manipulated, without the requirement for users to install software.


Subject(s)
Software , Color Perception , Computational Biology , Image Processing, Computer-Assisted , Internet , Sequence Alignment
15.
Biotechnol Annu Rev ; 8: 1-54, 2002.
Article in English | MEDLINE | ID: mdl-12436914

ABSTRACT

In silico biology has gathered momentum as, worldwide, scientists have united in a common quest to sequence, store and analyse complete genomes. This year, a pivotal achievement of this cooperative endeavour was realised in the release of a public draft of the human genome, and with it the promises to improve our understanding of diverse aspects of biology and to yield a healthier future with safe personalized medicines. Key to these goals will be the need to elucidate and characterise the genes and gene products encoded not just in the human genome, but in many genomes. These tasks are underpinned by the concepts and processes of genome and gene/protein evolution, regulation of gene expression, mechanisms of protein folding, the manifestation of protein function, and so on, all of which must be understood in the context of complex, dynamic biological systems. Our use of computers to model such concepts and systems must be placed in the context of the current limits of our understanding of them:- it is important to recognise, for example, that we don't have a common understanding either of what constitutes a gene or a protein function; we can't invariably say that a particular sequence or fold has arisen via divergent or convergent evolution; and we don't fully understand the rules of protein folding. Accepting what we can't do in silico is essential in appreciating what we can do. Without this understanding, it is easy to be misled, as notions of what particular computational approaches can achieve are sometimes rather optimistic. There are valuable lessons to be learned here from the field of Artificial Intelligence, principal among which is the realisation that capturing and representing complex knowledge is time consuming, expensive and hard. Thus, we argue here that if bioinformatics is to tackle biological complexity in earnest, it would be wise to absorb the experience distilled from decades of artificial intelligence research, and to approach the road ahead with caution, rigour and pragmatism.


Subject(s)
Artificial Intelligence , Computational Biology/methods , Computational Biology/trends , Databases, Genetic , Genomics/methods , Sequence Analysis/methods , DNA/chemistry , DNA/genetics , Genome , Human Genome Project , Humans , Proteins/chemistry , Proteins/genetics , Proteomics/methods , Sequence Analysis/trends , Systems Integration
19.
Protein Eng Des Sel ; 26(10): 695-704, 2013 Oct.
Article in English | MEDLINE | ID: mdl-23840071

ABSTRACT

The inability to generate soluble, correctly folded recombinant protein is often a barrier to successful structural and functional studies. Access to affordable synthetic genes has, however, made it possible to design, make and test many more variants of a target protein to identify suitable constructs. We have used rational design and gene synthesis to create a controlled randomised library of the EphB4 receptor tyrosine kinase, with the aim of obtaining soluble, purifiable and active catalytic domain material at multi-milligram levels in Escherichia coli. Three main parameters were tested in designing the library--construct length, functional mutations and stability grafting. These variables were combined to generate a total of 9720 possible variants. The screening of 480 clones generated a 3% hit rate, with a purifiable solubility of up to 15 mg/L for some EphB4 constructs that was largely independent of construct length. Sequencing of the positive clones revealed a pair of hydrophobic core mutations that were key to obtaining soluble material. A minimal kinase domain construct containing these two mutations exhibited a +4.5°C increase in thermal stability over the wild-type protein. These approaches will be broadly applicable for solubility engineering of many different protein target classes. Atomic coordinates and structural factors have been deposited in PDB under the accession 2yn8 (EphB4 HP + staurosporine).


Subject(s)
Catalytic Domain , Peptide Library , Protein Engineering/methods , Receptor, EphB4/chemistry , Receptor, EphB4/genetics , Humans , Hydrophobic and Hydrophilic Interactions , Models, Molecular , Mutation , Protein Stability , Receptor, EphB4/metabolism , Solubility , Temperature
SELECTION OF CITATIONS
SEARCH DETAIL