Your browser doesn't support javascript.
loading
: 20 | 50 | 100
1 - 20 de 95
1.
Commun Biol ; 7(1): 447, 2024 Apr 11.
Article En | MEDLINE | ID: mdl-38605212

Protein evolution is constrained by structure and function, creating patterns in residue conservation that are routinely exploited to predict structure and other features. Similar constraints should affect variation across individuals, but it is only with the growth of human population sequencing that this has been tested at scale. Now, human population constraint has established applications in pathogenicity prediction, but it has not yet been explored for structural inference. Here, we map 2.4 million population variants to 5885 protein families and quantify residue-level constraint with a new Missense Enrichment Score (MES). Analysis of 61,214 structures from the PDB spanning 3661 families shows that missense depleted sites are enriched in buried residues or those involved in small-molecule or protein binding. MES is complementary to evolutionary conservation and a combined analysis allows a new classification of residues according to a conservation plane. This approach finds functional residues that are evolutionarily diverse, which can be related to specificity, as well as family-wide conserved sites that are critical for folding or function. We also find a possible contrast between lethal and non-lethal pathogenic sites, and a surprising clinical variant hot spot at a subset of missense enriched positions.


Proteins , Humans , Protein Domains , Proteins/metabolism , Protein Binding , Base Sequence
2.
Commun Biol ; 7(1): 320, 2024 Mar 13.
Article En | MEDLINE | ID: mdl-38480979

Fragment screening is used to identify binding sites and leads in drug discovery, but it is often unclear which binding sites are functionally important. Here, data from 37 experiments, and 1309 protein structures binding to 1601 ligands were analysed. A method to group ligands by binding sites is introduced and sites clustered according to profiles of relative solvent accessibility. This identified 293 unique ligand binding sites, grouped into four clusters (C1-4). C1 includes larger, buried, conserved, and population missense-depleted sites, enriched in known functional sites. C4 comprises smaller, accessible, divergent, missense-enriched sites, depleted in functional sites. A site in C1 is 28 times more likely to be functional than one in C4. Seventeen sites, which to the best of our knowledge are novel, in 13 proteins are identified as likely to be functionally important with examples from human tenascin and 5-aminolevulinate synthase highlighted. A multi-layer perceptron, and K-nearest neighbours model are presented to predict cluster labels for ligand binding sites with an accuracy of 96% and 100%, respectively, so allowing functional classification of sites for proteins not in this set. Our findings will be of interest to those studying protein-ligand interactions and developing new drugs or function modulators.


Drug Discovery , Proteins , Humans , Ligands , Binding Sites , Proteins/metabolism , Drug Discovery/methods
3.
Elife ; 122023 10 03.
Article En | MEDLINE | ID: mdl-37787376

Eukaryotic genes are interrupted by introns that are removed from transcribed RNAs by splicing. Patterns of splicing complexity differ between species, but it is unclear how these differences arise. We used inter-species association mapping with Saccharomycotina species to correlate splicing signal phenotypes with the presence or absence of splicing factors. Here, we show that variation in 5' splice site sequence preferences correlate with the presence of the U6 snRNA N6-methyladenosine methyltransferase METTL16 and the splicing factor SNRNP27K. The greatest variation in 5' splice site sequence occurred at the +4 position and involved a preference switch between adenosine and uridine. Loss of METTL16 and SNRNP27K orthologs, or a single SNRNP27K methionine residue, was associated with a preference for +4 U. These findings are consistent with splicing analyses of mutants defective in either METTL16 or SNRNP27K orthologs and models derived from spliceosome structures, demonstrating that inter-species association mapping is a powerful orthogonal approach to molecular studies. We identified variation between species in the occurrence of two major classes of 5' splice sites, defined by distinct interaction potentials with U5 and U6 snRNAs, that correlates with intron number. We conclude that variation in concerted processes of 5' splice site selection by U6 snRNA is associated with evolutionary changes in splicing signal phenotypes.


RNA Splice Sites , RNA, Small Nuclear , Adenosine/metabolism , Base Sequence , Introns/genetics , RNA Precursors/metabolism , RNA Splicing , RNA, Small Nuclear/genetics , Humans
4.
Cell Signal ; 108: 110714, 2023 08.
Article En | MEDLINE | ID: mdl-37187217

Protein kinases are major regulators of cellular processes, but the roles of most kinases remain unresolved. Dictyostelid social amoebas have been useful in identifying functions for 30% of its kinases in cell migration, cytokinesis, vesicle trafficking, gene regulation and other processes but their upstream regulators and downstream effectors are mostly unknown. Comparative genomics can assist to distinguish between genes involved in deeply conserved core processes and those involved in species-specific innovations, while co-expression of genes as evident from comparative transcriptomics can provide cues to the protein complement of regulatory networks. Genomes and developmental and cell-type specific transcriptomes are available for species that span the 0.5 billion years of evolution of Dictyostelia from their unicellular ancestors. In this work we analysed conservation and change in the abundance, functional domain architecture and developmental regulation of protein kinases across the 4 major taxon groups of Dictyostelia. All data are summarized in annotated phylogenetic trees of the kinase subtypes and accompanied by functional information of all kinases that were experimentally studied. We detected 393 different protein kinase domains across the five studied genomes, of which 212 were fully conserved. Conservation was highest (71%) in the previously defined AGC, CAMK, CK1, CMCG, STE and TKL groups and lowest (26%) in the "other" group of typical protein kinases. This was mostly due to species-specific single gene amplification of "other" kinases. Apart from the AFK and α-kinases, the atypical protein kinases, such as the PIKK and histidine kinases were also almost fully conserved. The phylogeny-wide developmental and cell-type specific expression profiles of the protein kinase genes were combined with profiles from the same transcriptomic experiments for the families of G-protein coupled receptors, small GTPases and their GEFs and GAPs, the transcription factors and for all genes that upon lesion generate a developmental defect. This dataset was subjected to hierarchical clustering to identify clusters of co-expressed genes that potentially act together in a signalling network. The work provides a valuable resource that allows researchers to identify protein kinases and other regulatory proteins that are likely to act as intermediates in a network of interest.


Dictyostelium , Dictyostelium/genetics , Phylogeny , Protein Kinases/metabolism , Genome , Transcription Factors/metabolism
5.
Elife ; 112022 11 21.
Article En | MEDLINE | ID: mdl-36409063

Alternative splicing of messenger RNAs is associated with the evolution of developmentally complex eukaryotes. Splicing is mediated by the spliceosome, and docking of the pre-mRNA 5' splice site into the spliceosome active site depends upon pairing with the conserved ACAGA sequence of U6 snRNA. In some species, including humans, the central adenosine of the ACAGA box is modified by N6 methylation, but the role of this m6A modification is poorly understood. Here, we show that m6A modified U6 snRNA determines the accuracy and efficiency of splicing. We reveal that the conserved methyltransferase, FIONA1, is required for Arabidopsis U6 snRNA m6A modification. Arabidopsis fio1 mutants show disrupted patterns of splicing that can be explained by the sequence composition of 5' splice sites and cooperative roles for U5 and U6 snRNA in splice site selection. U6 snRNA m6A influences 3' splice site usage. We generalise these findings to reveal two major classes of 5' splice site in diverse eukaryotes, which display anti-correlated interaction potential with U5 snRNA loop 1 and the U6 snRNA ACAGA box. We conclude that U6 snRNA m6A modification contributes to the selection of degenerate 5' splice sites crucial to alternative splicing.


All the information necessary to build the proteins that perform the biological processes required for life is encoded in the DNA of an organism. Making these proteins requires the DNA sequence of a gene to be transcribed into a 'messenger RNA' (mRNA), which is then processed into a final, mature form. This blueprint is then translated to assemble the corresponding protein. When an mRNA is processed, segments of the sequence that do not code for protein are removed and the remaining coding sequences are joined together in the right order. An intricate molecular machine known as the spliceosome controls this mechanism by recognising the 'splice sites' where coding and non-coding sequences meet. Depending on external conditions, the spliceosome can 'pick-and-mix' the coding sequences to create different processed mRNAs (and therefore proteins) from a single gene. This alternative splicing mechanism is often used to regulate when certain biological processes take place based on environmental cues; for example, the splicing of genes which control the timing of plant flowering is sensitive to ambient temperatures. To investigate this mechanism, Parker et al. focused on Arabidopsis thaliana, a plant that blooms later when temperatures are low. This precise timing partly relies on a gene whose mRNA is efficiently spliced in the cold, resulting in an active form of its protein that blocks blooming. Parker et al. grew and screened many A. thaliana plants to find individuals that could flower early in the cold, in which splicing of this gene was disrupted. A mutant fitting these criteria was identified and subjected to further investigation, which revealed that it could not produce FIONA1. In non-mutant plants, this enzyme chemically modifies one of the components of the spliceosome, a small nuclear RNA known as U6. Parker et al found that there are two types of splice site ­ one more likely to interact with U6 and another that preferentially interacts with another small nuclear RNA, U5. When FIONA1 is inactive (such as in the mutant identified by Parker et al.), splice sites that tend to strongly interact with U5 are selected. However, when the enzyme is active, splice sites that tend to bind with the chemically modified U6 are used instead. Further work by Parker et al. showed that these two types of splice sites ('preferring' either U5 or U6) are found in equal proportions in the genomes of many species, including humans. This suggests that Parker et al. have uncovered an essential feature of how genomes are organised and splicing is controlled.


Arabidopsis , RNA Precursors , Humans , RNA Precursors/metabolism , RNA Splice Sites , Arabidopsis/genetics , Arabidopsis/metabolism , RNA Splicing , RNA, Small Nuclear/genetics , Spliceosomes/metabolism
6.
Nat Commun ; 13(1): 3443, 2022 06 16.
Article En | MEDLINE | ID: mdl-35710760

A prerequisite to exploiting soil microbes for sustainable crop production is the identification of the plant genes shaping microbiota composition in the rhizosphere, the interface between roots and soil. Here, we use metagenomics information as an external quantitative phenotype to map the host genetic determinants of the rhizosphere microbiota in wild and domesticated genotypes of barley, the fourth most cultivated cereal globally. We identify a small number of loci with a major effect on the composition of rhizosphere communities. One of those, designated the QRMC-3HS, emerges as a major determinant of microbiota composition. We subject soil-grown sibling lines harbouring contrasting alleles at QRMC-3HS and hosting contrasting microbiotas to comparative root RNA-seq profiling. This allows us to identify three primary candidate genes, including a Nucleotide-Binding-Leucine-Rich-Repeat (NLR) gene in a region of structural variation of the barley genome. Our results provide insights into the footprint of crop improvement on the plant's capacity of shaping rhizosphere microbes.


Hordeum , Microbiota , Bacteria/genetics , Genes, Plant/genetics , Hordeum/genetics , Microbiota/genetics , Plant Roots/genetics , Rhizosphere , Soil/chemistry , Soil Microbiology
7.
Nat Commun ; 13(1): 2001, 2022 04 14.
Article En | MEDLINE | ID: mdl-35422045

The nutrient-rich tubers of the greater yam, Dioscorea alata L., provide food and income security for millions of people around the world. Despite its global importance, however, greater yam remains an orphan crop. Here, we address this resource gap by presenting a highly contiguous chromosome-scale genome assembly of D. alata combined with a dense genetic map derived from African breeding populations. The genome sequence reveals an ancient allotetraploidization in the Dioscorea lineage, followed by extensive genome-wide reorganization. Using the genomic tools, we find quantitative trait loci for resistance to anthracnose, a damaging fungal pathogen of yam, and several tuber quality traits. Genomic analysis of breeding lines reveals both extensive inbreeding as well as regions of extensive heterozygosity that may represent interspecific introgression during domestication. These tools and insights will enable yam breeders to unlock the potential of this staple crop and take full advantage of its adaptability to varied environments.


Dioscorea , Chromosomes , Dioscorea/genetics , Humans , Plant Breeding , Plant Tubers , Quantitative Trait Loci/genetics
8.
PLoS Comput Biol ; 18(3): e1009922, 2022 03.
Article En | MEDLINE | ID: mdl-35235558

SARS-CoV-2 Spike (Spike) binds to human angiotensin-converting enzyme 2 (ACE2) and the strength of this interaction could influence parameters relating to virulence. To explore whether population variants in ACE2 influence Spike binding and hence infection, we selected 10 ACE2 variants based on affinity predictions and prevalence in gnomAD and measured their affinities and kinetics for Spike receptor binding domain through surface plasmon resonance (SPR) at 37°C. We discovered variants that reduce and enhance binding, including three ACE2 variants that strongly inhibited (p.Glu37Lys, ΔΔG = -1.33 ± 0.15 kcal mol-1 and p.Gly352Val, predicted ΔΔG = -1.17 kcal mol-1) or abolished (p.Asp355Asn) binding. We also identified two variants with distinct population distributions that enhanced affinity for Spike. ACE2 p.Ser19Pro (ΔΔG = 0.59 ± 0.08 kcal mol-1) is predominant in the gnomAD African cohort (AF = 0.003) whilst p.Lys26Arg (ΔΔG = 0.26 ± 0.09 kcal mol-1) is predominant in the Ashkenazi Jewish (AF = 0.01) and European non-Finnish (AF = 0.006) cohorts. We compared ACE2 variant affinities to published SARS-CoV-2 pseudotype infectivity data and confirmed that ACE2 variants with reduced affinity for Spike can protect cells from infection. The effect of variants with enhanced Spike affinity remains unclear, but we propose a mechanism whereby these alleles could cause greater viral spreading across tissues and cell types, as is consistent with emerging understanding regarding the interplay between receptor affinity and cell-surface abundance. Finally, we compared mCSM-PPI2 ΔΔG predictions against our SPR data to assess the utility of predictions in this system. We found that predictions of decreased binding were well-correlated with experiment and could be improved by calibration, but disappointingly, predictions of highly enhanced binding were unreliable. Recalibrated predictions for all possible ACE2 missense variants at the Spike interface were calculated and used to estimate the overall burden of ACE2 variants on Covid-19.


Angiotensin-Converting Enzyme 2/genetics , COVID-19/genetics , Mutation, Missense , Spike Glycoprotein, Coronavirus/metabolism , Angiotensin-Converting Enzyme 2/metabolism , Genetic Predisposition to Disease , Humans , Protein Binding
9.
PLoS Comput Biol ; 17(8): e1009335, 2021 08.
Article En | MEDLINE | ID: mdl-34428215

Ankyrin protein repeats bind to a wide range of substrates and are one of the most common protein motifs in nature. Here, we collate a high-quality alignment of 7,407 ankyrin repeats and examine for the first time, the distribution of human population variants from large-scale sequencing of healthy individuals across this family. Population variants are not randomly distributed across the genome but are constrained by gene essentiality and function. Accordingly, we interpret the population variants in context with evolutionary constraint and structural features including secondary structure, accessibility and protein-protein interactions across 383 three-dimensional structures of ankyrin repeats. We find five positions that are highly conserved across homologues and also depleted in missense variants within the human population. These positions are significantly enriched in intra-domain contacts and so likely to be key for repeat packing. In contrast, a group of evolutionarily divergent positions are found to be depleted in missense variants in human and significantly enriched in protein-protein interactions. Our analysis also suggests the domain has three, not two surfaces, each with different patterns of enrichment in protein-substrate interactions and missense variants. Our findings will be of interest to those studying or engineering ankyrin-repeat containing proteins as well as those interpreting the significance of disease variants.


Ankyrin Repeat , Genetic Variation , Humans , Hydrophobic and Hydrophilic Interactions , Models, Molecular , Mutation, Missense , Protein Binding , Proteins/chemistry , Proteins/genetics
10.
Elife ; 102021 04 27.
Article En | MEDLINE | ID: mdl-33904405

Genes involved in disease resistance are some of the fastest evolving and most diverse components of genomes. Large numbers of nucleotide-binding, leucine-rich repeat (NLR) genes are found in plant genomes and are required for disease resistance. However, NLRs can trigger autoimmunity, disrupt beneficial microbiota or reduce fitness. It is therefore crucial to understand how NLRs are controlled. Here, we show that the RNA-binding protein FPA mediates widespread premature cleavage and polyadenylation of NLR transcripts, thereby controlling their functional expression and impacting immunity. Using long-read Nanopore direct RNA sequencing, we resolved the complexity of NLR transcript processing and gene annotation. Our results uncover a co-transcriptional layer of NLR control with implications for understanding the regulatory and evolutionary dynamics of NLRs in the immune responses of plants.


Arabidopsis Proteins/metabolism , Arabidopsis/genetics , NLR Proteins/metabolism , RNA-Binding Proteins/metabolism , Transcription Termination, Genetic , Arabidopsis/metabolism , Arabidopsis Proteins/genetics , Gene Expression Regulation, Plant , Genes, Plant/genetics , Genes, Plant/physiology , RNA, Messenger/metabolism
12.
Genome Biol ; 22(1): 72, 2021 03 01.
Article En | MEDLINE | ID: mdl-33648554

Transcription of eukaryotic genomes involves complex alternative processing of RNAs. Sequencing of full-length RNAs using long reads reveals the true complexity of processing. However, the relatively high error rates of long-read sequencing technologies can reduce the accuracy of intron identification. Here we apply alignment metrics and machine-learning-derived sequence information to filter spurious splice junctions from long-read alignments and use the remaining junctions to guide realignment in a two-pass approach. This method, available in the software package 2passtools ( https://github.com/bartongroup/2passtools ), improves the accuracy of spliced alignment and transcriptome assembly for species both with and without existing high-quality annotations.


Algorithms , Computational Biology/methods , Machine Learning , RNA Splice Sites , RNA-Seq , Sequence Alignment/methods , Software , Introns , Molecular Sequence Annotation , RNA Splicing , RNA-Seq/methods , Reproducibility of Results
13.
Methods Mol Biol ; 2231: 203-224, 2021.
Article En | MEDLINE | ID: mdl-33289895

In this chapter, we introduce core functionality of the Jalview interactive platform for the creation, analysis, and publication of multiple sequence alignments. A workflow is described based on Jalview's core functions: from data import to figure generation, including import of alignment reliability scores from T-Coffee and use of Jalview from the command line. The accompanying notes provide background information on the underlying methods and discuss additional options for working with Jalview to perform multiple sequence alignment, functional site analysis, and publication of alignments on the web.


Sequence Alignment/methods , Sequence Analysis, DNA/methods , Sequence Analysis, Protein/methods , Software , Internet Use , Phylogeny , Reproducibility of Results , Workflow
14.
Elife ; 92020 01 14.
Article En | MEDLINE | ID: mdl-31931956

Understanding genome organization and gene regulation requires insight into RNA transcription, processing and modification. We adapted nanopore direct RNA sequencing to examine RNA from a wild-type accession of the model plant Arabidopsis thaliana and a mutant defective in mRNA methylation (m6A). Here we show that m6A can be mapped in full-length mRNAs transcriptome-wide and reveal the combinatorial diversity of cap-associated transcription start sites, splicing events, poly(A) site choice and poly(A) tail length. Loss of m6A from 3' untranslated regions is associated with decreased relative transcript abundance and defective RNA 3' end formation. A functional consequence of disrupted m6A is a lengthening of the circadian period. We conclude that nanopore direct RNA sequencing can reveal the complexity of mRNA processing and modification in full-length single molecule reads. These findings can refine Arabidopsis genome annotation. Further, applying this approach to less well-studied species could transform our understanding of what their genomes encode.


Adenosine/analogs & derivatives , Arabidopsis/genetics , RNA Processing, Post-Transcriptional , RNA, Messenger/genetics , RNA, Plant/genetics , Sequence Analysis, RNA , Adenosine/metabolism , Arabidopsis/metabolism , Gene Expression Profiling , Methylation , Nanopores , Poly A/genetics , Poly A/metabolism , RNA Caps , RNA Splicing , RNA, Messenger/chemistry , RNA, Messenger/metabolism , RNA, Plant/chemistry , RNA, Plant/metabolism , RNA, Untranslated/chemistry , RNA, Untranslated/genetics
15.
J Struct Biol ; 209(1): 107405, 2020 01 01.
Article En | MEDLINE | ID: mdl-31628985

Tetratricopeptide repeat (TPR) proteins belong to the class of α-solenoid proteins, in which repetitive units of α-helical hairpin motifs stack to form superhelical, often highly flexible structures. TPR domains occur in a wide variety of proteins, and perform key functional roles including protein folding, protein trafficking, cell cycle control and post-translational modification. Here, we look at the TPR domain of the enzyme O-linked GlcNAc-transferase (OGT), which catalyses O-GlcNAcylation of a broad range of substrate proteins. A number of single-point mutations in the TPR domain of human OGT have been associated with the disease Intellectual Disability (ID). By extended steered and equilibrium atomistic simulations, we show that the OGT-TPR domain acts as an elastic nanospring, and that each of the ID-related local mutations substantially affect the global dynamics of the TPR domain. Since the nanospring character of the OGT-TPR domain is key to its function in binding and releasing OGT substrates, these changes of its biomechanics likely lead to defective substrate interaction. We find that neutral mutations in the human population, selected by analysis of the gnomAD database, do not incur these changes. Our findings may not only help to explain the ID phenotype of the mutants, but also aid the design of TPR proteins with tailored biomechanical properties.


Intellectual Disability/genetics , N-Acetylglucosaminyltransferases/chemistry , N-Acetylglucosaminyltransferases/genetics , Point Mutation , Humans , Molecular Dynamics Simulation , N-Acetylglucosaminyltransferases/metabolism , Protein Conformation , Protein Domains , Tetratricopeptide Repeat
16.
Protein Sci ; 29(1): 277-297, 2020 01.
Article En | MEDLINE | ID: mdl-31710725

The Dundee Resource for Sequence Analysis and Structure Prediction (DRSASP; http://www.compbio.dundee.ac.uk/drsasp.html) is a collection of web services provided by the Barton Group at the University of Dundee. DRSASP's flagship services are the JPred4 webserver for secondary structure and solvent accessibility prediction and the JABAWS 2.2 webserver for multiple sequence alignment, disorder prediction, amino acid conservation calculations, and specificity-determining site prediction. DRSASP resources are available through conventional web interfaces and APIs but are also integrated into the Jalview sequence analysis workbench, which enables the composition of multitool interactive workflows. Other existing Barton Group tools are being brought under the banner of DRSASP, including NoD (Nucleolar localization sequence detector) and 14-3-3-Pred. New resources are being developed that enable the analysis of population genetic data in evolutionary and 3D structural contexts. Existing resources are actively developed to exploit new technologies and maintain parity with evolving web standards. DRSASP provides substantial computational resources for public use, and since 2016 DRSASP services have completed over 1.5 million jobs.


Computational Biology/methods , Proteins/chemistry , Sequence Analysis, Protein/methods , Protein Structure, Secondary , Sequence Alignment , Software , Web Browser
17.
Bioinformatics ; 35(18): 3372-3377, 2019 09 15.
Article En | MEDLINE | ID: mdl-30726870

MOTIVATION: RNA-seq experiments are usually carried out in three or fewer replicates. In order to work well with so few samples, differential gene expression (DGE) tools typically assume the form of the underlying gene expression distribution. In this paper, the statistical properties of gene expression from RNA-seq are investigated in the complex eukaryote, Arabidopsis thaliana, extending and generalizing the results of previous work in the simple eukaryote Saccharomyces cerevisiae. RESULTS: We show that, consistent with the results in S.cerevisiae, more gene expression measurements in A.thaliana are consistent with being drawn from an underlying negative binomial distribution than either a log-normal distribution or a normal distribution, and that the size and complexity of the A.thaliana transcriptome does not influence the false positive rate performance of nine widely used DGE tools tested here. We therefore recommend the use of DGE tools that are based on the negative binomial distribution. AVAILABILITY AND IMPLEMENTATION: The raw data for the 17 WT Arabidopsis thaliana datasets is available from the European Nucleotide Archive (E-MTAB-5446). The processed and aligned data can be visualized in context using IGB (Freese et al., 2016), or downloaded directly, using our publicly available IGB quickload server at https://compbio.lifesci.dundee.ac.uk/arabidopsisQuickload/public_quickload/ under 'RNAseq>Froussios2019'. All scripts and commands are available from github at https://github.com/bartongroup/KF_arabidopsis-GRNA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Arabidopsis , Binomial Distribution , RNA-Seq , Sequence Analysis, RNA , Software
18.
Bioinformatics ; 34(11): 1939-1940, 2018 06 01.
Article En | MEDLINE | ID: mdl-29390042

Summary: JABAWS 2.2 is a computational framework that simplifies the deployment of web services for Bioinformatics. In addition to the five multiple sequence alignment (MSA) algorithms in JABAWS 1.0, JABAWS 2.2 includes three additional MSA programs (Clustal Omega, MSAprobs, GLprobs), four protein disorder prediction methods (DisEMBL, IUPred, Ronn, GlobPlot), 18 measures of protein conservation as implemented in AACon, and RNA secondary structure prediction by the RNAalifold program. JABAWS 2.2 can be deployed on a variety of in-house or hosted systems. JABAWS 2.2 web services may be accessed from the Jalview multiple sequence analysis workbench (Version 2.8 and later), as well as directly via the JABAWS command line interface (CLI) client. JABAWS 2.2 can be deployed on a local virtual server as a Virtual Appliance (VA) or simply as a Web Application Archive (WAR) for private use. Improvements in JABAWS 2.2 also include simplified installation and a range of utility tools for usage statistics collection, and web services querying and monitoring. The JABAWS CLI client has been updated to support all the new services and allow integration of JABAWS 2.2 services into conventional scripts. A public JABAWS 2 server has been in production since December 2011 and served over 800 000 analyses for users worldwide. Availability and implementation: JABAWS 2.2 is made freely available under the Apache 2 license and can be obtained from: http://www.compbio.dundee.ac.uk/jabaws. Contact: g.j.barton@dundee.ac.uk.


Computational Biology/methods , Nucleic Acid Conformation , RNA/metabolism , Software , Algorithms , Internet , Models, Molecular , Proteostasis Deficiencies , RNA/chemistry , Sequence Alignment , Sequence Analysis, Protein/methods , Sequence Analysis, RNA/methods
19.
PLoS One ; 12(12): e0190461, 2017.
Article En | MEDLINE | ID: mdl-29281737

[This corrects the article DOI: 10.1371/journal.pone.0184405.].

20.
Genome Med ; 9(1): 113, 2017 Dec 18.
Article En | MEDLINE | ID: mdl-29254494

The translation of personal genomics to precision medicine depends on the accurate interpretation of the multitude of genetic variants observed for each individual. However, even when genetic variants are predicted to modify a protein, their functional implications may be unclear. Many diseases are caused by genetic variants affecting important protein features, such as enzyme active sites or interaction interfaces. The scientific community has catalogued millions of genetic variants in genomic databases and thousands of protein structures in the Protein Data Bank. Mapping mutations onto three-dimensional (3D) structures enables atomic-level analyses of protein positions that may be important for the stability or formation of interactions; these may explain the effect of mutations and in some cases even open a path for targeted drug development. To accelerate progress in the integration of these data types, we held a two-day Gene Variation to 3D (GVto3D) workshop to report on the latest advances and to discuss unmet needs. The overarching goal of the workshop was to address the question: what can be done together as a community to advance the integration of genetic variants and 3D protein structures that could not be done by a single investigator or laboratory? Here we describe the workshop outcomes, review the state of the field, and propose the development of a framework with which to promote progress in this arena. The framework will include a set of standard formats, common ontologies, a common application programming interface to enable interoperation of the resources, and a Tool Registry to make it easy to find and apply the tools to specific analysis problems. Interoperability will enable integration of diverse data sources and tools and collaborative development of variant effect prediction methods.


Genome-Wide Association Study/methods , Polymorphism, Genetic , Protein Conformation , Sequence Analysis, Protein/methods , Algorithms , Congresses as Topic , Genome-Wide Association Study/standards , Humans , Sequence Analysis, Protein/standards
...