Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
1.
Nucleic Acids Res ; 52(W1): W140-W147, 2024 Jul 05.
Article in English | MEDLINE | ID: mdl-38769064

ABSTRACT

Genomic variation can impact normal biological function in complex ways and so understanding variant effects requires a broad range of data to be coherently assimilated. Whilst the volume of human variant data and relevant annotations has increased, the corresponding increase in the breadth of participating fields, standards and versioning mean that moving between genomic, coding, protein and structure positions is increasingly complex. In turn this makes investigating variants in diverse formats and assimilating annotations from different resources challenging. ProtVar addresses these issues to facilitate the contextualization and interpretation of human missense variation with unparalleled flexibility and ease of accessibility for use by the broadest range of researchers. By precalculating all possible variants in the human proteome it offers near instantaneous mapping between all relevant data types. It also combines data and analyses from a plethora of resources to bring together genomic, protein sequence and function annotations as well as structural insights and predictions to better understand the likely effect of missense variation in humans. It is offered as an intuitive web server https://www.ebi.ac.uk/protvar where data can be explored and downloaded, and can be accessed programmatically via an API.


Subject(s)
Mutation, Missense , Software , Humans , Databases, Protein , Molecular Sequence Annotation , Proteome/genetics , Proteins/genetics , Proteins/chemistry , Internet , Genomics/methods
2.
Bioinformatics ; 35(22): 4854-4856, 2019 11 01.
Article in English | MEDLINE | ID: mdl-31192369

ABSTRACT

MOTIVATION: Understanding the protein structural context and patterning on proteins of genomic variants can help to separate benign from pathogenic variants and reveal molecular consequences. However, mapping genomic coordinates to protein structures is non-trivial, complicated by alternative splicing and transcript evidence. RESULTS: Here we present VarMap, a web tool for mapping a list of chromosome coordinates to canonical UniProt sequences and associated protein 3D structures, including validation checks, and annotating them with structural information. AVAILABILITY AND IMPLEMENTATION: https://www.ebi.ac.uk/thornton-srv/databases/VarMap. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Subject(s)
Genomics , Software , Amino Acid Sequence , Databases, Protein , Molecular Sequence Annotation , Proteins
3.
RNA ; 22(12): 1893-1901, 2016 12.
Article in English | MEDLINE | ID: mdl-27793980

ABSTRACT

Mineral surfaces are often proposed as the sites of critical processes in the emergence of life. Clay minerals in particular are thought to play significant roles in the origin of life including polymerizing, concentrating, organizing, and protecting biopolymers. In these scenarios, the impact of minerals on biopolymer folding is expected to influence evolutionary processes. These processes include both the initial emergence of functional structures in the presence of the mineral and the subsequent transition away from the mineral-associated niche. The initial evolution of function depends upon the number and distribution of sequences capable of functioning in the presence of the mineral, and the transition to new environments depends upon the overlap between sequences that evolve on the mineral surface and sequences that can perform the same functions in the mineral's absence. To examine these processes, we evolved self-cleaving ribozymes in vitro in the presence or absence of Na-saturated montmorillonite clay mineral particles. Starting from a shared population of random sequences, RNA populations were evolved in parallel, along separate evolutionary trajectories. Comparative sequence analysis and activity assays show that the impact of this clay mineral on functional structure selection was minimal; it neither prevented common structures from emerging, nor did it promote the emergence of new structures. This suggests that montmorillonite does not improve RNA's ability to evolve functional structures; however, it also suggests that RNAs that do evolve in contact with montmorillonite retain the same structures in mineral-free environments, potentially facilitating an evolutionary transition away from a mineral-associated niche.


Subject(s)
Minerals/chemistry , RNA, Catalytic/chemistry , Aluminum Silicates , Clay , Surface Properties
4.
Methods ; 103: 57-67, 2016 07 01.
Article in English | MEDLINE | ID: mdl-26853327

ABSTRACT

The importance of elucidating the three dimensional structures of RNA molecules is becoming increasingly clear. However, traditional protein structural techniques such as NMR and X-ray crystallography have several important drawbacks when probing long RNA molecules. Single molecule Förster resonance energy transfer (smFRET) has emerged as a useful alternative as it allows native sequences to be probed in physiological conditions and allows multiple conformations to be probed simultaneously. This review serves to describe the method of generating a three dimensional RNA structure from smFRET data from the biochemical probing of the secondary structure to the computational refinement of the final model.


Subject(s)
Fluorescence Resonance Energy Transfer , RNA/chemistry , Base Sequence , Fluorescence Polarization , Fluorescent Dyes/chemistry , Models, Molecular , Nucleic Acid Conformation , RNA/ultrastructure , Staining and Labeling
5.
J Mol Evol ; 77(4): 159-69, 2013 Oct.
Article in English | MEDLINE | ID: mdl-23743923

ABSTRACT

Similarities and differences between amino acids define the rates at which they substitute for one another within protein sequences and the patterns by which these sequences form protein structures. However, there exist many ways to measure similarity, whether one considers the molecular attributes of individual amino acids, the roles that they play within proteins, or some nuanced contribution of each. One popular approach to representing these relationships is to divide the 20 amino acids of the standard genetic code into groups, thereby forming a simplified amino acid alphabet. Here, we develop a method to compare or combine different simplified alphabets, and apply it to 34 simplified alphabets from the scientific literature. We use this method to show that while different suggestions vary and agree in non-intuitive ways, they combine to reveal a consensus view of amino acid similarity that is clearly rooted in physico-chemistry.


Subject(s)
Amino Acids/chemistry , Amino Acids/classification , Sequence Analysis, Protein/methods , Algorithms , Genetic Code , Proteins/chemistry , Sequence Alignment
6.
J Mol Biol ; 435(2): 167892, 2023 01 30.
Article in English | MEDLINE | ID: mdl-36410474

ABSTRACT

Constrained Coding Regions (CCRs) in the human genome have been derived from DNA sequencing data of large cohorts of healthy control populations, available in the Genome Aggregation Database (gnomAD) [1]. They identify regions depleted of protein-changing variants and thus identify segments of the genome that have been constrained during human evolution. By mapping these DNA-defined regions from genomic coordinates onto the corresponding protein positions and combining this information with protein annotations, we have explored the distribution of CCRs and compared their co-occurrence with different protein functional features, previously annotated at the amino acid level in public databases. As expected, our results reveal that functional amino acids involved in interactions with DNA/RNA, protein-protein contacts and catalytic sites are the protein features most likely to be highly constrained for variation in the control population. More surprisingly, we also found that linear motifs, linear interacting peptides (LIPs), disorder-order transitions upon binding with other protein partners and liquid-liquid phase separating (LLPS) regions are also strongly associated with high constraint for variability. We also compared intra-species constraints in the human CCRs with inter-species conservation and functional residues to explore how such CCRs may contribute to the analysis of protein variants. As has been previously observed, CCRs are only weakly correlated with conservation, suggesting that intraspecies constraints complement interspecies conservation and can provide more information to interpret variant effects.


Subject(s)
Genome, Human , Open Reading Frames , Proteins , Humans , Base Sequence , Genome, Human/genetics , Genomics , Proteins/genetics , Chromosome Mapping
7.
Nat Commun ; 14(1): 7702, 2023 Dec 06.
Article in English | MEDLINE | ID: mdl-38057330

ABSTRACT

Loss-of-function of DDX3X is a leading cause of neurodevelopmental disorders (NDD) in females. DDX3X is also a somatically mutated cancer driver gene proposed to have tumour promoting and suppressing effects. We perform saturation genome editing of DDX3X, testing in vitro the functional impact of 12,776 nucleotide variants. We identify 3432 functionally abnormal variants, in three distinct classes. We train a machine learning classifier to identify functionally abnormal variants of NDD-relevance. This classifier has at least 97% sensitivity and 99% specificity to detect variants pathogenic for NDD, substantially out-performing in silico predictors, and resolving up to 93% of variants of uncertain significance. Moreover, functionally-abnormal variants can account for almost all of the excess nonsynonymous DDX3X somatic mutations seen in DDX3X-driven cancers. Systematic maps of variant effects generated in experimentally tractable cell types have the potential to transform clinical interpretation of both germline and somatic disease-associated variation.


Subject(s)
Neoplasms , Neurodevelopmental Disorders , Female , Humans , Gene Editing , Virulence , Neurodevelopmental Disorders/genetics , Neoplasms/genetics , Germ Cells , Germ-Line Mutation , DEAD-box RNA Helicases/genetics
8.
Protein Sci ; 29(1): 111-119, 2020 01.
Article in English | MEDLINE | ID: mdl-31606900

ABSTRACT

VarSite is a web server mapping known disease-associated variants from UniProt and ClinVar, together with natural variants from gnomAD, onto protein 3D structures in the Protein Data Bank. The analyses are primarily image-based and provide both an overview for each human protein, as well as a report for any specific variant of interest. The information can be useful in assessing whether a given variant might be pathogenic or benign. The structural annotations for each position in the protein include protein secondary structure, interactions with ligand, metal, DNA/RNA, or other protein, and various measures of a given variant's possible impact on the protein's function. The 3D locations of the disease-associated variants can be viewed interactively via the 3dmol.js JavaScript viewer, as well as in RasMol and PyMOL. Users can search for specific variants, or sets of variants, by providing the DNA coordinates of the base change(s) of interest. Additionally, various agglomerative analyses are given, such as the mapping of disease and natural variants onto specific Pfam or CATH domains. The server is freely accessible to all at: https://www.ebi.ac.uk/thornton-srv/databases/VarSite.


Subject(s)
Databases, Genetic , Proteins/chemistry , Proteins/genetics , Cloud Computing , Computational Biology , Genetic Predisposition to Disease , Genetic Variation , Humans , Models, Molecular , Protein Conformation , User-Computer Interface
9.
Science ; 362(6419): 1161-1164, 2018 12 07.
Article in English | MEDLINE | ID: mdl-30409806

ABSTRACT

We estimated the genome-wide contribution of recessive coding variation in 6040 families from the Deciphering Developmental Disorders study. The proportion of cases attributable to recessive coding variants was 3.6% in patients of European ancestry, compared with 50% explained by de novo coding mutations. It was higher (31%) in patients with Pakistani ancestry, owing to elevated autozygosity. Half of this recessive burden is attributable to known genes. We identified two genes not previously associated with recessive developmental disorders, KDM5B and EIF3F, and functionally validated them with mouse and cellular models. Our results suggest that recessive coding variants account for a small fraction of currently undiagnosed nonconsanguineous individuals, and that the role of noncoding variants, incomplete penetrance, and polygenic mechanisms need further exploration.


Subject(s)
Developmental Disabilities/genetics , Genes, Recessive , Genetic Code , Genetic Variation , Penetrance , Animals , Disease Models, Animal , Eukaryotic Initiation Factor-3/genetics , Europe , Genome-Wide Association Study , Humans , Jumonji Domain-Containing Histone Demethylases/genetics , Mice , Nuclear Proteins/genetics , Pakistan , Phylogeny , Repressor Proteins/genetics
10.
PLoS One ; 8(6): e64624, 2013.
Article in English | MEDLINE | ID: mdl-23762242

ABSTRACT

We have detected a concentration of boron in martian clay far in excess of that in any previously reported extra-terrestrial object. This enrichment indicates that the chemistry necessary for the formation of ribose, a key component of RNA, could have existed on Mars since the formation of early clay deposits, contemporary to the emergence of life on Earth. Given the greater similarity of Earth and Mars early in their geological history, and the extensive disruption of Earth's earliest mineralogy by plate tectonics, we suggest that the conditions for prebiotic ribose synthesis may be better understood by further Mars exploration.


Subject(s)
Aluminum Silicates/chemistry , Boron/analysis , Extraterrestrial Environment/chemistry , Mars , Clay , Earth, Planet , Exobiology , Geology , Origin of Life
11.
Structure ; 21(6): 951-62, 2013 Jun 04.
Article in English | MEDLINE | ID: mdl-23685210

ABSTRACT

HIV-1 genomic RNA has a noncoding 5' region containing sequential conserved structural motifs that control many parts of the life cycle. Very limited data exist on their three-dimensional (3D) conformation and, hence, how they work structurally. To assemble a working model, we experimentally reassessed secondary structure elements of a 240-nt region and used single-molecule distances, derived from fluorescence resonance energy transfer, between defined locations in these elements as restraints to drive folding of the secondary structure into a 3D model with an estimated resolution below 10 Å. The folded 3D model satisfying the data is consensual with short nuclear-magnetic-resonance-solved regions and reveals previously unpredicted motifs, offering insight into earlier functional assays. It is a 3D representation of this entire region, with implications for RNA dimerization and protein binding during regulatory steps. The structural information of this highly conserved region of the virus has the potential to reveal promising therapeutic targets.


Subject(s)
HIV-1/genetics , Nucleic Acid Conformation , RNA, Viral/chemistry , Virus Assembly , Fluorescence Resonance Energy Transfer , Models, Molecular
SELECTION OF CITATIONS
SEARCH DETAIL