RESUMEN
The widespread TnpB proteins of IS200/IS605 transposon family have recently emerged as the smallest RNA-guided nucleases capable of targeted genome editing in eukaryotic cells1,2. Bioinformatic analysis identified TnpB proteins as the likely predecessors of Cas12 nucleases3-5, which along with Cas9 are widely used for targeted genome manipulation. Whereas Cas12 family nucleases are well characterized both biochemically and structurally6, the molecular mechanism of TnpB remains unknown. Here we present the cryogenic-electron microscopy structures of the Deinococcus radiodurans TnpB-reRNA (right-end transposon element-derived RNA) complex in DNA-bound and -free forms. The structures reveal the basic architecture of TnpB nuclease and the molecular mechanism for DNA target recognition and cleavage that is supported by biochemical experiments. Collectively, these results demonstrate that TnpB represents the minimal structural and functional core of the Cas12 protein family and provide a framework for developing TnpB-based genome editing tools.
Asunto(s)
Proteínas Asociadas a CRISPR , Elementos Transponibles de ADN , Deinococcus , Endonucleasas , Edición Génica , Proteínas Asociadas a CRISPR/química , Proteínas Asociadas a CRISPR/clasificación , Proteínas Asociadas a CRISPR/metabolismo , Proteínas Asociadas a CRISPR/ultraestructura , Sistemas CRISPR-Cas/genética , Microscopía por Crioelectrón , Deinococcus/enzimología , Deinococcus/genética , ADN/química , ADN/genética , ADN/metabolismo , ADN/ultraestructura , Elementos Transponibles de ADN/genética , Endonucleasas/química , Endonucleasas/clasificación , Endonucleasas/metabolismo , Endonucleasas/ultraestructura , Evolución Molecular , Edición Génica/métodos , ARN Guía de Sistemas CRISPR-CasRESUMEN
Prokaryotic toxin-antitoxin (TA) systems are composed of a toxin capable of interfering with key cellular processes and its neutralizing antidote, the antitoxin. Here, we focus on the HEPN-MNT TA system encoded in the vicinity of a subtype I-D CRISPR-Cas system in the cyanobacterium Aphanizomenon flos-aquae. We show that HEPN acts as a toxic RNase, which cleaves off 4 nt from the 3' end in a subset of tRNAs, thereby interfering with translation. Surprisingly, we find that the MNT (minimal nucleotidyltransferase) antitoxin inhibits HEPN RNase through covalent di-AMPylation (diadenylylation) of a conserved tyrosine residue, Y109, in the active site loop. Furthermore, we present crystallographic snapshots of the di-AMPylation reaction at different stages that explain the mechanism of HEPN RNase inactivation. Finally, we propose that the HEPN-MNT system functions as a cellular ATP sensor that monitors ATP homeostasis and, at low ATP levels, releases active HEPN toxin.
Asunto(s)
Antitoxinas/genética , Toxinas Bacterianas/genética , Ribonucleasas/genética , Sistemas Toxina-Antitoxina/genética , Adenosina Monofosfato/genética , Antídotos/química , Antitoxinas/metabolismo , Aphanizomenon/química , Aphanizomenon/genética , Sistemas CRISPR-Cas/genética , Nucleotidiltransferasas/genética , Nucleotidiltransferasas/metabolismo , Ribonucleasas/metabolismo , Tirosina/genéticaRESUMEN
Transposition has a key role in reshaping genomes of all living organisms1. Insertion sequences of IS200/IS605 and IS607 families2 are among the simplest mobile genetic elements and contain only the genes that are required for their transposition and its regulation. These elements encode tnpA transposase, which is essential for mobilization, and often carry an accessory tnpB gene, which is dispensable for transposition. Although the role of TnpA in transposon mobilization of IS200/IS605 is well documented, the function of TnpB has remained largely unknown. It had been suggested that TnpB has a role in the regulation of transposition, although no mechanism for this has been established3-5. A bioinformatic analysis indicated that TnpB might be a predecessor of the CRISPR-Cas9/Cas12 nucleases6-8. However, no biochemical activities have been ascribed to TnpB. Here we show that TnpB of Deinococcus radiodurans ISDra2 is an RNA-directed nuclease that is guided by an RNA, derived from the right-end element of a transposon, to cleave DNA next to the 5'-TTGAT transposon-associated motif. We also show that TnpB could be reprogrammed to cleave DNA target sites in human cells. Together, this study expands our understanding of transposition mechanisms by highlighting the role of TnpB in transposition, experimentally confirms that TnpB is a functional progenitor of CRISPR-Cas nucleases and establishes TnpB as a prototype of a new system for genome editing.
Asunto(s)
Elementos Transponibles de ADN/genética , Deinococcus/enzimología , Deinococcus/genética , Desoxirribonucleasa I/genética , Desoxirribonucleasa I/metabolismo , ARN/genética , Secuencia de Bases , Proteínas Asociadas a CRISPR/metabolismo , Sistemas CRISPR-Cas , Escherichia coli/genética , Edición Génica , Células HEK293 , Humanos , Motivos de NucleótidosRESUMEN
Structure-resolved protein interactions with other proteins, peptides and nucleic acids are key for understanding molecular mechanisms. The PPI3D web server enables researchers to query preprocessed and clustered structural data, analyze the results and make homology-based inferences for protein interactions. PPI3D offers three interaction exploration modes: (i) all interactions for proteins homologous to the query, (ii) interactions between two proteins or their homologs and (iii) interactions within a specific PDB entry. The server allows interactive analysis of the identified interactions in both summarized and detailed manner. This includes protein annotations, structures, the interface residues and the corresponding contact surface areas. In addition, users can make inferences about residues at the interaction interface for the query protein(s) from the sequence alignments and homology models. The weekly updated PPI3D database includes all the interaction interfaces and binding sites from PDB, clustered based on both protein sequence and structural similarity, yielding non-redundant datasets without loss of alternative interaction modes. Consequently, the PPI3D users avoid being flooded with redundant information, a typical situation for intensely studied proteins. Furthermore, PPI3D provides a possibility to download user-defined sets of interaction interfaces and analyze them locally. The PPI3D web server is available at https://bioinformatics.lt/ppi3d.
Asunto(s)
Internet , Programas Informáticos , Sitios de Unión , Mapeo de Interacción de Proteínas , Bases de Datos de Proteínas , Unión Proteica , Péptidos/química , Péptidos/metabolismo , Modelos Moleculares , Proteínas/química , Proteínas/metabolismo , Ácidos Nucleicos/química , Ácidos Nucleicos/metabolismoRESUMEN
Argonaute (Ago) proteins are present in all three domains of life (bacteria, archaea and eukaryotes). They use small (15-30 nucleotides) oligonucleotide guides to bind complementary nucleic acid targets and are responsible for gene expression regulation, mobile genome element silencing, and defence against viruses or plasmids. According to their domain organization, Agos are divided into long and short Agos. Long Agos found in prokaryotes (long-A and long-B pAgos) and eukaryotes (eAgos) comprise four major functional domains (N, PAZ, MID and PIWI) and two structural linker domains L1 and L2. The majority (â¼60%) of pAgos are short pAgos, containing only the MID and inactive PIWI domains. Here we focus on the prokaryotic Argonaute AfAgo from Archaeoglobus fulgidus DSM4304. Although phylogenetically classified as a long-B pAgo, AfAgo contains only MID and catalytically inactive PIWI domains, akin to short pAgos. We show that AfAgo forms a heterodimeric complex with a protein encoded upstream in the same operon, which is a structural equivalent of the N-L1-L2 domains of long pAgos. This complex, structurally equivalent to a long PAZ-less pAgo, outperforms standalone AfAgo in guide RNA-mediated target DNA binding. Our findings provide a missing piece to one of the first and the most studied pAgos.
Asunto(s)
Proteínas Arqueales , Archaeoglobus fulgidus , Proteínas Argonautas , Archaeoglobus fulgidus/metabolismo , Proteínas Argonautas/metabolismo , Bacterias/genética , Eucariontes/genética , Células Procariotas/metabolismo , Dominios Proteicos , ARN Guía de Sistemas CRISPR-Cas , Proteínas Arqueales/metabolismoRESUMEN
Streptococcus thermophilus (St) type III-A CRISPR-Cas system restricts MS2 RNA phage and cuts RNA in vitro. However, the CRISPR array spacers match DNA phages, raising the question: does the St CRISPR-Cas system provide immunity by erasing phage mRNA or/and by eliminating invading DNA? We show that it does both. We find that (1) base-pairing between crRNA and target RNA activates single-stranded DNA (ssDNA) degradation by StCsm; (2) ssDNase activity is confined to the HD-domain of Cas10; (3) target RNA cleavage by the Csm3 RNase suppresses Cas10 DNase activity, ensuring temporal control of DNA degradation; and (4) base-pairing between crRNA 5'-handle and target RNA 3'-flanking sequence inhibits Cas10 ssDNase to prevent self-targeting. We propose that upon phage infection, crRNA-guided StCsm binding to the emerging transcript recruits Cas10 DNase to the actively transcribed phage DNA, resulting in degradation of both the transcript and phage DNA, but not the host DNA.
Asunto(s)
Proteínas Asociadas a CRISPR/metabolismo , Sistemas CRISPR-Cas , ADN Bacteriano/metabolismo , ADN de Cadena Simple/metabolismo , ADN Viral/metabolismo , ARN Mensajero/metabolismo , ARN Viral/metabolismo , ADN Polimerasa Dirigida por ARN/metabolismo , Streptococcus thermophilus/metabolismo , Proteínas Asociadas a CRISPR/genética , Proteínas Asociadas a CRISPR/inmunología , Sistemas CRISPR-Cas/inmunología , ADN Bacteriano/genética , ADN Bacteriano/inmunología , ADN de Cadena Simple/genética , ADN de Cadena Simple/inmunología , ADN Viral/genética , ADN Viral/inmunología , Escherichia coli/genética , Escherichia coli/inmunología , Escherichia coli/virología , Interacciones Huésped-Patógeno , Modelos Moleculares , Mutación , Conformación de Ácido Nucleico , Conformación Proteica , División del ARN , Estabilidad del ARN , ARN Mensajero/genética , ARN Mensajero/inmunología , ARN Viral/genética , ARN Viral/inmunología , ADN Polimerasa Dirigida por ARN/genética , Streptococcus thermophilus/genética , Streptococcus thermophilus/inmunología , Streptococcus thermophilus/virología , Factores de TiempoRESUMEN
Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset consisting of 1677 homodimer protein crystal structures, including a balanced mix of physiological and non-physiological complexes. The non-physiological complexes in the benchmark were selected to bury a similar or larger interface area than their physiological counterparts, making it more difficult for scoring functions to differentiate between them. Next, 252 functions for scoring protein-protein interfaces previously developed by 13 groups were collected and evaluated for their ability to discriminate between physiological and non-physiological complexes. A simple consensus score generated using the best performing score of each of the 13 groups, and a cross-validated Random Forest (RF) classifier were created. Both approaches showed excellent performance, with an area under the Receiver Operating Characteristic (ROC) curve of 0.93 and 0.94, respectively, outperforming individual scores developed by different groups. Additionally, AlphaFold2 engines recalled the physiological dimers with significantly higher accuracy than the non-physiological set, lending support to the reliability of our benchmark dataset annotations. Optimizing the combined power of interface scoring functions and evaluating it on challenging benchmark datasets appears to be a promising strategy.
Asunto(s)
Proteínas , Reproducibilidad de los Resultados , Proteínas/metabolismo , Unión ProteicaRESUMEN
We present VoroIF-GNN (Voronoi InterFace Graph Neural Network), a novel method for assessing inter-subunit interfaces in a structural model of a protein-protein complex, relying solely on the input structure without any additional information. Given a multimeric protein structural model, we derive interface contacts from the Voronoi tessellation of atomic balls, construct a graph of those contacts, and predict the accuracy of every contact using an attention-based GNN. The contact-level predictions are then summarized to produce whole interface-level scores. VoroIF-GNN was blindly tested for its ability to estimate the accuracy of protein complexes during CASP15 and showed strong performance in selecting the best multimeric model out of many. The method implementation is freely available at https://kliment-olechnovic.github.io/voronota/expansion_js/.
Asunto(s)
Redes Neurales de la Computación , Proteínas , Modelos Moleculares , Proteínas/químicaRESUMEN
Proteins often function as part of permanent or transient multimeric complexes, and understanding function of these assemblies requires knowledge of their three-dimensional structures. While the ability of AlphaFold to predict structures of individual proteins with unprecedented accuracy has revolutionized structural biology, modeling structures of protein assemblies remains challenging. To address this challenge, we developed a protocol for predicting structures of protein complexes involving model sampling followed by scoring focused on the subunit-subunit interaction interface. In this protocol, we diversified AlphaFold models by varying construction and pairing of multiple sequence alignments as well as increasing the number of recycles. In cases when AlphaFold failed to assemble a full protein complex or produced unreliable results, additional diverse models were constructed by docking of monomers or subcomplexes. All the models were then scored using a newly developed method, VoroIF-jury, which relies only on structural information. Notably, VoroIF-jury is independent of AlphaFold self-assessment scores and therefore can be used to rank models originating from different structure prediction methods. We tested our protocol in CASP15 and obtained top results, significantly outperforming the standard AlphaFold-Multimer pipeline. Analysis of our results showed that the accuracy of our assembly models was capped mainly by structure sampling rather than model scoring. This observation suggests that better sampling, especially for the antibody-antigen complexes, may lead to further improvement. Our protocol is expected to be useful for modeling and/or scoring protein assemblies.
Asunto(s)
Biología Computacional , Proteínas , Biología Computacional/métodos , Proteínas/químicaRESUMEN
SUMMARY: VoroContacts is a versatile tool for computing and analyzing contact surface areas (CSAs) and solvent accessible surface areas (SASAs) for three-dimensional (3D) structures of proteins, nucleic acids and their complexes at the atomic resolution. CSAs and SASAs are derived using Voronoi tessellation of 3D structure, represented as a collection of atomic balls. VoroContacts web server features a highly configurable query interface, which enables on-the-fly analysis of contacts for selected set of atoms and allows filtering interatomic contacts by their type, surface areas, distance between contacting atoms and sequence separation between contacting residues. The VoroContacts functionality is also implemented as part of the standalone Voronota package, enabling batch processing. AVAILABILITY AND IMPLEMENTATION: https://bioinformatics.lt/wtsam/vorocontacts. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Ácidos Nucleicos , Programas Informáticos , Proteínas/química , Estructura Molecular , SolventesRESUMEN
MOTIVATION: Effective use of evolutionary information has recently led to tremendous progress in computational prediction of three-dimensional (3D) structures of proteins and their complexes. Despite the progress, the accuracy of predicted structures tends to vary considerably from case to case. Since the utility of computational models depends on their accuracy, reliable estimates of deviation between predicted and native structures are of utmost importance. RESULTS: For the first time, we present a deep convolutional neural network (CNN) constructed on a Voronoi tessellation of 3D molecular structures. Despite the irregular data domain, our data representation allows us to efficiently introduce both convolution and pooling operations and train the network in an end-to-end fashion without precomputed descriptors. The resultant model, VoroCNN, predicts local qualities of 3D protein folds. The prediction results are competitive to state of the art and superior to the previous 3D CNN architectures built for the same task. We also discuss practical applications of VoroCNN, for example, in recognition of protein binding interfaces. AVAILABILITY AND IMPLEMENTATION: The model, data and evaluation tests are available at https://team.inria.fr/nano-d/software/vorocnn/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMEN
Immunity against viruses and plasmids provided by CRISPR-Cas systems relies on a ribonucleoprotein effector complex that triggers the degradation of invasive nucleic acids (NA). Effector complexes of type I (Cascade) and II (Cas9-dual RNA) target foreign DNA. Intriguingly, the genetic evidence suggests that the type III-A Csm complex targets DNA, whereas biochemical data show that the type III-B Cmr complex cleaves RNA. Here we aimed to investigate NA specificity and mechanism of CRISPR interference for the Streptococcus thermophilus Csm (III-A) complex (StCsm). When expressed in Escherichia coli, two complexes of different stoichiometry copurified with 40 and 72 nt crRNA species, respectively. Both complexes targeted RNA and generated multiple cuts at 6 nt intervals. The Csm3 protein, present in multiple copies in both Csm complexes, acts as endoribonuclease. In the heterologous E. coli host, StCsm restricts MS2 RNA phage in a Csm3 nuclease-dependent manner. Thus, our results demonstrate that the type III-A StCsm complex guided by crRNA targets RNA and not DNA.
Asunto(s)
Proteínas Bacterianas/metabolismo , Proteínas Asociadas a CRISPR/metabolismo , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , División del ARN , Streptococcus thermophilus/genética , Secuencia de Aminoácidos , Proteínas Bacterianas/química , Proteínas Bacterianas/genética , Secuencia de Bases , Proteínas Asociadas a CRISPR/química , Proteínas Asociadas a CRISPR/genética , Endorribonucleasas/genética , Endorribonucleasas/metabolismo , Datos de Secuencia Molecular , Unión Proteica , Estructura Cuaternaria de Proteína , Dispersión del Ángulo Pequeño , Streptococcus thermophilus/enzimología , Difracción de Rayos XRESUMEN
B-family DNA polymerases (PolBs) represent the most common replicases. PolB enzymes that require RNA (or DNA) primed templates for DNA synthesis are found in all domains of life and many DNA viruses. Despite extensive research on PolBs, their origins and evolution remain enigmatic. Massive accumulation of new genomic and metagenomic data from diverse habitats as well as availability of new structural information prompted us to conduct a comprehensive analysis of the PolB sequences, structures, domain organizations, taxonomic distribution and co-occurrence in genomes. Based on phylogenetic analysis, we identified a new, widespread group of bacterial PolBs that are more closely related to the catalytically active N-terminal half of the eukaryotic PolEpsilon (PolEpsilonN) than to Escherichia coli Pol II. In Archaea, we characterized six new groups of PolBs. Two of them show close relationships with eukaryotic PolBs, the first one with PolEpsilonN, and the second one with PolAlpha, PolDelta and PolZeta. In addition, structure comparisons suggested common origin of the catalytically inactive C-terminal half of PolEpsilon (PolEpsilonC) and PolAlpha. Finally, in certain archaeal PolBs we discovered C-terminal Zn-binding domains closely related to those of PolAlpha and PolEpsilonC. Collectively, the obtained results allowed us to propose a scenario for the evolution of eukaryotic PolBs.
Asunto(s)
ADN Polimerasa beta/química , ADN Polimerasa beta/clasificación , Eucariontes/enzimología , Evolución Molecular , Archaea/enzimología , Bacterias/enzimología , Virus ADN/enzimología , Bases de Datos de ProteínasRESUMEN
CRISPR-associated Rossmann Fold (CARF) and SMODS-associated and fused to various effector domains (SAVED) are key components of cyclic oligonucleotide-based antiphage signaling systems (CBASS) that sense cyclic oligonucleotides and transmit the signal to an effector inducing cell dormancy or death. Most of the CARFs are components of a CBASS built into type III CRISPR-Cas systems, where the CARF domain binds cyclic oligoA (cOA) synthesized by Cas10 polymerase-cyclase and allosterically activates the effector, typically a promiscuous ribonuclease. Additionally, this signaling pathway includes a ring nuclease, often also a CARF domain (either the sensor itself or a specialized enzyme) that cleaves cOA and mitigates dormancy or death induction. We present a comprehensive census of CARF and SAVED domains in bacteria and archaea, and their sequence- and structure-based classification. There are 10 major families of CARF domains and multiple smaller groups that differ in structural features, association with distinct effectors, and presence or absence of the ring nuclease activity. By comparative genome analysis, we predict specific functions of CARF and SAVED domains and partition the CARF domains into those with both sensor and ring nuclease functions, and sensor-only ones. Several families of ring nucleases functionally associated with sensor-only CARF domains are also predicted.
Asunto(s)
Archaea/genética , Proteínas Arqueales/genética , Bacterias/genética , Proteínas Bacterianas/genética , Sistemas CRISPR-Cas , Dominios Proteicos , Archaea/enzimología , Proteínas Arqueales/química , Bacterias/enzimología , Proteínas Bacterianas/química , Evolución MolecularRESUMEN
In recent years, CRISPR-associated (Cas) nucleases have revolutionized the genome editing field. Being guided by an RNA to cleave double-stranded (ds) DNA targets near a short sequence termed a protospacer adjacent motif (PAM), Cas9 and Cas12 offer unprecedented flexibility, however, more compact versions would simplify delivery and extend application. Here, we present a collection of 10 exceptionally compact (422-603 amino acids) CRISPR-Cas12f nucleases that recognize and cleave dsDNA in a PAM dependent manner. Categorized as class 2 type V-F, they originate from the previously identified Cas14 family and distantly related type V-U3 Cas proteins found in bacteria. Using biochemical methods, we demonstrate that a 5' T- or C-rich PAM sequence triggers dsDNA target cleavage. Based on this discovery, we evaluated whether they can protect against invading dsDNA in Escherichia coli and find that some but not all can. Altogether, our findings show that miniature Cas12f nucleases can protect against invading dsDNA like much larger class 2 CRISPR effectors and have the potential to be harnessed as programmable nucleases for genome editing.
Asunto(s)
Proteínas Asociadas a CRISPR/metabolismo , Endodesoxirribonucleasas/metabolismo , División del ADN , Escherichia coli/genética , Edición Génica , Motivos de Nucleótidos , Plásmidos/genéticaRESUMEN
The goal of CASP experiments is to monitor the progress in the protein structure prediction field. During the 14th CASP edition we aimed to test our capabilities of predicting structures of protein complexes. Our protocol for modeling protein assemblies included both template-based modeling and free docking. Structural templates were identified using sensitive sequence-based searches. If sequence-based searches failed, we performed structure-based template searches using selected CASP server models. In the absence of reliable templates we applied free docking starting from monomers generated by CASP servers. We evaluated and ranked models of protein complexes using an improved version of our protein structure quality assessment method, VoroMQA, taking into account both interaction interface and global structure scores. If reliable templates could be identified, generally accurate models of protein assemblies were generated with the exception of an antibody-antigen interaction. The success of free docking mainly depended on the accuracy of initial subunit models and on the scoring of docking solutions. To put our overall results in perspective, we analyzed our performance in the context of other CASP groups. Although the subunits in our assembly models often were not of the top quality, these models had, overall, the best-predicted intersubunit interfaces according to several accuracy measures. We attribute our relative success primarily to the emphasis on the interaction interface when modeling and scoring.
Asunto(s)
Modelos Moleculares , Conformación Proteica , Proteínas , Programas Informáticos , Homología Estructural de Proteína , Sitios de Unión , Biología Computacional , Simulación del Acoplamiento Molecular , Dominios y Motivos de Interacción de Proteínas , Proteínas/química , Proteínas/metabolismo , Análisis de Secuencia de ProteínaRESUMEN
Critical Assessment of Structure Prediction (CASP) is an organization aimed at advancing the state of the art in computing protein structure from sequence. In the spring of 2020, CASP launched a community project to compute the structures of the most structurally challenging proteins coded for in the SARS-CoV-2 genome. Forty-seven research groups submitted over 3000 three-dimensional models and 700 sets of accuracy estimates on 10 proteins. The resulting models were released to the public. CASP community members also worked together to provide estimates of local and global accuracy and identify structure-based domain boundaries for some proteins. Subsequently, two of these structures (ORF3a and ORF8) have been solved experimentally, allowing assessment of both model quality and the accuracy estimates. Models from the AlphaFold2 group were found to have good agreement with the experimental structures, with main chain GDT_TS accuracy scores ranging from 63 (a correct topology) to 87 (competitive with experiment).
Asunto(s)
SARS-CoV-2/química , Proteínas Virales/química , COVID-19/virología , Genoma Viral , Humanos , Modelos Moleculares , Conformación Proteica , Dominios Proteicos , SARS-CoV-2/genética , Proteínas Virales/genética , Proteínas Viroporinas/química , Proteínas Viroporinas/genéticaRESUMEN
OBJECTIVE: The study aimed to identify the genetic basis of partial gonadal dysgenesis (PGD) in a non-consanguineous family from Estonia. PATIENTS: Cousins P (proband) 1 (12 years; 46,XY) and P2 (18 years; 46,XY) presented bilateral cryptorchidism, severe penoscrotal hypospadias, low bitesticular volume and azoospermia in P2. Their distant relative, P3 (30 years; 46,XY), presented bilateral cryptorchidism and cryptozoospermia. DESIGN: Exome sequencing was targeted to P1-P3 and five unaffected family members. RESULTS: P1-P2 were identified as heterozygous carriers of NR5A1 c.991-1G > C. NR5A1 encodes the steroidogenic factor-1 essential in gonadal development and specifically expressed in adrenal, spleen, pituitary and testes. Together with a previous PGD case from Belgium (Robevska et al 2018), c.991-1G > C represents the first recurrent NR5A1 splice-site mutation identified in patients. The majority of previous reports on NR5A1 mutation carriers have not included phenotype-genotype data of the family members. Segregation analysis across three generations showed incomplete penetrance (<50%) and phenotypic variability among the carriers of NR5A1 c.991-1G > C. The variant pathogenicity was possibly modulated by rare heterozygous variants inherited from the other parent, OTX2 p.P134R (P1) or PROP1 c.301_302delAG (P2). For P3, the pedigree structure supported a distinct genetic cause. He carries a previously undescribed likely pathogenic variant SOS1 p.Y136H. SOS1, critical in Ras/MAPK signalling and foetal development, is a strong novel candidate gene for cryptorchidism. CONCLUSIONS: Detailed genetic profiling facilitates counselling and clinical management of the probands, and supports unaffected mutation carriers in the family for their reproductive decision making.
Asunto(s)
Disgenesia Gonadal 46 XY , Penetrancia , Factor Esteroidogénico 1 , Variación Biológica Poblacional , Disgenesia Gonadal 46 XY/genética , Humanos , Masculino , Mutación , Factor Esteroidogénico 1/genética , TestículoRESUMEN
Bacterial Y-family DNA polymerases are usually classified into DinB (Pol IV), UmuC (the catalytic subunit of Pol V) and ImuB, a catalytically dead essential component of the ImuA-ImuB-DnaE2 mutasome. However, the true diversity of Y-family polymerases is unknown. Furthermore, for most of them the structures are unavailable and interactions are poorly characterized. To gain a better understanding of bacterial Y-family DNA polymerases, we performed a detailed computational study. It revealed substantial diversity, far exceeding traditional classification. We found that a large number of subfamilies feature a C-terminal extension next to the common Y-family region. Unexpectedly, in most C-terminal extensions we identified a region homologous to the N-terminal oligomerization motif of RecA. This finding implies a universal mode of interaction between Y-family members and RecA (or ImuA), in the case of Pol V strongly supported by experimental data. In gram-positive bacteria, we identified a putative Pol V counterpart composed of a Y-family polymerase, a YolD homolog and RecA. We also found ImuA-ImuB-DnaE2 variants lacking ImuA, but retaining active or inactive Y-family polymerase, a standalone ImuB C-terminal domain and/or DnaE2. In summary, our analyses revealed that, despite considerable diversity, bacterial Y-family polymerases share previously unanticipated similarities in their structural domains/motifs and interactions.
Asunto(s)
Proteínas de Unión al ADN/genética , ADN Polimerasa Dirigida por ADN/genética , Proteínas de Escherichia coli/genética , Conformación Proteica , Rec A Recombinasas/genética , Secuencia de Aminoácidos/genética , Dominio Catalítico/genética , Biología Computacional , Citoesqueleto/química , Citoesqueleto/genética , ADN Polimerasa III/química , ADN Polimerasa III/genética , Proteínas de Unión al ADN/química , ADN Polimerasa Dirigida por ADN/química , ADN Polimerasa Dirigida por ADN/clasificación , Escherichia coli/enzimología , Escherichia coli/genética , Proteínas de Escherichia coli/química , Modelos Moleculares , Rec A Recombinasas/químicaRESUMEN
The VoroMQA (Voronoi tessellation-based Model Quality Assessment) web server is dedicated to the estimation of protein structure quality, a common step in selecting realistic and most accurate computational models and in validating experimental structures. As an input, the VoroMQA web server accepts one or more protein structures in PDB format. Input structures may be either monomeric proteins or multimeric protein complexes. For every input structure, the server provides both global and local (per-residue) scores. Visualization of the local scores along the protein chain is enhanced by providing secondary structure assignment and information on solvent accessibility. A unique feature of the VoroMQA server is the ability to directly assess protein-protein interaction interfaces. If this type of assessment is requested, the web server provides interface quality scores, interface energy estimates, and local scores for residues involved in inter-chain interfaces. VoroMQA, the underlying method of the web server, was extensively tested in recent community-wide CASP and CAPRI experiments. During these experiments VoroMQA showed outstanding performance both in model selection and in estimation of accuracy of local structural regions. The VoroMQA web server is available at http://bioinformatics.ibt.lt/wtsam/voromqa.