Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 61
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Bio Protoc ; 11(14): e4100, 2021 Jul 20.
Artículo en Inglés | MEDLINE | ID: mdl-34395736

RESUMEN

Efficient precision genome engineering requires high frequency and specificity of integration at the genomic target site. Multiple design strategies for zebrafish gene targeting have previously been reported with widely varying frequencies for germline recovery of integration alleles. The GeneWeld protocol and pGTag (plasmids for Gene Tagging) vector series provide a set of resources to streamline precision gene targeting in zebrafish. Our approach uses short homology of 24-48 bp to drive targeted integration of DNA reporter cassettes by homology-mediated end joining (HMEJ) at a CRISPR/Cas induced DNA double-strand break. The pGTag vectors contain reporters flanked by a universal CRISPR sgRNA sequence to liberate the targeting cassette in vivo and expose homology arms for homology-driven integration. Germline transmission rates for precision-targeted integration alleles range 22-100%. Our system provides a streamlined, straightforward, and cost-effective approach for high-efficiency gene targeting applications in zebrafish. Graphic abstract: GeneWeld method for CRISPR/Cas9 targeted integration.

2.
Elife ; 92020 05 15.
Artículo en Inglés | MEDLINE | ID: mdl-32412410

RESUMEN

Efficient precision genome engineering requires high frequency and specificity of integration at the genomic target site. Here, we describe a set of resources to streamline reporter gene knock-ins in zebrafish and demonstrate the broader utility of the method in mammalian cells. Our approach uses short homology of 24-48 bp to drive targeted integration of DNA reporter cassettes by homology-mediated end joining (HMEJ) at high frequency at a double strand break in the targeted gene. Our vector series, pGTag (plasmids for Gene Tagging), contains reporters flanked by a universal CRISPR sgRNA sequence which enables in vivo liberation of the homology arms. We observed high rates of germline transmission (22-100%) for targeted knock-ins at eight zebrafish loci and efficient integration at safe harbor loci in porcine and human cells. Our system provides a straightforward and cost-effective approach for high efficiency gene targeting applications in CRISPR and TALEN compatible systems.


Asunto(s)
Proteínas Asociadas a CRISPR/genética , Sistemas CRISPR-Cas , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas , Técnicas de Sustitución del Gen , Genes Reporteros , Proteínas Fluorescentes Verdes/genética , Nucleasas de los Efectores Tipo Activadores de la Transcripción/genética , Pez Cebra/genética , Animales , Animales Modificados Genéticamente , Proteínas Asociadas a CRISPR/metabolismo , Fibroblastos/metabolismo , Regulación de la Expresión Génica , Proteínas Fluorescentes Verdes/metabolismo , Humanos , Células K562 , Leucemia Mielógena Crónica BCR-ABL Positiva/genética , Leucemia Mielógena Crónica BCR-ABL Positiva/metabolismo , ARN Guía de Kinetoplastida/genética , ARN Guía de Kinetoplastida/metabolismo , Reparación del ADN por Recombinación , Homología de Secuencia de Ácido Nucleico , Sus scrofa , Nucleasas de los Efectores Tipo Activadores de la Transcripción/metabolismo
3.
CRISPR J ; 2(6): 417-433, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31742435

RESUMEN

CRISPR and CRISPR-Cas effector proteins enable the targeting of DNA double-strand breaks to defined loci based on a variable length RNA guide specific to each effector. The guide RNAs are generally similar in size and form, consisting of a ∼20 nucleotide sequence complementary to the DNA target and an RNA secondary structure recognized by the effector. However, the effector proteins vary in protospacer adjacent motif requirements, nuclease activities, and DNA binding kinetics. Recently, ErCas12a, a new member of the Cas12a family, was identified in Eubacterium rectale. Here, we report the first characterization of ErCas12a activity in zebrafish and expand on previously reported activity in human cells. Using a fluorescent reporter system, we show that CRISPR-ErCas12a elicits strand annealing mediated DNA repair more efficiently than CRISPR-Cas9. Further, using our previously reported gene targeting method that utilizes short homology, GeneWeld, we demonstrate the use of CRISPR-ErCas12a to integrate reporter alleles into the genomes of both zebrafish and human cells. Together, this work provides methods for deploying an additional CRISPR-Cas system, thus increasing the flexibility researchers have in applying genome engineering technologies.


Asunto(s)
Sistemas CRISPR-Cas/genética , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas/genética , Edición Génica/métodos , Animales , Secuencia de Bases , Proteínas Asociadas a CRISPR/genética , ADN/química , Marcación de Gen/métodos , Ingeniería Genética/métodos , Genoma/genética , Humanos , ARN/química , ARN Guía de Kinetoplastida/química , Pez Cebra/genética
4.
Nucleic Acids Res ; 47(W1): W175-W182, 2019 07 02.
Artículo en Inglés | MEDLINE | ID: mdl-31127311

RESUMEN

The discovery and development of DNA-editing nucleases (Zinc Finger Nucleases, TALENs, CRISPR/Cas systems) has given scientists the ability to precisely engineer or edit genomes as never before. Several different platforms, protocols and vectors for precision genome editing are now available, leading to the development of supporting web-based software. Here we present the Gene Sculpt Suite (GSS), which comprises three tools: (i) GTagHD, which automatically designs and generates oligonucleotides for use with the GeneWeld knock-in protocol; (ii) MEDJED, a machine learning method, which predicts the extent to which a double-stranded DNA break site will utilize the microhomology-mediated repair pathway; and (iii) MENTHU, a tool for identifying genomic locations likely to give rise to a single predominant microhomology-mediated end joining allele (PreMA) repair outcome. All tools in the GSS are freely available for download under the GPL v3.0 license and can be run locally on Windows, Mac and Linux systems capable of running R and/or Docker. The GSS is also freely available online at www.genesculpt.org.


Asunto(s)
Bases de Datos Genéticas , Edición Génica , Ingeniería Genética/métodos , Programas Informáticos , Animales , Sistemas CRISPR-Cas/genética , Roturas del ADN de Doble Cadena , Humanos , Nucleasas de los Efectores Tipo Activadores de la Transcripción/genética , Nucleasas con Dedos de Zinc/genética
5.
Proteins ; 87(3): 198-211, 2019 03.
Artículo en Inglés | MEDLINE | ID: mdl-30536635

RESUMEN

RNA-protein interactions play essential roles in regulating gene expression. While some RNA-protein interactions are "specific", that is, the RNA-binding proteins preferentially bind to particular RNA sequence or structural motifs, others are "non-RNA specific." Deciphering the protein-RNA recognition code is essential for comprehending the functional implications of these interactions and for developing new therapies for many diseases. Because of the high cost of experimental determination of protein-RNA interfaces, there is a need for computational methods to identify RNA-binding residues in proteins. While most of the existing computational methods for predicting RNA-binding residues in RNA-binding proteins are oblivious to the characteristics of the partner RNA, there is growing interest in methods for partner-specific prediction of RNA binding sites in proteins. In this work, we assess the performance of two recently published partner-specific protein-RNA interface prediction tools, PS-PRIP, and PRIdictor, along with our own new tools. Specifically, we introduce a novel metric, RNA-specificity metric (RSM), for quantifying the RNA-specificity of the RNA binding residues predicted by such tools. Our results show that the RNA-binding residues predicted by previously published methods are oblivious to the characteristics of the putative RNA binding partner. Moreover, when evaluated using partner-agnostic metrics, RNA partner-specific methods are outperformed by the state-of-the-art partner-agnostic methods. We conjecture that either (a) the protein-RNA complexes in PDB are not representative of the protein-RNA interactions in nature, or (b) the current methods for partner-specific prediction of RNA-binding residues in proteins fail to account for the differences in RNA partner-specific versus partner-agnostic protein-RNA interactions, or both.


Asunto(s)
Biología Computacional , Proteínas/química , Proteínas de Unión al ARN/genética , ARN/genética , Secuencia de Aminoácidos/genética , Secuencia de Bases/genética , Sitios de Unión/genética , Modelos Moleculares , Unión Proteica/genética , Conformación Proteica , Proteínas/genética , ARN/química , Motivos de Unión al ARN/genética , Proteínas de Unión al ARN/química , Análisis de Secuencia de Proteína , Programas Informáticos
6.
PLoS Genet ; 14(9): e1007652, 2018 09.
Artículo en Inglés | MEDLINE | ID: mdl-30208061

RESUMEN

One key problem in precision genome editing is the unpredictable plurality of sequence outcomes at the site of targeted DNA double stranded breaks (DSBs). This is due to the typical activation of the versatile Non-homologous End Joining (NHEJ) pathway. Such unpredictability limits the utility of somatic gene editing for applications including gene therapy and functional genomics. For germline editing work, the accurate reproduction of the identical alleles using NHEJ is a labor intensive process. In this study, we propose Microhomology-mediated End Joining (MMEJ) as a viable solution for improving somatic sequence homogeneity in vivo, capable of generating a single predictable allele at high rates (56% ~ 86% of the entire mutant allele pool). Using a combined dataset from zebrafish (Danio rerio) in vivo and human HeLa cell in vitro, we identified specific contextual sequence determinants surrounding genomic DSBs for robust MMEJ pathway activation. We then applied our observation to prospectively design MMEJ-inducing sgRNAs against a variety of proof-of-principle genes and demonstrated high levels of mutant allele homogeneity. MMEJ-based DNA repair at these target loci successfully generated F0 mutant zebrafish embryos and larvae that faithfully recapitulated previously reported, recessive, loss-of-function phenotypes. We also tested the generalizability of our approach in cultured human cells. Finally, we provide a novel algorithm, MENTHU (http://genesculpt.org/menthu/), for improved and facile prediction of candidate MMEJ loci. We believe that this MMEJ-centric approach will have a broader impact on genome engineering and its applications. For example, whereas somatic mosaicism hinders efficient recreation of knockout mutant allele at base pair resolution via the standard NHEJ-based approach, we demonstrate that F0 founders transmitted the identical MMEJ allele of interest at high rates. Most importantly, the ability to directly dictate the reading frame of an endogenous target will have important implications for gene therapy applications in human genetic diseases.


Asunto(s)
Roturas del ADN de Doble Cadena , Reparación del ADN por Unión de Extremidades/genética , Edición Génica/métodos , Modelos Genéticos , Algoritmos , Alelos , Animales , Estudios de Factibilidad , Femenino , Enfermedades Genéticas Congénitas/genética , Enfermedades Genéticas Congénitas/terapia , Terapia Genética/métodos , Células HeLa , Humanos , Masculino , Mutagénesis Sitio-Dirigida , ARN Guía de Kinetoplastida/genética , ARN Guía de Kinetoplastida/metabolismo , Pez Cebra
7.
Retrovirology ; 14(1): 40, 2017 Aug 22.
Artículo en Inglés | MEDLINE | ID: mdl-28830558

RESUMEN

BACKGROUND: Rev-like proteins are post-transcriptional regulatory proteins found in several retrovirus genera, including lentiviruses, betaretroviruses, and deltaretroviruses. These essential proteins mediate the nuclear export of incompletely spliced viral RNA, and act by tethering viral pre-mRNA to the host CRM1 nuclear export machinery. Although all Rev-like proteins are functionally homologous, they share less than 30% sequence identity. In the present study, we computationally assessed the extent of structural homology among retroviral Rev-like proteins within a phylogenetic framework. RESULTS: We undertook a comprehensive analysis of overall protein domain architecture and predicted secondary structural features for representative members of the Rev-like family of proteins. Similar patterns of α-helical domains were identified for Rev-like proteins within each genus, with the exception of deltaretroviruses, which were devoid of α-helices. Coiled-coil oligomerization motifs were also identified for most Rev-like proteins, with the notable exceptions of HIV-1, the deltaretroviruses, and some small ruminant lentiviruses. In Rev proteins of primate lentiviruses, the presence of predicted coiled-coil motifs segregated within specific primate lineages: HIV-1 descended from SIVs that lacked predicted coiled-coils in Rev whereas HIV-2 descended from SIVs that contained predicted coiled-coils in Rev. Phylogenetic ancestral reconstruction of coiled-coils for all Rev-like proteins predicted a single origin for the coiled-coil motif, followed by three losses of the predicted signal. The absence of a coiled-coil signal in HIV-1 was associated with replacement of canonical polar residues with non-canonical hydrophobic residues. However, hydrophobic residues were retained in the key 'a' and 'd' positions, and the α-helical region of HIV-1 Rev oligomerization domain could be modeled as a helical wheel with two predicted interaction interfaces. Moreover, the predicted interfaces mapped to the dimerization and oligomerization interfaces in HIV-1 Rev crystal structures. Helical wheel projections of other retroviral Rev-like proteins, including endogenous sequences, revealed similar interaction interfaces that could mediate oligomerization. CONCLUSIONS: Sequence-based computational analyses of Rev-like proteins, together with helical wheel projections of oligomerization domains, reveal a conserved homogeneous structural basis for oligomerization by retroviral Rev-like proteins.


Asunto(s)
Productos del Gen rev/química , Productos del Gen rev/metabolismo , Modelos Moleculares , Retroviridae/química , Retroviridae/metabolismo , Secuencia de Aminoácidos , Dimerización , Variación Genética , Filogenia , Estructura Secundaria de Proteína , Proteínas de los Retroviridae/química , Proteínas de los Retroviridae/metabolismo , Homología de Secuencia de Aminoácido
8.
Methods Mol Biol ; 1543: 169-185, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28349426

RESUMEN

Experimental methods for identifying protein(s) bound by a specific promoter-associated RNA (paRNA) of interest can be expensive, difficult, and time-consuming. This chapter describes a general computational framework for identifying potential binding partners in RNA-protein complexes or RNA-protein interaction networks. Protocols for using three web-based tools to predict RNA-protein interaction partners are outlined. Also, tables listing additional webservers and software tools for predicting RNA-protein interactions, as well as databases that contain valuable information about known RNA-protein complexes and recognition sites for RNA-binding proteins, are provided. Although only one of the tools described, lncPro, was designed expressly to identify proteins that bind long noncoding RNAs (including paRNAs), all three approaches can be applied to predict potential binding partners for both coding and noncoding RNAs (ncRNAs).


Asunto(s)
Biología Computacional/métodos , Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/metabolismo , ARN/química , ARN/metabolismo , Programas Informáticos , Sitios de Unión , Simulación por Computador , Bases de Datos Genéticas , Unión Proteica , ARN/genética , Motor de Búsqueda , Máquina de Vectores de Soporte , Navegador Web
9.
Brief Bioinform ; 18(3): 458-466, 2017 05 01.
Artículo en Inglés | MEDLINE | ID: mdl-27013645

RESUMEN

Although many advanced and sophisticated ab initio approaches for modeling protein-protein complexes have been proposed in past decades, template-based modeling (TBM) remains the most accurate and widely used approach, given a reliable template is available. However, there are many different ways to exploit template information in the modeling process. Here, we systematically evaluate and benchmark a TBM method that uses conserved interfacial residue pairs as docking distance restraints [referred to as alpha carbon-alpha carbon (CA-CA)-guided docking]. We compare it with two other template-based protein-protein modeling approaches, including a conserved non-pairwise interfacial residue restrained docking approach [referred to as the ambiguous interaction restraint (AIR)-guided docking] and a simple superposition-based modeling approach. Our results show that, for most cases, the CA-CA-guided docking method outperforms both superposition with refinement and the AIR-guided docking method. We emphasize the superiority of the CA-CA-guided docking on cases with medium to large conformational changes, and interactions mediated through loops, tails or disordered regions. Our results also underscore the importance of a proper refinement of superimposition models to reduce steric clashes. In summary, we provide a benchmarked TBM protocol that uses conserved pairwise interface distance as restraints in generating realistic 3D protein-protein interaction models, when reliable templates are available. The described CA-CA-guided docking protocol is based on the HADDOCK platform, which allows users to incorporate additional prior knowledge of the target system to further improve the quality of the resulting models.


Asunto(s)
Proteínas/metabolismo , Modelos Moleculares , Unión Proteica
10.
Methods Mol Biol ; 1484: 205-235, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-27787829

RESUMEN

Identifying individual residues in the interfaces of protein-RNA complexes is important for understanding the molecular determinants of protein-RNA recognition and has many potential applications. Recent technical advances have led to several high-throughput experimental methods for identifying partners in protein-RNA complexes, but determining RNA-binding residues in proteins is still expensive and time-consuming. This chapter focuses on available computational methods for identifying which amino acids in an RNA-binding protein participate directly in contacting RNA. Step-by-step protocols for using three different web-based servers to predict RNA-binding residues are described. In addition, currently available web servers and software tools for predicting RNA-binding sites, as well as databases that contain valuable information about known protein-RNA complexes, RNA-binding motifs in proteins, and protein-binding recognition sites in RNA are provided. We emphasize sequence-based methods that can reliably identify interfacial residues without the requirement for structural information regarding either the RNA-binding protein or its RNA partner.


Asunto(s)
Proteínas/genética , Proteínas de Unión al ARN/genética , Programas Informáticos , Algoritmos , Secuencia de Aminoácidos/genética , Sitios de Unión , Biología Computacional , Unión Proteica , Proteínas/química , Proteínas de Unión al ARN/química
11.
Methods Mol Biol ; 1484: 255-264, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-27787831

RESUMEN

Antibody-protein interactions play a critical role in the humoral immune response. B-cells secrete antibodies, which bind antigens (e.g., cell surface proteins of pathogens). The specific parts of antigens that are recognized by antibodies are called B-cell epitopes. These epitopes can be linear, corresponding to a contiguous amino acid sequence fragment of an antigen, or conformational, in which residues critical for recognition may not be contiguous in the primary sequence, but are in close proximity within the folded protein 3D structure.Identification of B-cell epitopes in target antigens is one of the key steps in epitope-driven subunit vaccine design, immunodiagnostic tests, and antibody production. In silico bioinformatics techniques offer a promising and cost-effective approach for identifying potential B-cell epitopes in a target vaccine candidate. In this chapter, we show how to utilize online B-cell epitope prediction tools to identify linear B-cell epitopes from the primary amino acid sequence of proteins.


Asunto(s)
Biología Computacional/métodos , Mapeo Epitopo/métodos , Proteínas/genética , Secuencia de Aminoácidos/genética , Anticuerpos/genética , Anticuerpos/inmunología , Antígenos/genética , Antígenos/inmunología , Linfocitos B/inmunología , Simulación por Computador , Epítopos/genética , Epítopos/inmunología , Proteínas/química , Proteínas/inmunología
12.
New Phytol ; 212(2): 444-60, 2016 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-27265684

RESUMEN

Heterodera glycines, the soybean cyst nematode, delivers effector proteins into soybean roots to initiate and maintain an obligate parasitic relationship. HgGLAND18 encodes a candidate H. glycines effector and is expressed throughout the infection process. We used a combination of molecular, genetic, bioinformatic and phylogenetic analyses to determine the role of HgGLAND18 during H. glycines infection. HgGLAND18 is necessary for pathogenicity in compatible interactions with soybean. The encoded effector strongly suppresses both basal and hypersensitive cell death innate immune responses, and immunosuppression requires the presence and coordination between multiple protein domains. The N-terminal domain in HgGLAND18 contains unique sequence similarity to domains of an immunosuppressive effector of Plasmodium spp., the malaria parasites. The Plasmodium effector domains functionally complement the loss of the N-terminal domain from HgGLAND18. In-depth sequence searches and phylogenetic analyses demonstrate convergent evolution between effectors from divergent parasites of plants and animals as the cause of sequence and functional similarity.


Asunto(s)
Glycine max/inmunología , Glycine max/parasitología , Inmunidad Innata , Inmunidad de la Planta , Plasmodium/fisiología , Tylenchoidea/fisiología , Factores de Virulencia/metabolismo , Secuencia de Aminoácidos , Animales , Prueba de Complementación Genética , Mutación/genética , Proteínas de Plantas/química , Raíces de Plantas/parasitología , Polimorfismo Genético , Dominios Proteicos , Interferencia de ARN , Secuencias Repetitivas de Ácidos Nucleicos/genética , Tylenchoidea/patogenicidad , Virulencia
13.
Pac Symp Biocomput ; 21: 445-455, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-26776208

RESUMEN

Efforts to predict interfacial residues in protein-RNA complexes have largely focused on predicting RNA-binding residues in proteins. Computational methods for predicting protein-binding residues in RNA sequences, however, are a problem that has received relatively little attention to date. Although the value of sequence motifs for classifying and annotating protein sequences is well established, sequence motifs have not been widely applied to predicting interfacial residues in macromolecular complexes. Here, we propose a novel sequence motif-based method for "partner-specific" interfacial residue prediction. Given a specific protein-RNA pair, the goal is to simultaneously predict RNA binding residues in the protein sequence and protein-binding residues in the RNA sequence. In 5-fold cross validation experiments, our method, PS-PRIP, achieved 92% Specificity and 61% Sensitivity, with a Matthews correlation coefficient (MCC) of 0.58 in predicting RNA-binding sites in proteins. The method achieved 69% Specificity and 75% Sensitivity, but with a low MCC of 0.13 in predicting protein binding sites in RNAs. Similar performance results were obtained when PS-PRIP was tested on two independent "blind" datasets of experimentally validated protein- RNA interactions, suggesting the method should be widely applicable and valuable for identifying potential interfacial residues in protein-RNA complexes for which structural information is not available. The PS-PRIP webserver and datasets are available at: http://pridb.gdcb.iastate.edu/PSPRIP/.


Asunto(s)
Proteínas de Unión al ARN/química , Proteínas de Unión al ARN/metabolismo , ARN/química , ARN/metabolismo , Secuencias de Aminoácidos , Secuencia de Aminoácidos , Secuencia de Bases , Sitios de Unión/genética , Biología Computacional/métodos , Biología Computacional/estadística & datos numéricos , Bases de Datos de Ácidos Nucleicos/estadística & datos numéricos , Bases de Datos de Proteínas/estadística & datos numéricos , Proteínas de Escherichia coli/química , Proteínas de Escherichia coli/genética , Proteínas de Escherichia coli/metabolismo , Modelos Moleculares , Unión Proteica , ARN/genética , ARN Bacteriano/química , ARN Bacteriano/genética , ARN Bacteriano/metabolismo , ARN Ribosómico 16S/química , ARN Ribosómico 16S/genética , ARN Ribosómico 16S/metabolismo , Proteínas de Unión al ARN/genética , Proteínas Ribosómicas/química , Proteínas Ribosómicas/genética , Proteínas Ribosómicas/metabolismo , Programas Informáticos
14.
FEBS Lett ; 589(23): 3516-26, 2015 Nov 30.
Artículo en Inglés | MEDLINE | ID: mdl-26460190

RESUMEN

Reliably pinpointing which specific amino acid residues form the interface(s) between a protein and its binding partner(s) is critical for understanding the structural and physicochemical determinants of protein recognition and binding affinity, and has wide applications in modeling and validating protein interactions predicted by high-throughput methods, in engineering proteins, and in prioritizing drug targets. Here, we review the basic concepts, principles and recent advances in computational approaches to the analysis and prediction of protein-protein interfaces. We point out caveats for objectively evaluating interface predictors, and discuss various applications of data-driven interface predictors for improving energy model-driven protein-protein docking. Finally, we stress the importance of exploiting binding partner information in reliably predicting interfaces and highlight recent advances in this emerging direction.


Asunto(s)
Biología Computacional/métodos , Proteínas/metabolismo , Simulación del Acoplamiento Molecular , Unión Proteica , Proteínas/química , Especificidad por Sustrato
15.
Retrovirology ; 11: 115, 2014 Dec 23.
Artículo en Inglés | MEDLINE | ID: mdl-25533001

RESUMEN

BACKGROUND: The lentiviral Rev protein mediates nuclear export of intron-containing viral RNAs that encode structural proteins or serve as the viral genome. Following translation, HIV-1 Rev localizes to the nucleus and binds its cognate sequence, termed the Rev-responsive element (RRE), in incompletely spliced viral RNA. Rev subsequently multimerizes along the viral RNA and associates with the cellular Crm1 export machinery to translocate the RNA-protein complex to the cytoplasm. Equine infectious anemia virus (EIAV) Rev is functionally homologous to HIV-1 Rev, but shares very little sequence similarity and differs in domain organization. EIAV Rev also contains a bipartite RNA binding domain comprising two short arginine-rich motifs (designated ARM-1 and ARM-2) spaced 79 residues apart in the amino acid sequence. To gain insight into the topology of the bipartite RNA binding domain, a computational approach was used to model the tertiary structure of EIAV Rev. RESULTS: The tertiary structure of EIAV Rev was modeled using several protein structure prediction and model quality assessment servers. Two types of structures were predicted: an elongated structure with an extended central alpha helix, and a globular structure with a central bundle of helices. Assessment of models on the basis of biophysical properties indicated they were of average quality. In almost all models, ARM-1 and ARM-2 were spatially separated by >15 Å, suggesting that they do not form a single RNA binding interface on the monomer. A highly conserved canonical coiled-coil motif was identified in the central region of EIAV Rev, suggesting that an RNA binding interface could be formed through dimerization of Rev and juxtaposition of ARM-1 and ARM-2. In support of this, purified Rev protein migrated as a dimer in Blue native gels, and mutation of a residue predicted to form a key coiled-coil contact disrupted dimerization and abrogated RNA binding. In contrast, mutation of residues outside the predicted coiled-coil interface had no effect on dimerization or RNA binding. CONCLUSIONS: Our results suggest that EIAV Rev binding to the RRE requires dimerization via a coiled-coil motif to juxtapose two RNA binding motifs, ARM-1 and ARM-2.


Asunto(s)
Productos del Gen rev/química , Productos del Gen rev/metabolismo , Virus de la Anemia Infecciosa Equina/fisiología , Multimerización de Proteína , ARN Viral/metabolismo , Modelos Moleculares , Unión Proteica , Conformación Proteica
16.
J Genet Genomics ; 41(12): 627-47, 2014 Dec 20.
Artículo en Inglés | MEDLINE | ID: mdl-25527104

RESUMEN

The G-quadruplex (G4) elements comprise a class of nucleic acid structures formed by stacking of guanine base quartets in a quadruple helix. This G4 DNA can form within or across single-stranded DNA molecules and is mutually exclusive with duplex B-form DNA. The reversibility and structural diversity of G4s make them highly versatile genetic structures, as demonstrated by their roles in various functions including telomere metabolism, genome maintenance, immunoglobulin gene diversification, transcription, and translation. Sequence motifs capable of forming G4 DNA are typically located in telomere repeat DNA and other non-telomeric genomic loci. To investigate their potential roles in a large-genome model plant species, we computationally identified 149,988 non-telomeric G4 motifs in maize (Zea mays L., B73 AGPv2), 29% of which were in non-repetitive genomic regions. G4 motif hotspots exhibited non-random enrichment in genes at two locations on the antisense strand, one in the 5' UTR and the other at the 5' end of the first intron. Several genic G4 motifs were shown to adopt sequence-specific and potassium-dependent G4 DNA structures in vitro. The G4 motifs were prevalent in key regulatory genes associated with hypoxia (group VII ERFs), oxidative stress (DJ-1/GATase1), and energy status (AMPK/SnRK) pathways. They also showed statistical enrichment for genes in metabolic pathways that function in glycolysis, sugar degradation, inositol metabolism, and base excision repair. Collectively, the maize G4 motifs may represent conditional regulatory elements that can aid in energy status gene responses. Such a network of elements could provide a mechanistic basis for linking energy status signals to gene regulation in maize, a model genetic system and major world crop species for feed, food, and fuel.


Asunto(s)
ADN de Plantas/genética , G-Cuádruplex , Genes de Plantas/genética , Genoma de Planta/genética , Zea mays/genética , Regiones no Traducidas 3'/genética , Metabolismo de los Hidratos de Carbono/genética , Dicroismo Circular , ADN de Plantas/química , Metabolismo Energético/genética , Regulación de la Expresión Génica de las Plantas , Redes y Vías Metabólicas/genética , Modelos Genéticos , Consumo de Oxígeno/genética , Zea mays/metabolismo
17.
PLoS One ; 9(5): e97725, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24846307

RESUMEN

Protein-RNA interactions are central to essential cellular processes such as protein synthesis and regulation of gene expression and play roles in human infectious and genetic diseases. Reliable identification of protein-RNA interfaces is critical for understanding the structural bases and functional implications of such interactions and for developing effective approaches to rational drug design. Sequence-based computational methods offer a viable, cost-effective way to identify putative RNA-binding residues in RNA-binding proteins. Here we report two novel approaches: (i) HomPRIP, a sequence homology-based method for predicting RNA-binding sites in proteins; (ii) RNABindRPlus, a new method that combines predictions from HomPRIP with those from an optimized Support Vector Machine (SVM) classifier trained on a benchmark dataset of 198 RNA-binding proteins. Although highly reliable, HomPRIP cannot make predictions for the unaligned parts of query proteins and its coverage is limited by the availability of close sequence homologs of the query protein with experimentally determined RNA-binding sites. RNABindRPlus overcomes these limitations. We compared the performance of HomPRIP and RNABindRPlus with that of several state-of-the-art predictors on two test sets, RB44 and RB111. On a subset of proteins for which homologs with experimentally determined interfaces could be reliably identified, HomPRIP outperformed all other methods achieving an MCC of 0.63 on RB44 and 0.83 on RB111. RNABindRPlus was able to predict RNA-binding residues of all proteins in both test sets, achieving an MCC of 0.55 and 0.37, respectively, and outperforming all other methods, including those that make use of structure-derived features of proteins. More importantly, RNABindRPlus outperforms all other methods for any choice of tradeoff between precision and recall. An important advantage of both HomPRIP and RNABindRPlus is that they rely on readily available sequence and sequence-derived features of RNA-binding proteins. A webserver implementation of both methods is freely available at http://einstein.cs.iastate.edu/RNABindRPlus/.


Asunto(s)
Inteligencia Artificial , Modelos Teóricos , Proteínas de Unión al ARN/genética , Análisis de Secuencia de Proteína/métodos , Análisis de Secuencia de ARN/métodos , Animales , Humanos
18.
Proteins ; 82(2): 250-67, 2014 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-23873600

RESUMEN

Selecting near-native conformations from the immense number of conformations generated by docking programs remains a major challenge in molecular docking. We introduce DockRank, a novel approach to scoring docked conformations based on the degree to which the interface residues of the docked conformation match a set of predicted interface residues. DockRank uses interface residues predicted by partner-specific sequence homology-based protein-protein interface predictor (PS-HomPPI), which predicts the interface residues of a query protein with a specific interaction partner. We compared the performance of DockRank with several state-of-the-art docking scoring functions using Success Rate (the percentage of cases that have at least one near-native conformation among the top m conformations) and Hit Rate (the percentage of near-native conformations that are included among the top m conformations). In cases where it is possible to obtain partner-specific (PS) interface predictions from PS-HomPPI, DockRank consistently outperforms both (i) ZRank and IRAD, two state-of-the-art energy-based scoring functions (improving Success Rate by up to 4-fold); and (ii) Variants of DockRank that use predicted interface residues obtained from several protein interface predictors that do not take into account the binding partner in making interface predictions (improving success rate by up to 39-fold). The latter result underscores the importance of using partner-specific interface residues in scoring docked conformations. We show that DockRank, when used to re-rank the conformations returned by ClusPro, improves upon the original ClusPro rankings in terms of both Success Rate and Hit Rate. DockRank is available as a server at http://einstein.cs.iastate.edu/DockRank/.


Asunto(s)
Simulación del Acoplamiento Molecular , Programas Informáticos , Ligandos , Dominios y Motivos de Interacción de Proteínas , Estructura Cuaternaria de Proteína , Receptores de Superficie Celular/química , Homología de Secuencia de Aminoácido , Homología Estructural de Proteína , Termodinámica
19.
BMC Bioinformatics ; 13: 89, 2012 May 10.
Artículo en Inglés | MEDLINE | ID: mdl-22574904

RESUMEN

BACKGROUND: RNA molecules play diverse functional and structural roles in cells. They function as messengers for transferring genetic information from DNA to proteins, as the primary genetic material in many viruses, as catalysts (ribozymes) important for protein synthesis and RNA processing, and as essential and ubiquitous regulators of gene expression in living organisms. Many of these functions depend on precisely orchestrated interactions between RNA molecules and specific proteins in cells. Understanding the molecular mechanisms by which proteins recognize and bind RNA is essential for comprehending the functional implications of these interactions, but the recognition 'code' that mediates interactions between proteins and RNA is not yet understood. Success in deciphering this code would dramatically impact the development of new therapeutic strategies for intervening in devastating diseases such as AIDS and cancer. Because of the high cost of experimental determination of protein-RNA interfaces, there is an increasing reliance on statistical machine learning methods for training predictors of RNA-binding residues in proteins. However, because of differences in the choice of datasets, performance measures, and data representations used, it has been difficult to obtain an accurate assessment of the current state of the art in protein-RNA interface prediction. RESULTS: We provide a review of published approaches for predicting RNA-binding residues in proteins and a systematic comparison and critical assessment of protein-RNA interface residue predictors trained using these approaches on three carefully curated non-redundant datasets. We directly compare two widely used machine learning algorithms (Naïve Bayes (NB) and Support Vector Machine (SVM)) using three different data representations in which features are encoded using either sequence- or structure-based windows. Our results show that (i) Sequence-based classifiers that use a position-specific scoring matrix (PSSM)-based representation (PSSMSeq) outperform those that use an amino acid identity based representation (IDSeq) or a smoothed PSSM (SmoPSSMSeq); (ii) Structure-based classifiers that use smoothed PSSM representation (SmoPSSMStr) outperform those that use PSSM (PSSMStr) as well as sequence identity based representation (IDStr). PSSMSeq classifiers, when tested on an independent test set of 44 proteins, achieve performance that is comparable to that of three state-of-the-art structure-based predictors (including those that exploit geometric features) in terms of Matthews Correlation Coefficient (MCC), although the structure-based methods achieve substantially higher Specificity (albeit at the expense of Sensitivity) compared to sequence-based methods. We also find that the expected performance of the classifiers on a residue level can be markedly different from that on a protein level. Our experiments show that the classifiers trained on three different non-redundant protein-RNA interface datasets achieve comparable cross-validation performance. However, we find that the results are significantly affected by differences in the distance threshold used to define interface residues. CONCLUSIONS: Our results demonstrate that protein-RNA interface residue predictors that use a PSSM-based encoding of sequence windows outperform classifiers that use other encodings of sequence windows. While structure-based methods that exploit geometric features can yield significant increases in the Specificity of protein-RNA interface residue predictions, such increases are offset by decreases in Sensitivity. These results underscore the importance of comparing alternative methods using rigorous statistical procedures, multiple performance measures, and datasets that are constructed based on several alternative definitions of interface residues and redundancy cutoffs as well as including evaluations on independent test sets into the comparisons.


Asunto(s)
Inteligencia Artificial , Proteínas de Unión al ARN/química , ARN/química , Algoritmos , Aminoácidos/química , Teorema de Bayes , Humanos , Posición Específica de Matrices de Puntuación , Conformación Proteica , ARN/metabolismo , Proteínas de Unión al ARN/metabolismo , Análisis de Secuencia de Proteína , Máquina de Vectores de Soporte
20.
BMC Bioinformatics ; 13: 41, 2012 Mar 18.
Artículo en Inglés | MEDLINE | ID: mdl-22424103

RESUMEN

BACKGROUND: Identification of the residues in protein-protein interaction sites has a significant impact in problems such as drug discovery. Motivated by the observation that the set of interface residues of a protein tend to be conserved even among remote structural homologs, we introduce PrISE, a family of local structural similarity-based computational methods for predicting protein-protein interface residues. RESULTS: We present a novel representation of the surface residues of a protein in the form of structural elements. Each structural element consists of a central residue and its surface neighbors. The PrISE family of interface prediction methods uses a representation of structural elements that captures the atomic composition and accessible surface area of the residues that make up each structural element. Each of the members of the PrISE methods identifies for each structural element in the query protein, a collection of similar structural elements in its repository of structural elements and weights them according to their similarity with the structural element of the query protein. PrISEL relies on the similarity between structural elements (i.e. local structural similarity). PrISEG relies on the similarity between protein surfaces (i.e. general structural similarity). PrISEC, combines local structural similarity and general structural similarity to predict interface residues. These predictors label the central residue of a structural element in a query protein as an interface residue if a weighted majority of the structural elements that are similar to it are interface residues, and as a non-interface residue otherwise. The results of our experiments using three representative benchmark datasets show that the PrISEC outperforms PrISEL and PrISEG; and that PrISEC is highly competitive with state-of-the-art structure-based methods for predicting protein-protein interface residues. Our comparison of PrISEC with PredUs, a recently developed method for predicting interface residues of a query protein based on the known interface residues of its (global) structural homologs, shows that performance superior or comparable to that of PredUs can be obtained using only local surface structural similarity. PrISEC is available as a Web server at http://prise.cs.iastate.edu/ CONCLUSIONS: Local surface structural similarity based methods offer a simple, efficient, and effective approach to predict protein-protein interface residues.


Asunto(s)
Dominios y Motivos de Interacción de Proteínas , Proteínas/química , Programas Informáticos , Algoritmos , Modelos Moleculares , Conformación Proteica , Proteínas/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...