Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Resultados 1 - 20 de 66
Filtrar
1.
Trends Biochem Sci ; 23(3): 109-13, 1998 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-9581503

RESUMEN

Site-specific DNA-protein interactions can be studied using experimental and computational methods. Experimental approaches typically analyze a protein-DNA interaction by measuring the free energy of binding under a variety of conditions. Computational methods focus on alignments of known binding sites for a protein, and, from these alignments, make estimates of the binding energy. Understanding the relationship between these two perspectives, and finding ways to improve both, is a major challenge of modern molecular biology.


Asunto(s)
Proteínas de Unión al ADN/química , Proteínas de Unión al ADN/metabolismo , ADN/metabolismo , Sitios de Unión , ADN/química , Proteínas de Unión al ADN/genética , Transferencia de Energía , Predicción , Modelos Químicos , Unión Proteica , Proteínas Represoras/química , Proteínas Represoras/metabolismo , Especificidad por Sustrato , Transcripción Genética , Proteínas Virales/química , Proteínas Virales/metabolismo , Proteínas Reguladoras y Accesorias Virales
2.
Nucleic Acids Res ; 29(12): 2471-8, 2001 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-11410653

RESUMEN

Salmonella bacteriophage repressor Mnt belongs to the ribbon-helix-helix class of transcription factors. Previous SELEX results suggested that interactions of Mnt with positions 16 and 17 of the operator DNA are not independent. Using a newly developed high-throughput quantitative multiple fluorescence relative affinity (QuMFRA) assay, we directly quantified the relative equilibrium binding constants (K(ref)) of Mnt to operators carrying all the possible dinucleotide combinations at these two positions. Results show that Mnt prefers binding to C, instead of wild-type A, at position 16 when wild-type C at position 17 is changed to other bases. The measured K(ref) values of double mutants were also higher than the values predicted from single mutants, demonstrating the non-independence of these two positions. The ability to produce a large number of quantitative binding data simultaneously and the potential to scale up makes QuMFRA a valuable tool for the large-scale study of macromolecular interaction.


Asunto(s)
Bacteriófago P22/genética , ADN/metabolismo , Proteínas Represoras/metabolismo , Proteínas Virales/metabolismo , Secuencia de Bases , Sitios de Unión , ADN/genética , Proteínas de Unión al ADN/metabolismo , Fluorescencia , Colorantes Fluorescentes/metabolismo , Modelos Moleculares , Mutación/genética , Regiones Operadoras Genéticas/genética , Unión Proteica , Salmonella/genética , Salmonella/virología , Especificidad por Sustrato , Termodinámica , Proteínas Reguladoras y Accesorias Virales
3.
Nucleic Acids Res ; 28(24): 4938-43, 2000 Dec 15.
Artículo en Inglés | MEDLINE | ID: mdl-11121485

RESUMEN

Recent biochemical studies have indicated a number of regions in both the 16S and 23S rRNA that are exposed on the ribosomal subunit surface. In order to predict potential interactions between these regions we applied novel phylogenetically-based statistical methods to detect correlated nucleotide changes occurring between the rRNA molecules. With these methods we discovered a number of highly significant correlated changes between different sets of nucleotides in the two ribosomal subunits. The predictions with the highest correlation values belong to regions of the rRNA subunits that are in close proximity according to recent crystal structures of the entire ribosome. We also applied a new statistical method of detecting base triple interactions within these same rRNA subunit regions. This base triple statistic predicted a number of new base triples not detected by pair-wise interaction statistics within the rRNA molecules. Our results suggest that these statistical methods may enhance the ability to detect novel structural elements both within and between RNA molecules.


Asunto(s)
Filogenia , ARN Ribosómico 16S/metabolismo , ARN Ribosómico 23S/metabolismo , Animales , Secuencia de Bases , Sitios de Unión , Biología Computacional , Bases de Datos como Asunto , Genes Arqueales/genética , Genes Bacterianos/genética , Datos de Secuencia Molecular , ARN Ribosómico 16S/genética , ARN Ribosómico 23S/genética , Alineación de Secuencia , Estadística como Asunto
4.
Nucleic Acids Res ; 29(10): 2135-44, 2001 May 15.
Artículo en Inglés | MEDLINE | ID: mdl-11353083

RESUMEN

Post-transcriptional regulation of gene expression is often accomplished by proteins binding to specific sequence motifs in mRNA molecules, to affect their translation or stability. The motifs are often composed of a combination of sequence and structural constraints such that the overall structure is preserved even though much of the primary sequence is variable. While several methods exist to discover transcriptional regulatory sites in the DNA sequences of coregulated genes, the RNA motif discovery problem is much more difficult because of covariation in the positions. We describe the combined use of two approaches for RNA structure prediction, FOLDALIGN and COVE, that together can discover and model stem-loop RNA motifs in unaligned sequences, such as UTRs from post-transcriptionally coregulated genes. We evaluate the method on two datasets, one a section of rRNA genes with randomly truncated ends so that a global alignment is not possible, and the other a hyper-variable collection of IRE-like elements that were inserted into randomized UTR sequences. In both cases the combined method identified the motifs correctly, and in the rRNA example we show that it is capable of determining the structure, which includes bulge and internal loops as well as a variable length hairpin loop. Those automated results are quantitatively evaluated and found to agree closely with structures contained in curated databases, with correlation coefficients up to 0.9. A basic server, Stem-Loop Align SearcH (SLASH), which will perform stem-loop searches in unaligned RNA sequences, is available at http://www.bioinf.au.dk/slash/.


Asunto(s)
Biología Computacional , Conformación de Ácido Nucleico , ARN/química , ARN/genética , Programas Informáticos , Algoritmos , Secuencia de Bases , Bases de Datos como Asunto , Internet , Datos de Secuencia Molecular , ARN/metabolismo , ARN de Archaea/química , ARN de Archaea/genética , ARN de Archaea/metabolismo , ARN Ribosómico/química , ARN Ribosómico/genética , ARN Ribosómico/metabolismo , Secuencias Reguladoras de Ácidos Nucleicos/genética , Sensibilidad y Especificidad , Alineación de Secuencia , Regiones no Traducidas/química , Regiones no Traducidas/genética , Regiones no Traducidas/metabolismo
5.
J Mol Biol ; 223(1): 159-70, 1992 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-1731067

RESUMEN

An Expectation Maximization algorithm for identification of DNA binding sites is presented. The approach predicts the location of binding regions while allowing variable length spacers within the sites. In addition to predicting the most likely spacer length for a set of DNA fragments, the method identifies individual sites that differ in spacer size. No alignment of DNA sequences is necessary. The method is illustrated by application to 231 Escherichia coli DNA fragments known to contain promoters with variable spacings between their consensus regions. Maximum-likelihood tests of the differences between the spacing classes indicate that the consensus regions of the spacing classes are not distinct. Further tests suggest that several positions within the spacing region may contribute to promoter specificity.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , ADN/genética , Regiones Promotoras Genéticas , Algoritmos , Secuencia de Bases , Sitios de Unión , ADN Bacteriano/genética , Escherichia coli/genética , Datos de Secuencia Molecular , Alineación de Secuencia
6.
J Mol Biol ; 248(1): 1-18, 1995 Apr 21.
Artículo en Inglés | MEDLINE | ID: mdl-7731036

RESUMEN

We have developed a computer program, GeneParser, which identifies and determines the fine structure of protein genes in genomic DNA sequences. The program scores all subintervals in a sequence for content statistics indicative of introns and exons, and for sites that identify their boundaries. This information is weighted by a neural network to approximate the log-likelihood that each subinterval exactly represents an intron or exon (first, internal or last). A dynamic programming algorithm is then applied to this data to find the combination of introns and exons that maximizes the likelihood function. Using this method, we can rapidly generate ranked suboptimal solutions, each of which is the optimum solution containing a given intron-exon junction. We have tested the system on a large collection of human genes. On sequences not used in training, we achieved a correlation coefficient for exon nucleotide prediction of 0.89. For a subset of G + C-rich genes, a correlation coefficient of 0.94 was achieved. We have also quantified the robustness of the method to substitution and frame-shift errors and show how the system can be optimized for performance on sequences with known levels of sequencing errors.


Asunto(s)
Secuencia de Bases , ADN/química , Genes , Proteínas/genética , Programas Informáticos , Composición de Base , Simulación por Computador , ADN/metabolismo , Exones , Humanos , Intrones , Modelos Estadísticos , Biosíntesis de Proteínas , Reproducibilidad de los Resultados
7.
J Mol Biol ; 201(3): 517-35, 1988 Jun 05.
Artículo en Inglés | MEDLINE | ID: mdl-3262167

RESUMEN

We have identified the binding site on the bacteriophage T4 gene 32 mRNA responsible for autogenous translational regulation. We demonstrate that this site is largely unstructured and overlaps the initiation codon of gene 32 as previously predicted. Co-operative binding of gene 32 protein to this site specifically blocks the formation of 30 S-tRNA(fMet)-gene 32 mRNA ternary complexes and initiation of translation. The translational operator is bound co-operatively by gene 32 protein and this binding is facilitated by a nucleation site far upstream from the initiation codon. A similar unstructured mRNA lacking this nucleation site is also bound co-operatively, but only at concentrations of gene 32 protein higher than those needed to repress binding of ribosomes to the gene 32 mRNA. Some sequence-specific interactions may also influence this binding. Comparison of the bacteriophage T2, T4 and T6 gene 32 operator sequences leads us to propose that the nucleation site is a pseudoknot.


Asunto(s)
Regulación de la Expresión Génica , Genes Virales , ARN Mensajero/genética , ARN Viral/genética , Fagos T/genética , Secuencia de Bases , Sitios de Unión , Datos de Secuencia Molecular , Conformación de Ácido Nucleico , Regiones Operadoras Genéticas , Proteínas Virales
8.
J Mol Biol ; 188(3): 415-31, 1986 Apr 05.
Artículo en Inglés | MEDLINE | ID: mdl-3525846

RESUMEN

Repressors, polymerases, ribosomes and other macromolecules bind to specific nucleic acid sequences. They can find a binding site only if the sequence has a recognizable pattern. We define a measure of the information (R sequence) in the sequence patterns at binding sites. It allows one to investigate how information is distributed across the sites and to compare one site to another. One can also calculate the amount of information (R frequency) that would be required to locate the sites, given that they occur with some frequency in the genome. Several Escherichia coli binding sites were analyzed using these two independent empirical measurements. The two amounts of information are similar for most of the sites we analyzed. In contrast, bacteriophage T7 RNA polymerase binding sites contain about twice as much information as is necessary for recognition by the T7 polymerase, suggesting that a second protein may bind at T7 promoters. The extra information can be accounted for by a strong symmetry element found at the T7 promoters. This element may be an operator. If this model is correct, these promoters and operators do not share much information. The comparisons between R sequence and R frequency suggest that the information at binding sites is just sufficient for the sites to be distinguished from the rest of the genome.


Asunto(s)
Sitios de Unión , ADN Bacteriano/genética , Proteínas de Unión al ADN , Serina Endopeptidasas , Proteínas Bacterianas/genética , Secuencia de Bases , ADN Bacteriano/metabolismo , ARN Polimerasas Dirigidas por ADN/genética , Escherichia coli/genética , Escherichia coli/metabolismo , Operón Lac , Regiones Operadoras Genéticas , Operón , Proteínas Represoras , Ribosomas/metabolismo , Estadística como Asunto , Fagos T/genética , Triptófano/genética , Proteínas Virales , Proteínas Reguladoras y Accesorias Virales
9.
J Mol Biol ; 229(4): 821-6, 1993 Feb 20.
Artículo en Inglés | MEDLINE | ID: mdl-8445649

RESUMEN

The relative binding affinities of Mnt protein are determined for each possible base-pair at position 15 of the operator sequence, and for all combinations of G.C base-pairs at positions 15 and 17. The partitioning of each operator sequence is determined quantitatively with restriction enzymes. At position 15, the wild-type G.C base-pair provides the highest binding affinity but, unlike position 17, the primary distinction is between purine and pyrimidine bases on the top strand. The information content at position 15 is only about 0.16 bit. In comparison with previous measurements at position 17, it is determined that the interactions of the Mnt protein with positions 15 and 17 are independent, i.e. the specific binding energies for the two positions are additive. The relative binding affinities at position 17 are also determined in the background of a G to T mutation at position 5, the position equivalent to 17 on the other half of the symmetric operator. The relative affinities at position 17 are independent of whether position 5 is wild-type or mutant.


Asunto(s)
Regiones Operadoras Genéticas , Proteínas Represoras/metabolismo , Proteínas Virales/metabolismo , Secuencia de Bases , ADN Viral/síntesis química , ADN Viral/metabolismo , Datos de Secuencia Molecular , Mutagénesis , Unión Proteica , Proteínas Represoras/genética , Proteínas Virales/genética , Proteínas Reguladoras y Accesorias Virales
10.
J Mol Biol ; 271(2): 178-94, 1997 Aug 15.
Artículo en Inglés | MEDLINE | ID: mdl-9268651

RESUMEN

The Mnt protein of Salmonella phage P22 binds site-specifically to its operator. To better understand this binding we used dideoxy DNA sequencing in a quantitative manner to determine the relative binding constants, and hence the relative free energies, of wild-type Mnt protein to a substantial number of variants of its operator. These measurements were supported by experiments which used the SELEX procedure to generate a set of operators from an initially randomized population. In the Discussion we show that the present model of Mnt protein/operator binding, due to Sauer and co-workers, along with the assumption of an independent contribution of each position in the operator to the total binding, provides a reasonably accurate description of the system. We also discuss the use of information content as a measure of DNA-protein binding specificity with the Mnt protein/operator system serving as an example and show again that the assumption of independence supports the current view of this case of site-specific binding.


Asunto(s)
Bacteriófago P22/metabolismo , ADN/química , ADN/metabolismo , Oligodesoxirribonucleótidos/química , Oligodesoxirribonucleótidos/metabolismo , Proteínas Represoras/metabolismo , Proteínas Virales/metabolismo , Secuencia de Bases , Sitios de Unión , Secuencia de Consenso , Proteínas de Unión al ADN/metabolismo , Cinética , Ligandos , Datos de Secuencia Molecular , Salmonella/virología , Alineación de Secuencia , Especificidad por Sustrato , Moldes Genéticos , Termodinámica , Proteínas Reguladoras y Accesorias Virales
11.
J Mol Biol ; 177(4): 663-83, 1984 Aug 25.
Artículo en Inglés | MEDLINE | ID: mdl-6434747

RESUMEN

Sixteen single point mutations near the beginning of the lacZ gene have been isolated and their effect on lacZ expression has been measured. Five mutations were obtained that alter a potential stem-and-loop structure in the messenger RNA that masks the initiation codons. Formation of this stem-and-loop is a result of transcription of DNA sequences introduced during the cloning of the lac regulatory region. The mutations isolated were then moved into a background that deleted this structure. Analysis of these mutations indicated that the secondary structure inhibited lacZ expression 5.8-fold and that either single point mutations or a 9 base-pair deletion could relieve this inhibition completely. In addition, it was found that an A to C transversion in the first base following the initiation codon (in the absence of the inhibitory secondary structure) decreases lacZ expression almost twofold, whereas C to U transitions in the next two positions have negligible effects. Mutations were also obtained that either increase or decrease the length of the Shine-Dalgarno sequence. The effects of these mutations were studied in the presence or absence of the secondary structure that involves the two initiation codons. It was found that when translation initiation was inhibited by the secondary structure, increasing the length of the Shine-Dalgarno sequence increased lacZ expression 2.8-fold and decreasing the length of this sequence reduced lacZ expression 12-fold. When translation initiation was not inhibited by the secondary structure, increasing the length of the Shine-Dalgarno sequence had no effect and decreasing the length of this sequence only reduced lacZ expression sixfold. The mechanistic implications of these results are discussed. Two initiation codons are located in the beginning of the lacZ gene, 7 and 13 bases from the Shine-Dalgarno sequence. NH2-terminal sequence analysis indicated that the majority of the protein synthesized initiate at the first initiation codon in the wild-type lacZ gene (in agreement with results reported previously by J. L. Brown and his colleagues). Upon introduction of sequences that result in a change in the mRNA secondary structure, both initiation codons are used in almost equal amounts. Three mutations and two pseudorevertants were obtained, which are located in the first initiation codon. It was found that when the first initiation codon is changed from AUG to GUG, translation initiation is decreased tenfold at that codon.(ABSTRACT TRUNCATED AT 400 WORDS)


Asunto(s)
Operón Lac , Mutación , Biosíntesis de Proteínas , Ribosomas , Secuencia de Bases , Sitios de Unión , Codón/genética , Genes Bacterianos , Conformación de Ácido Nucleico , ARN Mensajero/genética , beta-Galactosidasa/metabolismo
12.
Methods Enzymol ; 183: 211-21, 1990.
Artículo en Inglés | MEDLINE | ID: mdl-2179676

RESUMEN

Matrices can provide realistic representations of protein/DNA specificity. In many cases simple mononucleotide-based matrices are adequate representations, but more complex matrices may be needed for other cases. Unlike simple consensus sequences, matrices allow for different penalties to be assessed for different changes to a binding site, a property that is essential for accurate description of a binding site pattern. When only a collection of binding site sequences is known, the best representation for the pattern is an information content formulation, based on both thermodynamic and statistical considerations. Quantitative data on relative binding affinities may be used to determine matrices that provide a best fit to the data. Matrix representations also provide an efficient method of aligning multiple sequences to identify binding site patterns that they have in common.


Asunto(s)
Secuencia de Bases , ADN/genética , Sistemas de Información , ADN Bacteriano/genética , Escherichia coli/genética , Matemática , Regiones Promotoras Genéticas , Proyectos de Investigación , Homología de Secuencia de Ácido Nucleico
13.
Methods Enzymol ; 208: 458-68, 1991.
Artículo en Inglés | MEDLINE | ID: mdl-1664028

RESUMEN

An information content analysis of protein-binding sites gives a quantitative description of the specificity of the protein, independent of the mechanism of specificity. It gives useful information about the total specificity of the protein and about the individual positions within the binding sites. Information content is consistent with both thermodynamic and statistical analyses of specificity. When applied to a collection of known binding sites, the description provided may be limited by the sample size or by unknown constraints on those sites. Experimental procedures to determine the information content can give much more reliable measures. A large number of functional sites can be obtained from a much larger pool of randomized potential sites. Quantitative assays for the activity of different sites can be easily incorporated into the analysis, thereby increasing its sensitivity. Both in vitro and in vivo experiments are amenable to information content analysis.


Asunto(s)
ADN Viral/metabolismo , Proteínas de Unión al ADN/metabolismo , ADN/metabolismo , Regiones Promotoras Genéticas , Fagos T/genética , Secuencia de Bases , Sitios de Unión , Enzimas de Restricción del ADN/metabolismo , ADN Viral/química , Escherichia coli/genética , Datos de Secuencia Molecular , Especificidad por Sustrato
14.
Biotechniques ; 11(6): 733-4, 736, 738, 1991 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-1809325

RESUMEN

An automated kinetic assay for beta-galactosidase activity in Escherichia coli was developed to permit the measurement of many independent samples simultaneously. Bacteria are grown, lysed from without (by adsorption of a high multiplicity of bacteriophage T4) and assayed in microtiter plates with 96 wells. Absorbance data are collected and analyzed by computer. The growth and lysis procedure, apparatus and software used in this assay can be used for other spectrophotometric enzyme assays.


Asunto(s)
Escherichia coli/enzimología , beta-Galactosidasa/análisis , Automatización , Cinética , Espectrofotometría Ultravioleta
15.
Shock ; 15(3): 165-70, 2001 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-11236897

RESUMEN

The traditional approach to the study of biology employs small-scale experimentation that results in the description of a molecular sequence of known function or relevance. In the era of the genome the reverse is true, as large-scale cloning and gene sequencing come first, followed by the use of computational methods to systematically determine gene function and regulation. The overarching goal of this new approach is to translate the knowledge learned from a systematic, global analysis of genomic data into a complete understanding of biology. For investigators who study shock, the specific goal is to increase understanding of the adaptive response to injury at the level of the entire genome. This review describes our initial experience using DNA microarrays to profile stress-induced changes in gene expression. We conclude that efforts to apply genomics to the study of injury are best coordinated by multi-disciplinary groups, because of the extensive expertise required.


Asunto(s)
Genómica/tendencias , Investigación/tendencias , Heridas y Lesiones/fisiopatología , Predicción , Técnicas Genéticas , Genoma Fúngico , Genómica/métodos , Humanos , Insuficiencia Multiorgánica/genética , Insuficiencia Multiorgánica/inmunología , Insuficiencia Multiorgánica/patología , Proyectos de Investigación , Saccharomyces cerevisiae/fisiología , Bazo/inmunología , Bazo/lesiones , Bazo/fisiopatología , Heridas y Lesiones/genética
16.
Genome Inform ; 12: 184-93, 2001.
Artículo en Inglés | MEDLINE | ID: mdl-11791237

RESUMEN

When a set of coregulated genes share a common structural RNA motif, e.g. a hairpin, most motif search approaches fail to locate the covarying but structurally conserved motif. There do exist methods that can locate structural RNA motifs, like FOLDALIGN, but the main problem with these methods is that they are computationally expensive. In FOLDALIGN, a major contribution to this is the use of a greedy algorithm to construct the multiple alignment. To ensure good quality many redundant computations must be made. However, by applying the greedy algorithm on a carefully selected subset of sequences, near full greedy quality can be obtained. The basic idea is to estimate the order in which the sequences entered a good greedy alignment. If such a ranking, found from all pairwise alignments, is in good agreement with the order of appearance in the multiple alignment, the core structural motif can be found by performing the greedy algorithm on just the top sequences in the ranking. The ranking used in this mini-greedy algorithm is found by using two complementing approaches: 1) When interpreting the FOLDALIGN score as an inner product (kernel), the sequences can be ranked according to their distance to their center of mass; 2) We construct an algorithm that attempts to find the K closest sequences in the vector space associated with the inner product, and the remaining sequences can be ranked by their minimum distance to any of the sequences, or to the center of mass in this set. The two approaches arecompared and merged, and the results discussed. We also show that structural alignments of near full greedy quality can found in significantly reduced time, using these methods. The algorithm is being included in the SLASH (Stem-Loop Align SearcH) server available at http://www.bioinf.au.dk/slash.


Asunto(s)
Algoritmos , ARN/química , ARN/genética , Secuencia de Bases , Biología Computacional , Bases de Datos de Ácidos Nucleicos , Conformación de Ácido Nucleico , Alineación de Secuencia/estadística & datos numéricos
20.
Bioinformatics ; 16(1): 16-23, 2000 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-10812473

RESUMEN

The purpose of this article is to provide a brief history of the development and application of computer algorithms for the analysis and prediction of DNA binding sites. This problem can be conveniently divided into two subproblems. The first is, given a collection of known binding sites, develop a representation of those sites that can be used to search new sequences and reliably predict where additional binding sites occur. The second is, given a set of sequences known to contain binding sites for a common factor, but not knowing where the sites are, discover the location of the sites in each sequence and a representation for the specificity of the protein.


Asunto(s)
Proteínas de Unión al ADN/metabolismo , ADN/análisis , Sitios de Unión , ADN/historia , Proteínas de Unión al ADN/historia , Historia del Siglo XX , Investigación/historia
SELECCIÓN DE REFERENCIAS
Detalles de la búsqueda