RESUMEN
It is generally accepted that during the domestication of food plants, selection was focused on their productivity, the ease of their technological processing into food, and resistance to pathogens and environmental stressors. Besides, the palatability of plant foods and their health benefits could also be subjected to selection by humans in the past. Nonetheless, it is unclear whether in antiquity, aside from positive selection for beneficial properties of plants, humans simultaneously selected against such detrimental properties as allergenicity. This topic is becoming increasingly relevant as the allergization of the population grows, being a major challenge for modern medicine. That is why intensive research by breeders is already underway for creating hypoallergenic forms of food plants. Accordingly, in this paper, albumin, globulin, and ß-amylase of common wheat Triticum aestivum L. (1753) are analyzed, which have been identified earlier as targets for attacks by human class E immunoglobulins. At the genomic level, we wanted to find signs of past negative selection against the allergenicity of these three proteins (albumin, globulin, and ß-amylase) during the domestication of ancestral forms of modern food plants. We focused the search on the TATA-binding protein (TBP)-binding site because it is located within a narrow region (between positions -70 and -20 relative to the corresponding transcription start sites), is the most conserved, necessary for primary transcription initiation, and is the best-studied regulatory genomic signal in eukaryotes. Our previous studies presented our publicly available Web service Plant_SNP_TATA_Z-tester, which makes it possible to estimate the equilibrium dissociation constant (KD) of TBP complexes with plant proximal promoters (as output data) using 90 bp of their DNA sequences (as input data). In this work, by means of this bioinformatics tool, 363 gene promoter DNA sequences representing 43 plant species were analyzed. It was found that compared with non-food plants, food plants are characterized by significantly weaker affinity of TBP for proximal promoters of their genes homologous to the genes of common-wheat globulin, albumin, and ß-amylase (food allergens) (p < 0.01, Fisher's Z-test). This evidence suggests that in the past humans carried out selective breeding to reduce the expression of food plant genes encoding these allergenic proteins.
RESUMEN
The GeneNet system is designed for collection and analysis of the data on gene and metabolic networks, signal transduction pathways and kinetic characteristics of elementary processes. In the past 2 years, the GeneNet structure was considerably improved: (i) the current version of the database is now implemented using ORACLE9i; (ii) the capacities to describe the structure of the protein complexes and the interactions between the units are increased; (iii) two tables with kinetic constants and more detailed descriptions of certain reactions were added; and (iv) a module for kinetic modeling was supplemented. The current SRS release of the GeneNet database contains 37 graphical maps of gene networks, as well as descriptions of 1766 proteins, 1006 genes, 241 small molecules and 3254 relationships between gene network units, and 552 kinetic constants. Information distributed between 16 interlinked tables was obtained by annotating 1980 journal publications. SRS release of the GeneNet database, the graphical viewer and the modeling section are available at http://wwwmgs.bionet.nsc.ru/mgs/gnw/genenet/.
Asunto(s)
Bases de Datos Genéticas , Metabolismo , Modelos Genéticos , Gráficos por Computador , Simulación por Computador , Cinética , Complejos Multiproteicos/química , Transducción de SeñalRESUMEN
The GeneNet database is designed for accumulation of information on gene networks. Original technology applied in GeneNet enables description of not only a gene network structure and functional relationships between components, but also metabolic and signal transduction pathways. Specialised software, GeneNet Viewer, automatically displays the graphical diagram of gene networks described in the database. Current release 3.0 of GeneNet database contains descriptions of 25 gene networks, 945 proteins, 567 genes, 151 other substances and 1364 relationships between components of gene networks. Information distributed between 14 interlinked tables was obtained by annotating 968 scientific publications. The SRS-version of GeneNet database is freely available (http://wwwmgs.bionet.nsc.ru/mgs/systems/genenet/).
Asunto(s)
Bases de Datos Genéticas , Metabolismo/genética , Transducción de Señal/genética , Animales , Gráficos por Computador , Predicción , Genes , Humanos , Almacenamiento y Recuperación de la Información , Internet , Proteínas/genética , Proteínas/fisiología , ARN/genética , Interfaz Usuario-ComputadorRESUMEN
ACTIVITY is a database on DNA/RNA site sequences with known activity magnitudes, measurement systems, sequence-activity relationships under fixed experimental conditions and procedures to adapt these relationships from one measurement system to another. This database deposits information on DNA/RNA affinities to proteins and cell nuclear extracts, cutting efficiencies, gene transcription activity, mRNA translation efficiencies, mutability and other biological activities of natural sites occurring within promoters, mRNA leaders, and other regulatory regions in pro- and eukaryotic genomes, their mutant forms and synthetic analogues. Since activity magnitudes are heavily system-dependent, the current version of ACTIVITY is supplemented by three novel sub-databases: (i) SYSTEM, measurement systems; (ii) KNOWLEDGE, sequence-activity relationships under fixed experimental conditions; and (iii) CROSS_TEST, procedures adapting a relationship from one measurement system to another. These databases are useful in molecular biology, pharmacogenetics, metabolic engineering, drug design and biotechnology. The databases can be queried using SRS and are available through the Web, http://wwwmgs. bionet.nsc.ru/systems/Activity/.
Asunto(s)
ADN/genética , Bases de Datos Factuales , ARN/genética , Sitios de Unión , ADN/metabolismo , Regulación de la Expresión Génica , Internet , Unión Proteica , ARN/metabolismoRESUMEN
Transcription Regulatory Regions Database (TRRD) is an informational resource containing an integrated description of the gene transcription regulation. An entry of the database corresponds to a gene and contains the data on localization and functions of the transcription regulatory regions as well as gene expression patterns. TRRD contains only experimental data that are inputted into the database through annotating scientific publication. TRRD release 6.0 comprises the information on 1167 genes, 5537 transcription factor binding sites, 1714 regulatory regions, 14 locus control regions and 5335 expression patterns obtained through annotating 3898 scientific papers. This information is arranged in seven databases: TRRDGENES (general gene description), TRRDLCR (locus control regions); TRRDUNITS (regulatory regions: promoters, enhancers, silencers, etc.), TRRDSITES (transcription factor binding sites), TRRDFACTORS (transcription factors), TRRDEXP (expression patterns) and TRRDBIB (experimental publications). Sequence Retrieval System (SRS) is used as a basic tool for navigating and searching TRRD and integrating it with external informational and software resources. The visualization tool, TRRD Viewer, provides the information representation in a form of maps of gene regulatory regions. The option allowing nucleotide sequences to be searched for according to their homology using BLAST is also included. TRRD is available at http://www.bionet.nsc.ru/trrd/.
Asunto(s)
Bases de Datos de Ácidos Nucleicos , Transcripción Genética , Animales , Sitios de Unión , Gráficos por Computador , Proteínas de Unión al ADN/metabolismo , Silenciador del Gen , Humanos , Almacenamiento y Recuperación de la Información , Internet , Control de Calidad , Secuencias Reguladoras de Ácidos Nucleicos , Homología de Secuencia de Ácido Nucleico , Relación Estructura-Actividad , Factores de Transcripción/metabolismo , Activación TranscripcionalRESUMEN
The review describes several modules of the GeneExpress integrated computer system concerning the regulation of gene expression in eukaryotes. Approaches to the presentation of experimental data in databases are considered. The employment of GeneExpress in computer analysis and modeling of the organization and function of genetic systems is illustrated with examples. GeneExpress is available at http://wwwmgs.bionet.nsc.ru/mgs/gnw/.
Asunto(s)
Regulación de la Expresión Génica , Integración de Sistemas , Animales , Bases de Datos Genéticas , Evolución Molecular , Regiones Promotoras Genéticas , ARN Mensajero/genética , Vertebrados/genéticaAsunto(s)
Metagenómica/métodos , Proteínas/química , Proteínas/metabolismo , Programas Informáticos , Secuencia de Aminoácidos , Celulasas/química , Celulasas/genética , Celulasas/metabolismo , Enzimas/química , Enzimas/metabolismo , Datos de Secuencia Molecular , Sistemas de Lectura Abierta , Conformación Proteica , Estructura Terciaria de Proteína , Proteínas/genética , Homología de Secuencia de AminoácidoRESUMEN
MOTIVATION: A rapid growth in the number of genes with known sequences calls for developing automated tools for their classification and analysis. It became clear that nucleosome packaging of eukaryotic DNA is very important for gene functioning. Automated computer tools for characterization of nucleosome packaging density could be useful for studying of gene regulation and genome annotation. RESULTS: A program for constructing nucleosome formation potential profiles of eukaryotic DNA sequences was developed. Nucleosome packaging density was analyzed for different functional types of human promoters. It was found that in promoters of tissue-specific genes, the nucleosome formation potential was essentially higher than in genes expressed in many tissues, or housekeeping genes. Hence, capability of nucleosome positioning in the promoter region may serve as a factor regulating gene expression. AVAILABILITY: The program for nucleosome sites recognition is included into the GeneExpress system; section 'DNA Nucleosomal Organization', http://wwwmgs.bionet.nsc.ru/mgs/programs/recon/.
Asunto(s)
ADN/genética , Nucleosomas/genética , Programas Informáticos , Algoritmos , Composición de Base , Biología Computacional , ADN/química , Células Eucariotas , Expresión Génica , Histonas/química , Histonas/genética , Regiones Promotoras GenéticasRESUMEN
A program for constructing nucleosome formation potential profile was applied for investigation of exons, introns, and repetitive sequences. The program is available at http://wwwmgs.bionet.nsc.ru/mgs/programs/recon/. We have demonstrated that introns and repetitive sequences exhibit higher nucleosome formation potentials than exons. This fact may be explained by functional saturation of exons with genetic code, hindering the localization of efficient nucleosome positioning sites.
Asunto(s)
Elementos Alu , Exones , Intrones , Nucleosomas/genética , Programas Informáticos , Biología Computacional , Bases de Datos de Ácidos Nucleicos , HumanosRESUMEN
MOTIVATION: The commonly accepted statistical mechanical theory is now multiply confirmed by using the weight matrix methods successfully recognizing DNA sites binding regulatory proteins in prokaryotes. Nevertheless, the recent evaluation of weight matrix methods application for transcription factor binding site recognition in eukaryotes has unexpectedly revealed that the matrix scores correlate better to each other than to the activity of DNA sites interacting with proteins. This observation points out that molecular mechanisms of DNA/protein recognition are more complicated in eukaryotes than in prokaryotes. As the extra events in eukaryotes, the following processes may be considered: (i) competition between the proteins and nucleosome core particle for DNA sites binding these proteins and (ii) interaction between two synergetic/antagonist proteins recognizing a composed element compiled from two DNA sites binding these proteins. That is why identification of the sequence-dependent DNA features correlating with affinity magnitudes of DNA sites interacting with a protein can pinpoint the molecular event limiting this protein/DNA recognition machinery. RESULTS: An approach for predicting site activity based on its primary nucleotide sequence has been developed. The approach is realized in the computer system ACTIVITY, containing the databases on site activity and on conformational and physicochemical DNA/RNA parameters. By using the system ACTIVITY, an analysis of some sites was provided and the methods for predicting site activity were constructed. The methods developed are in good agreement with the experimental data. AVAILABILITY: The database ACTIVITY is available at http://wwwmgs.bionet.nsc.ru/systems/Activity/ and the mirror site, http://www.cbil.upenn.edu/mgs/systems/acti vity/.
Asunto(s)
Sistemas de Computación , ADN/genética , ADN/metabolismo , Proteínas/metabolismo , Algoritmos , Animales , Secuencia de Bases , Sitios de Unión/genética , Fenómenos Químicos , Química Física , ADN/química , Bases de Datos Factuales , Humanos , Proteínas de Dominio MADS , Factores de Transcripción MEF2 , Datos de Secuencia Molecular , Mutación , Factores Reguladores Miogénicos/genética , Factores Reguladores Miogénicos/metabolismo , Conformación de Ácido Nucleico , TATA BoxRESUMEN
TRANSFAC, TRRD (Transcription Regulatory Region Database) and COMPEL are databases which store information about transcriptional regulation in eukaryotic cells. The three databases provide distinct views on the components involved in transcription: transcription factors and their binding sites and binding profiles (TRANSFAC), the regulatory hierarchy of whole genes (TRRD), and the structural and functional properties of composite elements (COMPEL). The quantitative and qualitative changes of all three databases and connected programs are described. The databases are accessible via WWW:http://transfac.gbf.de/TRANSFAC orhttp://www.bionet.nsc.ru/TRRD
Asunto(s)
Bases de Datos Factuales , Regulación de la Expresión Génica , Transcripción Genética , Animales , Redes de Comunicación de Computadores , Humanos , Programas Informáticos , Factores de Transcripción , Interfaz Usuario-ComputadorRESUMEN
Transcription Regulatory Regions Database (TRRD) has been developed for accumulation of experimental information on the structure-function features of regulatory regions of eukaryotic genes. Each entry in TRRD corresponds to a particular gene and contains a description of structure-function features of its regulatory regions (transcription factor binding sites, promoters, enhancers, silencers, etc.) and gene expression regulation patterns. The current release, TRRD 4.2.5, comprises the description of 760 genes, 3403 expression patterns, and >4600 regulatory elements including 3604 transcription factor binding sites, 600 promoters and 152 enhancers. This information was obtained through annotation of 2537 scientific publications. TRRD 4.2.5 is available through the WWW at http://wwwmgs.bionet.nsc.ru/mgs/dbases/trrd4/
Asunto(s)
Bases de Datos Factuales , Transcripción Genética , Elementos de Facilitación Genéticos , Internet , Regiones Promotoras Genéticas , Secuencias Reguladoras de Ácidos NucleicosRESUMEN
The Transcription Regulatory Regions Database (TRRD) is a curated database designed for accumulation of experimental data on extended regulatory regions of eukaryotic genes, the regulatory elements they contain, i.e., transcription factor binding sites, promoters, enhancers, silencers, etc., and expression patterns of the genes. Release 4.1 of TRRD offers a number of significant improvements, in particular, a more detailed description of transcription factor binding sites, transcription factors per se, and gene expression patterns in a computer-readable format. In addition, the new TRRD release provides considerably more references to other molecular biological databases. TRRD 4.1 is installed under SRS and is available through the WWW at http://www.bionet.nsc.ru/trrd/
Asunto(s)
Bases de Datos Factuales , Secuencias Reguladoras de Ácidos Nucleicos/genética , Transcripción Genética/genética , Animales , Secuencia de Bases , Sitios de Unión , Línea Celular , Bases de Datos Factuales/tendencias , Elementos de Facilitación Genéticos/genética , Células Eucariotas , Regulación de la Expresión Génica/genética , Glutatión Peroxidasa/genética , Almacenamiento y Recuperación de la Información , Internet , Ratones , Especificidad de Órganos , Regiones Promotoras Genéticas/genética , Elementos de Respuesta/genética , Federación de Rusia , Factores de Transcripción/genética , Interfaz Usuario-ComputadorRESUMEN
GeneExpress system has been designed to integrate description, analysis, and recognition of eukaryotic regulatory sequences. The system includes 5 basic units: (1) GeneNet contains an object-oriented database for accumulation of data on gene networks and signal transduction pathways and a Java-based viewer that allows an exploration and visualization of the GeneNet information; (2) Transcription Regulation combines the database on transcription regulatory regions of eukaryotic genes (TRRD) and TRRD Viewer; (3) Transcription Factor Binding Site Recognition contains a compilation of transcription factor binding sites (TFBSC) and programs for their analysis and recognition; (4) mRNA Translation is designed for analysis of structural and contextual features of mRNA 5'UTRs and prediction of their translation efficiency; and (5) ACTIVITY is the module for analysis and site activity prediction of a given nucleotide sequence. Integration of the databases in the GeneExpress is based on the Sequence Retrieval System (SRS) created in the European Bioinformatics Institute.
Asunto(s)
Sistemas de Computación , Genes Reguladores , Genoma , Inteligencia Artificial , Sitios de Unión , Bases de Datos Factuales , Células Eucariotas , Regulación de la Expresión Génica , Biosíntesis de Proteínas , ARN Mensajero/genética , Programas Informáticos , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Transcripción GenéticaRESUMEN
MOTIVATION: The goal of the work was to develop a WWW-oriented computer system providing a maximal integration of informational and software resources on the regulation of gene expression and navigation through them. Rapid growth of the variety and volume of information accumulated in the databases on regulation of gene expression necessarily requires the development of computer systems for automated discovery of the knowledge that can be further used for analysis of regulatory genomic sequences. RESULTS: The GeneExpress system developed includes the following major informational and software modules: (1) Transcription Regulation (TRRD) module, which contains the databases on transcription regulatory regions of eukaryotic genes and TRRD Viewer for data visualization; (2) Site Activity Prediction (ACTIVITY), the module for analysis of functional site activity and its prediction; (3) Site Recognition module, which comprises (a) B-DNA-VIDEO system for detecting the conformational and physicochemical properties of DNA sites significant for their recognition, (b) Consensus and Weight Matrices (ConsFrec) and (c) Transcription Factor Binding Sites Recognition (TFBSR) systems for detecting conservative contextual regions of functional sites and their recognition; (4) Gene Networks (GeneNet), which contains an object-oriented database accumulating the data on gene networks and signal transduction pathways, and the Java-based Viewer for exploration and visualization of the GeneNet information; (5) mRNA Translation (Leader mRNA), designed to analyze structural and contextual properties of mRNA 5'-untranslated regions (5'-UTRs) and predict their translation efficiency; (6) other program modules designed to study the structure-function organization of regulatory genomic sequences and regulatory proteins. AVAILABILITY: GeneExpress is available at http://wwwmgs.bionet.nsc. ru/systems/GeneExpress/ and the links to the mirror site(s) can be found at http://wwwmgs.bionet.nsc.ru/mgs/links/mirrors.html+ ++.