RESUMEN
UNLABELLED: This paper presents a language for describing arrangements of motifs in biological sequences, and a program that uses the language to find the arrangements in motif match databases. The program does not by itself search for the constituent motifs, and is thus independent of how they are detected, which allows it to use motif match data of various origins. AVAILABILITY: The program can be tested online at http://hits.isb-sib.ch and the distribution is available from ftp://ftp.isrec.isb-sib.ch/pub/software/unix/mmsearch-1.0.tar.gz CONTACT: Thomas.Junier@isrec.unil.ch SUPPLEMENTARY INFORMATION: The full documentation about mmsearchis available from http://hits.isb-sib.ch/~tjunier/mmsearch/doc.
Asunto(s)
Lenguajes de Programación , Proteínas/análisis , Programas Informáticos , Receptores de Citocinas/análisis , Tiorredoxinas/análisisRESUMEN
High throughput genome (HTG) and expressed sequence tag (EST) sequences are currently the most abundant nucleotide sequence classes in the public database. The large volume, high degree of fragmentation and lack of gene structure annotations prevent efficient and effective searches of HTG and EST data for protein sequence homologies by standard search methods. Here, we briefly describe three newly developed resources that should make discovery of interesting genes in these sequence classes easier in the future, especially to biologists not having access to a powerful local bioinformatics environment. trEST and trGEN are regularly regenerated databases of hypothetical protein sequences predicted from EST and HTG sequences, respectively. Hits is a web-based data retrieval and analysis system providing access to precomputed matches between protein sequences (including sequences from trEST and trGEN) and patterns and profiles from Prosite and Pfam. The three resources can be accessed via the Hits home page (http://hits. isb-sib.ch).
Asunto(s)
Secuencia de Aminoácidos , Etiquetas de Secuencia Expresada , Cadenas de Markov , Animales , Bases de Datos Factuales , Humanos , Servicios de Información , Internet , Datos de Secuencia Molecular , Proteínas/genética , Alineación de Secuencia , Homología de Secuencia de AminoácidoRESUMEN
UNLABELLED: Dotlet is a program for comparing sequences by the diagonal plot method. It is designed to be platform-independent and to run in a Web browser, thus enabling the majority of researchers to use it. AVAILABILITY: The applet can be tested at http://www.isrec.isb-sib.ch/java/dotlet/ Dotlet.html, and the source code is available upon request. CONTACT: Thomas. Junier Marco.Pagni @isrec.unil.ch SUPPLEMENTARY: The full documentation about d o t l e t is available from the above URL.
Asunto(s)
ADN/análisis , Proteínas/análisis , Alineación de Secuencia/métodos , Programas Informáticos , InternetRESUMEN
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters for which the transcription start site has been determined experimentally. Access to promoter sequences is provided by pointers to positions in nucleotide sequence entries. The annotation part of an entry includes a description of the initiation site mapping data, exhaustive cross-references to the EMBL nucleotide sequence database, SWISS-PROT, TRANSFAC and other databases, as well as bibliographic references. EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets for comparative sequence analysis. WWW-based interfaces have been developed that enable the user to view EPD entries in different formats, to select and extract promoter sequences according to a variety of criteria, and to navigate to related databases exploiting different cross-references. The EPD web site also features yearly updated base frequency matrices for major eukaryotic promoter elements. EPD can be accessed at http://www.epd.isb-sib.ch
Asunto(s)
Bases de Datos Factuales , Regiones Promotoras Genéticas , Sistemas de Administración de Bases de Datos , Células Eucariotas , Internet , Interfaz Usuario-ComputadorRESUMEN
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters, for which the transcription start site has been determined experimentally. Access to promoter sequences is provided by pointers to positions in nucleotide sequence entries. The annotation part of an entry includes description of the initiation site mapping data, cross-references to other databases, and bibliographic references. EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets for comparative sequence analysis. Recent efforts have focused on exhaustive cross-referencing to the EMBL nucleotide sequence database, and on the improvement of the WWW-based user interfaces and data retrieval mechanisms. EPD can be accessed at http://www.epd.isb-sib.ch
Asunto(s)
Bases de Datos Factuales , Células Eucariotas , Regiones Promotoras Genéticas/genética , ARN Polimerasa II/fisiología , Algoritmos , Animales , Secuencia de Bases , Genes Virales/genética , Genoma , Almacenamiento y Recuperación de la Información , Internet , ARN Mensajero/genética , Homología de Secuencia de Ácido Nucleico , Transcripción Genética/genética , Interfaz Usuario-ComputadorRESUMEN
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of experimentally characterised eukaryotic POL II promoters. The underlying definition of a promoter is that of a transcription initiation site. All information presented in EPD results from an independent evaluation of primary experimental data shown in the biological literature. Sequences flanking transcription initiation sites are indirectly given by pointers to EMBL sequences. The annotation part of a promoter entry includes description of the promoter-defining evidence, cross-references to other databases, and bibliographic references. Being designed as a resource for comparative sequence analysis, EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets. The database is available through the World Wide Web at URL http://cmpteam4.unil.ch
Asunto(s)
Bases de Datos Factuales , Regiones Promotoras Genéticas , Animales , Redes de Comunicación de Computadores , ADN Polimerasa II/metabolismo , Células Eucariotas , HumanosRESUMEN
SEView is a Java applet that represents known or predicted elements of a protein or nucleotide sequence. It replaces or supplements the textual format of databases or program output with an interactive, graphical representation that is easily available through a WWW browser. Independence from the source data's format is achieved through a description language and ad hoc translators, which make the system versatile and flexible.