Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 18 de 18
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Hum Mutat ; 40(12): 2230-2238, 2019 12.
Artículo en Inglés | MEDLINE | ID: mdl-31433103

RESUMEN

Each year diagnostic laboratories in the Netherlands profile thousands of individuals for heritable disease using next-generation sequencing (NGS). This requires pathogenicity classification of millions of DNA variants on the standard 5-tier scale. To reduce time spent on data interpretation and increase data quality and reliability, the nine Dutch labs decided to publicly share their classifications. Variant classifications of nearly 100,000 unique variants were catalogued and compared in a centralized MOLGENIS database. Variants classified by more than one center were labeled as "consensus" when classifications agreed, and shared internationally with LOVD and ClinVar. When classifications opposed (LB/B vs. LP/P), they were labeled "conflicting", while other nonconsensus observations were labeled "no consensus". We assessed our classifications using the InterVar software to compare to ACMG 2015 guidelines, showing 99.7% overall consistency with only 0.3% discrepancies. Differences in classifications between Dutch labs or between Dutch labs and ACMG were mainly present in genes with low penetrance or for late onset disorders and highlight limitations of the current 5-tier classification system. The data sharing boosted the quality of DNA diagnostics in Dutch labs, an initiative we hope will be followed internationally. Recently, a positive match with a case from outside our consortium resulted in a more definite disease diagnosis.


Asunto(s)
Enfermedades Genéticas Congénitas/diagnóstico , Variación Genética , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Difusión de la Información/métodos , Exactitud de los Datos , Bases de Datos Genéticas , Enfermedades Genéticas Congénitas/genética , Guías como Asunto , Humanos , Laboratorios , Países Bajos , Análisis de Secuencia de ADN
2.
J Mol Diagn ; 21(2): 261-273, 2019 03.
Artículo en Inglés | MEDLINE | ID: mdl-30576869

RESUMEN

A common approach in clinical diagnostic laboratories to variant assessment from tumor molecular profiling is sequencing of genomic DNA extracted from both tumor (somatic) and normal (germline) tissue, with subsequent variant comparison to identify true somatic variants with potential impact on patient treatment or prognosis. However, challenges exist in paired tumor-normal testing, including increased cost of dual sample testing and identification of germline cancer predisposing variants. Alternatively, somatic variants can be identified by in silico tumor-only variant filtration precluding the need for matched normal testing. The barrier to tumor-only variant filtration is defining a reliable approach, with high sensitivity and specificity to identify somatic variants. In this study, we used retrospective data sets from paired tumor-normal samples tested on small (48 gene) and large (555 gene) targeted next-generation sequencing panels, to model algorithms for tumor-only variants classification. The optimal algorithm required an ordinal filtering approach using information from variant population databases (1000 Genomes Phase 3, ESP6500, ExAC), clinical mutation databases (ClinVar), and information on recurring clinically relevant somatic variants. Overall the tumor-only variant filtration strategy described in this study can define clinically relevant somatic variants from tumor-only analysis with sensitivity of 97% to 99% and specificity of 87% to 94%, and with significant potential utility for clinical laboratories implementing tumor-only molecular profiling.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Algoritmos , Biología Computacional/métodos , Humanos , Mutación/genética , Neoplasias/genética , Estudios Retrospectivos
3.
Nat Biotechnol ; 23(1): 137-44, 2005 Jan.
Artículo en Inglés | MEDLINE | ID: mdl-15637633

RESUMEN

The prediction of regulatory elements is a problem where computational methods offer great hope. Over the past few years, numerous tools have become available for this task. The purpose of the current assessment is twofold: to provide some guidance to users regarding the accuracy of currently available tools in various settings, and to provide a benchmark of data sets for assessing future tools.


Asunto(s)
Biología Computacional/métodos , Expresión Génica , Transcripción Genética , Secuencias de Aminoácidos , Animales , Sitios de Unión , Bases de Datos de Proteínas , Drosophila , Proteínas Fúngicas/química , Humanos , Internet , Ratones , Reproducibilidad de los Resultados , Programas Informáticos
4.
J Cheminform ; 10(1): 9, 2018 Mar 07.
Artículo en Inglés | MEDLINE | ID: mdl-29516311

RESUMEN

Spectrophores are novel descriptors that are calculated from the three-dimensional atomic properties of molecules. In our current implementation, the atomic properties that were used to calculate spectrophores include atomic partial charges, atomic lipophilicity indices, atomic shape deviations and atomic softness properties. This approach can easily be widened to also include additional atomic properties. Our novel methodology finds its roots in the experimental affinity fingerprinting technology developed in the 1990's by Terrapin Technologies. Here we have translated it into a purely virtual approach using artificial affinity cages and a simplified metric to calculate the interaction between these cages and the atomic properties. A typical spectrophore consists of a vector of 48 real numbers. This makes it highly suitable for the calculation of a wide range of similarity measures for use in virtual screening and for the investigation of quantitative structure-activity relationships in combination with advanced statistical approaches such as self-organizing maps, support vector machines and neural networks. In our present report we demonstrate the applicability of our novel methodology for scaffold hopping as well as virtual screening.

5.
Nucleic Acids Res ; 33(Web Server issue): W393-6, 2005 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-15980497

RESUMEN

We present the second and improved release of the TOUCAN workbench for cis-regulatory sequence analysis. TOUCAN implements and integrates fast state-of-the-art methods and strategies in gene regulation bioinformatics, including algorithms for comparative genomics and for the detection of cis-regulatory modules. This second release of TOUCAN has become open source and thereby carries the potential to evolve rapidly. The main goal of TOUCAN is to allow a user to come to testable hypotheses regarding the regulation of a gene or of a set of co-regulated genes. TOUCAN can be launched from this location: http://www.esat.kuleuven.ac.be/~saerts/software/toucan.php.


Asunto(s)
Regulación de la Expresión Génica , Secuencias Reguladoras de Ácidos Nucleicos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Algoritmos , Genómica , Internet , Interfaz Usuario-Computador
6.
BMC Bioinformatics ; 7: 160, 2006 Mar 20.
Artículo en Inglés | MEDLINE | ID: mdl-16549017

RESUMEN

BACKGROUND: Several motif detection algorithms have been developed to discover overrepresented motifs in sets of coexpressed genes. However, in a noisy gene list, the number of genes containing the motif versus the number lacking the motif might not be sufficiently high to allow detection by classical motif detection tools. To still recover motifs which are not significantly enriched but still present, we developed a procedure in which we use phylogenetic footprinting to first delineate all potential motifs in each gene. Then we mutually compare all detected motifs and identify the ones that are shared by at least a few genes in the data set as potential candidates. RESULTS: We applied our methodology to a compiled test data set containing known regulatory motifs and to two biological data sets derived from genome wide expression studies. By executing four consecutive steps of 1) identifying conserved regions in orthologous intergenic regions, 2) aligning these conserved regions, 3) clustering the conserved regions containing similar regulatory regions followed by extraction of the regulatory motifs and 4) screening the input intergenic sequences with detected regulatory motif models, our methodology proves to be a powerful tool for detecting regulatory motifs when a low signal to noise ratio is present in the input data set. Comparing our results with two other motif detection algorithms points out the robustness of our algorithm. CONCLUSION: We developed an approach that can reliably identify multiple regulatory motifs lacking a high degree of overrepresentation in a set of coexpressed genes (motifs belonging to sparsely connected hubs in the regulatory network) by exploiting the advantages of using both coexpression and phylogenetic information.


Asunto(s)
Algoritmos , ADN Bacteriano/genética , Regulación Bacteriana de la Expresión Génica , Filogenia , Secuencias Reguladoras de Ácidos Nucleicos/genética , Análisis de Secuencia de ADN , Yersinia pestis/genética , Secuencia de Bases , Análisis por Conglomerados , Secuencia de Consenso/genética , Huella de ADN , Perfilación de la Expresión Génica , Datos de Secuencia Molecular , Análisis de Secuencia por Matrices de Oligonucleótidos , Reproducibilidad de los Resultados , Análisis de Secuencia de ADN/métodos
7.
Nucleic Acids Res ; 31(6): 1753-64, 2003 Mar 15.
Artículo en Inglés | MEDLINE | ID: mdl-12626717

RESUMEN

TOUCAN is a Java application for the rapid discovery of significant cis-regulatory elements from sets of coexpressed or coregulated genes. Biologists can automatically (i) retrieve genes and intergenic regions, (ii) identify putative regulatory regions, (iii) score sequences for known transcription factor binding sites, (iv) identify candidate motifs for unknown binding sites, and (v) detect those statistically over-represented sites that are characteristic for a gene set. Genes or intergenic regions are retrieved from Ensembl or EMBL, together with orthologs and supporting information. Orthologs are aligned and syntenic regions are selected as candidate regulatory regions. Putative sites for known transcription factors are detected using our MotifScanner, which scores position weight matrices using a probabilistic model. New motifs are detected using our MotifSampler based on Gibbs sampling. Binding sites characteristic for a gene set--and thus statistically over-represented with respect to a reference sequence set--are found using a binomial test. We have validated Toucan by analyzing muscle-specific genes, liver-specific genes and E2F target genes; we have easily detected many known binding sites within intergenic DNA and identified new biologically plausible sites for known and unknown transcription factors. Software available at http://www.esat.kuleuven.ac. be/ approximately dna/BioI/Software.html.


Asunto(s)
Proteínas de Ciclo Celular , Proteínas de Unión al ADN , Regulación de la Expresión Génica/genética , Programas Informáticos , Algoritmos , Sitios de Unión/genética , Biología Computacional/métodos , Factores de Transcripción E2F , Genoma Humano , Humanos , Hígado/metabolismo , Músculos/metabolismo , Regiones Promotoras Genéticas/genética , Factores de Transcripción/metabolismo
8.
Nucleic Acids Res ; 31(13): 3468-70, 2003 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-12824346

RESUMEN

INCLUSive is a suite of algorithms and tools for the analysis of gene expression data and the discovery of cis-regulatory sequence elements. The tools allow normalization, filtering and clustering of microarray data, functional scoring of gene clusters, sequence retrieval, and detection of known and unknown regulatory elements using probabilistic sequence models and Gibbs sampling. All tools are available via different web pages and as web services. The web pages are connected and integrated to reflect a methodology and facilitate complex analysis using different tools. The web services can be invoked using standard SOAP messaging. Example clients are available for download to invoke the services from a remote computer or to be integrated with other applications. All services are catalogued and described in a web service registry. The INCLUSive web portal is available for academic purposes at http://www.esat.kuleuven.ac.be/inclusive.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Secuencias Reguladoras de Ácidos Nucleicos , Programas Informáticos , Algoritmos , Análisis por Conglomerados , Internet , Sistema de Registros , Análisis de Secuencia/métodos , Integración de Sistemas
9.
Nucleic Acids Res ; 30(1): 325-7, 2002 Jan 01.
Artículo en Inglés | MEDLINE | ID: mdl-11752327

RESUMEN

PlantCARE is a database of plant cis-acting regulatory elements, enhancers and repressors. Regulatory elements are represented by positional matrices, consensus sequences and individual sites on particular promoter sequences. Links to the EMBL, TRANSFAC and MEDLINE databases are provided when available. Data about the transcription sites are extracted mainly from the literature, supplemented with an increasing number of in silico predicted data. Apart from a general description for specific transcription factor sites, levels of confidence for the experimental evidence, functional information and the position on the promoter are given as well. New features have been implemented to search for plant cis-acting regulatory elements in a query sequence. Furthermore, links are now provided to a new clustering and motif search method to investigate clusters of co-expressed genes. New regulatory elements can be sent automatically and will be added to the database after curation. The PlantCARE relational database is available via the World Wide Web at http://sphinx.rug.ac.be:8080/PlantCARE/.


Asunto(s)
Bases de Datos de Ácidos Nucleicos , Regulación de la Expresión Génica de las Plantas , Genes de Plantas , Regiones Promotoras Genéticas , Secuencia de Consenso , Elementos de Facilitación Genéticos , Genoma de Planta , Almacenamiento y Recuperación de la Información , Internet , Familia de Multigenes , Secuencias Reguladoras de Ácidos Nucleicos , Integración de Sistemas , Transcripción Genética
10.
Trends Microbiol ; 11(2): 61-6, 2003 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-12598125

RESUMEN

Motif detection based on Gibbs sampling is a common procedure used to retrieve regulatory motifs in silico. Using a species-specific background model was previously shown to increase the robustness of the algorithm. Here, we demonstrate that selecting a non-species-adapted background model can have an adverse effect on the results of motif detection. The large differences in the average nucleotide composition of prokaryotic sequences exacerbate the problem of exchanging background models. Therefore, we have developed complex background models for all prokaryotic species with available genome sequences.


Asunto(s)
Genómica , Modelos Genéticos , Secuencias Reguladoras de Ácidos Nucleicos , Análisis de Secuencia de ADN , Algoritmos , ADN Intergénico/análisis , Escherichia coli/genética , Genoma , Regiones Promotoras Genéticas , Pseudomonas aeruginosa/genética , Factor sigma/genética , Especificidad de la Especie
11.
Bioinformatics ; 19 Suppl 2: ii5-14, 2003 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-14534164

RESUMEN

MOTIVATION: The transcriptional regulation of a metazoan gene depends on the cooperative action of multiple transcription factors that bind to cis-regulatory modules (CRMs) located in the neighborhood of the gene. By integrating multiple signals, CRMs confer an organism specific spatial and temporal rate of transcription. RESULTS: Based on the hypothesis that genes that are needed in exactly the same conditions might share similar regulatory switches, we have developed a novel methodology to find CRMs in a set of coexpressed or coregulated genes. The ModuleSearcher algorithm finds for a given gene set the best scoring combination of transcription factor binding sites within a sequence window using an A(*)procedure for tree searching. To keep the level of noise low, we use DNA sequences that are most likely to contain functional cis-regulatory information, namely conserved regions between human and mouse orthologous genes. The ModuleScanner performs genomic searches with a predicted CRM or with a user-defined CRM known from the literature to find possible target genes. The validity of a set of putative targets is checked using Gene Ontology annotations. We demonstrate the use and effectiveness of the ModuleSearcher and ModuleScanner algorithms and test their specificity and sensitivity on semi-artificial data. Next, we search for a module in a cluster of gene expression profiles of human cell cycle genes. AVAILABILITY: The ModuleSearcher is available as a web service within the TOUCAN workbench for regulatory sequence analysis, which can be downloaded from http://www.esat.kuleuven.ac.be/~dna/BioI.


Asunto(s)
Algoritmos , Mapeo Cromosómico/métodos , Elementos Reguladores de la Transcripción/genética , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Factores de Transcripción/genética , Transcripción Genética/genética , Secuencia de Bases , Sitios de Unión , Datos de Secuencia Molecular , Unión Proteica
12.
BMC Genomics ; 5(1): 34, 2004 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-15171795

RESUMEN

BACKGROUND: The transcription start site of a metazoan gene remains poorly understood, mostly because there is no clear signal present in all genes. Now that several sequenced metazoan genomes have been annotated, we have been able to compare the base composition around the transcription start site for all annotated genes across multiple genomes. RESULTS: The most prominent feature in the base compositions is a significant local variation in G+C content over a large region around the transcription start site. The change is present in all animal phyla but the extent of variation is different between distinct classes of vertebrates, and the shape of the variation is completely different between vertebrates and arthropods. Furthermore, the height of the variation correlates with CpG frequencies in vertebrates but not in invertebrates and it also correlates with gene expression, especially in mammals. We also detect GC and AT skews in all clades (where %G is not equal to %C or %A is not equal to %T respectively) but these occur in a more confined region around the transcription start site and in the coding region. CONCLUSIONS: The dramatic changes in nucleotide composition in humans are a consequence of CpG nucleotide frequencies and of gene expression, the changes in Fugu could point to primordial CpG islands, and the changes in the fly are of a totally different kind and unrelated to dinucleotide frequencies.


Asunto(s)
ADN/genética , Evolución Molecular , Sitio de Iniciación de la Transcripción , Secuencia Rica en At/genética , Animales , Anopheles/genética , Composición de Base/genética , Caenorhabditis/genética , Islas de CpG/genética , ADN de Helmintos/genética , Bases de Datos Genéticas/normas , Drosophila melanogaster/genética , Secuencia Rica en GC/genética , Expresión Génica/genética , Variación Genética/genética , Humanos , Ratones , Ratas , Takifugu/genética , Pez Cebra/genética
13.
J Comput Biol ; 9(2): 447-64, 2002.
Artículo en Inglés | MEDLINE | ID: mdl-12015892

RESUMEN

Microarray experiments can reveal important information about transcriptional regulation. In our case, we look for potential promoter regulatory elements in the upstream region of coexpressed genes. Here we present two modifications of the original Gibbs sampling algorithm for motif finding (Lawrence et al., 1993). First, we introduce the use of a probability distribution to estimate the number of copies of the motif in a sequence. Second, we describe the technical aspects of the incorporation of a higher-order background model whose application we discussed in Thijs et al. (2001). Our implementation is referred to as the Motif Sampler. We successfully validate our algorithm on several data sets. First, we show results for three sets of upstream sequences containing known motifs: 1) the G-box light-response element in plants, 2) elements involved in methionine response in Saccharomyces cerevisiae, and 3) the FNR O(2)-responsive element in bacteria. We use these data sets to explain the influence of the parameters on the performance of our algorithm. Second, we show results for upstream sequences from four clusters of coexpressed genes identified in a microarray experiment on wounding in Arabidopsis thaliana. Several motifs could be matched to regulatory elements from plant defence pathways in our database of plant cis-acting regulatory elements (PlantCARE). Some other strong motifs do not have corresponding motifs in PlantCARE but are promising candidates for further analysis.


Asunto(s)
Algoritmos , Perfilación de la Expresión Génica/estadística & datos numéricos , Arabidopsis/genética , Bacterias/genética , Secuencia de Bases , Biología Computacional , ADN/genética , Modelos Genéticos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Saccharomyces cerevisiae/genética
14.
J Mol Graph Model ; 27(2): 161-9, 2008 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-18485770

RESUMEN

Within the context of early drug discovery, a new pharmacophore-based tool to score and align small molecules (Pharao) is described. The tool is built on the idea to model pharmacophoric features by Gaussian 3D volumes instead of the more common point or sphere representations. The smooth nature of these continuous functions has a beneficent effect on the optimization problem introduced during alignment. The usefulness of Pharao is illustrated by means of three examples: a virtual screening of trypsin-binding ligands, a virtual screening of phosphodiesterase 5-binding ligands, and an investigation of the biological relevance of an unsupervised clustering of small ligands based on Pharao.


Asunto(s)
Algoritmos , Sistemas de Liberación de Medicamentos , Diseño de Fármacos , Análisis por Conglomerados , Enlace de Hidrógeno , Ligandos , Modelos Moleculares , Conformación Molecular , Programas Informáticos , Relación Estructura-Actividad
15.
Genome Biol ; 6(13): R113, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-16420672

RESUMEN

Although proven successful in the identification of regulatory motifs, phylogenetic footprinting methods still show some shortcomings. To assess these difficulties, most apparent when applying phylogenetic footprinting to distantly related organisms, we developed a two-step procedure that combines the advantages of sequence alignment and motif detection approaches. The results on well-studied benchmark datasets indicate that the presented method outperforms other methods when the sequences become either too long or too heterogeneous in size.


Asunto(s)
Biología Computacional/métodos , Genoma/genética , Secuencias Reguladoras de Ácidos Nucleicos/genética , Secuencia de Aminoácidos , Animales , Emparejamiento Base/genética , Secuencia Conservada , Bases de Datos Genéticas , Proteínas del Ojo/química , Proteínas de Homeodominio/química , Humanos , Datos de Secuencia Molecular , Factor de Transcripción PAX6 , Factores de Transcripción Paired Box/química , Filogenia , Proteínas Represoras/química
16.
Bioinformatics ; 18(5): 735-46, 2002 May.
Artículo en Inglés | MEDLINE | ID: mdl-12050070

RESUMEN

MOTIVATION: Microarray experiments generate a considerable amount of data, which analyzed properly help us gain a huge amount of biologically relevant information about the global cellular behaviour. Clustering (grouping genes with similar expression profiles) is one of the first steps in data analysis of high-throughput expression measurements. A number of clustering algorithms have proved useful to make sense of such data. These classical algorithms, though useful, suffer from several drawbacks (e.g. they require the predefinition of arbitrary parameters like the number of clusters; they force every gene into a cluster despite a low correlation with other cluster members). In the following we describe a novel adaptive quality-based clustering algorithm that tackles some of these drawbacks. RESULTS: We propose a heuristic iterative two-step algorithm: First, we find in the high-dimensional representation of the data a sphere where the "density" of expression profiles is locally maximal (based on a preliminary estimate of the radius of the cluster-quality-based approach). In a second step, we derive an optimal radius of the cluster (adaptive approach) so that only the significantly coexpressed genes are included in the cluster. This estimation is achieved by fitting a model to the data using an EM-algorithm. By inferring the radius from the data itself, the biologist is freed from finding an optimal value for this radius by trial-and-error. The computational complexity of this method is approximately linear in the number of gene expression profiles in the data set. Finally, our method is successfully validated using existing data sets. AVAILABILITY: http://www.esat.kuleuven.ac.be/~thijs/Work/Clustering.html


Asunto(s)
Algoritmos , Análisis por Conglomerados , Bases de Datos Genéticas , Perfilación de la Expresión Génica/métodos , Perfilación de la Expresión Génica/estadística & datos numéricos , Simulación por Computador , Genoma Fúngico , Mitosis/genética , Modelos Genéticos , Modelos Estadísticos , Saccharomyces cerevisiae/genética , Sensibilidad y Especificidad
17.
Genome Biol ; 5(2): R9, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-14759259

RESUMEN

BACKGROUND: The PmrAB (BasSR) two-component regulatory system is required for Salmonella typhimurium virulence. PmrAB-controlled modifications of the lipopolysaccharide (LPS) layer confer resistance to cationic antibiotic polypeptides, which may allow bacteria to survive within macrophages. The PmrAB system also confers resistance to Fe3+-mediated killing. New targets of the system have recently been discovered that seem not to have a role in the well-described functions of PmrAB, suggesting that the PmrAB-dependent regulon might contain additional, unidentified targets. RESULTS: We performed an in silico analysis of possible targets of the PmrAB system. Using a motif model of the PmrA binding site in DNA, genome-wide screening was carried out to detect PmrAB target genes. To increase confidence in the predictions, all putative targets were subjected to a cross-species comparison (phylogenetic footprinting) using a Gibbs sampling-based motif-detection procedure. As well as the known targets, we detected additional targets with unknown functions. Four of these were experimentally validated (yibD, aroQ, mig-13 and sseJ). Site-directed mutagenesis of the PmrA-binding site (PmrA box) in yibD revealed specific sequence requirements. CONCLUSIONS: We demonstrated the efficiency of our procedure by recovering most of the known PmrAB-dependent targets and by identifying unknown targets that we were able to validate experimentally. We also pinpointed directions for further research that could help elucidate the S. typhimurium virulence pathway.


Asunto(s)
Proteínas Bacterianas/metabolismo , ADN Bacteriano/análisis , Secuencias Reguladoras de Ácidos Nucleicos , Salmonella typhimurium/genética , Factores de Transcripción/metabolismo , Secuencia de Bases , Sitios de Unión , Huella de ADN , ADN Bacteriano/metabolismo , Genes Reporteros , Genoma Bacteriano , Datos de Secuencia Molecular , Mutagénesis Sitio-Dirigida , Filogenia , Salmonella typhimurium/patogenicidad , Alineación de Secuencia , Análisis de Secuencia de ADN , Homología de Secuencia de Ácido Nucleico
18.
Bioinformatics ; 18(2): 331-2, 2002 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-11847086

RESUMEN

INCLUSive allows automatic multistep analysis of microarray data (clustering and motif finding). The clustering algorithm (adaptive quality-based clustering) groups together genes with highly similar expression profiles. The upstream sequences of the genes belonging to a cluster are automatically retrieved from GenBank and can be fed directly into Motif Sampler, a Gibbs sampling algorithm that retrieves statistically over-represented motifs in sets of sequences, in this case upstream regions of co-expressed genes.


Asunto(s)
Familia de Multigenes , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Programas Informáticos , Algoritmos , Análisis por Conglomerados , Biología Computacional , Bases de Datos Genéticas , Perfilación de la Expresión Génica/estadística & datos numéricos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA