Search | VHL Regional Portal

NGSmethDB: an updated genome resource for high quality, single-cytosine resolution methylomes.

Geisen, Stefanie; Barturen, Guillermo; Alganza, Ángel M; Hackenberg, Michael; Oliver, José L.

Nucleic Acids Res ; 42(Database issue): D53-9, 2014 Jan.

Article in English | MEDLINE | ID: mdl-24271385

ABSTRACT

The updated release of 'NGSmethDB' (http://bioinfo2.ugr.es/NGSmethDB) is a repository for single-base whole-genome methylome maps for the best-assembled eukaryotic genomes. Short-read data sets from NGS bisulfite-sequencing projects of cell lines, fresh and pathological tissues are first pre-processed and aligned to the corresponding reference genome, and then the cytosine methylation levels are profiled. One major improvement is the application of a unique bioinformatics protocol to all data sets, thereby assuring the comparability of all values with each other. We implemented stringent quality controls to minimize important error sources, such as sequencing errors, bisulfite failures, clonal reads or single nucleotide variants (SNVs). This leads to reliable and high-quality methylomes, all obtained under uniform settings. Another significant improvement is the detection in parallel of SNVs, which might be crucial for many downstream analyses (e.g. SNVs and differential-methylation relationships). A next-generation methylation browser allows fast and smooth scrolling and zooming, thus speeding data download/upload, at the same time requiring fewer server resources. Several data mining tools allow the comparison/retrieval of methylation levels in different tissues or genome regions. NGSmethDB methylomes are also available as native tracks through a UCSC hub, which allows comparison with a wide range of third-party annotations, in particular phenotype or disease annotations.

Subject(s)

DNA Methylation , Databases, Nucleic Acid , High-Throughput Nucleotide Sequencing/methods , Sequence Analysis, DNA/methods , Animals , Cell Line , Cytosine/metabolism , Epigenesis, Genetic , Genetic Variation , Genome , Genomics , High-Throughput Nucleotide Sequencing/standards , Humans , Internet , Mice , Sequence Alignment , Sequence Analysis, DNA/standards

WordCluster: detecting clusters of DNA words and genomic elements.

Hackenberg, Michael; Carpena, Pedro; Bernaola-Galván, Pedro; Barturen, Guillermo; Alganza, Angel M; Oliver, José L.

Algorithms Mol Biol ; 6: 2, 2011 Jan 24.

Article in English | MEDLINE | ID: mdl-21261981

ABSTRACT

BACKGROUND: Many k-mers (or DNA words) and genomic elements are known to be spatially clustered in the genome. Well established examples are the genes, TFBSs, CpG dinucleotides, microRNA genes and ultra-conserved non-coding regions. Currently, no algorithm exists to find these clusters in a statistically comprehensible way. The detection of clustering often relies on densities and sliding-window approaches or arbitrarily chosen distance thresholds. RESULTS: We introduce here an algorithm to detect clusters of DNA words (k-mers), or any other genomic element, based on the distance between consecutive copies and an assigned statistical significance. We implemented the method into a web server connected to a MySQL backend, which also determines the co-localization with gene annotations. We demonstrate the usefulness of this approach by detecting the clusters of CAG/CTG (cytosine contexts that can be methylated in undifferentiated cells), showing that the degree of methylation vary drastically between inside and outside of the clusters. As another example, we used WordCluster to search for statistically significant clusters of olfactory receptor (OR) genes in the human genome. CONCLUSIONS: WordCluster seems to predict biological meaningful clusters of DNA words (k-mers) and genomic entities. The implementation of the method into a web server is available at http://bioinfo2.ugr.es/wordCluster/wordCluster.php including additional features like the detection of co-localization with gene regions or the annotation enrichment tool for functional analysis of overlapped genes.

ABSTRACT

Subject(s)

ABSTRACT

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL