Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
BMC Bioinformatics ; 10: 432, 2009 Dec 18.
Artículo en Inglés | MEDLINE | ID: mdl-20021665

RESUMEN

BACKGROUND: In 2004, Bejerano et al. announced the startling discovery of hundreds of "ultraconserved elements", long genomic sequences perfectly conserved across human, mouse, and rat. Their announcement stimulated a flurry of subsequent research. RESULTS: We generalize the notion of ultraconserved element in a natural way from extraordinary human-rodent conservation to extraordinary conservation over an arbitrary set of species. We call these "Extremely Conserved Elements". There is a linear time algorithm to find all such Extremely Conserved Elements in any multiple sequence alignment, provided that the conservation is required to be across all the aligned species. For the general case of conservation across an arbitrary subset of the aligned species, we show that the question of whether there exists an Extremely Conserved Element is NP-complete. We illustrate the linear time algorithm by cataloguing all 177 Extremely Conserved Elements in the currently available 44-vertebrate whole-genome alignment, and point out some of the characteristics of these elements. CONCLUSIONS: The NP-completeness in the case of conservation across an arbitrary subset of the aligned species implies that it is unlikely an efficient algorithm exists for this general case. Despite this fact, for the interesting case of conservation across all or most of the aligned species, our algorithm is efficient enough to be practical. The 177 Extremely Conserved Elements that we catalog demonstrate many of the characteristics of the original ultraconserved elements of Bejerano et al.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Alineación de Secuencia/métodos , Animales , Secuencia de Bases , Secuencia Conservada , Genoma , Humanos , Ratones , Ratas , Análisis de Secuencia de ADN , Vertebrados
2.
J Bioinform Comput Biol ; 7(2): 373-88, 2009 Apr.
Artículo en Inglés | MEDLINE | ID: mdl-19340921

RESUMEN

Non-coding RNAs (ncRNAs) are transcripts that do not code for proteins. Recent findings have shown that RNA-mediated regulatory mechanisms influence a substantial portion of typical microbial genomes. We present an efficient method for finding potential ncRNAs in bacteria by clustering genomic sequences based on homology inferred from both primary sequence and secondary structure. We evaluate our approach using a set of predominantly Firmicutes sequences. Our results showed that, though primary sequence based-homology search was inaccurate for diverged ncRNA sequences, through our clustering method, we were able to infer motifs that recovered nearly all members of most known ncRNA families. Hence, our method shows promise for discovering new families of ncRNA.


Asunto(s)
Mapeo Cromosómico/métodos , Análisis por Conglomerados , Genoma/genética , ARN no Traducido/genética , Análisis de Secuencia de ARN/métodos
3.
BMC Bioinformatics ; 8: 66, 2007 Feb 27.
Artículo en Inglés | MEDLINE | ID: mdl-17326819

RESUMEN

BACKGROUND: The significant advances in microarray and proteomics analyses have resulted in an exponential increase in potential new targets and have promised to shed light on the identification of disease markers and cellular pathways. We aim to collect and decipher the HCC-related genes at the systems level. RESULTS: Here, we build an integrative platform, the Encyclopedia of Hepatocellular Carcinoma genes Online, dubbed EHCO http://ehco.iis.sinica.edu.tw, to systematically collect, organize and compare the pileup of unsorted HCC-related studies by using natural language processing and softbots. Among the eight gene set collections, ranging across PubMed, SAGE, microarray, and proteomics data, there are 2,906 genes in total; however, more than 77% genes are only included once, suggesting that tremendous efforts need to be exerted to characterize the relationship between HCC and these genes. Of these HCC inventories, protein binding represents the largest proportion (~25%) from Gene Ontology analysis. In fact, many differentially expressed gene sets in EHCO could form interaction networks (e.g. HBV-associated HCC network) by using available human protein-protein interaction datasets. To further highlight the potential new targets in the inferred network from EHCO, we combine comparative genomics and interactomics approaches to analyze 120 evolutionary conserved and overexpressed genes in HCC. 47 out of 120 queries can form a highly interactive network with 18 queries serving as hubs. CONCLUSION: This architectural map may represent the first step toward the attempt to decipher the hepatocarcinogenesis at the systems level. Targeting hubs and/or disruption of the network formation might reveal novel strategy for HCC treatment.


Asunto(s)
Carcinoma Hepatocelular/genética , Enciclopedias como Asunto , Redes Reguladoras de Genes/genética , Neoplasias Hepáticas/genética , Perfilación de la Expresión Génica/métodos , Regulación Neoplásica de la Expresión Génica/genética , Humanos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos
4.
Gene Expr ; 13(2): 107-32, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-17017125

RESUMEN

The development of hepatocellular carcinoma (HCC) is generally preceded by cirrhosis, which occurs at the end stage of fibrosis. This is a common and potentially lethal problem of chronic liver disease in Asia. The development of microarrays permits us to monitor transcriptomes on a genome-wide scale; this has dramatically speeded up a comprehensive understanding of the disease process. Here we used dimethylnitrosamine (DMN), a nongenotoxic hepatotoxin, to induce rat necroinflammatory and hepatic fibrosis. During the 6-week time course, histopathological, biochemical, and quantitative RT-PCR analyses confirmed the incidence of necroinflammatory and hepatic fibrosis in this established rat model system. Using the Affymetrix microarray chip, 256 differentially expressed genes were identified from the liver injury samples. Hierarchical clustering of gene expression using a gene ontology database allowed the identification of several stage-specific characters and functionally related clusters that encode proteins related to metabolism, cell growth/maintenance, and response to external challenge. Among these genes, we classified 44 potential necroinflammatory-related genes and 62 potential fibrosis-related markers or drug targets based on histopathological scores. We also compared the results with other data on well-known markers and various other microarray datasets that are available. In conclusion, we believe that the molecular picture of necroinflammatory and hepatic fibrosis from this study may provide novel biological insights into the development of early liver damage molecular classifiers than can be used for basic research and in clinical applications. A public accessible website is available at http://LiverFibrosis.nchc.org.tw:8080/LF.


Asunto(s)
Dimetilnitrosamina/toxicidad , Perfilación de la Expresión Génica , Cirrosis Hepática/genética , Análisis de Secuencia por Matrices de Oligonucleótidos , Animales , Modelos Animales de Enfermedad , Hígado/patología , Cirrosis Hepática/inducido químicamente , Cirrosis Hepática/metabolismo , Cirrosis Hepática/patología , Masculino , Reacción en Cadena de la Polimerasa , Ratas , Ratas Sprague-Dawley , Valores de Referencia
5.
Metagenomics (Cairo) ; 2: 235646, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-24013439

RESUMEN

Study of the human microbiota in relation to human health and disease is a rapidly expanding field. To fully understand the complex relationship between the human gut microbiota and disease risks, study designs that capture the variation within and between human subjects at the population level are required, but this has been hampered by the lack of cost-effective methods to characterize this variation. Illumina sequencing is inexpensive and produces millions of reads per run, but it is unclear whether short reads can adequately represent the microbial community of a human host. In this study, we examined the utility of a profiling method, microbial nucleotide signatures (MNS), focused on low-depth sampling of the human microbiota using Ilumina short reads. This method is intended to aid in human population-based studies where large sample sizes are required to adequately capture variation in disease or phenotype differences. We found that, by calculating the nucleotide diversities along the sequenced 16S rRNA gene region, which did not require assembly or phylogenetic identification, we were able to differentiate the gut microbial nucleotide signatures of 9 healthy individuals. When we further subsampled the reads down to 40,000 reads (51 bp long) per sample, the diversity profiles were relatively unchanged. Applying MNS to a public datasets showed that it could differentiate body site differences. The scalability of our approach offers rapid classification of study participants for studies with the sample sizes required for epidemiological studies. Using MNS to classify the microbiome associated with a disease state followed by targeted in-depth sequencing will give a comprehensive understanding of the role of the microbiome in human health.

6.
Bioinformatics ; 21(12): 2883-90, 2005 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-15802287

RESUMEN

MOTIVATION: The explosion of microarray studies has promised to shed light on the temporal expression patterns of thousands of genes simultaneously. However, available methods are far from adequate in efficiently extracting useful information to aid in a greater understanding of transcriptional regulatory network. Biological systems have been modeled as dynamic systems for a long history, such as genetic networks and cell regulatory network. This study evaluated if the stochastic differential equation (SDE), which is prominent for modeling dynamic diffusion process originating from the irregular Brownian motion, can be applied in modeling the transcriptional regulatory network in Saccharomyces cerevisiae. RESULTS: To model the time-continuous gene-expression datasets, a model of SDE is applied to depict irregular patterns. Our goal is to fit a generalized linear model by combining putative regulators to estimate the transcriptional pattern of a target gene. Goodness-of-fit is evaluated by log-likelihood and Akaike Information Criterion. Moreover, estimations of the contribution of regulators and inference of transcriptional pattern are implemented by statistical approaches. Our SDE model is basic but the test results agree well with the observed dynamic expression patterns. It implies that advanced SDE model might be perfectly suited to portray transcriptional regulatory networks. AVAILABILITY: The R code is available on request. CONTACT: cykao@csie.ntu.edu.tw SUPPLEMENTARY INFORMATION: http://www.csie.ntu.edu.tw/~b89x035/yeast/


Asunto(s)
Regulación de la Expresión Génica/fisiología , Modelos Biológicos , Proteínas de Saccharomyces cerevisiae/metabolismo , Saccharomyces cerevisiae/fisiología , Transducción de Señal/fisiología , Factores de Transcripción/metabolismo , Activación Transcripcional/fisiología , Modelos Estadísticos , Programas Informáticos , Procesos Estocásticos
7.
Bioinformatics ; 20(17): 3273-6, 2004 Nov 22.
Artículo en Inglés | MEDLINE | ID: mdl-15217821

RESUMEN

One possible path towards understanding the biological function of a target protein is through the discovery of how it interfaces within protein-protein interaction networks. The goal of this study was to create a virtual protein-protein interaction model using the concepts of orthologous conservation (or interologs) to elucidate the interacting networks of a particular target protein. POINT (the prediction of interactome database) is a functional database for the prediction of the human protein-protein interactome based on available orthologous interactome datasets. POINT integrates several publicly accessible databases, with emphasis placed on the extraction of a large quantity of mouse, fruit fly, worm and yeast protein-protein interactions datasets from the Database of Interacting Proteins (DIP), followed by conversion of them into a predicted human interactome. In addition, protein-protein interactions require both temporal synchronicity and precise spatial proximity. POINT therefore also incorporates correlated mRNA expression clusters obtained from cell cycle microarray databases and subcellular localization from Gene Ontology to further pinpoint the likelihood of biological relevance of each predicted interacting sets of protein partners.


Asunto(s)
Bases de Datos de Proteínas , Almacenamiento y Recuperación de la Información/métodos , Mapeo de Interacción de Proteínas/métodos , Proteoma/metabolismo , Alineación de Secuencia/métodos , Análisis de Secuencia de Proteína/métodos , Interfaz Usuario-Computador , Animales , Sistemas de Administración de Bases de Datos , Proteínas de Drosophila/química , Proteínas de Drosophila/metabolismo , Evolución Molecular , Humanos , Internet , Ratones , Proteoma/química , Proteínas de Saccharomyces cerevisiae/química , Proteínas de Saccharomyces cerevisiae/metabolismo , Transducción de Señal/fisiología
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA