Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
BMC Bioinformatics ; 15: 140, 2014 May 13.
Artículo en Inglés | MEDLINE | ID: mdl-24884349

RESUMEN

BACKGROUND: A means to predict the effects of gene over-expression, knockouts, and environmental stimuli in silico is useful for system biologists to develop and test hypotheses. Several studies had predicted the expression of all Escherichia coli genes from sequences and reported a correlation of 0.301 between predicted and actual expression. However, these do not allow biologists to study the effects of gene perturbations on the native transcriptome. RESULTS: We developed a predictor to predict transcriptome-scale gene expression from a small number (n = 59) of known gene expressions using gene co-expression network, which can be used to predict the effects of over-expressions and knockdowns on E. coli transcriptome. In terms of transcriptome prediction, our results show that the correlation between predicted and actual expression value is 0.467, which is similar to the microarray intra-array variation (p-value = 0.348), suggesting that intra-array variation accounts for a substantial portion of the transcriptome prediction error. In terms of predicting the effects of gene perturbation(s), our results suggest that the expression of 83% of the genes affected by perturbation can be predicted within 40% of error and the correlation between predicted and actual expression values among the affected genes to be 0.698. With the ability to predict the effects of gene perturbations, we demonstrated that our predictor has the potential to estimate the effects of varying gene expression level on the native transcriptome. CONCLUSION: We present a potential means to predict an entire transcriptome and a tool to estimate the effects of gene perturbations for E. coli, which will aid biologists in hypothesis development. This study forms the baseline for future work in using gene co-expression network for gene expression prediction.


Asunto(s)
Escherichia coli/genética , Perfilación de la Expresión Génica/métodos , Redes Reguladoras de Genes , Endopeptidasas/genética , Endopeptidasas/metabolismo , Escherichia coli/metabolismo , Regulación Bacteriana de la Expresión Génica , Técnicas de Inactivación de Genes , Análisis de Secuencia por Matrices de Oligonucleótidos
2.
BMC Genomics ; 14: 243, 2013 Apr 12.
Artículo en Inglés | MEDLINE | ID: mdl-23577827

RESUMEN

BACKGROUND: Recent studies had found thousands of natural antisense transcripts originating from the same genomic loci of protein coding genes but from the opposite strand. It is unclear whether the majority of antisense transcripts are functional or merely transcriptional noise. RESULTS: Using the Affymetrix Exon array with a modified cDNA synthesis protocol that enables genome-wide detection of antisense transcription, we conducted large-scale expression analysis of antisense transcripts in nine corresponding tissues from human, mouse and rat. We detected thousands of antisense transcripts, some of which show tissue-specific expression that could be subjected to further study for their potential function in the corresponding tissues/organs. The expression patterns of many antisense transcripts are conserved across species, suggesting selective pressure on these transcripts. When compared to protein-coding genes, antisense transcripts show a lesser degree of expression conservation. We also found a positive correlation between the sense and antisense expression across tissues. CONCLUSION: Our results suggest that natural antisense transcripts are subjected to selective pressure but to a lesser degree compared to sense transcripts in mammals.


Asunto(s)
ARN sin Sentido/genética , Transcripción Genética , Animales , ADN Complementario/genética , Exones/genética , Perfilación de la Expresión Génica , Humanos , Ratones , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Especificidad de Órganos , Ratas , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa
3.
BMC Bioinformatics ; 12 Suppl 8: S6, 2011 Oct 03.
Artículo en Inglés | MEDLINE | ID: mdl-22152021

RESUMEN

BACKGROUND: Previously, gene normalization (GN) systems are mostly focused on disambiguation using contextual information. An effective gene mention tagger is deemed unnecessary because the subsequent steps will filter out false positives and high recall is sufficient. However, unlike similar tasks in the past BioCreative challenges, the BioCreative III GN task is particularly challenging because it is not species-specific. Required to process full-length articles, an ineffective gene mention tagger may produce a huge number of ambiguous false positives that overwhelm subsequent filtering steps while still missing many true positives. RESULTS: We present our GN system participated in the BioCreative III GN task. Our system applies a typical 2-stage approach to GN but features a soft tagging gene mention tagger that generates a set of overlapping gene mention variants with a nearly perfect recall. The overlapping gene mention variants increase the chance of precise match in the dictionary and alleviate the need of disambiguation. Our GN system achieved a precision of 0.9 (F-score 0.63) on the BioCreative III GN test corpus with the silver annotation of 507 articles. Its TAP-k scores are competitive to the best results among all participants. CONCLUSIONS: We show that despite the lack of clever disambiguation in our gene normalization system, effective soft tagging of gene mention variants can indeed contribute to performance in cross-species and full-text gene normalization.


Asunto(s)
Minería de Datos , Genes , Especificidad de la Especie , Minería de Datos/métodos , Procesamiento de Lenguaje Natural , Publicaciones Periódicas como Asunto , Programas Informáticos , Terminología como Asunto
4.
IUBMB Life ; 62(3): 200-3, 2010 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-20087965

RESUMEN

Difference in gene expressions is characteristic of the function of different cell types and those genes with low expression variance can be used as standards for quantitative gene expression studies. Microarray technology is used to study global gene expression within a cell; hence, represents a suitable source of data to mine for genes with low expression variance. The coefficient of variation (COV) of each gene was determined and a threshold of less than 0.1 COV was used to select stably expressed genes in each data set. Our results showed that microtubule affinity-regulating kinase 3 (MARK3) has the lowest COV in eight microarray datasets. In addition, the gene expression of housekeeping genes, which is very likely to be stably expressed, tends to fluctuate highly under different conditions, marking them as being less reliable for use as reference genes.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/normas , Proteínas Serina-Treonina Quinasas/genética , Animales , Gliceraldehído 3-Fosfato Deshidrogenasa (NADP+)/biosíntesis , Ratones , Proteínas Serina-Treonina Quinasas/biosíntesis , Proteínas Ribosómicas/biosíntesis
5.
BMC Bioinformatics ; 10 Suppl 15: S7, 2009 Dec 03.
Artículo en Inglés | MEDLINE | ID: mdl-19958517

RESUMEN

BACKGROUND: To automatically process large quantities of biological literature for knowledge discovery and information curation, text mining tools are becoming essential. Abbreviation recognition is related to NER and can be considered as a pair recognition task of a terminology and its corresponding abbreviation from free text. The successful identification of abbreviation and its corresponding definition is not only a prerequisite to index terms of text databases to produce articles of related interests, but also a building block to improve existing gene mention tagging and gene normalization tools. RESULTS: Our approach to abbreviation recognition (AR) is based on machine-learning, which exploits a novel set of rich features to learn rules from training data. Tested on the AB3P corpus, our system demonstrated a F-score of 89.90% with 95.86% precision at 84.64% recall, higher than the result achieved by the existing best AR performance system. We also annotated a new corpus of 1200 PubMed abstracts which was derived from BioCreative II gene normalization corpus. On our annotated corpus, our system achieved a F-score of 86.20% with 93.52% precision at 79.95% recall, which also outperforms all tested systems. CONCLUSION: By applying our system to extract all short form-long form pairs from all available PubMed abstracts, we have constructed BIOADI. Mining BIOADI reveals many interesting trends of bio-medical research. Besides, we also provide an off-line AR software in the download section on http://bioagent.iis.sinica.edu.tw/BIOADI/.


Asunto(s)
Inteligencia Artificial , Biología Computacional/métodos , Programas Informáticos , Algoritmos , Minería de Datos/métodos , Procesamiento de Lenguaje Natural , PubMed
6.
Front Genet ; 10: 49, 2019.
Artículo en Inglés | MEDLINE | ID: mdl-30809243

RESUMEN

There is a growing attention toward personalized medicine. This is led by a fundamental shift from the 'one size fits all' paradigm for treatment of patients with conditions or predisposition to diseases, to one that embraces novel approaches, such as tailored target therapies, to achieve the best possible outcomes. Driven by these, several national and international genome projects have been initiated to reap the benefits of personalized medicine. Exome and targeted sequencing provide a balance between cost and benefit, in contrast to whole genome sequencing (WGS). Whole exome sequencing (WES) targets approximately 3% of the whole genome, which is the basis for protein-coding genes. Nonetheless, it has the characteristics of big data in large deployment. Herein, the application of WES and its relevance in advancing personalized medicine is reviewed. WES is mapped to Big Data "10 Vs" and the resulting challenges discussed. Application of existing biological databases and bioinformatics tools to address the bottleneck in data processing and analysis are presented, including the need for new generation big data analytics for the multi-omics challenges of personalized medicine. This includes the incorporation of artificial intelligence (AI) in the clinical utility landscape of genomic information, and future consideration to create a new frontier toward advancing the field of personalized medicine.

7.
Biomed Res Int ; 2014: 648389, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24977157

RESUMEN

Antibiotics resistance is a serious biomedical issue as formally susceptible organisms gain resistance under its selective pressure. There have been contradictory results regarding the prevalence of resistance following withdrawal and disuse of the specific antibiotics. Here, we use experimental evolution in "digital organisms" to examine the rate of gain and loss of resistance under the assumption that there is no fitness cost for maintaining resistance. Our results show that selective pressure is likely to result in maximum resistance with respect to the selective pressure. During deselection as a result of disuse of the specific antibiotics, a large initial loss and prolonged stabilization of resistance are observed, but resistance is not lost to the stage of preselection. This suggests that a pool of partial persists organisms persist long after withdrawal of selective pressure at a relatively constant proportion. Hence, contradictory results regarding the prevalence of resistance following withdrawal and disuse of the specific antibiotics may be a statistical variation about constant proportion. Our results also show that subsequent reintroduction of the same selective pressure results in rapid regain of maximal resistance. Thus, our simulation results suggest that complete elimination of specific antibiotics resistance is unlikely after the disuse of antibiotics once a resistant pool of microorganisms has been established.


Asunto(s)
Antibacterianos/farmacología , Bacterias/genética , Cromosomas Bacterianos/genética , Farmacorresistencia Bacteriana/genética , Modelos Genéticos , Selección Genética/genética , Bacterias/efectos de los fármacos , Simulación por Computador , Farmacorresistencia Bacteriana/efectos de los fármacos , Mutación/genética , Carácter Cuantitativo Heredable
8.
ISRN Bioinform ; 2013: 361321, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-25937945

RESUMEN

The expressions of reference genes used in gene expression studies are assumed to be stable under most circumstances. However, studies had demonstrated that genes assumed to be stably expressed in a species are not necessarily stably expressed in other organisms. This study aims to evaluate the likelihood of genus-specific reference genes for liver using comparable microarray datasets from Spermophilus lateralis and Spermophilus tridecemlineatus. The coefficient of variance (CV) of each probe was calculated and there were 178 probes common between the lowest 10% CV of both datasets (n = 1258). All 3 lists were analysed by NormFinder. Our results suggest that the most invariant probe for S. tridecemlineatus was 02n12, while that for S. lateralis was 24j21. However, our results showed that Probes 02n12 and 24j21 are ranked 8644 and 926 in terms of invariancy for S. lateralis and S. tridecemlineatus respectively. This suggests the lack of common liver-specific reference probes for both S. lateralis and S. tridecemlineatus. Given that S. lateralis and S. tridecemlineatus are closely related species and the datasets are comparable, our results do not support the presence of genus-specific reference genes.

9.
ISRN Bioinform ; 2012: 790452, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-25969744

RESUMEN

Lung cancer is a common cancer, and expression profiling can provide an accurate indication to advance the medical intervention. However, this requires the availability of stably expressed genes as reference. Recent studies had shown that genes that are stably expressed in a tissue may not be stably expressed in other tissues suggesting the need to identify stably expressed genes in each tissue for use as reference genes. DNA microarray analysis has been used to identify those reference genes with low fluctuation. Fourteen datasets with different lung conditions were employed in our study. Coefficient of variance, followed by NormFinder, was used to identify stably expressed genes. Our results showed that classical reference genes such as GAPDH and HPRT1 were highly variable; thus, they are unsuitable as reference genes. Signal peptidase complex subunit 1 (SPCS1) and hydroxyacyl-CoA dehydrogenase beta subunit (HADHB), which are involved in fundamental biochemical processes, demonstrated high expression stability suggesting their suitability in human lung cell profiling.

10.
ISRN Microbiol ; 2012: 965356, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23724334

RESUMEN

Escherichia coli is commonly found in intestine of human, and any changes in their adaptation or evolution may affect the human body. The relationship between E. coli and food additives is less studied as compared to antibiotics. E. coli within our human gut are consistently interacting with the food additives; thus, it is important to investigate this relationship. In this paper, we observed the evolution of E. coli cultured in different concentration of food additives (sodium chloride, benzoic acid, and monosodium glutamate), singly or in combination, over 70 passages. Adaptability over time was estimated by generation time and cell density at stationary phase. Polymerase chain reaction (PCR)/restriction fragments length polymorphism (RFLP) using 3 primers and restriction endonucleases, each was used to characterize adaptation/evolution at genomic level. The amplification and digestion profiles were tabulated and analyzed by Nei-Li dissimilarity index. Our results demonstrate that E. coli in every treatment had adapted over 465 generations. The types of stress were discovered to be different even though different concentrations of same additives were used. However, RFLP shows a convergence of genetic distances, suggesting the presence of global stress response. In addition, monosodium glutamate may be a nutrient source and support acid resistance in E. coli.

11.
ISRN Microbiol ; 2011: 469053, 2011.
Artículo en Inglés | MEDLINE | ID: mdl-23724305

RESUMEN

The expressions of reference genes used in gene expression studies are assumed to be stable under most circumstances. However, a number of studies had demonstrated that such genes were found to vary under experimental conditions. In addition, genes that are stably expressed in an organ may not be stably expressed in other organs or other organisms, suggesting the need to identify reference genes for each organ and organism. This study aims at identifying stably expressed genes in Escherichia coli. Microarray datasets from E. coli substrain MG1655 and 1 dataset from W3110 were analysed. Coefficient of variance (COV) of was calculated and 10% of the lowest COV from 4631 genes common in the 3 MG1655 sets were analysed using NormFinder. Glucan biosynthesis protein G (mdoG), which is involved in cell wall synthesis, displayed the lowest weighted COV and weighted NormFinder Stability Index for the MG1655 datasets, while also showing to be the most stable in the dataset for substrain W3110, suggesting that mdoG is a suitable reference gene for E. coli K-12. Gene ontology over-representation analysis on the 39 genes suggested an over-representation of cell division, carbohydrate metabolism, and protein synthesis which supports the short generation time of E. coli.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA