Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 16 de 16
Filter
Add more filters










Publication year range
2.
Front Genet ; 10: 49, 2019.
Article in English | MEDLINE | ID: mdl-30809243

ABSTRACT

There is a growing attention toward personalized medicine. This is led by a fundamental shift from the 'one size fits all' paradigm for treatment of patients with conditions or predisposition to diseases, to one that embraces novel approaches, such as tailored target therapies, to achieve the best possible outcomes. Driven by these, several national and international genome projects have been initiated to reap the benefits of personalized medicine. Exome and targeted sequencing provide a balance between cost and benefit, in contrast to whole genome sequencing (WGS). Whole exome sequencing (WES) targets approximately 3% of the whole genome, which is the basis for protein-coding genes. Nonetheless, it has the characteristics of big data in large deployment. Herein, the application of WES and its relevance in advancing personalized medicine is reviewed. WES is mapped to Big Data "10 Vs" and the resulting challenges discussed. Application of existing biological databases and bioinformatics tools to address the bottleneck in data processing and analysis are presented, including the need for new generation big data analytics for the multi-omics challenges of personalized medicine. This includes the incorporation of artificial intelligence (AI) in the clinical utility landscape of genomic information, and future consideration to create a new frontier toward advancing the field of personalized medicine.

3.
Biomed Res Int ; 2014: 648389, 2014.
Article in English | MEDLINE | ID: mdl-24977157

ABSTRACT

Antibiotics resistance is a serious biomedical issue as formally susceptible organisms gain resistance under its selective pressure. There have been contradictory results regarding the prevalence of resistance following withdrawal and disuse of the specific antibiotics. Here, we use experimental evolution in "digital organisms" to examine the rate of gain and loss of resistance under the assumption that there is no fitness cost for maintaining resistance. Our results show that selective pressure is likely to result in maximum resistance with respect to the selective pressure. During deselection as a result of disuse of the specific antibiotics, a large initial loss and prolonged stabilization of resistance are observed, but resistance is not lost to the stage of preselection. This suggests that a pool of partial persists organisms persist long after withdrawal of selective pressure at a relatively constant proportion. Hence, contradictory results regarding the prevalence of resistance following withdrawal and disuse of the specific antibiotics may be a statistical variation about constant proportion. Our results also show that subsequent reintroduction of the same selective pressure results in rapid regain of maximal resistance. Thus, our simulation results suggest that complete elimination of specific antibiotics resistance is unlikely after the disuse of antibiotics once a resistant pool of microorganisms has been established.


Subject(s)
Anti-Bacterial Agents/pharmacology , Bacteria/genetics , Chromosomes, Bacterial/genetics , Drug Resistance, Bacterial/genetics , Models, Genetic , Selection, Genetic/genetics , Bacteria/drug effects , Computer Simulation , Drug Resistance, Bacterial/drug effects , Mutation/genetics , Quantitative Trait, Heritable
4.
BMC Bioinformatics ; 15: 140, 2014 May 13.
Article in English | MEDLINE | ID: mdl-24884349

ABSTRACT

BACKGROUND: A means to predict the effects of gene over-expression, knockouts, and environmental stimuli in silico is useful for system biologists to develop and test hypotheses. Several studies had predicted the expression of all Escherichia coli genes from sequences and reported a correlation of 0.301 between predicted and actual expression. However, these do not allow biologists to study the effects of gene perturbations on the native transcriptome. RESULTS: We developed a predictor to predict transcriptome-scale gene expression from a small number (n = 59) of known gene expressions using gene co-expression network, which can be used to predict the effects of over-expressions and knockdowns on E. coli transcriptome. In terms of transcriptome prediction, our results show that the correlation between predicted and actual expression value is 0.467, which is similar to the microarray intra-array variation (p-value = 0.348), suggesting that intra-array variation accounts for a substantial portion of the transcriptome prediction error. In terms of predicting the effects of gene perturbation(s), our results suggest that the expression of 83% of the genes affected by perturbation can be predicted within 40% of error and the correlation between predicted and actual expression values among the affected genes to be 0.698. With the ability to predict the effects of gene perturbations, we demonstrated that our predictor has the potential to estimate the effects of varying gene expression level on the native transcriptome. CONCLUSION: We present a potential means to predict an entire transcriptome and a tool to estimate the effects of gene perturbations for E. coli, which will aid biologists in hypothesis development. This study forms the baseline for future work in using gene co-expression network for gene expression prediction.


Subject(s)
Escherichia coli/genetics , Gene Expression Profiling/methods , Gene Regulatory Networks , Endopeptidases/genetics , Endopeptidases/metabolism , Escherichia coli/metabolism , Gene Expression Regulation, Bacterial , Gene Knockout Techniques , Oligonucleotide Array Sequence Analysis
5.
Electron Physician ; 6(1): 719-27, 2014.
Article in English | MEDLINE | ID: mdl-25763136

ABSTRACT

BACKGROUND: Reference genes are assumed to be stably expressed under most circumstances. Previous studies have shown that identification of potential reference genes using common algorithms, such as NormFinder, geNorm, and BestKeeper, are not suitable for microarray-sized datasets. The aim of this study was to evaluate existing methods and develop methods for identifying reference genes from microarray datasets. METHODS: We evaluated the correlation between outputs from 7 published methods for identifying reference genes, including NormFinder, geNorm, and BestKeeper, using subsets of published microarray data. From these results, seven novel combinations of published methods for identifying reference genes were evaluated. RESULTS: Our results showed that NormFinder's and geNorm's indices had high correlations (R(2) = 0.987, P < 0.0001), which is consistent with the findings of previous studies. However, NormFinder's and BestKeeper's indices (R(2) = 0.489, 0.01 < P < 0.05) and NormFinder's coefficient of variance (CV) suggested a lower correlation (R(2) = 0.483, 0.01 < P < 0.05). We developed two novel methods with high correlations with NormFinder (R(2) values of both methods were 0.796, P < 0.0001). In addition, computational times required by the two novel methods were linear with the size of the dataset. CONCLUSION: Our findings suggested that both of our novel methods can be used as alternatives to NormFinder, geNorm, and BestKeeper for identifying reference genes from large datasets. These methods were implemented as a tool, OLIgonucleotide Variable Expression Ranker (OLIVER), which can be downloaded from http://sourceforge.net/projects/bactome/files/OLIVER/OLIVER_1.zip.

6.
BMC Genomics ; 14: 243, 2013 Apr 12.
Article in English | MEDLINE | ID: mdl-23577827

ABSTRACT

BACKGROUND: Recent studies had found thousands of natural antisense transcripts originating from the same genomic loci of protein coding genes but from the opposite strand. It is unclear whether the majority of antisense transcripts are functional or merely transcriptional noise. RESULTS: Using the Affymetrix Exon array with a modified cDNA synthesis protocol that enables genome-wide detection of antisense transcription, we conducted large-scale expression analysis of antisense transcripts in nine corresponding tissues from human, mouse and rat. We detected thousands of antisense transcripts, some of which show tissue-specific expression that could be subjected to further study for their potential function in the corresponding tissues/organs. The expression patterns of many antisense transcripts are conserved across species, suggesting selective pressure on these transcripts. When compared to protein-coding genes, antisense transcripts show a lesser degree of expression conservation. We also found a positive correlation between the sense and antisense expression across tissues. CONCLUSION: Our results suggest that natural antisense transcripts are subjected to selective pressure but to a lesser degree compared to sense transcripts in mammals.


Subject(s)
RNA, Antisense/genetics , Transcription, Genetic , Animals , DNA, Complementary/genetics , Exons/genetics , Gene Expression Profiling , Humans , Mice , Oligonucleotide Array Sequence Analysis/methods , Organ Specificity , Rats , Reverse Transcriptase Polymerase Chain Reaction
7.
Dataset Pap Biol ; 20132013.
Article in English | MEDLINE | ID: mdl-23457664

ABSTRACT

Microarrays are a large-scale expression profiling method which has been used to study the transcriptome of plants under various environmental conditions. However, manual inspection of microarray data is difficult at the genome level because of the large number of genes (normally at least 30,000) and the many different processes that occur within any given plant. MapMan software, which was initially developed to visualize microarray data for Arabidopsis, has been adapted to other plant species by mapping other species onto MapMan ontology. This paper provides a detailed procedure and the relevant computing codes to generate a MapMan ontology mapping file for tobacco (Nicotiana tabacum L.) using potato and Arabidopsis as intermediates. The mapping file can be used directly with our custom made NimbleGen oligoarray, that contains gene sequences from both the tobacco gene space sequence and the tobacco gene index 4 (NTGI4) collection of ESTs. The generated data set will be informative for scientists working on tobacco as their model plant by providing a MapMan ontology mapping file to tobacco, homology between tobacco coding sequences and that of potato and Arabidopsis, as well as adapting our procedure and codes for other plant species where the complete genome is not yet available.

8.
ISRN Bioinform ; 2013: 361321, 2013.
Article in English | MEDLINE | ID: mdl-25937945

ABSTRACT

The expressions of reference genes used in gene expression studies are assumed to be stable under most circumstances. However, studies had demonstrated that genes assumed to be stably expressed in a species are not necessarily stably expressed in other organisms. This study aims to evaluate the likelihood of genus-specific reference genes for liver using comparable microarray datasets from Spermophilus lateralis and Spermophilus tridecemlineatus. The coefficient of variance (CV) of each probe was calculated and there were 178 probes common between the lowest 10% CV of both datasets (n = 1258). All 3 lists were analysed by NormFinder. Our results suggest that the most invariant probe for S. tridecemlineatus was 02n12, while that for S. lateralis was 24j21. However, our results showed that Probes 02n12 and 24j21 are ranked 8644 and 926 in terms of invariancy for S. lateralis and S. tridecemlineatus respectively. This suggests the lack of common liver-specific reference probes for both S. lateralis and S. tridecemlineatus. Given that S. lateralis and S. tridecemlineatus are closely related species and the datasets are comparable, our results do not support the presence of genus-specific reference genes.

9.
Electron Physician ; 5(1): 576-81, 2013.
Article in English | MEDLINE | ID: mdl-26120385

ABSTRACT

BACKGROUND: Escherichia coli is a widely studied prokaryotic system. A recent study had demonstrated that reduced growth of E. coli after extended culture in Luria-Bertani broth is a result of depletion of fermentable sugars but able to sustain extended cell culture due to the presence of amino acids, which can be utilized as a carbon source. However, this had not been demonstrated in other media. The study aimed to determine the growth and viability of E. coli ATCC 8739 in 3 different media, Nutrient Broth (NB), Brain Heart Infusion (BHI) and Luria-Bertani Broth (LB) over 11 weeks. METHODS: Growth of E. coli ATCC 8739 was determined by optical density. Viability was determined by serial dilution/spread-plate enumeration. After 11 weeks, the media were exhausted by repeated culture. Glucose was added to the exhausted media to determine whether glucose is the growth-limiting factor. RESULTS: Our results showed that cell density in all 3 media increased to about 1 × 10(9) cells/ml by the end of week 1, from the inoculation density of 2.67 × 10(5) cells/ml, peaked at about 1 × 10(13) cells/ml at week 4, before declining to about 5 × 10(7) cells/ml at week 7. Cell density is highly correlated to genomic DNA content (r(2) = 0.93) but poorly correlated to optical density (r(2)< 0.2). Our results also showed that the spent media were able to support further growth after glucose-supplementation. CONCLUSION: NB, LB and BHI are able to support extended periods of culture and glucose depletion is the likely reason for declining cell growth.

10.
Theory Biosci ; 131(4): 215-23, 2012 Dec.
Article in English | MEDLINE | ID: mdl-22588998

ABSTRACT

The aim of this review is to find answers to some of the questions surrounding reference genes and their reliability for quantitative experiments. Reference genes are assumed to be at a constant expression level, over a range of conditions such as temperature. These genes, such as GADPH and beta-actin, are used extensively for gene expression studies using techniques like quantitative PCR. There have been several studies carried out on identifying reference genes. However, a lot of evidence indicates issues to the general suitability of these genes. Recent studies had shown that different factors, including the environment and methods, play an important role in changing the expression levels of the reference genes. Thus, we conclude that there is no reference gene that can deemed suitable for all the experimental conditions. In addition, we believe that every experiment will require the scientific evaluation and selection of the best candidate gene for use as a reference gene to obtain reliable scientific results.


Subject(s)
Gene Expression Profiling/methods , RNA, Messenger/analysis , Real-Time Polymerase Chain Reaction/methods , Gene Expression Profiling/standards , Humans , Reference Standards , Validation Studies as Topic
11.
ISRN Bioinform ; 2012: 790452, 2012.
Article in English | MEDLINE | ID: mdl-25969744

ABSTRACT

Lung cancer is a common cancer, and expression profiling can provide an accurate indication to advance the medical intervention. However, this requires the availability of stably expressed genes as reference. Recent studies had shown that genes that are stably expressed in a tissue may not be stably expressed in other tissues suggesting the need to identify stably expressed genes in each tissue for use as reference genes. DNA microarray analysis has been used to identify those reference genes with low fluctuation. Fourteen datasets with different lung conditions were employed in our study. Coefficient of variance, followed by NormFinder, was used to identify stably expressed genes. Our results showed that classical reference genes such as GAPDH and HPRT1 were highly variable; thus, they are unsuitable as reference genes. Signal peptidase complex subunit 1 (SPCS1) and hydroxyacyl-CoA dehydrogenase beta subunit (HADHB), which are involved in fundamental biochemical processes, demonstrated high expression stability suggesting their suitability in human lung cell profiling.

12.
ISRN Microbiol ; 2012: 965356, 2012.
Article in English | MEDLINE | ID: mdl-23724334

ABSTRACT

Escherichia coli is commonly found in intestine of human, and any changes in their adaptation or evolution may affect the human body. The relationship between E. coli and food additives is less studied as compared to antibiotics. E. coli within our human gut are consistently interacting with the food additives; thus, it is important to investigate this relationship. In this paper, we observed the evolution of E. coli cultured in different concentration of food additives (sodium chloride, benzoic acid, and monosodium glutamate), singly or in combination, over 70 passages. Adaptability over time was estimated by generation time and cell density at stationary phase. Polymerase chain reaction (PCR)/restriction fragments length polymorphism (RFLP) using 3 primers and restriction endonucleases, each was used to characterize adaptation/evolution at genomic level. The amplification and digestion profiles were tabulated and analyzed by Nei-Li dissimilarity index. Our results demonstrate that E. coli in every treatment had adapted over 465 generations. The types of stress were discovered to be different even though different concentrations of same additives were used. However, RFLP shows a convergence of genetic distances, suggesting the presence of global stress response. In addition, monosodium glutamate may be a nutrient source and support acid resistance in E. coli.

13.
BMC Bioinformatics ; 12 Suppl 8: S6, 2011 Oct 03.
Article in English | MEDLINE | ID: mdl-22152021

ABSTRACT

BACKGROUND: Previously, gene normalization (GN) systems are mostly focused on disambiguation using contextual information. An effective gene mention tagger is deemed unnecessary because the subsequent steps will filter out false positives and high recall is sufficient. However, unlike similar tasks in the past BioCreative challenges, the BioCreative III GN task is particularly challenging because it is not species-specific. Required to process full-length articles, an ineffective gene mention tagger may produce a huge number of ambiguous false positives that overwhelm subsequent filtering steps while still missing many true positives. RESULTS: We present our GN system participated in the BioCreative III GN task. Our system applies a typical 2-stage approach to GN but features a soft tagging gene mention tagger that generates a set of overlapping gene mention variants with a nearly perfect recall. The overlapping gene mention variants increase the chance of precise match in the dictionary and alleviate the need of disambiguation. Our GN system achieved a precision of 0.9 (F-score 0.63) on the BioCreative III GN test corpus with the silver annotation of 507 articles. Its TAP-k scores are competitive to the best results among all participants. CONCLUSIONS: We show that despite the lack of clever disambiguation in our gene normalization system, effective soft tagging of gene mention variants can indeed contribute to performance in cross-species and full-text gene normalization.


Subject(s)
Data Mining , Genes , Species Specificity , Data Mining/methods , Natural Language Processing , Periodicals as Topic , Software , Terminology as Topic
14.
ISRN Microbiol ; 2011: 469053, 2011.
Article in English | MEDLINE | ID: mdl-23724305

ABSTRACT

The expressions of reference genes used in gene expression studies are assumed to be stable under most circumstances. However, a number of studies had demonstrated that such genes were found to vary under experimental conditions. In addition, genes that are stably expressed in an organ may not be stably expressed in other organs or other organisms, suggesting the need to identify reference genes for each organ and organism. This study aims at identifying stably expressed genes in Escherichia coli. Microarray datasets from E. coli substrain MG1655 and 1 dataset from W3110 were analysed. Coefficient of variance (COV) of was calculated and 10% of the lowest COV from 4631 genes common in the 3 MG1655 sets were analysed using NormFinder. Glucan biosynthesis protein G (mdoG), which is involved in cell wall synthesis, displayed the lowest weighted COV and weighted NormFinder Stability Index for the MG1655 datasets, while also showing to be the most stable in the dataset for substrain W3110, suggesting that mdoG is a suitable reference gene for E. coli K-12. Gene ontology over-representation analysis on the 39 genes suggested an over-representation of cell division, carbohydrate metabolism, and protein synthesis which supports the short generation time of E. coli.

15.
IUBMB Life ; 62(3): 200-3, 2010 Mar.
Article in English | MEDLINE | ID: mdl-20087965

ABSTRACT

Difference in gene expressions is characteristic of the function of different cell types and those genes with low expression variance can be used as standards for quantitative gene expression studies. Microarray technology is used to study global gene expression within a cell; hence, represents a suitable source of data to mine for genes with low expression variance. The coefficient of variation (COV) of each gene was determined and a threshold of less than 0.1 COV was used to select stably expressed genes in each data set. Our results showed that microtubule affinity-regulating kinase 3 (MARK3) has the lowest COV in eight microarray datasets. In addition, the gene expression of housekeeping genes, which is very likely to be stably expressed, tends to fluctuate highly under different conditions, marking them as being less reliable for use as reference genes.


Subject(s)
Gene Expression Profiling/methods , Oligonucleotide Array Sequence Analysis/standards , Protein Serine-Threonine Kinases/genetics , Animals , Glyceraldehyde 3-Phosphate Dehydrogenase (NADP+)/biosynthesis , Mice , Protein Serine-Threonine Kinases/biosynthesis , Ribosomal Proteins/biosynthesis
16.
BMC Bioinformatics ; 10 Suppl 15: S7, 2009 Dec 03.
Article in English | MEDLINE | ID: mdl-19958517

ABSTRACT

BACKGROUND: To automatically process large quantities of biological literature for knowledge discovery and information curation, text mining tools are becoming essential. Abbreviation recognition is related to NER and can be considered as a pair recognition task of a terminology and its corresponding abbreviation from free text. The successful identification of abbreviation and its corresponding definition is not only a prerequisite to index terms of text databases to produce articles of related interests, but also a building block to improve existing gene mention tagging and gene normalization tools. RESULTS: Our approach to abbreviation recognition (AR) is based on machine-learning, which exploits a novel set of rich features to learn rules from training data. Tested on the AB3P corpus, our system demonstrated a F-score of 89.90% with 95.86% precision at 84.64% recall, higher than the result achieved by the existing best AR performance system. We also annotated a new corpus of 1200 PubMed abstracts which was derived from BioCreative II gene normalization corpus. On our annotated corpus, our system achieved a F-score of 86.20% with 93.52% precision at 79.95% recall, which also outperforms all tested systems. CONCLUSION: By applying our system to extract all short form-long form pairs from all available PubMed abstracts, we have constructed BIOADI. Mining BIOADI reveals many interesting trends of bio-medical research. Besides, we also provide an off-line AR software in the download section on http://bioagent.iis.sinica.edu.tw/BIOADI/.


Subject(s)
Artificial Intelligence , Computational Biology/methods , Software , Algorithms , Data Mining/methods , Natural Language Processing , PubMed
SELECTION OF CITATIONS
SEARCH DETAIL