RESUMO
Recent progress has been made in the identification of protein-coding genes and miRNAs that are expressed in and alter the behavior of colonic epithelia. However, the role of long non-coding RNAs (lncRNAs) in colonic homeostasis is just beginning to be explored. By gene expression profiling of post-mitotic, differentiated tops and proliferative, progenitor-compartment bottoms of microdissected adult mouse colonic crypts, we identified several lncRNAs more highly expressed in crypt bottoms. One identified lncRNA, designated non-coding Nras functional RNA (ncNRFR), resides within the Nras locus but appears to be independent of the Nras coding transcript. Stable overexpression of ncNRFR in non-transformed, conditionally immortalized mouse colonocytes results in malignant transformation, as determined by growth in soft agar and formation of highly invasive tumors in nude mice. Moreover, ncNRFR appears to inhibit the function of the tumor suppressor let-7. These results suggest precise regulation of ncNRFR is necessary for proper cell growth in the colonic crypt, and its misregulation results in neoplastic transformation.
Assuntos
Transformação Celular Neoplásica , Colo/patologia , Neoplasias do Colo/genética , Células Epiteliais/patologia , Regulação Neoplásica da Expressão Gênica , RNA Longo não Codificante/genética , Animais , Colo/metabolismo , Neoplasias do Colo/metabolismo , Neoplasias do Colo/patologia , Células Epiteliais/metabolismo , Perfilação da Expressão Gênica , Camundongos , Camundongos Nus , MicroRNAs/genética , MicroRNAs/metabolismo , RNA Longo não Codificante/metabolismoRESUMO
Mouse knockout technology provides a powerful means of elucidating gene function in vivo, and a publicly available genome-wide collection of mouse knockouts would be significantly enabling for biomedical discovery. To date, published knockouts exist for only about 10% of mouse genes. Furthermore, many of these are limited in utility because they have not been made or phenotyped in standardized ways, and many are not freely available to researchers. It is time to harness new technologies and efficiencies of production to mount a high-throughput international effort to produce and phenotype knockouts for all mouse genes, and place these resources into the public domain.
Assuntos
Camundongos Knockout , Criação de Embriões para Pesquisa , Alelos , Animais , Pesquisa em Genética , Camundongos , Fenótipo , Criação de Embriões para Pesquisa/economiaRESUMO
High-throughput experiments in biology often produce sets of genes of potential interests. Some of those gene sets might be of considerable size. Therefore, computer-assisted analysis is necessary for the biological interpretation of the gene sets, and for creating working hypotheses, which can be tested experimentally. One obvious way to analyze gene set data is to associate the genes with a particular biological feature, for example, a given pathway. Statistical analysis could be used to evaluate if a gene set is truly associated with a feature. Over the past few years many tools that perform such analysis have been created. In this chapter, using WebGestalt as an example, it will be explained in detail how to associate gene sets with functional annotations, pathways, publication records, and protein domains.
Assuntos
Bases de Dados Genéticas , Técnicas Genéticas/estatística & dados numéricos , Software , Biologia Computacional , Interpretação Estatística de Dados , Perfilação da Expressão Gênica/estatística & dados numéricos , Genômica/estatística & dados numéricos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricosRESUMO
High-throughput technologies have led to the rapid generation of large-scale datasets about genes and gene products. These technologies have also shifted our research focus from 'single genes' to 'gene sets'. We have developed a web-based integrated data mining system, WebGestalt (http://genereg.ornl.gov/webgestalt/), to help biologists in exploring large sets of genes. WebGestalt is composed of four modules: gene set management, information retrieval, organization/visualization, and statistics. The management module uploads, saves, retrieves and deletes gene sets, as well as performs Boolean operations to generate the unions, intersections or differences between different gene sets. The information retrieval module currently retrieves information for up to 20 attributes for all genes in a gene set. The organization/visualization module organizes and visualizes gene sets in various biological contexts, including Gene Ontology, tissue expression pattern, chromosome distribution, metabolic and signaling pathways, protein domain information and publications. The statistics module recommends and performs statistical tests to suggest biological areas that are important to a gene set and warrant further investigation. In order to demonstrate the use of WebGestalt, we have generated 48 gene sets with genes over-represented in various human tissue types. Exploration of all the 48 gene sets using WebGestalt is available for the public at http://genereg.ornl.gov/webgestalt/wg_enrich.php.
Assuntos
Genes , Software , Gráficos por Computador , Interpretação Estatística de Dados , Bases de Dados Genéticas , Expressão Gênica , Genômica , Humanos , Internet , Proteômica , Integração de Sistemas , Distribuição Tecidual , Interface Usuário-ComputadorRESUMO
BACKGROUND: Microarray and other high-throughput technologies are producing large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in the gene sets. RESULTS: We have created a web-based tool for data analysis and data visualization for sets of genes called GOTree Machine (GOTM). This tool was originally intended to analyze sets of co-regulated genes identified from microarray analysis but is adaptable for use with other gene sets from other high-throughput analyses. GOTree Machine generates a GOTree, a tree-like structure to navigate the Gene Ontology Directed Acyclic Graph for input gene sets. This system provides user friendly data navigation and visualization. Statistical analysis helps users to identify the most important Gene Ontology categories for the input gene sets and suggests biological areas that warrant further study. GOTree Machine is available online at http://genereg.ornl.gov/gotm/. CONCLUSION: GOTree Machine has a broad application in functional genomic, proteomic and other high-throughput methods that generate large sets of interesting genes; its primary purpose is to help users sort for interesting patterns in gene sets.
Assuntos
Genes de Insetos/fisiologia , Genes/fisiologia , Animais , Análise por Conglomerados , Biologia Computacional/estatística & dados numéricos , Gráficos por Computador/estatística & dados numéricos , Interpretação Estatística de Dados , Bases de Dados Genéticas/estatística & dados numéricos , Dípteros/genética , Perfilação da Expressão Gênica/estatística & dados numéricos , Genoma , Genoma Humano , Humanos , Internet , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos/estatística & dados numéricos , Ratos , Software/estatística & dados numéricos , Design de Software , Interface Usuário-ComputadorRESUMO
BACKGROUND: Modern biological research makes possible the comprehensive study and development of heritable mutations in the mouse model at high-throughput. Using techniques spanning genetics, molecular biology, histology, and behavioral science, researchers may examine, with varying degrees of granularity, numerous phenotypic aspects of mutant mouse strains directly pertinent to human disease states. Success of these and other genome-wide endeavors relies on a well-structured bioinformatics core that brings together investigators from widely dispersed institutions and enables them to seamlessly integrate data, observations and discussions. DESCRIPTION: MuTrack was developed as the bioinformatics core for a large mouse phenotype screening effort. It is a comprehensive collection of on-line computational tools and tracks thousands of mutagenized mice from birth through senescence and death. It identifies the physical location of mice during an intensive phenotype screening process at several locations throughout the state of Tennessee and collects raw and processed experimental data from each domain. MuTrack's statistical package allows researchers to access a real-time analysis of mouse pedigrees for aberrant behavior, and subsequent recirculation and retesting. The end result is the classification of potential and actual heritable mutant mouse strains that become immediately available to outside researchers who have expressed interest in the mutant phenotype. CONCLUSION: MuTrack demonstrates the effectiveness of using bioinformatics techniques in data collection, integration and analysis to identify unique result sets that are beyond the capacity of a solitary laboratory. By employing the research expertise of investigators at several institutions for a broad-ranging study, the TMGC has amplified the effectiveness of any one consortium member. The bioinformatics strategy presented here lends future collaborative efforts a template for a comprehensive approach to large-scale analysis.
Assuntos
Análise Mutacional de DNA/métodos , Genoma , Mutagênese/genética , Software , Animais , Humanos , CamundongosRESUMO
N-ethyl-N-nitrosourea (ENU) mutagenesis is presented as a powerful approach to developing models for human disease. The efforts of three NIH Mutagenesis Centers established for the detection of neuroscience-related phenotypes are described. Each center has developed an extensive panel of phenotype screens that assess nervous system structure and function. In particular, these screens focus on complex behavioral traits from drug and alcohol responses to circadian rhythms to epilepsy. Each of these centers has developed a bioinformatics infrastructure to track the extensive number of transactions that are inherent in these large-scale projects. Over 100 new mouse mutant lines have been defined through the efforts of these three mutagenesis centers and are presented to the research community via the centralized Web presence of the Neuromice.org consortium (http://www.neuromice.org). This community resource provides visitors with the ability to search for specific mutant phenotypes, to view the genetic and phenotypic details of mutant mouse lines, and to order these mice for use in their own research program.
Assuntos
Mutagênese , Doenças do Sistema Nervoso/genética , Fenômenos Fisiológicos do Sistema Nervoso , Alquilantes , Animais , Etilnitrosoureia , CamundongosRESUMO
Matrix metalloproteinase-7 (MMP-7) is a small secreted proteolytic enzyme with broad substrate specificity against ECM and non-ECM components. Known to be vital for tumor invasion and metastasis, accumulating evidence also implicates MMP-7 in cancer development. Using data from the Shanghai Breast Cancer Study, we conducted a two-stage study to evaluate the association of MMP-7 single nucleotide polymorphisms (SNPs) with breast cancer risk. Additionally, associated SNPs were characterized by laboratory assays. In stage 1, 11 SNPs were genotyped among 1,079 incident cases and 1,082 community controls using an Affymetrix Genotyping System. Promising SNPs were selected for stage 2 evaluation and genotyped by TaqMan allelic discrimination assays in an independent set of 1,911 cases and 1,811 controls. Three SNPs were selected for stage 2 validation (rs880197, rs10895304, and rs12184413); one had highly consistent results between the two stages of the study. In combined analysis, homozygosity for the variant T allele for rs12184413 was associated with an odds ratio (OR) of 0.7 [95% confidence interval (95% CI), 0.6-0.9] compared with the common C allele. This effect was slightly more pronounced in postmenopausal women (OR, 0.6; 95% CI, 0.4-0.8) than in premenopausal women (OR, 0.8; 95% CI, 0.6-1.1). This SNP is located 3' of the MMP-7 gene, in an area enriched with CTCF binding sites. In silico analysis suggested a regulatory role for this region, and our in vitro assays showed an allelic difference in nuclear protein binding capacity. Results from our study suggest that common MMP-7 genetic polymorphisms may contribute to breast cancer susceptibility.
Assuntos
Neoplasias da Mama/genética , Predisposição Genética para Doença , Metaloproteinase 7 da Matriz/genética , Polimorfismo de Nucleotídeo Único , Adulto , Sequência de Bases , Estudos de Casos e Controles , Primers do DNA , Ensaio de Desvio de Mobilidade Eletroforética , Feminino , Humanos , Pessoa de Meia-IdadeRESUMO
PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at http://www.pazar.info, is open for business.
Assuntos
Bases de Dados Genéticas , Internet , Sequências Reguladoras de Ácido Nucleico/genética , Fatores de Transcrição/genéticaRESUMO
Gene expression microarray data can be used for the assembly of genetic coexpression network graphs. Using mRNA samples obtained from recombinant inbred Mus musculus strains, it is possible to integrate allelic variation with molecular and higher-order phenotypes. The depth of quantitative genetic analysis of microarray data can be vastly enhanced utilizing this mouse resource in combination with powerful computational algorithms, platforms, and data repositories. The resulting network graphs transect many levels of biological scale. This approach is illustrated with the extraction of cliques of putatively co-regulated genes and their annotation using gene ontology analysis and cis-regulatory element discovery. The causal basis for co-regulation is detected through the use of quantitative trait locus mapping.
RESUMO
Uncontrolled expansion of adipose tissue leads to obesity, a public health epidemic affecting >30% of adult Americans. Adipose mass increases in part through the recruitment and differentiation of an existing pool of preadipocytes (PA) into adipocytes (AD). Most studies investigating adipogenesis used primarily murine cell lines; much less is known about the relevant processes that occur in humans. Therefore, characterization of genes associated with adipocyte development is key to understanding the pathogenesis of obesity and developing treatments for this disorder. To address this issue, we performed large-scale analyses of human adipose gene expression using microarray technology. Differential gene expression between PA and AD was analyzed in 6 female patients using human cDNA microarray slides and data analyzed using the Stanford Microarray Database. Statistical analysis for the gene expression was performed using the SAS mixed models. Compared with PA, several genes involved in lipid metabolism were overexpressed in AD, including fatty acid binding protein, adipose differentiation-related protein, lipoprotein lipase, perilipin, and adipose most abundant transcript 1. Novel genes expressed in adipocytes included E2F5 transcriptional factor and SMARC (SWI/SNF-related, matrix associated, actin-dependent regulator of chromatin). PA predominantly expressed genes encoding extracellular matrix components such as fibronectin, matrix metalloprotein, and novel proteins such as lysyl oxidase. Despite the high differential expression of some of these genes, many did not differ significantly likely due to high variability and limited statistical power. A comprehensive list of differential gene expression is presented according to cellular function. In conclusion, these studies offer an overview of the gene expression profiles in PA and AD and identify new genes with potentially important functions in adipose tissue development and obesity that merit further investigation.