Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 21
Filtrar
Mais filtros








Base de dados
Intervalo de ano de publicação
1.
Methods Mol Biol ; 2324: 49-63, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34165708

RESUMO

Long intergenic noncoding RNAs (lincRNAs) are known to be tissue specifically expressed and able to regulate functional protein-coding genes: some can even act as competing endogenous RNAs (ceRNAs), because microRNAs can bind to them instead of the corresponding mRNA binding sites. Some lincRNAs contain remnants of protein-coding sequences and it has been hypothesized that they might arise after a pseudogenization processes. However, a major limitation in the study of such phenomenon is the lack of proper computational tools designed to align/analyze protein-coding sequences and noncoding sequences. To overcome this limitation, we published a method that finds the remnants of protein-coding sequences within the sequence of lincRNAs, as well as the corresponding sequences in parental proteins. This method, together with the visualization platform for tracing frameshifts and single point mutations within this type of sequences, are described here.


Assuntos
Biologia Computacional/métodos , Fases de Leitura Aberta/genética , RNA Longo não Codificante/análise , RNA Longo não Codificante/genética , Alinhamento de Sequência/métodos , Análise de Sequência de RNA/métodos , Sequência de Aminoácidos , MicroRNAs/genética , Pseudogenes/genética , RNA Mensageiro/genética
2.
Cells ; 10(4)2021 03 29.
Artigo em Inglês | MEDLINE | ID: mdl-33805436

RESUMO

Long intergenic non-coding RNAs (LincRNAs) are long RNAs that do not encode proteins. Functional evidence is lacking for most of them. Their biogenesis is not well-known, but it is thought that many lincRNAs originate from genomic duplication of coding material, resulting in pseudogenes, gene copies that lose their original function and can accumulate mutations. While most pseudogenes eventually stop producing a transcript and become erased by mutations, many of these pseudogene-based lincRNAs keep similarity to the parental gene from which they originated, possibly for functional reasons. For example, they can act as decoys for miRNAs targeting the parental gene. Enrichment analysis of function is a powerful tool to discover the functional effects of a treatment producing differential expression of transcripts. However, in the case of lincRNAs, since their function is not easy to define experimentally, such a tool is lacking. To address this problem, we have developed an enrichment analysis tool that focuses on lincRNAs exploiting their functional association, using as a proxy function that of the parental genes and has a focus on human diseases.


Assuntos
Doença/genética , Perfilação da Expressão Gênica , RNA Longo não Codificante/genética , Neoplasias da Mama/genética , Feminino , Regulação Neoplásica da Expressão Gênica , Humanos , Internet , Estimativa de Kaplan-Meier , Prognóstico , RNA Longo não Codificante/metabolismo , Interface Usuário-Computador
3.
Front Neurol ; 11: 573560, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-33329316

RESUMO

Huntington's disease (HD) is an autosomal dominantly inherited neurodegenerative disorder caused by a trinucleotide repeat expansion in the Huntingtin gene. As disease-modifying therapies for HD are being developed, peripheral blood cells may be used to indicate disease progression and to monitor treatment response. In order to investigate whether gene expression changes can be found in the blood of individuals with HD that distinguish them from healthy controls, we performed transcriptome analysis by next-generation sequencing (RNA-seq). We detected a gene expression signature consistent with dysregulation of immune-related functions and inflammatory response in peripheral blood from HD cases vs. controls, including induction of the interferon response genes, IFITM3, IFI6 and IRF7. Our results suggest that it is possible to detect gene expression changes in blood samples from individuals with HD, which may reflect the immune pathology associated with the disease.

4.
Epigenetics Chromatin ; 12(1): 72, 2019 12 05.
Artigo em Inglês | MEDLINE | ID: mdl-31805995

RESUMO

BACKGROUND: Our understanding of the nuclear chromatin structure has increased hugely during the last years mainly as a consequence of the advances in chromatin conformation capture methods like Hi-C. The unprecedented resolution of genome-wide interaction maps shows functional consequences that extend the initial thought of an efficient DNA packaging mechanism: gene regulation, DNA repair, chromosomal translocations and evolutionary rearrangements seem to be only the peak of the iceberg. One key concept emerging from this research is the topologically associating domains (TADs) whose functional role in gene regulation and their association with disease is not fully untangled. RESULTS: We report that the lower the number of protein coding genes inside TADs, the higher the tendency of those genes to be associated with disease (p-value = 4 × [Formula: see text]). Moreover, housekeeping genes are less associated with disease than other genes. Accordingly, they are depleted in TADs containing less than three protein coding genes (p-value = 3.9 × [Formula: see text]). We observed that TADs with higher ratios of enhancers versus genes contained higher numbers of disease-associated genes. We interpret these results as an indication that sharing enhancers among genes reduces their involvement in disease. Larger TADs would have more chances to accommodate many genes and select for enhancer sharing along evolution. CONCLUSIONS: Genes associated with human disease do not distribute randomly over the TADs. Our observations suggest general rules that confer functional stability to TADs, adding more evidence to the role of TADs as regulatory units.


Assuntos
Cromatina/genética , Doença/genética , Fases de Leitura Aberta/genética , Linhagem Celular , Cromatina/química , Cromatina/metabolismo , Bases de Dados Genéticas , Elementos Facilitadores Genéticos , Humanos , Sítio de Iniciação de Transcrição
5.
Nucleic Acids Res ; 46(17): 8720-8729, 2018 09 28.
Artigo em Inglês | MEDLINE | ID: mdl-29986053

RESUMO

Long intergenic non-coding RNAs (lincRNAs) are non-coding transcripts >200 nucleotides long that do not overlap protein-coding sequences. Importantly, such elements are known to be tissue-specifically expressed and to play a widespread role in gene regulation across thousands of genomic loci. However, very little is known of the mechanisms for the evolutionary biogenesis of these RNA elements, especially given their poor conservation across species. It has been proposed that lincRNAs might arise from pseudogenes. To test this systematically, we developed a novel method that searches for remnants of protein-coding sequences within lincRNA transcripts; the hypothesis is that we can trace back their biogenesis from protein-coding genes or posterior transposon/retrotransposon insertions. Applying this method, we found 203 human lincRNA genes with regions significantly similar to protein-coding sequences. Our method provides a visualization tool to trace the evolutionary biogenesis of lincRNAs with respect to protein-coding genes by sequence divergence. Subsequently, we show the expression correlation between lincRNAs and their identified parental protein-coding genes using public RNA-seq repositories, hinting at novel gene regulatory relationships. In summary, we developed a novel computational methodology to study non-coding gene sequences, which can be applied to identify the evolutionary biogenesis and function of lincRNAs.


Assuntos
Biologia Computacional/métodos , DNA Intergênico/genética , Fases de Leitura Aberta/genética , RNA Longo não Codificante/genética , Análise de Sequência de RNA/métodos , Algoritmos , Sequência de Aminoácidos , Sequência de Bases , Regulação da Expressão Gênica , Humanos , Alinhamento de Sequência/métodos , Análise de Sequência de Proteína
6.
Nucleic Acids Res ; 45(1): 81-91, 2017 01 09.
Artigo em Inglês | MEDLINE | ID: mdl-27634932

RESUMO

Paralog genes arise from gene duplication events during evolution, which often lead to similar proteins that cooperate in common pathways and in protein complexes. Consequently, paralogs show correlation in gene expression whereby the mechanisms of co-regulation remain unclear. In eukaryotes, genes are regulated in part by distal enhancer elements through looping interactions with gene promoters. These looping interactions can be measured by genome-wide chromatin conformation capture (Hi-C) experiments, which revealed self-interacting regions called topologically associating domains (TADs). We hypothesize that paralogs share common regulatory mechanisms to enable coordinated expression according to TADs. To test this hypothesis, we integrated paralogy annotations with human gene expression data in diverse tissues, genome-wide enhancer-promoter associations and Hi-C experiments in human, mouse and dog genomes. We show that paralog gene pairs are enriched for co-localization in the same TAD, share more often common enhancer elements than expected and have increased contact frequencies over large genomic distances. Combined, our results indicate that paralogs share common regulatory mechanisms and cluster not only in the linear genome but also in the three-dimensional chromatin architecture. This enables concerted expression of paralogs over diverse cell-types and indicate evolutionary constraints in functional genome organization.


Assuntos
Cromatina/química , Duplicação Gênica , Regulação da Expressão Gênica , Genoma , Animais , Evolução Biológica , Cromatina/metabolismo , Montagem e Desmontagem da Cromatina , Análise por Conglomerados , Biologia Computacional , Cães , Elementos Facilitadores Genéticos , Humanos , Camundongos , Regiões Promotoras Genéticas
7.
Genome Med ; 8(1): 28, 2016 Mar 17.
Artigo em Inglês | MEDLINE | ID: mdl-26988706

RESUMO

BACKGROUND: NF-κB is widely involved in lymphoid malignancies; however, the functional roles and specific transcriptomes of NF-κB dimers with distinct subunit compositions have been unclear. METHODS: Using combined ChIP-sequencing and microarray analyses, we determined the cistromes and target gene signatures of canonical and non-canonical NF-κB species in Hodgkin lymphoma (HL) cells. RESULTS: We found that the various NF-κB subunits are recruited to regions with redundant κB motifs in a large number of genes. Yet canonical and non-canonical NF-κB dimers up- and downregulate gene sets that are both distinct and overlapping, and are associated with diverse biological functions. p50 and p52 are formed through NIK-dependent p105 and p100 precursor processing in HL cells and are the predominant DNA binding subunits. Logistic regression analyses of combinations of the p50, p52, RelA, and RelB subunits in binding regions that have been assigned to genes they regulate reveal a cross-contribution of p52 and p50 to canonical and non-canonical transcriptomes. These analyses also indicate that the subunit occupancy pattern of NF-κB binding regions and their distance from the genes they regulate are determinants of gene activation versus repression. The pathway-specific signatures of activated and repressed genes distinguish HL from other NF-κB-associated lymphoid malignancies and inversely correlate with gene expression patterns in normal germinal center B cells, which are presumed to be the precursors of HL cells. CONCLUSIONS: We provide insights that are relevant for lymphomas with constitutive NF-κB activation and generally for the decoding of the mechanisms of differential gene regulation through canonical and non-canonical NF-κB signaling.


Assuntos
Estudo de Associação Genômica Ampla , Doença de Hodgkin/genética , Doença de Hodgkin/metabolismo , NF-kappa B/genética , NF-kappa B/metabolismo , Sítios de Ligação , Linhagem Celular Tumoral , Sobrevivência Celular , Imunoprecipitação da Cromatina , Biologia Computacional/métodos , Bases de Dados de Ácidos Nucleicos , Regulação Neoplásica da Expressão Gênica , Sequenciamento de Nucleotídeos em Larga Escala , Humanos , Quinase I-kappa B/genética , Quinase I-kappa B/metabolismo , Subunidade p50 de NF-kappa B/genética , Subunidade p50 de NF-kappa B/metabolismo , Subunidade p52 de NF-kappa B/genética , Subunidade p52 de NF-kappa B/metabolismo , Motivos de Nucleotídeos , Ligação Proteica , Multimerização Proteica , Transdução de Sinais , Fator de Transcrição RelA/genética , Fator de Transcrição RelA/metabolismo , Fator de Transcrição RelB/genética , Fator de Transcrição RelB/metabolismo , Ativação Transcricional
8.
Wiley Interdiscip Rev RNA ; 3(4): 567-79, 2012.
Artigo em Inglês | MEDLINE | ID: mdl-22555938

RESUMO

New developments are being brought to the field of molecular biology with the mounting evidence that RNA transcripts not translated into protein (noncoding RNAs, ncRNAs) hold a variety of biological functions. Computational discovery of ncRNAs is one of these developments, fueled not only by the urge to characterize these sequences but also by necessity to prioritize ones with the most relevant functions for experimental verification. The heterogeneity in size and mode of activity of ncRNAs is reflected in the corresponding diversity of computational methods for their study. Sequence and structural analysis, conservation across species, and relative position to other genomic elements are being used for ncRNA detection. In addition, the recent development of techniques that allow deep sequencing of cell transcripts either globally or from isolated ncRNA-related material is leading the field toward increased use of such high-throughput data. We expect that imminent breakthroughs will include the classification of newer types of ncRNA and new insights into miRNA and piRNA biology, eventually leading toward the completion of a catalog of all human ncRNAs.


Assuntos
Biologia Computacional/métodos , Biologia Molecular/métodos , RNA não Traduzido/genética , Animais , Bactérias/genética , Humanos , Análise de Sequência de RNA , Vírus/genética
9.
Biochimie ; 93(11): 1916-21, 2011 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-21816204

RESUMO

Pseudogenes have been mainly considered as functionless evolutionary relics since their discovery in 1977. However, multiple mechanisms of pseudogene functionality have been proposed both at the transcriptional and post-transcriptional level. This review focuses on the role of pseudogenes as post-transcriptional regulators. Two lines of research have recently presented strong evidence of their potential function as post-transcriptional regulators of the corresponding parental genes from which they originate. First, pseudogene genomic sequences can encode siRNAs. Second, pseudogene transcripts can act as indirect post-transcriptional regulators decoying ncRNA, in particular miRNAs that target the parental gene. This has been demonstrated for PTEN and KRAS, two genes involved in tumorigenesis. The role of pseudogenes in disease has not been proven and seems to be the next research landmark. In this review, we chronicle the events following the initial discovery of the 'useless' pseudogene to its breakthrough as a functional molecule with hitherto unbeknownst potential to influence human disease.


Assuntos
Doença/genética , Pseudogenes/genética , RNA Interferente Pequeno/genética , Animais , Evolução Molecular , Regulação da Expressão Gênica , Genômica , Humanos , Camundongos , MicroRNAs/genética , MicroRNAs/metabolismo , Processamento Pós-Transcricional do RNA/genética , Estabilidade de RNA/genética , RNA Interferente Pequeno/metabolismo , Transcrição Gênica
10.
PLoS One ; 6(6): e20561, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-21698286

RESUMO

Many computational methods have been used to predict novel non-coding RNAs (ncRNAs), but none, to our knowledge, have explicitly investigated the impact of integrating existing cDNA-based Expressed Sequence Tag (EST) data that flank structural RNA predictions. To determine whether flanking EST data can assist in microRNA (miRNA) prediction, we identified genomic sites encoding putative miRNAs by combining functional RNA predictions with flanking ESTs data in a model consistent with miRNAs undergoing cleavage during maturation. In both human and mouse genomes, we observed that the inclusion of flanking ESTs adjacent to and not overlapping predicted miRNAs significantly improved the performance of various methods of miRNA prediction, including direct high-throughput sequencing of small RNA libraries. We analyzed the expression of hundreds of miRNAs predicted to be expressed during myogenic differentiation using a customized microarray and identified several known and predicted myogenic miRNA hairpins. Our results indicate that integrating ESTs flanking structural RNA predictions improves the quality of cleaved miRNA predictions and suggest that this strategy can be used to predict other non-coding RNAs undergoing cleavage during maturation.


Assuntos
Etiquetas de Sequências Expressas , MicroRNAs/química , Conformação de Ácido Nucleico , RNA não Traduzido/química , Animais , Northern Blotting , Linhagem Celular , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos
11.
Nucleic Acids Res ; 39(5): 1732-8, 2011 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-21051341

RESUMO

Almost 50 years following the discovery of the prokaryotic operon, the functional relevance of gene order within operons remains unclear. In this work, we take advantage of the eroded genome of Mycobacterium leprae to add evidence supporting the notion that functionally less important genes have a tendency to be located at the end of its operons. M. leprae's genome includes 1133 pseudogenes and 1614 protein-coding genes and can be compared with the close genome of M. tuberculosis. Assuming M. leprae's pseudogenes to represent dispensable genes, we have studied the position of these pseudogenes in the operons of M. leprae and of their orthologs in M. tuberculosis. We observed that both tend to be located in the 3' (downstream) half of the operon (P-values of 0.03 and 0.18, respectively). Analysis of pseudogenes in all available prokaryotic genomes confirms this trend (P-value of 7.1 × 10(-7)). In a complementary analysis, we found a significant tendency for essential genes to be located at the 5' (upstream) half of the operon (P-value of 0.006). Our work provides an indication that, in prokarya, functionally less important genes have a tendency to be located at the end of operons, while more relevant genes tend to be located toward operon starts.


Assuntos
Mycobacterium leprae/genética , Óperon , Pseudogenes , Ordem dos Genes , Genes Bacterianos , Genômica
12.
BMC Evol Biol ; 10: 338, 2010 Nov 03.
Artigo em Inglês | MEDLINE | ID: mdl-21047404

RESUMO

BACKGROUND: Naturally occurring antisense transcripts (NATs) are non-coding RNAs that may regulate the activity of sense transcripts to which they bind because of complementarity. NATs that are not located in the gene they regulate (trans-NATs) have better chances to evolve than cis-NATs, which is evident when the sense strand of the cis-NAT is part of a protein coding gene. However, the generation of a trans-NAT requires the formation of a relatively large region of complementarity to the gene it regulates. RESULTS: Pseudogene formation may be one evolutionary mechanism that generates trans-NATs to the parental gene. For example, this could occur if the parental gene is regulated by a cis-NAT that is copied as a trans-NAT in the pseudogene. To support this we identified human pseudogenes with a trans-NAT to the parental gene in their antisense strand by analysis of the database of expressed sequence tags (ESTs). We found that the mutations that appeared in these trans-NATs after the pseudogene formation do not show the flat distribution that would be expected in a non functional transcript. Instead, we found higher similarity to the parental gene in a region nearby the 3' end of the trans-NATs. CONCLUSIONS: Our results do not imply a functional relation of the trans-NAT arising from pseudogenes over their respective parental genes but add evidence for it and stress the importance of duplication mechanisms of genetic material in the generation of non-coding RNAs. We also provide a plausible explanation for the large transcripts that can be found in the antisense strand of some pseudogenes.


Assuntos
Pseudogenes/genética , RNA Antissenso/genética , Animais , Bases de Dados Genéticas , Evolução Molecular , Etiquetas de Sequências Expressas , Humanos
13.
Methods Mol Biol ; 567: 145-54, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19588091

RESUMO

The simultaneous genotyping of thousands of single nucleotide polymorphisms (SNPs) in a genome using SNP-Arrays is a very important tool that is revolutionizing genetics and molecular biology. We expanded the utility of this technique by using it following chromatin immunoprecipitation (ChIP) to assess the multiple genomic locations protected by a protein complex recognized by an antibody. The power of this technique is illustrated through an analysis of the changes in histone H4 acetylation, a marker of open chromatin and transcriptionally active genomic regions, which occur during differentiation of human myoblasts into myotubes. The findings have been validated by the observation of a significant correlation between the detected histone modifications and the expression of the nearby genes, as measured by DNA expression microarrays. This chapter focuses on the computational analysis of the data.


Assuntos
Imunoprecipitação da Cromatina/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Polimorfismo de Nucleotídeo Único , Biologia Computacional/métodos , Bases de Dados Genéticas , Humanos , Internet , Modelos Biológicos , Polimorfismo de Nucleotídeo Único/genética , Linguagens de Programação , Software
14.
Nucleic Acids Res ; 37(Web Server issue): W141-6, 2009 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-19429696

RESUMO

The biomedical literature is represented by millions of abstracts available in the Medline database. These abstracts can be queried with the PubMed interface, which provides a keyword-based Boolean search engine. This approach shows limitations in the retrieval of abstracts related to very specific topics, as it is difficult for a non-expert user to find all of the most relevant keywords related to a biomedical topic. Additionally, when searching for more general topics, the same approach may return hundreds of unranked references. To address these issues, text mining tools have been developed to help scientists focus on relevant abstracts. We have implemented the MedlineRanker webserver, which allows a flexible ranking of Medline for a topic of interest without expert knowledge. Given some abstracts related to a topic, the program deduces automatically the most discriminative words in comparison to a random selection. These words are used to score other abstracts, including those from not yet annotated recent publications, which can be then ranked by relevance. We show that our tool can be highly accurate and that it is able to process millions of abstracts in a practical amount of time. MedlineRanker is free for use and is available at http://cbdm.mdc-berlin.de/tools/medlineranker.


Assuntos
Armazenamento e Recuperação da Informação/métodos , MEDLINE , Software , Interface Usuário-Computador
15.
BMC Res Notes ; 2: 39, 2009 Mar 10.
Artigo em Inglês | MEDLINE | ID: mdl-19284540

RESUMO

BACKGROUND: Currently one of the largest online repositories for human and mouse stem cell gene expression data, StemBase was first designed as a simple web-interface to DNA microarray data generated by the Canadian Stem Cell Network to facilitate the discovery of gene functions relevant to stem cell control and differentiation. FINDINGS: Since its creation, StemBase has grown in both size and scope into a system with analysis tools that examine either the whole database at once, or slices of data, based on tissue type, cell type or gene of interest. As of September 1, 2008, StemBase contains gene expression data (microarray and Serial Analysis of Gene Expression) from 210 stem cell samples in 60 different experiments. CONCLUSION: StemBase can be used to study gene expression in human and murine stem cells and is available at http://www.stembase.ca.

16.
Proc Natl Acad Sci U S A ; 105(51): 20286-90, 2008 Dec 23.
Artigo em Inglês | MEDLINE | ID: mdl-19095794

RESUMO

The properties and biology of mRNA transcripts can be affected profoundly by the choice of alternative polyadenylation sites, making definition of the 3' ends of transcripts essential for understanding their regulation. Here we show that 22-52% of sequences in commonly used human and murine "full-length" transcript databases may not currently end at bona fide polyadenylation sites. To identify probable transcript termini over the entire murine and human genomes, we analyzed the EST databases for positional clustering of EST ends. The analysis yielded 58,282 murine- and 86,410 human-candidate polyadenylation sites, of which 75% mapped to 23,091 known murine transcripts and 22,891 known human transcripts. The murine dataset correctly predicted 97% of the 3' ends in a manually curated and experimentally supported benchmark transcript set. Of currently known genes, 15% had no associated prediction and 25% had only a single predicted termination site. The remaining genes had an average of 3-4 alternative polyadenylation sites predicted for each murine or human transcript, respectively. The results are made available in the form of tables and an interactive web site that can be mined for rapid assessment of the validity of 3' ends in existing collections, enumeration of potential alternative 3' polyadenylation sites of known transcripts, direct retrieval of terminal sequences for design of probes, and detection of polyadenylation sites not currently mapped to known genes.


Assuntos
Região 3'-Flanqueadora , Análise por Conglomerados , Etiquetas de Sequências Expressas , Animais , Humanos , Métodos , Camundongos , Poli A , Poliadenilação
17.
BMC Genomics ; 8: 322, 2007 Sep 14.
Artigo em Inglês | MEDLINE | ID: mdl-17868463

RESUMO

BACKGROUND: SNP microarrays are designed to genotype Single Nucleotide Polymorphisms (SNPs). These microarrays report hybridization of DNA fragments and therefore can be used for the purpose of detecting genomic fragments. RESULTS: Here, we demonstrate that a SNP microarray can be effectively used in this way to perform chromatin immunoprecipitation (ChIP) on chip as an alternative to tiling microarrays. We illustrate this novel application by mapping whole genome histone H4 hyperacetylation in human myoblasts and myotubes. We detect clusters of hyperacetylated histone H4, often spanning across up to 300 kilobases of genomic sequence. Using complementary genome-wide analyses of gene expression by DNA microarray we demonstrate that these clusters of hyperacetylated histone H4 tend to be associated with expressed genes. CONCLUSION: The use of a SNP array for a ChIP-on-chip application (ChIP on SNP-chip) will be of great value to laboratories whose interest is the determination of general rules regarding the relationship of specific chromatin modifications to transcriptional status throughout the genome and to examine the asymmetric modification of chromatin at heterozygous loci.


Assuntos
Imunoprecipitação da Cromatina , Genoma Humano , Histonas/metabolismo , Polimorfismo de Nucleotídeo Único , Acetilação , Células Cultivadas , Mapeamento Cromossômico , Expressão Gênica , Humanos , Análise de Sequência com Séries de Oligonucleotídeos
18.
BMC Genomics ; 8: 85, 2007 Mar 29.
Artigo em Inglês | MEDLINE | ID: mdl-17394647

RESUMO

BACKGROUND: Little is known about the genes that drive embryonic stem cell differentiation. However, such knowledge is necessary if we are to exploit the therapeutic potential of stem cells. To uncover the genetic determinants of mouse embryonic stem cell (mESC) differentiation, we have generated and analyzed 11-point time-series of DNA microarray data for three biologically equivalent but genetically distinct mESC lines (R1, J1, and V6.5) undergoing undirected differentiation into embryoid bodies (EBs) over a period of two weeks. RESULTS: We identified the initial 12 hour period as reflecting the early stages of mESC differentiation and studied probe sets showing consistent changes of gene expression in that period. Gene function analysis indicated significant up-regulation of genes related to regulation of transcription and mRNA splicing, and down-regulation of genes related to intracellular signaling. Phylogenetic analysis indicated that the genes showing the largest expression changes were more likely to have originated in metazoans. The probe sets with the most consistent gene changes in the three cell lines represented 24 down-regulated and 12 up-regulated genes, all with closely related human homologues. Whereas some of these genes are known to be involved in embryonic developmental processes (e.g. Klf4, Otx2, Smn1, Socs3, Tagln, Tdgf1), our analysis points to others (such as transcription factor Phf21a, extracellular matrix related Lama1 and Cyr61, or endoplasmic reticulum related Sc4mol and Scd2) that have not been previously related to mESC function. The majority of identified functions were related to transcriptional regulation, intracellular signaling, and cytoskeleton. Genes involved in other cellular functions important in ESC differentiation such as chromatin remodeling and transmembrane receptors were not observed in this set. CONCLUSION: Our analysis profiles for the first time gene expression at a very early stage of mESC differentiation, and identifies a functional and phylogenetic signature for the genes involved. The data generated constitute a valuable resource for further studies. All DNA microarray data used in this study are available in the StemBase database of stem cell gene expression data 1 and in the NCBI's GEO database.


Assuntos
Biologia do Desenvolvimento/métodos , Células-Tronco Embrionárias/citologia , Regulação da Expressão Gênica no Desenvolvimento , Animais , Diferenciação Celular , Linhagem Celular , Perfilação da Expressão Gênica , Marcadores Genéticos , Humanos , Fator 4 Semelhante a Kruppel , Camundongos , Análise de Sequência com Séries de Oligonucleotídeos , Filogenia , Estrutura Terciária de Proteína , Fatores de Tempo , Fatores de Transcrição/metabolismo
19.
Methods Mol Biol ; 407: 137-48, 2007.
Artigo em Inglês | MEDLINE | ID: mdl-18453254

RESUMO

StemBase is a database of gene expression data obtained from stem cells and derivatives mainly from mouse and human using DNA microarrays and Serial Analysis of Gene Expression. Here, we describe this database and indicate ways to use it for the study the expression of particular genes in stem cells or to search for genes with particular expression profiles in stem cells, which could be associated to stem cell function or used as stem cell markers.


Assuntos
Biomarcadores/análise , Bases de Dados Genéticas , Perfilação da Expressão Gênica/métodos , Expressão Gênica/fisiologia , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Células-Tronco/fisiologia , Animais , Humanos , Camundongos
20.
BMC Bioinformatics ; 7: 159, 2006 Mar 20.
Artigo em Inglês | MEDLINE | ID: mdl-16549014

RESUMO

BACKGROUND: The annotations of Affymetrix DNA microarray probe sets with Gene Ontology terms are carefully selected for correctness. This results in very accurate but incomplete annotations which is not always desirable for microarray experiment evaluation. RESULTS: Here we present a protocol to amplify the set of Gene Ontology annotations associated to Affymetrix DNA microarray probe sets using information from related databases. CONCLUSION: Predicted novel annotations and the evidence producing them can be accessed at Probe2GO: http://www.ogic.ca/p2g. Scripts are available on demand.


Assuntos
Sistemas de Gerenciamento de Base de Dados , Bases de Dados de Proteínas , Documentação/métodos , Perfilação da Expressão Gênica/métodos , Análise de Sequência com Séries de Oligonucleotídeos/métodos , Proteínas/classificação , Proteínas/genética , Algoritmos , Sondas de DNA/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA