RESUMO
Expression Atlas (http://www.ebi.ac.uk/gxa) is an added value database that provides information about gene and protein expression in different species and contexts, such as tissue, developmental stage, disease or cell type. The available public and controlled access data sets from different sources are curated and re-analysed using standardized, open source pipelines and made available for queries, download and visualization. As of August 2017, Expression Atlas holds data from 3,126 studies across 33 different species, including 731 from plants. Data from large-scale RNA sequencing studies including Blueprint, PCAWG, ENCODE, GTEx and HipSci can be visualized next to each other. In Expression Atlas, users can query genes or gene-sets of interest and explore their expression across or within species, tissues, developmental stages in a constitutive or differential context, representing the effects of diseases, conditions or experimental interventions. All processed data matrices are available for direct download in tab-delimited format or as R-data. In addition to the web interface, data sets can now be searched and downloaded through the Expression Atlas R package. Novel features and visualizations include the on-the-fly analysis of gene set overlaps and the option to view gene co-expression in experiments investigating constitutive gene expression across tissues or other conditions.
Assuntos
Bases de Dados Genéticas , Animais , Perfilação da Expressão Gênica , Humanos , Mamíferos/genética , Mamíferos/metabolismo , Análise de Sequência com Séries de Oligonucleotídeos , Plantas/genética , Plantas/metabolismo , Proteômica , Análise de Sequência de RNA , Especificidade da Espécie , Interface Usuário-ComputadorRESUMO
We have designed and developed a data integration and visualization platform that provides evidence about the association of known and potential drug targets with diseases. The platform is designed to support identification and prioritization of biological targets for follow-up. Each drug target is linked to a disease using integrated genome-wide data from a broad range of data sources. The platform provides either a target-centric workflow to identify diseases that may be associated with a specific target, or a disease-centric workflow to identify targets that may be associated with a specific disease. Users can easily transition between these target- and disease-centric workflows. The Open Targets Validation Platform is accessible at https://www.targetvalidation.org.
Assuntos
Biologia Computacional/métodos , Terapia de Alvo Molecular , Ferramenta de Busca , Software , Bases de Dados Factuais , Humanos , Terapia de Alvo Molecular/métodos , Reprodutibilidade dos Testes , Navegador , Fluxo de TrabalhoRESUMO
MOTIVATION: The exponential growth of publicly available RNA-sequencing (RNA-Seq) data poses an increasing challenge to researchers wishing to discover, analyse and store such data, particularly those based in institutions with limited computational resources. EMBL-EBI is in an ideal position to address these challenges and to allow the scientific community easy access to not just raw, but also processed RNA-Seq data. We present a Web service to access the results of a systematically and continually updated standardized alignment as well as gene and exon expression quantification of all public bulk (and in the near future also single-cell) RNA-Seq runs in 264 species in European Nucleotide Archive, using Representational State Transfer. RESULTS: The RNASeq-er API (Application Programming Interface) enables ontology-powered search for and retrieval of CRAM, bigwig and bedGraph files, gene and exon expression quantification matrices (Fragments Per Kilobase Of Exon Per Million Fragments Mapped, Transcripts Per Million, raw counts) as well as sample attributes annotated with ontology terms. To date over 270 00 RNA-Seq runs in nearly 10 000 studies (1PB of raw FASTQ data) in 264 species in ENA have been processed and made available via the API. AVAILABILITY AND IMPLEMENTATION: The RNASeq-er API can be accessed at http://www.ebi.ac.uk/fg/rnaseq/api . The commands used to analyse the data are available in supplementary materials and at https://github.com/nunofonseca/irap/wiki/iRAP-single-library . CONTACT: rnaseq@ebi.ac.uk ; rpetry@ebi.ac.uk. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Biologia Computacional/métodos , Eucariotos/genética , Análise de Sequência de RNA/métodos , Software , Transcriptoma , Animais , Bases de Dados Genéticas , Expressão Gênica , Ontologia Genética , Humanos , InternetRESUMO
Gorillas are humans' closest living relatives after chimpanzees, and are of comparable importance for the study of human origins and evolution. Here we present the assembly and analysis of a genome sequence for the western lowland gorilla, and compare the whole genomes of all extant great ape genera. We propose a synthesis of genetic and fossil evidence consistent with placing the human-chimpanzee and human-chimpanzee-gorilla speciation events at approximately 6 and 10 million years ago. In 30% of the genome, gorilla is closer to human or chimpanzee than the latter are to each other; this is rarer around coding genes, indicating pervasive selection throughout great ape evolution, and has functional consequences in gene expression. A comparison of protein coding genes reveals approximately 500 genes showing accelerated evolution on each of the gorilla, human and chimpanzee lineages, and evidence for parallel acceleration, particularly of genes involved in hearing. We also compare the western and eastern gorilla species, estimating an average sequence divergence time 1.75 million years ago, but with evidence for more recent genetic exchange and a population bottleneck in the eastern species. The use of the genome sequence in these and future analyses will promote a deeper understanding of great ape biology and evolution.
Assuntos
Evolução Molecular , Especiação Genética , Genoma/genética , Gorilla gorilla/genética , Animais , Feminino , Regulação da Expressão Gênica , Variação Genética/genética , Genômica , Humanos , Macaca mulatta/genética , Dados de Sequência Molecular , Pan troglodytes/genética , Filogenia , Pongo/genética , Proteínas/genética , Alinhamento de Sequência , Especificidade da Espécie , Transcrição GênicaRESUMO
Expression Atlas (http://www.ebi.ac.uk/gxa) provides information about gene and protein expression in animal and plant samples of different cell types, organism parts, developmental stages, diseases and other conditions. It consists of selected microarray and RNA-sequencing studies from ArrayExpress, which have been manually curated, annotated with ontology terms, checked for high quality and processed using standardised analysis methods. Since the last update, Atlas has grown seven-fold (1572 studies as of August 2015), and incorporates baseline expression profiles of tissues from Human Protein Atlas, GTEx and FANTOM5, and of cancer cell lines from ENCODE, CCLE and Genentech projects. Plant studies constitute a quarter of Atlas data. For genes of interest, the user can view baseline expression in tissues, and differential expression for biologically meaningful pairwise comparisons-estimated using consistent methodology across all of Atlas. Our first proteomics study in human tissues is now displayed alongside transcriptomics data in the same tissues. Novel analyses and visualisations include: 'enrichment' in each differential comparison of GO terms, Reactome, Plant Reactome pathways and InterPro domains; hierarchical clustering (by baseline expression) of most variable genes and experimental conditions; and, for a given gene-condition, distribution of baseline expression across biological replicates.
Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Plantas/metabolismo , Proteínas/metabolismo , Proteômica , Animais , Linhagem Celular Tumoral , Humanos , Plantas/genética , Interface Usuário-ComputadorRESUMO
Gramene (http://www.gramene.org) is an online resource for comparative functional genomics in crops and model plant species. Its two main frameworks are genomes (collaboration with Ensembl Plants) and pathways (The Plant Reactome and archival BioCyc databases). Since our last NAR update, the database website adopted a new Drupal management platform. The genomes section features 39 fully assembled reference genomes that are integrated using ontology-based annotation and comparative analyses, and accessed through both visual and programmatic interfaces. Additional community data, such as genetic variation, expression and methylation, are also mapped for a subset of genomes. The Plant Reactome pathway portal (http://plantreactome.gramene.org) provides a reference resource for analyzing plant metabolic and regulatory pathways. In addition to â¼ 200 curated rice reference pathways, the portal hosts gene homology-based pathway projections for 33 plant species. Both the genome and pathway browsers interface with the EMBL-EBI's Expression Atlas to enable the projection of baseline and differential expression data from curated expression studies in plants. Gramene's archive website (http://archive.gramene.org) continues to provide previously reported resources on comparative maps, markers and QTL. To further aid our users, we have also introduced a live monthly educational webinar series and a Gramene YouTube channel carrying video tutorials.
Assuntos
Bases de Dados Genéticas , Genoma de Planta , Plantas/metabolismo , Expressão Gênica , Variação Genética , Genômica , Internet , Redes e Vias Metabólicas , Anotação de Sequência Molecular , Plantas/genéticaRESUMO
The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is an international functional genomics database at the European Bioinformatics Institute (EMBL-EBI) recommended by most journals as a repository for data supporting peer-reviewed publications. It contains data from over 7000 public sequencing and 42,000 array-based studies comprising over 1.5 million assays in total. The proportion of sequencing-based submissions has grown significantly over the last few years and has doubled in the last 18 months, whilst the rate of microarray submissions is growing slightly. All data in ArrayExpress are available in the MAGE-TAB format, which allows robust linking to data analysis and visualization tools and standardized analysis. The main development over the last two years has been the release of a new data submission tool Annotare, which has reduced the average submission time almost 3-fold. In the near future, Annotare will become the only submission route into ArrayExpress, alongside MAGE-TAB format-based pipelines. ArrayExpress is a stable and highly accessed resource. Our future tasks include automation of data flows and further integration with other EMBL-EBI resources for the representation of multi-omics data.
Assuntos
Bases de Dados Genéticas , Perfilação da Expressão Gênica , Análise de Sequência com Séries de Oligonucleotídeos , Genômica , Sequenciamento de Nucleotídeos em Larga Escala , Internet , SoftwareRESUMO
In female mammals, one of the two X chromosomes is transcriptionally silenced to equalize X-linked gene dosage relative to XY males, a process termed X chromosome inactivation. Mechanistically, this is thought to occur via directed recruitment of chromatin modifying factors by the master regulator, X-inactive specific transcript (Xist) RNA, which localizes in cis along the entire length of the chromosome. A well-studied example is the recruitment of polycomb repressive complex 2 (PRC2), for which there is evidence of a direct interaction involving the PRC2 proteins Enhancer of zeste 2 (Ezh2) and Supressor of zeste 12 (Suz12) and the A-repeat region located at the 5' end of Xist RNA. In this study, we have analyzed Xist-mediated recruitment of PRC2 using two approaches, microarray-based epigenomic mapping and superresolution 3D structured illumination microscopy. Making use of an ES cell line carrying an inducible Xist transgene located on mouse chromosome 17, we show that 24 h after synchronous induction of Xist expression, acquired PRC2 binding sites map predominantly to gene-rich regions, notably within gene bodies. Paradoxically, these new sites of PRC2 deposition do not correlate with Xist-mediated gene silencing. The 3D structured illumination microscopy was performed to assess the relative localization of PRC2 proteins and Xist RNA. Unexpectedly, we observed significant spatial separation and absence of colocalization both in the inducible Xist transgene ES cell line and in normal XX somatic cells. Our observations argue against direct interaction between Xist RNA and PRC2 proteins and, as such, prompt a reappraisal of the mechanism for PRC2 recruitment in X chromosome inactivation.
Assuntos
Proteínas do Grupo Polycomb/isolamento & purificação , RNA Longo não Codificante/isolamento & purificação , RNA/genética , Animais , Linhagem Celular , Inativação Gênica , Camundongos , Microscopia Eletrônica , Análise de Sequência com Séries de Oligonucleotídeos , RNA Longo não Codificante/genética , Transcrição GênicaRESUMO
The ArrayExpress Archive of Functional Genomics Data (http://www.ebi.ac.uk/arrayexpress) is one of three international functional genomics public data repositories, alongside the Gene Expression Omnibus at NCBI and the DDBJ Omics Archive, supporting peer-reviewed publications. It accepts data generated by sequencing or array-based technologies and currently contains data from almost a million assays, from over 30 000 experiments. The proportion of sequencing-based submissions has grown significantly over the last 2 years and has reached, in 2012, 15% of all new data. All data are available from ArrayExpress in MAGE-TAB format, which allows robust linking to data analysis and visualization tools, including Bioconductor and GenomeSpace. Additionally, R objects, for microarray data, and binary alignment format files, for sequencing data, have been generated for a significant proportion of ArrayExpress data.
Assuntos
Bases de Dados Genéticas , Genômica , Análise em Microsséries , Bases de Dados Genéticas/estatística & dados numéricos , Bases de Dados Genéticas/tendências , Sequenciamento de Nucleotídeos em Larga Escala , Internet , Software , Interface Usuário-ComputadorRESUMO
The Ensembl project (http://www.ensembl.org) provides genome resources for chordate genomes with a particular focus on human genome data as well as data for key model organisms such as mouse, rat and zebrafish. Five additional species were added in the last year including gibbon (Nomascus leucogenys) and Tasmanian devil (Sarcophilus harrisii) bringing the total number of supported species to 61 as of Ensembl release 64 (September 2011). Of these, 55 species appear on the main Ensembl website and six species are provided on the Ensembl preview site (Pre!Ensembl; http://pre.ensembl.org) with preliminary support. The past year has also seen improvements across the project.
Assuntos
Bases de Dados Genéticas , Genômica , Animais , Regulação da Expressão Gênica , Variação Genética , Humanos , Camundongos , Anotação de Sequência Molecular , RatosRESUMO
The Ensembl project (http://www.ensembl.org) seeks to enable genomic science by providing high quality, integrated annotation on chordate and selected eukaryotic genomes within a consistent and accessible infrastructure. All supported species include comprehensive, evidence-based gene annotations and a selected set of genomes includes additional data focused on variation, comparative, evolutionary, functional and regulatory annotation. The most advanced resources are provided for key species including human, mouse, rat and zebrafish reflecting the popularity and importance of these species in biomedical research. As of Ensembl release 59 (August 2010), 56 species are supported of which 5 have been added in the past year. Since our previous report, we have substantially improved the presentation and integration of both data of disease relevance and the regulatory state of different cell types.
Assuntos
Bases de Dados Genéticas , Genômica , Animais , Variação Genética , Humanos , Camundongos , Anotação de Sequência Molecular , Ratos , Sequências Reguladoras de Ácido Nucleico , Software , Peixe-Zebra/genéticaRESUMO
Ensembl (http://www.ensembl.org) integrates genomic information for a comprehensive set of chordate genomes with a particular focus on resources for human, mouse, rat, zebrafish and other high-value sequenced genomes. We provide complete gene annotations for all supported species in addition to specific resources that target genome variation, function and evolution. Ensembl data is accessible in a variety of formats including via our genome browser, API and BioMart. This year marks the tenth anniversary of Ensembl and in that time the project has grown with advances in genome technology. As of release 56 (September 2009), Ensembl supports 51 species including marmoset, pig, zebra finch, lizard, gorilla and wallaby, which were added in the past year. Major additions and improvements to Ensembl since our previous report include the incorporation of the human GRCh37 assembly, enhanced visualisation and data-mining options for the Ensembl regulatory features and continued development of our software infrastructure.
Assuntos
Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Acesso à Informação , Animais , Biologia Computacional/tendências , Bases de Dados de Proteínas , Variação Genética , Genômica/métodos , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Estrutura Terciária de Proteína , Software , Especificidade da EspécieRESUMO
In many higher organisms, 5%-15% of histone H2A is ubiquitylated at lysine 119 (uH2A). The function of this modification and the factors involved in its establishment, however, are unknown. Here we demonstrate that uH2A occurs on the inactive X chromosome in female mammals and that this correlates with recruitment of Polycomb group (PcG) proteins belonging to Polycomb repressor complex 1 (PRC1). Based on our observations, we tested the role of the PRC1 protein Ring1B and its closely related homolog Ring1A in H2A ubiquitylation. Analysis of Ring1B null embryonic stem (ES) cells revealed extensive depletion of global uH2A levels. On the inactive X chromosome, uH2A was maintained in Ring1A or Ring1B null cells, but not in double knockout cells, demonstrating an overlapping function for these proteins in development. These observations link H2A ubiquitylation, X inactivation, and PRC1 PcG function, suggesting an unanticipated and novel mechanism for chromatin-mediated heritable gene silencing.
Assuntos
Proteínas de Transporte/metabolismo , Mecanismo Genético de Compensação de Dose , Inativação Gênica , Histonas/metabolismo , Ubiquitina/metabolismo , Proteínas rab de Ligação ao GTP/metabolismo , Animais , Anticorpos Monoclonais/metabolismo , Blastocisto/metabolismo , Western Blotting , Proteínas de Transporte/classificação , Proteínas de Transporte/genética , Linhagem Celular , Cruzamentos Genéticos , Embrião de Mamíferos/citologia , Feminino , Fibroblastos/metabolismo , Deleção de Genes , Marcação de Genes , Histonas/isolamento & purificação , Peptídeos e Proteínas de Sinalização Intracelular , Camundongos , Camundongos Endogâmicos C57BL , Camundongos Endogâmicos CBA , Mapeamento por Restrição , Células-Tronco/metabolismo , Proteínas rab de Ligação ao GTP/classificação , Proteínas rab de Ligação ao GTP/genéticaRESUMO
BACKGROUND: There is accumulating evidence that the milieu of repeat elements and other non-genic sequence features at a given chromosomal locus, here defined as the genome environment, can play an important role in regulating chromosomal processes such as transcription, replication and recombination. The availability of whole-genome sequences has allowed us to annotate the genome environment of any locus in detail. The development of genome wide experimental analyses of gene expression, chromatin modification and chromatin proteins means that it is now possible to identify potential links between chromosomal processes and the underlying genome environment. There is a need for novel bioinformatic tools that facilitate these studies. RESULTS: We developed the Genome Environment Browser (GEB) in order to visualise the integration of experimental data from large scale high throughput analyses with repeat sequence features that define the local genome environment. The browser has incorporated dynamic scales adjustable in real-time, which enables scanning of large regions of the genome as well as detailed investigation of local regions on the same page without the need to load new pages. The interface also accommodates a 2-dimensional display of repetitive features which vary substantially in size, such as LINE-1 repeats. Specific queries for preliminary quantitative analysis of genome features can also be formulated, results of which can be exported for further analysis. CONCLUSION: The Genome Environment Browser is a versatile program which can be easily adapted for displaying all types of genome data with known genomic coordinates. It is currently available at http://web.bioinformatics.ic.ac.uk/geb/.
Assuntos
Biologia Computacional/métodos , Genômica/métodos , Sequências Repetitivas de Ácido Nucleico/genética , Software , Interface Usuário-ComputadorRESUMO
The Ensembl gene annotation system has been used to annotate over 70 different vertebrate species across a wide range of genome projects. Furthermore, it generates the automatic alignment-based annotation for the human and mouse GENCODE gene sets. The system is based on the alignment of biological sequences, including cDNAs, proteins and RNA-seq reads, to the target genome in order to construct candidate transcript models. Careful assessment and filtering of these candidate transcripts ultimately leads to the final gene set, which is made available on the Ensembl website. Here, we describe the annotation process in detail.Database URL: http://www.ensembl.org/index.html.
Assuntos
Bases de Dados de Ácidos Nucleicos , Bases de Dados de Proteínas , Internet , Anotação de Sequência Molecular/métodos , Animais , Humanos , CamundongosRESUMO
The Smchd1 gene encodes a large protein with homology to the SMC family of proteins involved in chromosome condensation and cohesion. Previous studies have found that Smchd1 has an important role in CpG island (CGI) methylation on the inactive X chromosome (Xi) and in stable silencing of some Xi genes. In this study, using genome-wide expression analysis, we showed that Smchd1 is required for the silencing of around 10% of the genes on Xi, apparently independent of CGI hypomethylation, and, moreover, that these genes nonrandomly occur in clusters. Additionally, we found that Smchd1 is required for CpG island methylation and silencing at a cluster of four imprinted genes in the Prader-Willi syndrome (PWS) locus on chromosome 7 and genes from the protocadherin-alpha and -beta clusters. All of the affected autosomal loci display developmentally regulated brain-specific methylation patterns which are lost in Smchd1 homozygous mutants. We discuss the implications of these findings for understanding the function of Smchd1 in epigenetic regulation of gene expression.
Assuntos
Proteínas Cromossômicas não Histona/genética , Epigênese Genética , Regulação da Expressão Gênica no Desenvolvimento , Família Multigênica , Cromossomo X/genética , Animais , Caderinas/genética , Proteínas Cromossômicas não Histona/metabolismo , Ilhas de CpG , Metilação de DNA , Embrião de Mamíferos/metabolismo , Feminino , Deleção de Genes , Impressão Genômica , Masculino , Camundongos , Síndrome de Prader-Willi/genética , Receptores de Superfície Celular/genéticaRESUMO
X chromosome inactivation involves multiple levels of chromatin modification, established progressively and in a stepwise manner during early development. The chromosomal protein Smchd1 was recently shown to play an important role in DNA methylation of CpG islands (CGIs), a late step in the X inactivation pathway that is required for long-term maintenance of gene silencing. Here we show that inactive X chromosome (Xi) CGI methylation can occur via either Smchd1-dependent or -independent pathways. Smchd1-dependent CGI methylation, the primary pathway, is acquired gradually over an extended period, whereas Smchd1-independent CGI methylation occurs rapidly after the onset of X inactivation. The de novo methyltransferase Dnmt3b is required for methylation of both classes of CGI, whereas Dnmt3a and Dnmt3L are dispensable. Xi CGIs methylated by these distinct pathways differ with respect to their sequence characteristics and immediate chromosomal environment. We discuss the implications of these results for understanding CGI methylation during development.
Assuntos
Proteínas Cromossômicas não Histona/metabolismo , Ilhas de CpG , Metilação de DNA , Inativação do Cromossomo X , Alelos , Animais , Linhagem Celular , Proteínas Cromossômicas não Histona/genética , Camundongos , Isoformas de Proteínas/genética , Isoformas de Proteínas/metabolismoRESUMO
BACKGROUND: X chromosome inactivation, the mechanism used by mammals to equalise dosage of X-linked genes in XX females relative to XY males, is triggered by chromosome-wide localisation of a cis-acting non-coding RNA, Xist. The mechanism of Xist RNA spreading and Xist-dependent silencing is poorly understood. A large body of evidence indicates that silencing is more efficient on the X chromosome than on autosomes, leading to the idea that the X chromosome has acquired sequences that facilitate propagation of silencing. LINE-1 (L1) repeats are relatively enriched on the X chromosome and have been proposed as candidates for these sequences. To determine the requirements for efficient silencing we have analysed the relationship of chromosome features, including L1 repeats, and the extent of silencing in cell lines carrying inducible Xist transgenes located on one of three different autosomes. RESULTS: Our results show that the organisation of the chromosome into large gene-rich and L1-rich domains is a key determinant of silencing efficiency. Specifically genes located in large gene-rich domains with low L1 density are relatively resistant to Xist-mediated silencing whereas genes located in gene-poor domains with high L1 density are silenced more efficiently. These effects are observed shortly after induction of Xist RNA expression, suggesting that chromosomal domain organisation influences establishment rather than long-term maintenance of silencing. The X chromosome and some autosomes have only small gene-rich L1-depleted domains and we suggest that this could confer the capacity for relatively efficient chromosome-wide silencing. CONCLUSIONS: This study provides insight into the requirements for efficient Xist mediated silencing and specifically identifies organisation of the chromosome into gene-rich L1-depleted and gene-poor L1-dense domains as a major influence on the ability of Xist-mediated silencing to be propagated in a continuous manner in cis.
RESUMO
BACKGROUND: X chromosome inactivation is the mechanism used in mammals to achieve dosage compensation of X-linked genes in XX females relative to XY males. Chromosome silencing is triggered in cis by expression of the non-coding RNA Xist. As such, correct regulation of the Xist gene promoter is required to establish appropriate X chromosome activity both in males and females. Studies to date have demonstrated co-transcription of an antisense RNA Tsix and low-level sense transcription prior to onset of X inactivation. The balance of sense and antisense RNA is important in determining the probability that a given Xist allele will be expressed, termed the X inactivation choice, when X inactivation commences. RESULTS: Here we investigate further the mechanism of Xist promoter regulation. We demonstrate that both sense and antisense transcription modulate Xist promoter DNA methylation in undifferentiated embryonic stem (ES) cells, suggesting a possible mechanistic basis for influencing X chromosome choice. Given the involvement of sense and antisense RNAs in promoter methylation, we investigate a possible role for the RNA interference (RNAi) pathway. We show that the Xist promoter is hypomethylated in ES cells deficient for the essential RNAi enzyme Dicer, but that this effect is probably a secondary consequence of reduced levels of de novo DNA methyltransferases in these cells. Consistent with this we find that Dicer-deficient XY and XX embryos show appropriate Xist expression patterns, indicating that Xist gene regulation has not been perturbed. CONCLUSION: We conclude that Xist promoter methylation prior to the onset of random X chromosome inactivation is influenced by relative levels of sense and antisense transcription but that this probably occurs independent of the RNAi pathway. We discuss the implications for this data in terms of understanding Xist gene regulation and X chromosome choice in random X chromosome inactivation.