Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 12 de 12
Filter
Add more filters











Publication year range
2.
Nature ; 586(7831): 683-692, 2020 10.
Article in English | MEDLINE | ID: mdl-33116284

ABSTRACT

Starting with the launch of the Human Genome Project three decades ago, and continuing after its completion in 2003, genomics has progressively come to have a central and catalytic role in basic and translational research. In addition, studies increasingly demonstrate how genomic information can be effectively used in clinical care. In the future, the anticipated advances in technology development, biological insights, and clinical applications (among others) will lead to more widespread integration of genomics into almost all areas of biomedical research, the adoption of genomics into mainstream medical and public-health practices, and an increasing relevance of genomics for everyday life. On behalf of the research community, the National Human Genome Research Institute recently completed a multi-year process of strategic engagement to identify future research priorities and opportunities in human genomics, with an emphasis on health applications. Here we describe the highest-priority elements envisioned for the cutting-edge of human genomics going forward-that is, at 'The Forefront of Genomics'.


Subject(s)
Biomedical Research/trends , Genome, Human/genetics , Genomics/trends , Public Health/standards , Translational Research, Biomedical/trends , Biomedical Research/economics , COVID-19/genetics , Genomics/economics , Humans , National Human Genome Research Institute (U.S.)/economics , Social Change , Translational Research, Biomedical/economics , United States
3.
Nature ; 583(7818): 693-698, 2020 07.
Article in English | MEDLINE | ID: mdl-32728248

ABSTRACT

The Encylopedia of DNA Elements (ENCODE) Project launched in 2003 with the long-term goal of developing a comprehensive map of functional elements in the human genome. These included genes, biochemical regions associated with gene regulation (for example, transcription factor binding sites, open chromatin, and histone marks) and transcript isoforms. The marks serve as sites for candidate cis-regulatory elements (cCREs) that may serve functional roles in regulating gene expression1. The project has been extended to model organisms, particularly the mouse. In the third phase of ENCODE, nearly a million and more than 300,000 cCRE annotations have been generated for human and mouse, respectively, and these have provided a valuable resource for the scientific community.


Subject(s)
Databases, Genetic , Genome/genetics , Genomics , Molecular Sequence Annotation , Animals , Binding Sites , Chromatin/genetics , Chromatin/metabolism , DNA Methylation , Databases, Genetic/standards , Databases, Genetic/trends , Gene Expression Regulation/genetics , Genome, Human/genetics , Genomics/standards , Genomics/trends , Histones/metabolism , Humans , Mice , Molecular Sequence Annotation/standards , Quality Control , Regulatory Sequences, Nucleic Acid/genetics , Transcription Factors/metabolism
5.
Nature ; 512(7515): 445-8, 2014 Aug 28.
Article in English | MEDLINE | ID: mdl-25164755

ABSTRACT

The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters.


Subject(s)
Caenorhabditis elegans/genetics , Drosophila melanogaster/genetics , Gene Expression Profiling , Transcriptome/genetics , Animals , Caenorhabditis elegans/embryology , Caenorhabditis elegans/growth & development , Chromatin/genetics , Cluster Analysis , Drosophila melanogaster/growth & development , Gene Expression Regulation, Developmental/genetics , Histones/metabolism , Humans , Larva/genetics , Larva/growth & development , Models, Genetic , Molecular Sequence Annotation , Promoter Regions, Genetic/genetics , Pupa/genetics , Pupa/growth & development , RNA, Untranslated/genetics , Sequence Analysis, RNA
6.
Nature ; 512(7515): 453-6, 2014 Aug 28.
Article in English | MEDLINE | ID: mdl-25164757

ABSTRACT

Despite the large evolutionary distances between metazoan species, they can show remarkable commonalities in their biology, and this has helped to establish fly and worm as model organisms for human biology. Although studies of individual elements and factors have explored similarities in gene regulation, a large-scale comparative analysis of basic principles of transcriptional regulatory features is lacking. Here we map the genome-wide binding locations of 165 human, 93 worm and 52 fly transcription regulatory factors, generating a total of 1,019 data sets from diverse cell types, developmental stages, or conditions in the three species, of which 498 (48.9%) are presented here for the first time. We find that structural properties of regulatory networks are remarkably conserved and that orthologous regulatory factor families recognize similar binding motifs in vivo and show some similar co-associations. Our results suggest that gene-regulatory properties previously observed for individual factors are general principles of metazoan regulation that are remarkably well-preserved despite extensive functional divergence of individual network connections. The comparative maps of regulatory circuitry provided here will drive an improved understanding of the regulatory underpinnings of model organism biology and how these relate to human biology, development and disease.


Subject(s)
Caenorhabditis elegans/genetics , Drosophila melanogaster/genetics , Evolution, Molecular , Gene Expression Regulation/genetics , Gene Regulatory Networks/genetics , Transcription Factors/metabolism , Animals , Binding Sites , Caenorhabditis elegans/growth & development , Chromatin Immunoprecipitation , Conserved Sequence/genetics , Drosophila melanogaster/growth & development , Gene Expression Regulation, Developmental/genetics , Genome/genetics , Humans , Molecular Sequence Annotation , Nucleotide Motifs/genetics , Organ Specificity/genetics , Transcription Factors/genetics
7.
Nature ; 512(7515): 449-52, 2014 Aug 28.
Article in English | MEDLINE | ID: mdl-25164756

ABSTRACT

Genome function is dynamically regulated in part by chromatin, which consists of the histones, non-histone proteins and RNA molecules that package DNA. Studies in Caenorhabditis elegans and Drosophila melanogaster have contributed substantially to our understanding of molecular mechanisms of genome function in humans, and have revealed conservation of chromatin components and mechanisms. Nevertheless, the three organisms have markedly different genome sizes, chromosome architecture and gene organization. On human and fly chromosomes, for example, pericentric heterochromatin flanks single centromeres, whereas worm chromosomes have dispersed heterochromatin-like regions enriched in the distal chromosomal 'arms', and centromeres distributed along their lengths. To systematically investigate chromatin organization and associated gene regulation across species, we generated and analysed a large collection of genome-wide chromatin data sets from cell lines and developmental stages in worm, fly and human. Here we present over 800 new data sets from our ENCODE and modENCODE consortia, bringing the total to over 1,400. Comparison of combinatorial patterns of histone modifications, nuclear lamina-associated domains, organization of large-scale topological domains, chromatin environment at promoters and enhancers, nucleosome positioning, and DNA replication patterns reveals many conserved features of chromatin organization among the three organisms. We also find notable differences in the composition and locations of repressive chromatin. These data sets and analyses provide a rich resource for comparative and species-specific investigations of chromatin composition, organization and function.


Subject(s)
Caenorhabditis elegans/cytology , Caenorhabditis elegans/genetics , Chromatin/genetics , Chromatin/metabolism , Drosophila melanogaster/cytology , Drosophila melanogaster/genetics , Animals , Cell Line , Centromere/genetics , Centromere/metabolism , Chromatin/chemistry , Chromatin Assembly and Disassembly/genetics , DNA Replication/genetics , Enhancer Elements, Genetic/genetics , Epigenesis, Genetic , Heterochromatin/chemistry , Heterochromatin/genetics , Heterochromatin/metabolism , Histones/chemistry , Histones/metabolism , Humans , Molecular Sequence Annotation , Nuclear Lamina/metabolism , Nucleosomes/chemistry , Nucleosomes/genetics , Nucleosomes/metabolism , Promoter Regions, Genetic/genetics , Species Specificity
8.
Proc Natl Acad Sci U S A ; 111(17): 6131-8, 2014 Apr 29.
Article in English | MEDLINE | ID: mdl-24753594

ABSTRACT

With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease.


Subject(s)
DNA/genetics , Genome, Human/genetics , Biological Evolution , Disease/genetics , Humans , Regulatory Sequences, Nucleic Acid/genetics , Software
9.
Science ; 330(6012): 1775-87, 2010 Dec 24.
Article in English | MEDLINE | ID: mdl-21177976

ABSTRACT

We systematically generated large-scale data sets to improve genome annotation for the nematode Caenorhabditis elegans, a key model organism. These data sets include transcriptome profiling across a developmental time course, genome-wide identification of transcription factor-binding sites, and maps of chromatin organization. From this, we created more complete and accurate gene models, including alternative splice forms and candidate noncoding RNAs. We constructed hierarchical networks of transcription factor-binding and microRNA interactions and discovered chromosomal locations bound by an unusually large number of transcription factors. Different patterns of chromatin composition and histone modification were revealed between chromosome arms and centers, with similarly prominent differences between autosomes and the X chromosome. Integrating data types, we built statistical models relating chromatin, transcription factor binding, and gene expression. Overall, our analyses ascribed putative functions to most of the conserved genome.


Subject(s)
Caenorhabditis elegans/genetics , Chromosomes , Gene Expression Profiling , Gene Expression Regulation , Genome, Helminth , Molecular Sequence Annotation , Animals , Caenorhabditis elegans/growth & development , Caenorhabditis elegans/metabolism , Caenorhabditis elegans Proteins/genetics , Caenorhabditis elegans Proteins/metabolism , Chromatin/genetics , Chromatin/metabolism , Chromatin/ultrastructure , Chromosomes/genetics , Chromosomes/metabolism , Chromosomes/ultrastructure , Computational Biology/methods , Conserved Sequence , Evolution, Molecular , Gene Regulatory Networks , Genes, Helminth , Genomics/methods , Histones/metabolism , Models, Genetic , RNA, Helminth/genetics , RNA, Helminth/metabolism , RNA, Untranslated/genetics , RNA, Untranslated/metabolism , Regulatory Sequences, Nucleic Acid , Transcription Factors/genetics , Transcription Factors/metabolism
10.
Genome Res ; 19(12): 2324-33, 2009 Dec.
Article in English | MEDLINE | ID: mdl-19767417

ABSTRACT

Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide.


Subject(s)
Cloning, Molecular/methods , Computational Biology/methods , DNA, Complementary/genetics , Gene Library , Genes/genetics , Mammals/genetics , Animals , DNA/biosynthesis , Humans , Mice , National Institutes of Health (U.S.) , Rats , Reverse Transcriptase Polymerase Chain Reaction , United States
11.
Genome Res ; 14(10B): 2121-7, 2004 Oct.
Article in English | MEDLINE | ID: mdl-15489334

ABSTRACT

The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.


Subject(s)
Cloning, Molecular/methods , DNA, Complementary , Gene Library , Open Reading Frames/physiology , Animals , Computational Biology , DNA Primers , DNA, Complementary/genetics , DNA, Complementary/metabolism , Humans , Mice , National Institutes of Health (U.S.) , Rats , United States , Xenopus laevis/genetics , Zebrafish/genetics
12.
Proc Natl Acad Sci U S A ; 99(26): 16899-903, 2002 Dec 24.
Article in English | MEDLINE | ID: mdl-12477932

ABSTRACT

The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http:mgc.nci.nih.gov).


Subject(s)
DNA, Complementary/chemistry , Sequence Analysis, DNA , Algorithms , Animals , DNA, Complementary/analysis , Gene Library , Humans , Mice , Open Reading Frames
SELECTION OF CITATIONS
SEARCH DETAIL