Pesquisa | Portal Regional da BVS

1.

The UCSC Genome Browser database: 2016 update.

Speir, Matthew L; Zweig, Ann S; Rosenbloom, Kate R; Raney, Brian J; Paten, Benedict; Nejad, Parisa; Lee, Brian T; Learned, Katrina; Karolchik, Donna; Hinrichs, Angie S; Heitner, Steve; Harte, Rachel A; Haeussler, Maximilian; Guruvadoo, Luvina; Fujita, Pauline A; Eisenhart, Christopher; Diekhans, Mark; Clawson, Hiram; Casper, Jonathan; Barber, Galt P; Haussler, David; Kuhn, Robert M; Kent, W James.

Nucleic Acids Res ; 44(D1): D717-25, 2016 Jan 04.

Artigo em Inglês | MEDLINE | ID: mdl-26590259

RESUMO

For the past 15 years, the UCSC Genome Browser (http://genome.ucsc.edu/) has served the international research community by offering an integrated platform for viewing and analyzing information from a large database of genome assemblies and their associated annotations. The UCSC Genome Browser has been under continuous development since its inception with new data sets and software features added frequently. Some release highlights of this year include new and updated genome browsers for various assemblies, including bonobo and zebrafish; new gene annotation sets; improvements to track and assembly hub support; and a new interactive tool, the "Data Integrator", for intersecting data from multiple tracks. We have greatly expanded the data sets available on the most recent human assembly, hg38/GRCh38, to include updated gene prediction sets from GENCODE, more phenotype- and disease-associated variants from ClinVar and ClinGen, more genomic regulatory data, and a new multiple genome alignment.

Assuntos

Bases de Dados Genéticas , Genômica , Animais , Doença/genética , Genes , Genoma , Humanos , Camundongos , Anotação de Sequência Molecular , Software

2.

The UCSC Genome Browser database: 2015 update.

Rosenbloom, Kate R; Armstrong, Joel; Barber, Galt P; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Dreszer, Timothy R; Fujita, Pauline A; Guruvadoo, Luvina; Haeussler, Maximilian; Harte, Rachel A; Heitner, Steve; Hickey, Glenn; Hinrichs, Angie S; Hubley, Robert; Karolchik, Donna; Learned, Katrina; Lee, Brian T; Li, Chin H; Miga, Karen H; Nguyen, Ngan; Paten, Benedict; Raney, Brian J; Smit, Arian F A; Speir, Matthew L; Zweig, Ann S; Haussler, David; Kuhn, Robert M; Kent, W James.

Nucleic Acids Res ; 43(Database issue): D670-81, 2015 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-25428374

RESUMO

Launched in 2001 to showcase the draft human genome assembly, the UCSC Genome Browser database (http://genome.ucsc.edu) and associated tools continue to grow, providing a comprehensive resource of genome assemblies and annotations to scientists and students worldwide. Highlights of the past year include the release of a browser for the first new human genome reference assembly in 4 years in December 2013 (GRCh38, UCSC hg38), a watershed comparative genomics annotation (100-species multiple alignment and conservation) and a novel distribution mechanism for the browser (GBiB: Genome Browser in a Box). We created browsers for new species (Chinese hamster, elephant shark, minke whale), 'mined the web' for DNA sequences and expanded the browser display with stacked color graphs and region highlighting. As our user community increasingly adopts the UCSC track hub and assembly hub representations for sharing large-scale genomic annotation data sets and genome sequencing projects, our menu of public data hubs has tripled.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genômica , Animais , Cricetinae , Cães , Ebolavirus/genética , Expressão Gênica , Genoma , Internet , Camundongos , Anotação de Sequência Molecular , Fenótipo , Ratos , Software

3.

Comparative analysis of pseudogenes across three phyla.

Sisu, Cristina; Pei, Baikang; Leng, Jing; Frankish, Adam; Zhang, Yan; Balasubramanian, Suganthi; Harte, Rachel; Wang, Daifeng; Rutenberg-Schoenberg, Michael; Clark, Wyatt; Diekhans, Mark; Rozowsky, Joel; Hubbard, Tim; Harrow, Jennifer; Gerstein, Mark B.

Proc Natl Acad Sci U S A ; 111(37): 13361-6, 2014 Sep 16.

Artigo em Inglês | MEDLINE | ID: mdl-25157146

RESUMO

Pseudogenes are degraded fossil copies of genes. Here, we report a comparison of pseudogenes spanning three phyla, leveraging the completed annotations of the human, worm, and fly genomes, which we make available as an online resource. We find that pseudogenes are lineage specific, much more so than protein-coding genes, reflecting the different remodeling processes marking each organism's genome evolution. The majority of human pseudogenes are processed, resulting from a retrotranspositional burst at the dawn of the primate lineage. This burst can be seen in the largely uniform distribution of pseudogenes across the genome, their preservation in areas with low recombination rates, and their preponderance in highly expressed gene families. In contrast, worm and fly pseudogenes tell a story of numerous duplication events. In worm, these duplications have been preserved through selective sweeps, so we see a large number of pseudogenes associated with highly duplicated families such as chemoreceptors. However, in fly, the large effective population size and high deletion rate resulted in a depletion of the pseudogene complement. Despite large variations between these species, we also find notable similarities. Overall, we identify a broad spectrum of biochemical activity for pseudogenes, with the majority in each organism exhibiting varying degrees of partial activity. In particular, we identify a consistent amount of transcription (â¼15%) across all species, suggesting a uniform degradation process. Also, we see a uniform decay of pseudogene promoter activity relative to their coding counterparts and identify a number of pseudogenes with conserved upstream sequences and activity, hinting at potential regulatory roles.

Assuntos

Caenorhabditis elegans/genética , Drosophila melanogaster/genética , Filogenia , Pseudogenes/genética , Animais , Evolução Molecular , Estudos de Associação Genética , Humanos , Anotação de Sequência Molecular , Regiões Promotoras Genéticas/genética , Homologia de Sequência do Ácido Nucleico

4.

The UCSC Genome Browser database: 2014 update.

Karolchik, Donna; Barber, Galt P; Casper, Jonathan; Clawson, Hiram; Cline, Melissa S; Diekhans, Mark; Dreszer, Timothy R; Fujita, Pauline A; Guruvadoo, Luvina; Haeussler, Maximilian; Harte, Rachel A; Heitner, Steve; Hinrichs, Angie S; Learned, Katrina; Lee, Brian T; Li, Chin H; Raney, Brian J; Rhead, Brooke; Rosenbloom, Kate R; Sloan, Cricket A; Speir, Matthew L; Zweig, Ann S; Haussler, David; Kuhn, Robert M; Kent, W James.

Nucleic Acids Res ; 42(Database issue): D764-70, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24270787

RESUMO

The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a large collection of organisms, primarily vertebrates, with an emphasis on the human and mouse genomes. The Browser's web-based tools provide an integrated environment for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic data sets. As of September 2013, the database contained genomic sequence and a basic set of annotation 'tracks' for â¼90 organisms. Significant new annotations include a 60-species multiple alignment conservation track on the mouse, updated UCSC Genes tracks for human and mouse, and several new sets of variation and ENCODE data. New software tools include a Variant Annotation Integrator that returns predicted functional effects of a set of variants uploaded as a custom track, an extension to UCSC Genes that displays haplotype alleles for protein-coding genes and an expansion of data hubs that includes the capability to display remotely hosted user-provided assembly sequence in addition to annotation data. To improve European access, we have added a Genome Browser mirror (http://genome-euro.ucsc.edu) hosted at Bielefeld University in Germany.

Assuntos

Bases de Dados Genéticas , Genoma , Genômica , Alelos , Animais , Genoma Humano , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Polimorfismo de Nucleotídeo Único , Alinhamento de Sequência , Software

5.

Current status and new features of the Consensus Coding Sequence database.

Farrell, Catherine M; O'Leary, Nuala A; Harte, Rachel A; Loveland, Jane E; Wilming, Laurens G; Wallin, Craig; Diekhans, Mark; Barrell, Daniel; Searle, Stephen M J; Aken, Bronwen; Hiatt, Susan M; Frankish, Adam; Suner, Marie-Marthe; Rajput, Bhanu; Steward, Charles A; Brown, Garth R; Bennett, Ruth; Murphy, Michael; Wu, Wendy; Kay, Mike P; Hart, Jennifer; Rajan, Jeena; Weber, Janet; Snow, Catherine; Riddick, Lillian D; Hunt, Toby; Webb, David; Thomas, Mark; Tamez, Pamela; Rangwala, Sanjida H; McGarvey, Kelly M; Pujar, Shashikant; Shkeda, Andrei; Mudge, Jonathan M; Gonzalez, Jose M; Gilbert, James G R; Trevanion, Stephen J; Baertsch, Robert; Harrow, Jennifer L; Hubbard, Tim; Ostell, James M; Haussler, David; Pruitt, Kim D.

Nucleic Acids Res ; 42(Database issue): D865-72, 2014 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-24217909

RESUMO

The Consensus Coding Sequence (CCDS) project (http://www.ncbi.nlm.nih.gov/CCDS/) is a collaborative effort to maintain a dataset of protein-coding regions that are identically annotated on the human and mouse reference genome assemblies by the National Center for Biotechnology Information (NCBI) and Ensembl genome annotation pipelines. Identical annotations that pass quality assurance tests are tracked with a stable identifier (CCDS ID). Members of the collaboration, who are from NCBI, the Wellcome Trust Sanger Institute and the University of California Santa Cruz, provide coordinated and continuous review of the dataset to ensure high-quality CCDS representations. We describe here the current status and recent growth in the CCDS dataset, as well as recent changes to the CCDS web and FTP sites. These changes include more explicit reporting about the NCBI and Ensembl annotation releases being compared, new search and display options, the addition of biologically descriptive information and our approach to representing genes for which support evidence is incomplete. We also present a summary of recent and future curation targets.

Assuntos

Bases de Dados Genéticas , Proteínas/genética , Animais , Éxons , Genômica , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Análise de Sequência

6.

ENCODE data in the UCSC Genome Browser: year 5 update.

Rosenbloom, Kate R; Sloan, Cricket A; Malladi, Venkat S; Dreszer, Timothy R; Learned, Katrina; Kirkup, Vanessa M; Wong, Matthew C; Maddren, Morgan; Fang, Ruihua; Heitner, Steven G; Lee, Brian T; Barber, Galt P; Harte, Rachel A; Diekhans, Mark; Long, Jeffrey C; Wilder, Steven P; Zweig, Ann S; Karolchik, Donna; Kuhn, Robert M; Haussler, David; Kent, W James.

Nucleic Acids Res ; 41(Database issue): D56-63, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23193274

RESUMO

The Encyclopedia of DNA Elements (ENCODE), http://encodeproject.org, has completed its fifth year of scientific collaboration to create a comprehensive catalog of functional elements in the human genome, and its third year of investigations in the mouse genome. Since the last report in this journal, the ENCODE human data repertoire has grown by 898 new experiments (totaling 2886), accompanied by a major integrative analysis. In the mouse genome, results from 404 new experiments became available this year, increasing the total to 583, collected during the course of the project. The University of California, Santa Cruz, makes this data available on the public Genome Browser http://genome.ucsc.edu for visual browsing and data mining. Download of raw and processed data files are all supported. The ENCODE portal provides specialized tools and information about the ENCODE data sets.

Assuntos

Bases de Dados Genéticas , Genoma Humano , Genômica , Animais , Humanos , Internet , Camundongos , Software

7.

The UCSC Genome Browser database: extensions and updates 2013.

Meyer, Laurence R; Zweig, Ann S; Hinrichs, Angie S; Karolchik, Donna; Kuhn, Robert M; Wong, Matthew; Sloan, Cricket A; Rosenbloom, Kate R; Roe, Greg; Rhead, Brooke; Raney, Brian J; Pohl, Andy; Malladi, Venkat S; Li, Chin H; Lee, Brian T; Learned, Katrina; Kirkup, Vanessa; Hsu, Fan; Heitner, Steve; Harte, Rachel A; Haeussler, Maximilian; Guruvadoo, Luvina; Goldman, Mary; Giardine, Belinda M; Fujita, Pauline A; Dreszer, Timothy R; Diekhans, Mark; Cline, Melissa S; Clawson, Hiram; Barber, Galt P; Haussler, David; Kent, W James.

Nucleic Acids Res ; 41(Database issue): D64-9, 2013 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-23155063

RESUMO

The University of California Santa Cruz (UCSC) Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analysing and sharing both publicly available and user-generated genomic datasets. As of September 2012, genomic sequence and a basic set of annotation 'tracks' are provided for 63 organisms, including 26 mammals, 13 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms, yeast and sea hare. In the past year 19 new genome assemblies have been added, and we anticipate releasing another 28 in early 2013. Further, a large number of annotation tracks have been either added, updated by contributors or remapped to the latest human reference genome. Among these are an updated UCSC Genes track for human and mouse assemblies. We have also introduced several features to improve usability, including new navigation menus. This article provides an update to the UCSC Genome Browser database, which has been previously featured in the Database issue of this journal.

Assuntos

Bases de Dados Genéticas , Genômica , Animais , Genoma Humano , Humanos , Internet , Camundongos , Anotação de Sequência Molecular , Software

8.

The GENCODE pseudogene resource.

Pei, Baikang; Sisu, Cristina; Frankish, Adam; Howald, Cédric; Habegger, Lukas; Mu, Xinmeng Jasmine; Harte, Rachel; Balasubramanian, Suganthi; Tanzer, Andrea; Diekhans, Mark; Reymond, Alexandre; Hubbard, Tim J; Harrow, Jennifer; Gerstein, Mark B.

Genome Biol ; 13(9): R51, 2012 Sep 26.

Artigo em Inglês | MEDLINE | ID: mdl-22951037

RESUMO

BACKGROUND: Pseudogenes have long been considered as nonfunctional genomic sequences. However, recent evidence suggests that many of them might have some form of biological activity, and the possibility of functionality has increased interest in their accurate annotation and integration with functional genomics data. RESULTS: As part of the GENCODE annotation of the human genome, we present the first genome-wide pseudogene assignment for protein-coding genes, based on both large-scale manual annotation and in silico pipelines. A key aspect of this coupled approach is that it allows us to identify pseudogenes in an unbiased fashion as well as untangle complex events through manual evaluation. We integrate the pseudogene annotations with the extensive ENCODE functional genomics information. In particular, we determine the expression level, transcription-factor and RNA polymerase II binding, and chromatin marks associated with each pseudogene. Based on their distribution, we develop simple statistical models for each type of activity, which we validate with large-scale RT-PCR-Seq experiments. Finally, we compare our pseudogenes with conservation and variation data from primate alignments and the 1000 Genomes project, producing lists of pseudogenes potentially under selection. CONCLUSIONS: At one extreme, some pseudogenes possess conventional characteristics of functionality; these may represent genes that have recently died. On the other hand, we find interesting patterns of partial activity, which may suggest that dead genes are being resurrected as functioning non-coding RNAs. The activity data of each pseudogene are stored in an associated resource, psiDR, which will be useful for the initial identification of potentially functional pseudogenes.

Assuntos

Genoma Humano , Pseudogenes , Transcrição Gênica , Animais , Sítios de Ligação , Cromatina/química , Cromatina/genética , Humanos , Modelos Genéticos , Modelos Estatísticos , Anotação de Sequência Molecular , Filogenia , Primatas , RNA Polimerase II/metabolismo , Sequências Reguladoras de Ácido Nucleico , Seleção Genética , Análise de Sequência de DNA , Fatores de Transcrição/metabolismo

9.

GENCODE: the reference human genome annotation for The ENCODE Project.

Harrow, Jennifer; Frankish, Adam; Gonzalez, Jose M; Tapanari, Electra; Diekhans, Mark; Kokocinski, Felix; Aken, Bronwen L; Barrell, Daniel; Zadissa, Amonida; Searle, Stephen; Barnes, If; Bignell, Alexandra; Boychenko, Veronika; Hunt, Toby; Kay, Mike; Mukherjee, Gaurab; Rajan, Jeena; Despacio-Reyes, Gloria; Saunders, Gary; Steward, Charles; Harte, Rachel; Lin, Michael; Howald, Cédric; Tanzer, Andrea; Derrien, Thomas; Chrast, Jacqueline; Walters, Nathalie; Balasubramanian, Suganthi; Pei, Baikang; Tress, Michael; Rodriguez, Jose Manuel; Ezkurdia, Iakes; van Baren, Jeltje; Brent, Michael; Haussler, David; Kellis, Manolis; Valencia, Alfonso; Reymond, Alexandre; Gerstein, Mark; Guigó, Roderic; Hubbard, Tim J.

Genome Res ; 22(9): 1760-74, 2012 Sep.

Artigo em Inglês | MEDLINE | ID: mdl-22955987

RESUMO

The GENCODE Consortium aims to identify all gene features in the human genome using a combination of computational analysis, manual annotation, and experimental validation. Since the first public release of this annotation data set, few new protein-coding loci have been added, yet the number of alternative splicing transcripts annotated has steadily increased. The GENCODE 7 release contains 20,687 protein-coding and 9640 long noncoding RNA loci and has 33,977 coding transcripts not represented in UCSC genes and RefSeq. It also has the most comprehensive annotation of long noncoding RNA (lncRNA) loci publicly available with the predominant transcript form consisting of two exons. We have examined the completeness of the transcript annotation and found that 35% of transcriptional start sites are supported by CAGE clusters and 62% of protein-coding genes have annotated polyA sites. Over one-third of GENCODE protein-coding genes are supported by peptide hits derived from mass spectrometry spectra submitted to Peptide Atlas. New models derived from the Illumina Body Map 2.0 RNA-seq data identify 3689 new loci not currently in GENCODE, of which 3127 consist of two exon models indicating that they are possibly unannotated long noncoding loci. GENCODE 7 is publicly available from gencodegenes.org and via the Ensembl and UCSC Genome Browsers.

Assuntos

Bases de Dados Genéticas , Genoma Humano , Genômica/métodos , Anotação de Sequência Molecular , Animais , Biologia Computacional/métodos , DNA Complementar/química , DNA Complementar/genética , Evolução Molecular , Éxons , Loci Gênicos , Humanos , Internet , Modelos Moleculares , Fases de Leitura Aberta , Pseudogenes , Controle de Qualidade , Sítios de Splice de RNA , RNA Longo não Codificante , Reprodutibilidade dos Testes , Regiões não Traduzidas

10.

Tracking and coordinating an international curation effort for the CCDS Project.

Harte, Rachel A; Farrell, Catherine M; Loveland, Jane E; Suner, Marie-Marthe; Wilming, Laurens; Aken, Bronwen; Barrell, Daniel; Frankish, Adam; Wallin, Craig; Searle, Steve; Diekhans, Mark; Harrow, Jennifer; Pruitt, Kim D.

Database (Oxford) ; 2012: bas008, 2012.

Artigo em Inglês | MEDLINE | ID: mdl-22434842

RESUMO

The Consensus Coding Sequence (CCDS) collaboration involves curators at multiple centers with a goal of producing a conservative set of high quality, protein-coding region annotations for the human and mouse reference genome assemblies. The CCDS data set reflects a 'gold standard' definition of best supported protein annotations, and corresponding genes, which pass a standard series of quality assurance checks and are supported by manual curation. This data set supports use of genome annotation information by human and mouse researchers for effective experimental design, analysis and interpretation. The CCDS project consists of analysis of automated whole-genome annotation builds to identify identical CDS annotations, quality assurance testing and manual curation support. Identical CDS annotations are tracked with a CCDS identifier (ID) and any future change to the annotated CDS structure must be agreed upon by the collaborating members. CCDS curation guidelines were developed to address some aspects of curation in order to improve initial annotation consistency and to reduce time spent in discussing proposed annotation updates. Here, we present the current status of the CCDS database and details on our procedures to track and coordinate our efforts. We also present the relevant background and reasoning behind the curation standards that we have developed for CCDS database treatment of transcripts that are nonsense-mediated decay (NMD) candidates, for transcripts containing upstream open reading frames, for identifying the most likely translation start codons and for the annotation of readthrough transcripts. Examples are provided to illustrate the application of these guidelines. DATABASE URL: http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi.

Assuntos

Sequência Consenso , Sistemas de Gerenciamento de Base de Dados , Bases de Dados Genéticas , Genômica/métodos , Anotação de Sequência Molecular/métodos , Animais , Humanos , Camundongos

11.

The UCSC Genome Browser database: extensions and updates 2011.

Dreszer, Timothy R; Karolchik, Donna; Zweig, Ann S; Hinrichs, Angie S; Raney, Brian J; Kuhn, Robert M; Meyer, Laurence R; Wong, Mathew; Sloan, Cricket A; Rosenbloom, Kate R; Roe, Greg; Rhead, Brooke; Pohl, Andy; Malladi, Venkat S; Li, Chin H; Learned, Katrina; Kirkup, Vanessa; Hsu, Fan; Harte, Rachel A; Guruvadoo, Luvina; Goldman, Mary; Giardine, Belinda M; Fujita, Pauline A; Diekhans, Mark; Cline, Melissa S; Clawson, Hiram; Barber, Galt P; Haussler, David; James Kent, W.

Nucleic Acids Res ; 40(Database issue): D918-23, 2012 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-22086951

RESUMO

The University of California Santa Cruz Genome Browser (http://genome.ucsc.edu) offers online public access to a growing database of genomic sequence and annotations for a wide variety of organisms. The Browser is an integrated tool set for visualizing, comparing, analyzing and sharing both publicly available and user-generated genomic data sets. In the past year, the local database has been updated with four new species assemblies, and we anticipate another four will be released by the end of 2011. Further, a large number of annotation tracks have been either added, updated by contributors, or remapped to the latest human reference genome. Among these are new phenotype and disease annotations, UCSC genes, and a major dbSNP update, which required new visualization methods. Growing beyond the local database, this year we have introduced 'track data hubs', which allow the Genome Browser to provide access to remotely located sets of annotations. This feature is designed to significantly extend the number and variety of annotation tracks that are publicly available for visualization and analysis from within our site. We have also introduced several usability features including track search and a context-sensitive menu of options available with a right-click anywhere on the Browser's image.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genoma , Animais , Doença/genética , Genoma Humano , Genômica , Humanos , Internet , Anotação de Sequência Molecular , Fenótipo

12.

ENCODE whole-genome data in the UCSC Genome Browser: update 2012.

Rosenbloom, Kate R; Dreszer, Timothy R; Long, Jeffrey C; Malladi, Venkat S; Sloan, Cricket A; Raney, Brian J; Cline, Melissa S; Karolchik, Donna; Barber, Galt P; Clawson, Hiram; Diekhans, Mark; Fujita, Pauline A; Goldman, Mary; Gravell, Robert C; Harte, Rachel A; Hinrichs, Angie S; Kirkup, Vanessa M; Kuhn, Robert M; Learned, Katrina; Maddren, Morgan; Meyer, Laurence R; Pohl, Andy; Rhead, Brooke; Wong, Matthew C; Zweig, Ann S; Haussler, David; Kent, W James.

Nucleic Acids Res ; 40(Database issue): D912-7, 2012 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-22075998

RESUMO

The Encyclopedia of DNA Elements (ENCODE) Consortium is entering its 5th year of production-level effort generating high-quality whole-genome functional annotations of the human genome. The past year has brought the ENCODE compendium of functional elements to critical mass, with a diverse set of 27 biochemical assays now covering 200 distinct human cell types. Within the mouse genome, which has been under study by ENCODE groups for the past 2 years, 37 cell types have been assayed. Over 2000 individual experiments have been completed and submitted to the Data Coordination Center for public use. UCSC makes this data available on the quality-reviewed public Genome Browser (http://genome.ucsc.edu) and on an early-access Preview Browser (http://genome-preview.ucsc.edu). Visual browsing, data mining and download of raw and processed data files are all supported. An ENCODE portal (http://encodeproject.org) provides specialized tools and information about the ENCODE data sets.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genoma Humano , Genoma , Camundongos/genética , Animais , Humanos , Internet , Anotação de Sequência Molecular , Software

13.

Gene inactivation and its implications for annotation in the era of personal genomics.

Balasubramanian, Suganthi; Habegger, Lukas; Frankish, Adam; MacArthur, Daniel G; Harte, Rachel; Tyler-Smith, Chris; Harrow, Jennifer; Gerstein, Mark.

Genes Dev ; 25(1): 1-10, 2011 Jan 01.

Artigo em Inglês | MEDLINE | ID: mdl-21205862

RESUMO

The first wave of personal genomes documents how no single individual genome contains the full complement of functional genes. Here, we describe the extent of variation in gene and pseudogene numbers between individuals arising from inactivation events such as premature termination or aberrant splicing due to single-nucleotide polymorphisms. This highlights the inadequacy of the current reference sequence and gene set. We present a proposal to define a reference gene set that will remain stable as more individuals are sequenced. In particular, we recommend that the ancestral allele be used to define the reference sequence from which a core human reference gene annotation set can be derived. In addition, we call for the development of an expanded gene set to include human-specific genes that have arisen recently and are absent from the ancestral set.

Assuntos

Inativação Gênica/fisiologia , Privacidade Genética , Anotação de Sequência Molecular , Privacidade Genética/tendências , Variação Genética , Genoma Humano/genética , Humanos , Polimorfismo de Nucleotídeo Único

14.

The UCSC Genome Browser database: update 2011.

Fujita, Pauline A; Rhead, Brooke; Zweig, Ann S; Hinrichs, Angie S; Karolchik, Donna; Cline, Melissa S; Goldman, Mary; Barber, Galt P; Clawson, Hiram; Coelho, Antonio; Diekhans, Mark; Dreszer, Timothy R; Giardine, Belinda M; Harte, Rachel A; Hillman-Jackson, Jennifer; Hsu, Fan; Kirkup, Vanessa; Kuhn, Robert M; Learned, Katrina; Li, Chin H; Meyer, Laurence R; Pohl, Andy; Raney, Brian J; Rosenbloom, Kate R; Smith, Kayla E; Haussler, David; Kent, W James.

Nucleic Acids Res ; 39(Database issue): D876-82, 2011 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-20959295

RESUMO

The University of California, Santa Cruz Genome Browser (http://genome.ucsc.edu) offers online access to a database of genomic sequence and annotation data for a wide variety of organisms. The Browser also has many tools for visualizing, comparing and analyzing both publicly available and user-generated genomic data sets, aligning sequences and uploading user data. Among the features released this year are a gene search tool and annotation track drag-reorder functionality as well as support for BAM and BigWig/BigBed file formats. New display enhancements include overlay of multiple wiggle tracks through use of transparent coloring, options for displaying transformed wiggle data, a 'mean+whiskers' windowing function for display of wiggle data at high zoom levels, and more color schemes for microarray data. New data highlights include seven new genome assemblies, a Neandertal genome data portal, phenotype and disease association data, a human RNA editing track, and a zebrafish Conservation track. We also describe updates to existing tracks.

Assuntos

Bases de Dados Genéticas , Genômica , Animais , Doença/genética , Genes , Genoma Humano , Hominidae/genética , Humanos , Internet , Anotação de Sequência Molecular , Fenótipo , Edição de RNA , Software

15.

The UCSC Genome Browser database: update 2010.

Rhead, Brooke; Karolchik, Donna; Kuhn, Robert M; Hinrichs, Angie S; Zweig, Ann S; Fujita, Pauline A; Diekhans, Mark; Smith, Kayla E; Rosenbloom, Kate R; Raney, Brian J; Pohl, Andy; Pheasant, Michael; Meyer, Laurence R; Learned, Katrina; Hsu, Fan; Hillman-Jackson, Jennifer; Harte, Rachel A; Giardine, Belinda; Dreszer, Timothy R; Clawson, Hiram; Barber, Galt P; Haussler, David; Kent, W James.

Nucleic Acids Res ; 38(Database issue): D613-9, 2010 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-19906737

RESUMO

The University of California, Santa Cruz (UCSC) Genome Browser website (http://genome.ucsc.edu/) provides a large database of publicly available sequence and annotation data along with an integrated tool set for examining and comparing the genomes of organisms, aligning sequence to genomes, and displaying and sharing users' own annotation data. As of September 2009, genomic sequence and a basic set of annotation 'tracks' are provided for 47 organisms, including 14 mammals, 10 non-mammal vertebrates, 3 invertebrate deuterostomes, 13 insects, 6 worms and a yeast. New data highlights this year include an updated human genome browser, a 44-species multiple sequence alignment track, improved variation and phenotype tracks and 16 new genome-wide ENCODE tracks. New features include drag-and-zoom navigation, a Wiki track for user-added annotations, new custom track formats for large datasets (bigBed and bigWig), a new multiple alignment output tool, links to variation and protein structure tools, in silico PCR utility enhancements, and improved track configuration tools.

Assuntos

Biologia Computacional/métodos , Bases de Dados Genéticas , Bases de Dados de Ácidos Nucleicos , Genoma , Animais , Biologia Computacional/tendências , Variação Genética , Genoma Fúngico , Genômica , Humanos , Armazenamento e Recuperação da Informação/métodos , Internet , Invertebrados , Modelos Moleculares , Fenótipo , Software

16.

The completion of the Mammalian Gene Collection (MGC).

Temple, Gary; Gerhard, Daniela S; Rasooly, Rebekah; Feingold, Elise A; Good, Peter J; Robinson, Cristen; Mandich, Allison; Derge, Jeffrey G; Lewis, Jeanne; Shoaf, Debonny; Collins, Francis S; Jang, Wonhee; Wagner, Lukas; Shenmen, Carolyn M; Misquitta, Leonie; Schaefer, Carl F; Buetow, Kenneth H; Bonner, Tom I; Yankie, Linda; Ward, Ming; Phan, Lon; Astashyn, Alex; Brown, Garth; Farrell, Catherine; Hart, Jennifer; Landrum, Melissa; Maidak, Bonnie L; Murphy, Michael; Murphy, Terence; Rajput, Bhanu; Riddick, Lillian; Webb, David; Weber, Janet; Wu, Wendy; Pruitt, Kim D; Maglott, Donna; Siepel, Adam; Brejova, Brona; Diekhans, Mark; Harte, Rachel; Baertsch, Robert; Kent, Jim; Haussler, David; Brent, Michael; Langton, Laura; Comstock, Charles L G; Stevens, Michael; Wei, Chaochun; van Baren, Marijke J; Salehi-Ashtiani, Kourosh.

Genome Res ; 19(12): 2324-33, 2009 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-19767417

RESUMO

Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide.

Assuntos

Clonagem Molecular/métodos , Biologia Computacional/métodos , DNA Complementar/genética , Biblioteca Gênica , Genes/genética , Mamíferos/genética , Animais , DNA/biossíntese , Humanos , Camundongos , National Institutes of Health (U.S.) , Ratos , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Estados Unidos

17.

The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes.

Pruitt, Kim D; Harrow, Jennifer; Harte, Rachel A; Wallin, Craig; Diekhans, Mark; Maglott, Donna R; Searle, Steve; Farrell, Catherine M; Loveland, Jane E; Ruef, Barbara J; Hart, Elizabeth; Suner, Marie-Marthe; Landrum, Melissa J; Aken, Bronwen; Ayling, Sarah; Baertsch, Robert; Fernandez-Banet, Julio; Cherry, Joshua L; Curwen, Val; Dicuccio, Michael; Kellis, Manolis; Lee, Jennifer; Lin, Michael F; Schuster, Michael; Shkeda, Andrew; Amid, Clara; Brown, Garth; Dukhanina, Oksana; Frankish, Adam; Hart, Jennifer; Maidak, Bonnie L; Mudge, Jonathan; Murphy, Michael R; Murphy, Terence; Rajan, Jeena; Rajput, Bhanu; Riddick, Lillian D; Snow, Catherine; Steward, Charles; Webb, David; Weber, Janet A; Wilming, Laurens; Wu, Wenyu; Birney, Ewan; Haussler, David; Hubbard, Tim; Ostell, James; Durbin, Richard; Lipman, David.

Genome Res ; 19(7): 1316-23, 2009 Jul.

Artigo em Inglês | MEDLINE | ID: mdl-19498102

RESUMO

Effective use of the human and mouse genomes requires reliable identification of genes and their products. Although multiple public resources provide annotation, different methods are used that can result in similar but not identical representation of genes, transcripts, and proteins. The collaborative consensus coding sequence (CCDS) project tracks identical protein annotations on the reference mouse and human genomes with a stable identifier (CCDS ID), and ensures that they are consistently represented on the NCBI, Ensembl, and UCSC Genome Browsers. Importantly, the project coordinates on manually reviewing inconsistent protein annotations between sites, as well as annotations for which new evidence suggests a revision is needed, to progressively converge on a complete protein-coding set for the human and mouse reference genomes, while maintaining a high standard of reliability and biological accuracy. To date, the project has identified 20,159 human and 17,707 mouse consensus coding regions from 17,052 human and 16,893 mouse genes. Three evaluation methods indicate that the entries in the CCDS set are highly likely to represent real proteins, more so than annotations from contributing groups not included in CCDS. The CCDS database thus centralizes the function of identifying well-supported, identically-annotated, protein-coding regions.

Assuntos

Sequência Consenso , Genoma , Fases de Leitura Aberta/genética , Animais , Humanos , Camundongos , Alinhamento de Sequência

18.

The ENCODE Project at UC Santa Cruz.

Thomas, Daryl J; Rosenbloom, Kate R; Clawson, Hiram; Hinrichs, Angie S; Trumbower, Heather; Raney, Brian J; Karolchik, Donna; Barber, Galt P; Harte, Rachel A; Hillman-Jackson, Jennifer; Kuhn, Robert M; Rhead, Brooke L; Smith, Kayla E; Thakkapallayil, Archana; Zweig, Ann S; Haussler, David; Kent, W James.

Nucleic Acids Res ; 35(Database issue): D663-7, 2007 Jan.

Artigo em Inglês | MEDLINE | ID: mdl-17166863

RESUMO

The goal of the Encyclopedia Of DNA Elements (ENCODE) Project is to identify all functional elements in the human genome. The pilot phase is for comparison of existing methods and for the development of new methods to rigorously analyze a defined 1% of the human genome sequence. Experimental datasets are focused on the origin of replication, DNase I hypersensitivity, chromatin immunoprecipitation, promoter function, gene structure, pseudogenes, non-protein-coding RNAs, transcribed RNAs, multiple sequence alignment and evolutionarily constrained elements. The ENCODE project at UCSC website (http://genome.ucsc.edu/ENCODE) is the primary portal for the sequence-based data produced as part of the ENCODE project. In the pilot phase of the project, over 30 labs provided experimental results for a total of 56 browser tracks supported by 385 database tables. The site provides researchers with a number of tools that allow them to visualize and analyze the data as well as download data for local analyses. This paper describes the portal to the data, highlights the data that has been made available, and presents the tools that have been developed within the ENCODE project. Access to the data and types of interactive analysis that are possible are illustrated through supplemental examples.

Assuntos

Bases de Dados de Ácidos Nucleicos , Genoma Humano , Genômica , Sequência de Bases , Humanos , Internet , Alinhamento de Sequência , Software , Interface Usuário-Computador

19.

Piloting the zebrafish genome browser.

DiBiase, Anthony; Harte, Rachel A; Zhou, Yi; Zon, Leonard; Kent, W James.

Dev Dyn ; 235(3): 747-53, 2006 Mar.

Artigo em Inglês | MEDLINE | ID: mdl-16372332

RESUMO

This correspondence is a primer for the zebrafish research community on zebrafish tracks available in the UCSC Genome Browser at http://genome.ucsc.edu based on Sanger's Zv4 assembly. A primary capability of this facility is comparative informatics between humans (as well as many other model organisms) and zebrafish. The zebrafish genome sequencing project has played important roles in mutant mapping and cloning, and comparative genomic research projects. This easy-to-use genome browser aims to display and download useful genome sequence information for zebrafish mutant mapping and cloning projects. Its user-friendly interface expedites annotation of the zebrafish genome sequence.

Assuntos

Biologia Computacional , Bases de Dados Genéticas , Genoma , Software , Peixe-Zebra/genética , Animais , Genômica , Humanos , Camundongos , Análise de Sequência de DNA , Análise de Sequência de Proteína

20.

Expression analysis of secreted and cell surface genes of five transformed human cell lines and derivative xenograft tumors.

Stull, Robert A; Tavassoli, Roya; Kennedy, Scot; Osborn, Steve; Harte, Rachel; Lu, Yan; Napier, Cheryl; Abo, Arie; Chin, Daniel J.

BMC Genomics ; 6: 55, 2005 Apr 18.

Artigo em Inglês | MEDLINE | ID: mdl-15836779

RESUMO

BACKGROUND: Since the early stages of tumorigenesis involve adhesion, escape from immune surveillance, vascularization and angiogenesis, we devised a strategy to study the expression profiles of all publicly known and putative secreted and cell surface genes. We designed a custom oligonucleotide microarray containing probes for 3531 secreted and cell surface genes to study 5 diverse human transformed cell lines and their derivative xenograft tumors. The origins of these human cell lines were lung (A549), breast (MDA MB-231), colon (HCT-116), ovarian (SK-OV-3) and prostate (PC3) carcinomas. RESULTS: Three different analyses were performed: (1) A PCA-based linear discriminant analysis identified a 54 gene profile characteristic of all tumors, (2) Application of MANOVA (Pcorr < .05) to tumor data revealed a larger set of 149 differentially expressed genes. (3) After MANOVA was performed on data from individual tumors, a comparison of differential genes amongst all tumor types revealed 12 common differential genes. Seven of the 12 genes were identified by all three analytical methods. These included late angiogenic, morphogenic and extracellular matrix genes such as ANGPTL4, COL1A1, GP2, GPR57, LAMB3, PCDHB9 and PTGER3. The differential expression of ANGPTL4 and COL1A1 and other genes was confirmed by quantitative PCR. CONCLUSION: Overall, a comparison of the three analyses revealed an expression pattern indicative of late angiogenic processes. These results show that a xenograft model using multiple cell lines of diverse tissue origin can identify common tumorigenic cell surface or secreted molecules that may be important biomarker and therapeutic discoveries.

Assuntos

Biomarcadores Tumorais/genética , Membrana Celular/metabolismo , Perfilação da Expressão Gênica/métodos , Regulação Neoplásica da Expressão Gênica , Proteínas de Membrana/química , Neovascularização Patológica , Análise de Variância , Animais , Linhagem Celular Transformada , Linhagem Celular Tumoral , DNA Complementar/metabolismo , Feminino , Marcadores Genéticos , Técnicas Genéticas , Genômica/métodos , Humanos , Masculino , Proteínas de Membrana/genética , Camundongos , Camundongos Endogâmicos BALB C , Análise Multivariada , Transplante de Neoplasias , Hibridização de Ácido Nucleico , Análise de Sequência com Séries de Oligonucleotídeos , Reação em Cadeia da Polimerase , Análise de Componente Principal , RNA/metabolismo , Transdução de Sinais

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA