Pesquisa | BVS Integralidade em Saúde

The completion of the Mammalian Gene Collection (MGC).

Temple, Gary; Gerhard, Daniela S; Rasooly, Rebekah; Feingold, Elise A; Good, Peter J; Robinson, Cristen; Mandich, Allison; Derge, Jeffrey G; Lewis, Jeanne; Shoaf, Debonny; Collins, Francis S; Jang, Wonhee; Wagner, Lukas; Shenmen, Carolyn M; Misquitta, Leonie; Schaefer, Carl F; Buetow, Kenneth H; Bonner, Tom I; Yankie, Linda; Ward, Ming; Phan, Lon; Astashyn, Alex; Brown, Garth; Farrell, Catherine; Hart, Jennifer; Landrum, Melissa; Maidak, Bonnie L; Murphy, Michael; Murphy, Terence; Rajput, Bhanu; Riddick, Lillian; Webb, David; Weber, Janet; Wu, Wendy; Pruitt, Kim D; Maglott, Donna; Siepel, Adam; Brejova, Brona; Diekhans, Mark; Harte, Rachel; Baertsch, Robert; Kent, Jim; Haussler, David; Brent, Michael; Langton, Laura; Comstock, Charles L G; Stevens, Michael; Wei, Chaochun; van Baren, Marijke J; Salehi-Ashtiani, Kourosh.

Genome Res ; 19(12): 2324-33, 2009 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-19767417

RESUMO

Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide.

Assuntos

Clonagem Molecular/métodos , Biologia Computacional/métodos , DNA Complementar/genética , Biblioteca Gênica , Genes/genética , Mamíferos/genética , Animais , DNA/biossíntese , Humanos , Camundongos , National Institutes of Health (U.S.) , Ratos , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Estados Unidos

Targeted discovery of novel human exons by comparative genomics.

Siepel, Adam; Diekhans, Mark; Brejová, Brona; Langton, Laura; Stevens, Michael; Comstock, Charles L G; Davis, Colleen; Ewing, Brent; Oommen, Shelly; Lau, Christopher; Yu, Hung-Chun; Li, Jianfeng; Roe, Bruce A; Green, Phil; Gerhard, Daniela S; Temple, Gary; Haussler, David; Brent, Michael R.

Genome Res ; 17(12): 1763-73, 2007 Dec.

Artigo em Inglês | MEDLINE | ID: mdl-17989246

RESUMO

A complete and accurate set of human protein-coding gene annotations is perhaps the single most important resource for genomic research after the human-genome sequence itself, yet the major gene catalogs remain incomplete and imperfect. Here we describe a genome-wide effort, carried out as part of the Mammalian Gene Collection (MGC) project, to identify human genes not yet in the gene catalogs. Our approach was to produce gene predictions by algorithms that rely on comparative sequence data but do not require direct cDNA evidence, then to test predicted novel genes by RT-PCR. We have identified 734 novel gene fragments (NGFs) containing 2188 exons with, at most, weak prior cDNA support. These NGFs correspond to an estimated 563 distinct genes, of which >160 are completely absent from the major gene catalogs, while hundreds of others represent significant extensions of known genes. The NGFs appear to be predominantly protein-coding genes rather than noncoding RNAs, unlike novel transcribed sequences identified by technologies such as tiling arrays and CAGE. They tend to be expressed at low levels and in a tissue-specific manner, and they are enriched for roles in motor activity, cell adhesion, connective tissue, and central nervous system development. Our results demonstrate that many important genes and gene fragments have been missed by traditional approaches to gene discovery but can be identified by their evolutionary signatures using comparative sequence data. However, they suggest that hundreds-not thousands-of protein-coding genes are completely missing from the current gene catalogs.

Assuntos

Éxons/genética , Genômica , Animais , Sequência de Bases , Galinhas/genética , Biologia Computacional , Etiquetas de Sequências Expressas , Genoma Humano , Humanos , Camundongos , Valor Preditivo dos Testes , Ratos , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Peixe-Zebra/embriologia , Peixe-Zebra/genética

RESUMO

Assuntos

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

Detalhe da pesquisa