Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 11 de 11
Filtrar
1.
Proc Natl Acad Sci U S A ; 111(17): 6131-8, 2014 Apr 29.
Artigo em Inglês | MEDLINE | ID: mdl-24753594

RESUMO

With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease.


Assuntos
DNA/genética , Genoma Humano/genética , Evolução Biológica , Doença/genética , Humanos , Sequências Reguladoras de Ácido Nucleico/genética , Software
2.
Genome Res ; 19(12): 2324-33, 2009 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-19767417

RESUMO

Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide.


Assuntos
Clonagem Molecular/métodos , Biologia Computacional/métodos , DNA Complementar/genética , Biblioteca Gênica , Genes/genética , Mamíferos/genética , Animais , DNA/biossíntese , Humanos , Camundongos , National Institutes of Health (U.S.) , Ratos , Reação em Cadeia da Polimerase Via Transcriptase Reversa , Estados Unidos
4.
PLoS Genet ; 2(10): e168, 2006 Oct 13.
Artigo em Inglês | MEDLINE | ID: mdl-17040131

RESUMO

Comparative genomics allow us to search the human genome for segments that were extensively changed in the last approximately 5 million years since divergence from our common ancestor with chimpanzee, but are highly conserved in other species and thus are likely to be functional. We found 202 genomic elements that are highly conserved in vertebrates but show evidence of significantly accelerated substitution rates in human. These are mostly in non-coding DNA, often near genes associated with transcription and DNA binding. Resequencing confirmed that the five most accelerated elements are dramatically changed in human but not in other primates, with seven times more substitutions in human than in chimp. The accelerated elements, and in particular the top five, show a strong bias for adenine and thymine to guanine and cytosine nucleotide changes and are disproportionately located in high recombination and high guanine and cytosine content environments near telomeres, suggesting either biased gene conversion or isochore selection. In addition, there is some evidence of directional selection in the regions containing the two most accelerated regions. A combination of evolutionary forces has contributed to accelerated evolution of the fastest evolving elements in the human genome.


Assuntos
Evolução Molecular , Genoma Humano/genética , Seleção Genética , Animais , Pareamento de Bases , Sequência de Bases , Sequência Conservada , Humanos , Dados de Sequência Molecular , Recombinação Genética , Elementos Reguladores de Transcrição/genética , Análise de Sequência de DNA , Especificidade da Espécie
5.
Methods Mol Biol ; 422: 133-44, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-18629665

RESUMO

The UC Santa Cruz Genome Browser provides a number of resources that can be used for phylogenomic studies, including (1) whole-genome sequence data from a number of vertebrate species, (2) pairwise alignments of the human genome sequence to a number of other vertebrate genome, (3) a simultaneous alignment of 17 vertebrate genomes (most of them incompletely sequenced) that covers all of the human sequence, (4) several independent sets of multiple alignments covering 1% of the human genome (ENCODE regions), (5) extensive sequence annotation for interpreting those sequences and alignments, and (6) sequence, alignments, and annotations from certain other species, including an alignment of nine insect genomes. We illustrate the use of these resources in the context of assigning rare genomic changes to the branch of the phylogenetic tree where they appear to have occurred, or of looking for evidence supporting a particular possible tree topology. Sample source code for performing such studies is available.


Assuntos
Genoma/genética , Genômica/métodos , Internet , Filogenia , Animais , Quebra Cromossômica , Cromossomos , Drosophila/genética , Humanos , Sequências Repetitivas Dispersas/genética , Alinhamento de Sequência
6.
Hum Mutat ; 28(6): 554-62, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17326095

RESUMO

PhenCode (Phenotypes for ENCODE; http://www.bx.psu.edu/phencode) is a collaborative, exploratory project to help understand phenotypes of human mutations in the context of sequence and functional data from genome projects. Currently, it connects human phenotype and clinical data in various locus-specific databases (LSDBs) with data on genome sequences, evolutionary history, and function from the ENCODE project and other resources in the UCSC Genome Browser. Initially, we focused on a few selected LSDBs covering genes encoding alpha- and beta-globins (HBA, HBB), phenylalanine hydroxylase (PAH), blood group antigens (various genes), androgen receptor (AR), cystic fibrosis transmembrane conductance regulator (CFTR), and Bruton's tyrosine kinase (BTK), but we plan to include additional loci of clinical importance, ultimately genomewide. We have also imported variant data and associated OMIM links from Swiss-Prot. Users can find interesting mutations in the UCSC Genome Browser (in a new Locus Variants track) and follow links back to the LSDBs for more detailed information. Alternatively, they can start with queries on mutations or phenotypes at an LSDB and then display the results at the Genome Browser to view complementary information such as functional data (e.g., chromatin modifications and protein binding from the ENCODE consortium), evolutionary constraint, regulatory potential, and/or any other tracks they choose. We present several examples illustrating the power of these connections for exploring phenotypes associated with functional elements, and for identifying genomic data that could help to explain clinical phenotypes.


Assuntos
Bases de Dados Genéticas , Mutação , Fenótipo , Tirosina Quinase da Agamaglobulinemia , Antígenos de Grupos Sanguíneos/genética , Comportamento Cooperativo , Regulador de Condutância Transmembrana em Fibrose Cística/genética , Bases de Dados Genéticas/normas , Genótipo , Globinas/genética , Humanos , Internet , Fenilalanina Hidroxilase/genética , Proteínas Tirosina Quinases/genética , Receptores Androgênicos/genética , Design de Software , Integração de Sistemas
7.
PLoS Comput Biol ; 2(4): e33, 2006 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-16628248

RESUMO

The discoveries of microRNAs and riboswitches, among others, have shown functional RNAs to be biologically more important and genomically more prevalent than previously anticipated. We have developed a general comparative genomics method based on phylogenetic stochastic context-free grammars for identifying functional RNAs encoded in the human genome and used it to survey an eight-way genome-wide alignment of the human, chimpanzee, mouse, rat, dog, chicken, zebra-fish, and puffer-fish genomes for deeply conserved functional RNAs. At a loose threshold for acceptance, this search resulted in a set of 48,479 candidate RNA structures. This screen finds a large number of known functional RNAs, including 195 miRNAs, 62 histone 3'UTR stem loops, and various types of known genetic recoding elements. Among the highest-scoring new predictions are 169 new miRNA candidates, as well as new candidate selenocysteine insertion sites, RNA editing hairpins, RNAs involved in transcript auto regulation, and many folds that form singletons or small functional RNA families of completely unknown function. While the rate of false positives in the overall set is difficult to estimate and is likely to be substantial, the results nevertheless provide evidence for many new human functional RNAs and present specific predictions to facilitate their further characterization.


Assuntos
Genoma Humano , MicroRNAs/química , Conformação de Ácido Nucleico , Análise de Sequência de RNA/métodos , Regiões 3' não Traduzidas , Animais , Galinhas , Biologia Computacional/métodos , Sequência Conservada , Cães , Genoma , Humanos , Camundongos , Ratos , Tetraodontiformes , Peixe-Zebra
8.
Genome Biol ; 17(1): 148, 2016 07 05.
Artigo em Inglês | MEDLINE | ID: mdl-27380939

RESUMO

BACKGROUND: The success of the CRISPR/Cas9 genome editing technique depends on the choice of the guide RNA sequence, which is facilitated by various websites. Despite the importance and popularity of these algorithms, it is unclear to which extent their predictions are in agreement with actual measurements. RESULTS: We conduct the first independent evaluation of CRISPR/Cas9 predictions. To this end, we collect data from eight SpCas9 off-target studies and compare them with the sites predicted by popular algorithms. We identify problems in one implementation but found that sequence-based off-target predictions are very reliable, identifying most off-targets with mutation rates superior to 0.1 %, while the number of false positives can be largely reduced with a cutoff on the off-target score. We also evaluate on-target efficiency prediction algorithms against available datasets. The correlation between the predictions and the guide activity varied considerably, especially for zebrafish. Together with novel data from our labs, we find that the optimal on-target efficiency prediction model strongly depends on whether the guide RNA is expressed from a U6 promoter or transcribed in vitro. We further demonstrate that the best predictions can significantly reduce the time spent on guide screening. CONCLUSIONS: To make these guidelines easily accessible to anyone planning a CRISPR genome editing experiment, we built a new website ( http://crispor.org ) that predicts off-targets and helps select and clone efficient guide sequences for more than 120 genomes using different Cas9 proteins and the eight efficiency scoring systems evaluated here.


Assuntos
Sistemas CRISPR-Cas/genética , Edição de Genes , RNA Guia de Cinetoplastídeos/genética , Software , Algoritmos , Genoma , Internet , Regiões Promotoras Genéticas , RNA Nuclear Pequeno/genética
10.
Nat Genet ; 40(5): 523-7, 2008 May.
Artigo em Inglês | MEDLINE | ID: mdl-18443589

RESUMO

It has been four years since the original publication of the draft sequence of the rat genome. Five groups are now working together to assemble, annotate and release an updated version of the rat genome. As the prevailing model for physiology, complex disease and pharmacological studies, there is an acute need for the rat's genomic resources to keep pace with the rat's prominence in the laboratory. In this commentary, we describe the current status of the rat genome sequence and the plans for its impending 'upgrade'. We then cover the key online resources providing access to the rat genome, including the new SNP views at Ensembl, the RefSeq and Genes databases at the US National Center for Biotechnology Information, Genome Browser at the University of California Santa Cruz and the disease portals for cardiovascular disease and obesity at the Rat Genome Database.


Assuntos
Bases de Dados Genéticas , Genoma , Ratos/genética , Animais , Biologia Computacional , Modelos Animais de Doenças , Doenças Genéticas Inatas/genética , Variação Genética , Genômica , Haplótipos , Humanos , Internet , Polimorfismo de Nucleotídeo Único , Ratos Mutantes , Análise de Sequência de DNA
11.
Genome Res ; 14(10B): 2121-7, 2004 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-15489334

RESUMO

The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.


Assuntos
Clonagem Molecular/métodos , DNA Complementar , Biblioteca Gênica , Fases de Leitura Aberta/fisiologia , Animais , Biologia Computacional , Primers do DNA , DNA Complementar/genética , DNA Complementar/metabolismo , Humanos , Camundongos , National Institutes of Health (U.S.) , Ratos , Estados Unidos , Xenopus laevis/genética , Peixe-Zebra/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA