Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 17 de 17
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
2.
Nat Biotechnol ; 30(1): 83-9, 2011 Nov 06.
Artículo en Inglés | MEDLINE | ID: mdl-22057054

RESUMEN

Pigeonpea is an important legume food crop grown primarily by smallholder farmers in many semi-arid tropical regions of the world. We used the Illumina next-generation sequencing platform to generate 237.2 Gb of sequence, which along with Sanger-based bacterial artificial chromosome end sequences and a genetic map, we assembled into scaffolds representing 72.7% (605.78 Mb) of the 833.07 Mb pigeonpea genome. Genome analysis predicted 48,680 genes for pigeonpea and also showed the potential role that certain gene families, for example, drought tolerance-related genes, have played throughout the domestication of pigeonpea and the evolution of its ancestors. Although we found a few segmental duplication events, we did not observe the recent genome-wide duplication events observed in soybean. This reference genome sequence will facilitate the identification of the genetic basis of agronomically important traits, and accelerate the development of improved pigeonpea varieties that could improve food security in many developing countries.


Asunto(s)
Cajanus/genética , Genes de Plantas , Genoma de Planta , Análisis de Secuencia de ADN/métodos , Mapeo Cromosómico , Cromosomas Artificiales Bacterianos/genética , Marcadores Genéticos , Anotación de Secuencia Molecular , Secuencias Repetitivas de Ácidos Nucleicos/genética , Duplicaciones Segmentarias en el Genoma , Glycine max/genética , Sintenía/genética
3.
Genome Res ; 21(12): 2224-41, 2011 Dec.
Artículo en Inglés | MEDLINE | ID: mdl-21926179

RESUMEN

Low-cost short read sequencing technology has revolutionized genomics, though it is only just becoming practical for the high-quality de novo assembly of a novel large genome. We describe the Assemblathon 1 competition, which aimed to comprehensively assess the state of the art in de novo assembly methods when applied to current sequencing technologies. In a collaborative effort, teams were asked to assemble a simulated Illumina HiSeq data set of an unknown, simulated diploid genome. A total of 41 assemblies from 17 different groups were received. Novel haplotype aware assessments of coverage, contiguity, structure, base calling, and copy number were made. We establish that within this benchmark: (1) It is possible to assemble the genome to a high level of coverage and accuracy, and that (2) large differences exist between the assemblies, suggesting room for further improvements in current methods. The simulated benchmark, including the correct answer, the assemblies, and the code that was used to evaluate the assemblies is now public and freely available from http://www.assemblathon.org/.


Asunto(s)
Genoma/fisiología , Genómica/métodos , Análisis de Secuencia de ADN/métodos
4.
Nat Genet ; 43(3): 228-35, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21336279

RESUMEN

Genome evolution studies for the phylum Nematoda have been limited by focusing on comparisons involving Caenorhabditis elegans. We report a draft genome sequence of Trichinella spiralis, a food-borne zoonotic parasite, which is the most common cause of human trichinellosis. This parasitic nematode is an extant member of a clade that diverged early in the evolution of the phylum, enabling identification of archetypical genes and molecular signatures exclusive to nematodes. We sequenced the 64-Mb nuclear genome, which is estimated to contain 15,808 protein-coding genes, at ∼35-fold coverage using whole-genome shotgun and hierarchal map-assisted sequencing. Comparative genome analyses support intrachromosomal rearrangements across the phylum, disproportionate numbers of protein family deaths over births in parasitic compared to a non-parasitic nematode and a preponderance of gene-loss and -gain events in nematodes relative to Drosophila melanogaster. This genome sequence and the identified pan-phylum characteristics will contribute to genome evolution studies of Nematoda as well as strategies to combat global parasites of humans, food animals and crops.


Asunto(s)
Genoma de los Helmintos , Trichinella spiralis/genética , Animales , Secuencia de Bases , Secuencia Conservada , Evolución Molecular , Datos de Secuencia Molecular , Nematodos/genética , Filogenia , Análisis de Secuencia de ADN/métodos
5.
Nature ; 469(7331): 529-33, 2011 Jan 27.
Artículo en Inglés | MEDLINE | ID: mdl-21270892

RESUMEN

'Orang-utan' is derived from a Malay term meaning 'man of the forest' and aptly describes the southeast Asian great apes native to Sumatra and Borneo. The orang-utan species, Pongo abelii (Sumatran) and Pongo pygmaeus (Bornean), are the most phylogenetically distant great apes from humans, thereby providing an informative perspective on hominid evolution. Here we present a Sumatran orang-utan draft genome assembly and short read sequence data from five Sumatran and five Bornean orang-utan genomes. Our analyses reveal that, compared to other primates, the orang-utan genome has many unique features. Structural evolution of the orang-utan genome has proceeded much more slowly than other great apes, evidenced by fewer rearrangements, less segmental duplication, a lower rate of gene family turnover and surprisingly quiescent Alu repeats, which have played a major role in restructuring other primate genomes. We also describe a primate polymorphic neocentromere, found in both Pongo species, emphasizing the gradual evolution of orang-utan genome structure. Orang-utans have extremely low energy usage for a eutherian mammal, far lower than their hominid relatives. Adding their genome to the repertoire of sequenced primates illuminates new signals of positive selection in several pathways including glycolipid metabolism. From the population perspective, both Pongo species are deeply diverse; however, Sumatran individuals possess greater diversity than their Bornean counterparts, and more species-specific variation. Our estimate of Bornean/Sumatran speciation time, 400,000 years ago, is more recent than most previous studies and underscores the complexity of the orang-utan speciation process. Despite a smaller modern census population size, the Sumatran effective population size (N(e)) expanded exponentially relative to the ancestral N(e) after the split, while Bornean N(e) declined over the same period. Overall, the resources and analyses presented here offer new opportunities in evolutionary genomics, insights into hominid biology, and an extensive database of variation for conservation efforts.


Asunto(s)
Variación Genética , Genoma/genética , Pongo abelii/genética , Pongo pygmaeus/genética , Animales , Centrómero/genética , Cerebrósidos/metabolismo , Cromosomas , Evolución Molecular , Femenino , Reordenamiento Génico/genética , Especiación Genética , Genética de Población , Humanos , Masculino , Filogenia , Densidad de Población , Dinámica Poblacional , Especificidad de la Especie
6.
Genome Res ; 19(3): 470-80, 2009 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-19204328

RESUMEN

The majority of nematodes are gonochoristic (dioecious) with distinct male and female sexes, but the best-studied species, Caenorhabditis elegans, is a self-fertile hermaphrodite. The sequencing of the genomes of C. elegans and a second hermaphrodite, C. briggsae, was facilitated in part by the low amount of natural heterozygosity, which typifies selfing species. Ongoing genome projects for gonochoristic Caenorhabditis species seek to approximate this condition by intense inbreeding prior to sequencing. Here we show that despite this inbreeding, the heterozygous fraction of the whole genome shotgun assemblies of three gonochoristic Caenorhabditis species, C. brenneri, C. remanei, and C. japonica, is considerable. We first demonstrate experimentally that independently assembled sequence variants in C. remanei and C. brenneri are allelic. We then present gene-based approaches for recognizing heterozygous regions of WGS assemblies. We also develop a simple method for quantifying heterozygosity that can be applied to assemblies lacking gene annotations. Consistently we find that approximately 10% and 30% of the C. remanei and C. brenneri genomes, respectively, are represented by two alleles in the assemblies. Heterozygosity is restricted to autosomes and its retention is accompanied by substantial inbreeding depression, suggesting that it is caused by multiple recessive deleterious alleles and not merely by chance. Both the overall amount and chromosomal distribution of heterozygous DNA is highly variable between assemblies of close relatives produced by identical methodologies, and allele frequencies have continued to change after strains were sequenced. Our results highlight the impact of mating systems on genome sequencing projects.


Asunto(s)
Mapeo Cromosómico/métodos , Cruzamientos Genéticos , Tamización de Portadores Genéticos/métodos , Genoma de los Helmintos , Nematodos/genética , Alelos , Animales , Trastornos del Desarrollo Sexual/genética , Femenino , Heterocigoto , Endogamia , Análisis de Secuencia de ADN , Especificidad de la Especie
7.
PLoS Genet ; 4(8): e1000183, 2008 Aug 29.
Artículo en Inglés | MEDLINE | ID: mdl-18769710

RESUMEN

The abundance and identity of functional variation segregating in natural populations is paramount to dissecting the molecular basis of quantitative traits as well as human genetic diseases. Genome sequencing of multiple organisms of the same species provides an efficient means of cataloging rearrangements, insertion, or deletion polymorphisms (InDels) and single-nucleotide polymorphisms (SNPs). While inbreeding depression and heterosis imply that a substantial amount of polymorphism is deleterious, distinguishing deleterious from neutral polymorphism remains a significant challenge. To identify deleterious and neutral DNA sequence variation within Saccharomyces cerevisiae, we sequenced the genome of a vineyard and oak tree strain and compared them to a reference genome. Among these three strains, 6% of the genome is variable, mostly attributable to variation in genome content that results from large InDels. Out of the 88,000 polymorphisms identified, 93% are SNPs and a small but significant fraction can be attributed to recent interspecific introgression and ectopic gene conversion. In comparison to the reference genome, there is substantial evidence for functional variation in gene content and structure that results from large InDels, frame-shifts, and polymorphic start and stop codons. Comparison of polymorphism to divergence reveals scant evidence for positive selection but an abundance of evidence for deleterious SNPs. We estimate that 12% of coding and 7% of noncoding SNPs are deleterious. Based on divergence among 11 yeast species, we identified 1,666 nonsynonymous SNPs that disrupt conserved amino acids and 1,863 noncoding SNPs that disrupt conserved noncoding motifs. The deleterious coding SNPs include those known to affect quantitative traits, and a subset of the deleterious noncoding SNPs occurs in the promoters of genes that show allele-specific expression, implying that some cis-regulatory SNPs are deleterious. Our results show that the genome sequences of both closely and distantly related species provide a means of identifying deleterious polymorphisms that disrupt functionally conserved coding and noncoding sequences.


Asunto(s)
Polimorfismo de Nucleótido Simple , Saccharomyces cerevisiae/genética , Secuencia de Bases , Sitios de Unión , Conversión Génica , Reordenamiento Génico , Genoma Fúngico , Datos de Secuencia Molecular , Filogenia , Quercus/microbiología , Saccharomyces cerevisiae/clasificación , Saccharomyces cerevisiae/metabolismo , Selección Genética , Alineación de Secuencia , Factores de Transcripción/genética , Factores de Transcripción/metabolismo , Vitis/microbiología
8.
Nat Genet ; 40(10): 1193-8, 2008 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-18806794

RESUMEN

Here we present a draft genome sequence of the nematode Pristionchus pacificus, a species that is associated with beetles and is used as a model system in evolutionary biology. With 169 Mb and 23,500 predicted protein-coding genes, the P. pacificus genome is larger than those of Caenorhabditis elegans and the human parasite Brugia malayi. Compared to C. elegans, the P. pacificus genome has more genes encoding cytochrome P450 enzymes, glucosyltransferases, sulfotransferases and ABC transporters, many of which were experimentally validated. The P. pacificus genome contains genes encoding cellulase and diapausin, and cellulase activity is found in P. pacificus secretions, indicating that cellulases can be found in nematodes beyond plant parasites. The relatively higher number of detoxification and degradation enzymes in P. pacificus is consistent with its necromenic lifestyle and might represent a preadaptation for parasitism. Thus, comparative genomics analysis of three ecologically distinct nematodes offers a unique opportunity to investigate the association between genome structure and lifestyle.


Asunto(s)
Mapeo Cromosómico , Escarabajos/parasitología , Genes de Helminto , Genoma de los Helmintos , Intestinos/parasitología , Nematodos/fisiología , Secuencia de Aminoácidos , Animales , Caenorhabditis elegans/genética , Evolución Molecular , Exones/genética , Interacciones Huésped-Parásitos , Intrones/genética , Datos de Secuencia Molecular , Filogenia , Homología de Secuencia de Aminoácido
9.
Nature ; 453(7192): 175-83, 2008 May 08.
Artículo en Inglés | MEDLINE | ID: mdl-18464734

RESUMEN

We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.


Asunto(s)
Evolución Molecular , Genoma/genética , Ornitorrinco/genética , Animales , Composición de Base , Dentición , Femenino , Impresión Genómica/genética , Humanos , Inmunidad/genética , Masculino , Mamíferos/genética , MicroARNs/genética , Proteínas de la Leche/genética , Filogenia , Ornitorrinco/inmunología , Ornitorrinco/fisiología , Receptores Odorantes/genética , Secuencias Repetitivas de Ácidos Nucleicos/genética , Reptiles/genética , Análisis de Secuencia de ADN , Espermatozoides/metabolismo , Ponzoñas/genética , Zona Pelúcida/metabolismo
10.
Mol Biochem Parasitol ; 157(2): 187-92, 2008 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-18082904

RESUMEN

Hookworms infect nearly a billion people. The Ancylostoma caninum hookworm of canids is a model for studying human infections and information from its genome coupled with functional genomics and proteomics can accelerate progress towards hookworm control. As a step towards a full-scale A. caninum genome project, we generated 104,000 genome survey sequences (GSSs) and determined the genome size of the canine hookworm. GSSs assembled into 57.6 Mb of unique sequence from a genome that we estimate by flow cytometry of isolated nuclei to be 347 +/- 1.2 Mb, substantially larger than other Rhabditina species. Gene finding identified 5538 genes in the GSS assembly, for a total of 9113 non-redundant A. caninum genes when EST sequences are also considered. Functional classifications of many of the 70% of genes with homology to genes in other species are provided based on gene ontology and KEGG associations and secreted and membrane-bound proteins are also identified.


Asunto(s)
Ancylostoma/genética , Genoma de los Helmintos , Proteínas del Helminto/genética , Animales , Etiquetas de Secuencia Expresada , Genómica , Proteínas del Helminto/fisiología , Análisis de Secuencia de ADN , Homología de Secuencia
11.
Genome Res ; 16(6): 768-75, 2006 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-16741162

RESUMEN

We describe a targeted approach to improve the contiguity of whole-genome shotgun sequence (WGS) assemblies at run-time, using information from Bacterial Artificial Chromosome (BAC)-based physical maps. Clone sizes and overlaps derived from clone fingerprints are used for the calculation of length constraints between any two BAC neighbors sharing 40% of their size. These constraints are used to promote the linkage and guide the arrangement of sequence contigs within a sequence scaffold at the layout phase of WGS assemblies. This process is facilitated by FASSI, a stand-alone application that calculates BAC end and BAC overlap length constraints from clone fingerprint map contigs created by the FPC package. FASSI is designed to work with the assembly tool PCAP, but its output can be formatted to work with other WGS assembly algorithms able to use length constraints for individual clones. The FASSI method is simple to implement, potentially cost-effective, and has resulted in the increase of scaffold contiguity for both the Drosophila melanogaster and Cryptococcus gattii genomes when compared to a control assembly without map-derived constraints. A 6.5-fold coverage draft DNA sequence of the Pan troglodytes (chimpanzee) genome was assembled using map-derived constraints and resulted in a 26.1% increase in scaffold contiguity.


Asunto(s)
Cryptococcus/genética , Drosophila melanogaster/genética , Genoma , Pan troglodytes/genética , Mapeo Físico de Cromosoma , Análisis de Secuencia de ADN/métodos , Animales , Cromosomas Artificiales Bacterianos/genética , Bases de Datos de Ácidos Nucleicos , Programas Informáticos
12.
Nucleic Acids Res ; 34(1): 201-5, 2006.
Artículo en Inglés | MEDLINE | ID: mdl-16397298

RESUMEN

We introduce a data structure called a superword array for finding quickly matches between DNA sequences. The superword array possesses some desirable features of the lookup table and suffix array. We describe simple algorithms for constructing and using a superword array to find pairs of sequences that share a unique superword. The algorithms are implemented in a genome assembly program called PCAP.REP for computation of overlaps between reads. Experimental results produced by PCAP.REP and PCAP on a whole-genome dataset show that PCAP.REP produced a more accurate and contiguous assembly than PCAP.


Asunto(s)
Algoritmos , Genómica/métodos , Biología Computacional/métodos , Genoma Fúngico , Histoplasma/genética
13.
Curr Protoc Bioinformatics ; Chapter 11: Unit11.3, 2005 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-18428744

RESUMEN

This unit describes how to use the Parallel Contig Assembly Program (PCAP) to assemble the data produced by a whole-genome shotgun sequencing project. We present a basic protocol for using PCAP on a multiprocessor computer in a 300-Mb genome assembly project. A support protocol to prepare input files for PCAP is also described. Another basic protocol for using PCAP on a distributed cluster of computers in a 3-Gb genome assembly project is presented, in addition to suggestions for understanding results from PCAP.


Asunto(s)
Algoritmos , Mapeo Cromosómico/métodos , Sistemas de Administración de Bases de Datos , Bases de Datos Genéticas , Almacenamiento y Recuperación de la Información/métodos , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Secuencia de Bases , Datos de Secuencia Molecular
14.
Nature ; 432(7018): 761-4, 2004 Dec 09.
Artículo en Inglés | MEDLINE | ID: mdl-15592415

RESUMEN

Strategies for assembling large, complex genomes have evolved to include a combination of whole-genome shotgun sequencing and hierarchal map-assisted sequencing. Whole-genome maps of all types can aid genome assemblies, generally starting with low-resolution cytogenetic maps and ending with the highest resolution of sequence. Fingerprint clone maps are based upon complete restriction enzyme digests of clones representative of the target genome, and ultimately comprise a near-contiguous path of clones across the genome. Such clone-based maps are used to validate sequence assembly order, supply long-range linking information for assembled sequences, anchor sequences to the genetic map and provide templates for closing gaps. Fingerprint maps are also a critical resource for subsequent functional genomic studies, because they provide a redundant and ordered sampling of the genome with clones. In an accompanying paper we describe the draft genome sequence of the chicken, Gallus gallus, the first species sequenced that is both a model organism and a global food source. Here we present a clone-based physical map of the chicken genome at 20-fold coverage, containing 260 contigs of overlapping clones. This map represents approximately 91% of the chicken genome and enables identification of chicken clones aligned to positions in other sequenced genomes.


Asunto(s)
Pollos/genética , Genoma , Genómica , Mapeo Físico de Cromosoma , Animales , Cromosomas Artificiales Bacterianos/genética , Clonación Molecular , Mapeo Contig , Dermatoglifia del ADN , Ligamiento Genético/genética , Lugares Marcados de Secuencia
15.
Bioinformatics ; 20(10): 1527-34, 2004 Jul 10.
Artículo en Inglés | MEDLINE | ID: mdl-14962917

RESUMEN

MOTIVATION: Investigators utilize gap estimates for DNA sequencing projects. Standard theories assume sequences are independently and identically distributed, leading to appreciable under-prediction of gaps. RESULTS: Using a statistical scaling factor and data from 20 representative whole genome shotgun projects, we construct regression equations that relate coverage to a normalized gap measure. Prokaryotic genomes do not correlate to sequence coverage, while eukaryotes show strong correlation if the chaff is ignored. Gaps decrease at an exponential rate of only about one-third of that predicted via theory alone. Case studies suggest that departure from theory can largely be attributed to assembly difficulties for repeat-rich genomes, but bias and coverage anomalies are also important when repeats are sparse. Such factors cannot be readily characterized a priori, suggesting upper limits on the accuracy of gap prediction. We also find that diminishing coverage probability discussed in other studies is a theoretical artifact that does not arise for the typical project.


Asunto(s)
Artefactos , Interpretación Estadística de Datos , Modelos Genéticos , Modelos Estadísticos , Alineación de Secuencia/métodos , Análisis de Secuencia de ADN/métodos , Animales , Simulación por Computador , Variación Genética , Humanos , Tamaño de la Muestra , Especificidad de la Especie
16.
Genome Res ; 13(9): 2164-70, 2003 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-12952883

RESUMEN

We describe a whole-genome assembly program named PCAP for processing tens of millions of reads. The PCAP program has several features to address efficiency and accuracy issues in assembly. Multiple processors are used to perform most time-consuming computations in assembly. A more sensitive method is used to avoid missing overlaps caused by sequencing errors. Repetitive regions of reads are detected on the basis of many overlaps with other reads, instead of many shorter word matches with other reads. Contaminated end regions of reads are identified and removed. Generation of a consensus sequence for a contig is based on an alignment of reads in the contig, in which both base quality values and coverage information are used to determine every consensus base. The PCAP program was tested on a mouse whole-genome data set of 30 million reads and a human Chromosome 20 data set of 1.7 million reads. The program is freely available for academic use.


Asunto(s)
Mapeo Contig/métodos , Genoma , Programas Informáticos , Algoritmos , Animales , Biología Computacional/métodos , Humanos , Ratones , Alineación de Secuencia/métodos
17.
Nature ; 423(6942): 825-37, 2003 Jun 19.
Artículo en Inglés | MEDLINE | ID: mdl-12815422

RESUMEN

The male-specific region of the Y chromosome, the MSY, differentiates the sexes and comprises 95% of the chromosome's length. Here, we report that the MSY is a mosaic of heterochromatic sequences and three classes of euchromatic sequences: X-transposed, X-degenerate and ampliconic. These classes contain all 156 known transcription units, which include 78 protein-coding genes that collectively encode 27 distinct proteins. The X-transposed sequences exhibit 99% identity to the X chromosome. The X-degenerate sequences are remnants of ancient autosomes from which the modern X and Y chromosomes evolved. The ampliconic class includes large regions (about 30% of the MSY euchromatin) where sequence pairs show greater than 99.9% identity, which is maintained by frequent gene conversion (non-reciprocal transfer). The most prominent features here are eight massive palindromes, at least six of which contain testis genes.


Asunto(s)
Cromosomas Humanos Y/genética , Evolución Molecular , Procesos de Determinación del Sexo , Transducina , Cromosomas Humanos X/genética , Intercambio Genético/genética , Elementos Transponibles de ADN/genética , Eucromatina/genética , Femenino , Amplificación de Genes/genética , Conversión Génica/genética , Genes/genética , Heterocromatina/genética , Humanos , Hibridación Fluorescente in Situ , Masculino , Modelos Genéticos , Familia de Multigenes/genética , Especificidad de Órganos , Seudogenes/genética , Homología de Secuencia de Ácido Nucleico , Caracteres Sexuales , Especificidad de la Especie , Testículo/metabolismo , Transcripción Genética/genética
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...