Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Mais filtros

Base de dados
Tipo de documento
Intervalo de ano de publicação
1.
Nature ; 500(7462): 335-9, 2013 Aug 15.
Artigo em Inglês | MEDLINE | ID: mdl-23883927

RESUMO

Oil palm is the most productive oil-bearing crop. Although it is planted on only 5% of the total world vegetable oil acreage, palm oil accounts for 33% of vegetable oil and 45% of edible oil worldwide, but increased cultivation competes with dwindling rainforest reserves. We report the 1.8-gigabase (Gb) genome sequence of the African oil palm Elaeis guineensis, the predominant source of worldwide oil production. A total of 1.535 Gb of assembled sequence and transcriptome data from 30 tissue types were used to predict at least 34,802 genes, including oil biosynthesis genes and homologues of WRINKLED1 (WRI1), and other transcriptional regulators, which are highly expressed in the kernel. We also report the draft sequence of the South American oil palm Elaeis oleifera, which has the same number of chromosomes (2n = 32) and produces fertile interspecific hybrids with E. guineensis but seems to have diverged in the New World. Segmental duplications of chromosome arms define the palaeotetraploid origin of palm trees. The oil palm sequence enables the discovery of genes for important traits as well as somaclonal epigenetic alterations that restrict the use of clones in commercial plantings, and should therefore help to achieve sustainability for biofuels and edible oils, reducing the rainforest footprint of this tropical plantation crop.


Assuntos
Arecaceae/classificação , Arecaceae/genética , Genoma de Planta/genética , Filogenia , Metabolismo dos Carboidratos/genética , Cromossomos de Plantas/genética , Metabolismo dos Lipídeos/genética , Modelos Genéticos , Dados de Sequência Molecular
2.
BMC Bioinformatics ; 18(1): 310, 2017 Jun 21.
Artigo em Inglês | MEDLINE | ID: mdl-28633662

RESUMO

BACKGROUND: Identifying orthologous genes is an initial step required for phylogenetics, and it is also a common strategy employed in functional genetics to find candidates for functionally equivalent genes across multiple species. At the same time, in silico orthology prediction tools often require large computational resources only available on computing clusters. Here we present OrthoReD, an open-source orthology prediction tool with accuracy comparable to published tools that requires only a desktop computer. The low computational resource requirement of OrthoReD is achieved by repeating orthology searches on one gene of interest at a time, thereby generating a reduced dataset to limit the scope of orthology search for each gene of interest. RESULTS: The output of OrthoReD was highly similar to the outputs of two other published orthology prediction tools, OrthologID and/or OrthoDB, for the three dataset tested, which represented three phyla with different ranges of species diversity and different number of genomes included. Median CPU time for ortholog prediction per gene by OrthoReD executed on a desktop computer was <15 min even for the largest dataset tested, which included all coding sequences of 100 bacterial species. CONCLUSIONS: With high-throughput sequencing, unprecedented numbers of genes from non-model organisms are available with increasing need for clear information about their orthologies and/or functional equivalents in model organisms. OrthoReD is not only fast and accurate as an orthology prediction tool, but also gives researchers flexibility in the number of genes analyzed at a time, without requiring a high-performance computing cluster.


Assuntos
Software , Actinobacteria/genética , Actinobacteria/metabolismo , Animais , Bases de Dados Factuais , Drosophila/genética , Drosophila/metabolismo , Genética , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Magnoliopsida/genética , Magnoliopsida/metabolismo , Transcriptoma
3.
BMC Genomics ; 16: 987, 2015 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-26596625

RESUMO

BACKGROUND: Understanding the phylogenetic relationships among major lineages of multicellular animals (the Metazoa) is a prerequisite for studying the evolution of complex traits such as nervous systems, muscle tissue, or sensory organs. Transcriptome-based phylogenies have dramatically improved our understanding of metazoan relationships in recent years, although several important questions remain. The branching order near the base of the tree, in particular the placement of the poriferan (sponges, phylum Porifera) and ctenophore (comb jellies, phylum Ctenophora) lineages is one outstanding issue. Recent analyses have suggested that the comb jellies are sister to all remaining metazoan phyla including sponges. This finding is surprising because it suggests that neurons and other complex traits, present in ctenophores and eumetazoans but absent in sponges or placozoans, either evolved twice in Metazoa or were independently, secondarily lost in the lineages leading to sponges and placozoans. RESULTS: To address the question of basal metazoan relationships we assembled a novel dataset comprised of 1080 orthologous loci derived from 36 publicly available genomes representing major lineages of animals. From this large dataset we procured an optimized set of partitions with high phylogenetic signal for resolving metazoan relationships. This optimized data set is amenable to the most appropriate and computationally intensive analyses using site-heterogeneous models of sequence evolution. We also employed several strategies to examine the potential for long-branch attraction to bias our inferences. Our analyses strongly support the Ctenophora as the sister lineage to other Metazoa. We find no support for the traditional view uniting the ctenophores and Cnidaria. Our findings are supported by Bayesian comparisons of topological hypotheses and we find no evidence that they are biased by long-branch attraction. CONCLUSIONS: Our study further clarifies relationships among early branching metazoan lineages. Our phylogeny supports the still-controversial position of ctenophores as sister group to all other metazoans. This study also provides a workflow and computational tools for minimizing systematic bias in genome-based phylogenetic analyses. Future studies of metazoan phylogeny will benefit from ongoing efforts to sequence the genomes of additional invertebrate taxa that will continue to inform our view of the relationships among the major lineages of animals.


Assuntos
Ctenóforos/genética , Mineração de Dados , Genômica , Filogenia , Animais , Viés , Evolução Molecular , Loci Gênicos/genética , Humanos
4.
PLoS Genet ; 7(12): e1002411, 2011 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-22194700

RESUMO

A novel result of the current research is the development and implementation of a unique functional phylogenomic approach that explores the genomic origins of seed plant diversification. We first use 22,833 sets of orthologs from the nuclear genomes of 101 genera across land plants to reconstruct their phylogenetic relationships. One of the more salient results is the resolution of some enigmatic relationships in seed plant phylogeny, such as the placement of Gnetales as sister to the rest of the gymnosperms. In using this novel phylogenomic approach, we were also able to identify overrepresented functional gene ontology categories in genes that provide positive branch support for major nodes prompting new hypotheses for genes associated with the diversification of angiosperms. For example, RNA interference (RNAi) has played a significant role in the divergence of monocots from other angiosperms, which has experimental support in Arabidopsis and rice. This analysis also implied that the second largest subunit of RNA polymerase IV and V (NRPD2) played a prominent role in the divergence of gymnosperms. This hypothesis is supported by the lack of 24nt siRNA in conifers, the maternal control of small RNA in the seeds of flowering plants, and the emergence of double fertilization in angiosperms. Our approach takes advantage of genomic data to define orthologs, reconstruct relationships, and narrow down candidate genes involved in plant evolution within a phylogenomic view of species' diversification.


Assuntos
Evolução Biológica , Cycadopsida/genética , Genoma de Planta , Magnoliopsida/genética , Arabidopsis/genética , RNA Polimerases Dirigidas por DNA , Evolução Molecular , Flores/genética , Genes de Plantas/genética , Genômica , Oryza/genética , Filogenia , Plantas , Interferência de RNA , RNA Interferente Pequeno/genética , Sementes
5.
Mol Phylogenet Evol ; 54(3): 950-6, 2010 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-19686857

RESUMO

A phylogenomic approach was used to generate an amino acid phylogeny for 12 whole genomes representing 10 species in the family Pasteurellaceae. Orthology of genes was determined using an approach similar to OrthologID (http://nypg.bio.nyu.edu/orthologid/about.html) and resulted in the generation of a matrix with 3130 genes with 1,194,615 aligned amino acid characters of which 239,504 characters are phylogenetically informative. Phylogenetic analysis of the concatenated matrix using all standard approaches (maximum parsimony, maximum likelihood, and Bayesian analysis) results in a single extremely robust phylogenetic hypothesis for the species examined in this study. Remarkably, no single gene partition gives the same tree as the concatenated analysis. By analyzing partitioned support in the data matrix, we show that there is very little negative support emanating from individual gene partitions to suggest that the concatenated hypothesis is not tenable. The large number of characters in the matrix allows us to test hypotheses concerning missing data and character number in phylogenomic studies, and we conclude that matrices constructed using genome level information are very robust to missing data. We show that a very large number of concatenated gene sequences (>160) are needed to reliably obtain the same topology as the overall analysis.


Assuntos
Genoma Bacteriano , Modelos Genéticos , Pasteurellaceae/genética , Filogenia , Teorema de Bayes , DNA Bacteriano/genética , Genes Bacterianos , Genômica/métodos , Funções Verossimilhança , Pasteurellaceae/classificação , Análise de Sequência de DNA
6.
Methods Mol Biol ; 537: 23-38, 2009.
Artigo em Inglês | MEDLINE | ID: mdl-19378138

RESUMO

OrthologID (http://nypg.bio.nyu.edu/orthologid/) allows for the rapid and accurate identification of gene orthology within a character-based phylogenetic framework. The Web application has two functions - an orthologous group search and a query orthology classification. The former determines orthologous gene sets for complete genomes and identifies diagnostic characters that define each orthologous gene set; and the latter allows for the classification of unknown query sequences to orthology groups. The first module of the Web application, the gene family generator, uses an E-value based approach to sort genes into gene families. An alignment constructor then aligns members of gene families and the resulting gene family alignments are submitted to the tree builder to obtain gene family guide trees. Finally, the diagnostics generator extracts diagnostic characters from guide trees and these diagnostics are used to determine gene orthology for query sequences.


Assuntos
Biologia Computacional/métodos , Genes , Software , Sequência de Aminoácidos , Sequência de Bases , Bases de Dados Genéticas , Genômica/métodos , Internet , Dados de Sequência Molecular , Alinhamento de Sequência/métodos , Análise de Sequência de DNA , Interface Usuário-Computador
7.
BMC Bioinformatics ; 9: 103, 2008 Feb 19.
Artigo em Inglês | MEDLINE | ID: mdl-18282301

RESUMO

BACKGROUND: The availability of sequences from whole genomes to reconstruct the tree of life has the potential to enable the development of phylogenomic hypotheses in ways that have not been before possible. A significant bottleneck in the analysis of genomic-scale views of the tree of life is the time required for manual curation of genomic data into multi-gene phylogenetic matrices. RESULTS: To keep pace with the exponentially growing volume of molecular data in the genomic era, we have developed an automated technique, ASAP (Automated Simultaneous Analysis Phylogenetics), to assemble these multigene/multi species matrices and to evaluate the significance of individual genes within the context of a given phylogenetic hypothesis. CONCLUSION: Applications of ASAP may enable scientists to re-evaluate species relationships and to develop new phylogenomic hypotheses based on genome-scale data.


Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Evolução Molecular , Filogenia , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Sequência de Bases , Dados de Sequência Molecular
8.
Curr Biol ; 23(20): 2058-62, 2013 Oct 21.
Artigo em Inglês | MEDLINE | ID: mdl-24094856

RESUMO

Eusocial behavior has arisen in few animal groups, most notably in the aculeate Hymenoptera, a clade comprising ants, bees, and stinging wasps [1-4]. Phylogeny is crucial to understanding the evolution of the salient features of these insects, including eusociality [5]. Yet the phylogenetic relationships among the major lineages of aculeate Hymenoptera remain contentious [6-12]. We address this problem here by generating and analyzing genomic data for a representative series of taxa. We obtain a single well-resolved and strongly supported tree, robust to multiple methods of phylogenetic inference. Apoidea (spheciform wasps and bees) and ants are sister groups, a novel finding that contradicts earlier views that ants are closer to ectoparasitoid wasps. Vespid wasps (paper wasps, yellow jackets, and relatives) are sister to all other aculeates except chrysidoids. Thus, all eusocial species of Hymenoptera are contained within two major groups, characterized by transport of larval provisions and nest construction, likely prerequisites for the evolution of eusociality. These two lineages are interpolated among three other clades of wasps whose species are predominantly ectoparasitoids on concealed hosts, the inferred ancestral condition for aculeates [2]. This phylogeny provides a new framework for exploring the evolution of nesting, feeding, and social behavior within the stinging Hymenoptera.


Assuntos
Formigas/genética , Abelhas/genética , Filogenia , Vespas/genética , Animais , Formigas/classificação , Formigas/fisiologia , Abelhas/classificação , Abelhas/fisiologia , Evolução Biológica , Comportamento Alimentar , Proteínas de Insetos/genética , Proteínas de Insetos/metabolismo , Dados de Sequência Molecular , Comportamento de Nidação , Análise de Sequência de DNA , Comportamento Social , Vespas/classificação , Vespas/fisiologia
9.
G3 (Bethesda) ; 3(12): 2257-71, 2013 Dec 09.
Artigo em Inglês | MEDLINE | ID: mdl-24142924

RESUMO

Drosophila suzukii Matsumura (spotted wing drosophila) has recently become a serious pest of a wide variety of fruit crops in the United States as well as in Europe, leading to substantial yearly crop losses. To enable basic and applied research of this important pest, we sequenced the D. suzukii genome to obtain a high-quality reference sequence. Here, we discuss the basic properties of the genome and transcriptome and describe patterns of genome evolution in D. suzukii and its close relatives. Our analyses and genome annotations are presented in a web portal, SpottedWingFlyBase, to facilitate public access.


Assuntos
Proteínas de Drosophila/genética , Drosophila/genética , Genoma de Inseto , Animais , Evolução Biológica , Códon , Elementos de DNA Transponíveis , Feminino , Expressão Gênica , Internet , Masculino , Anotação de Sequência Molecular , Filogenia , Transcriptoma , Navegador
10.
Genome Biol Evol ; 2: 225-39, 2010 Jul 12.
Artigo em Inglês | MEDLINE | ID: mdl-20624728

RESUMO

We use measures of congruence on a combined expressed sequenced tag genome phylogeny to identify proteins that have potential significance in the evolution of seed plants. Relevant proteins are identified based on the direction of partitioned branch and hidden support on the hypothesis obtained on a 16-species tree, constructed from 2,557 concatenated orthologous genes. We provide a general method for detecting genes or groups of genes that may be under selection in directions that are in agreement with the phylogenetic pattern. Gene partitioning methods and estimates of the degree and direction of support of individual gene partitions to the overall data set are used. Using this approach, we correlate positive branch support of specific genes for key branches in the seed plant phylogeny. In addition to basic metabolic functions, such as photosynthesis or hormones, genes involved in posttranscriptional regulation by small RNAs were significantly overrepresented in key nodes of the phylogeny of seed plants. Two genes in our matrix are of critical importance as they are involved in RNA-dependent regulation, essential during embryo and leaf development. These are Argonaute and the RNA-dependent RNA polymerase 6 found to be overrepresented in the angiosperm clade. We use these genes as examples of our phylogenomics approach and show that identifying partitions or genes in this way provides a platform to explain some of the more interesting organismal differences among species, and in particular, in the evolution of plants.


Assuntos
Evolução Molecular , Genes de Plantas , Proteínas de Plantas/genética , Plantas/genética , Sequência de Aminoácidos , Substituição de Aminoácidos , Mineração de Dados , Epigênese Genética , Genômica , Magnoliopsida/classificação , Magnoliopsida/genética , Magnoliopsida/metabolismo , Modelos Genéticos , Dados de Sequência Molecular , Mutação , Filogenia , Plantas/classificação , Plantas/metabolismo , RNA de Plantas/genética , RNA Polimerase Dependente de RNA/genética , Seleção Genética , Homologia de Sequência de Aminoácidos
11.
PLoS One ; 4(6): e5764, 2009 Jun 02.
Artigo em Inglês | MEDLINE | ID: mdl-19503618

RESUMO

BACKGROUND: Genome level analyses have enhanced our view of phylogenetics in many areas of the tree of life. With the production of whole genome DNA sequences of hundreds of organisms and large-scale EST databases a large number of candidate genes for inclusion into phylogenetic analysis have become available. In this work, we exploit the burgeoning genomic data being generated for plant genomes to address one of the more important plant phylogenetic questions concerning the hierarchical relationships of the several major seed plant lineages (angiosperms, Cycadales, Gingkoales, Gnetales, and Coniferales), which continues to be a work in progress, despite numerous studies using single, few or several genes and morphology datasets. Although most recent studies support the notion that gymnosperms and angiosperms are monophyletic and sister groups, they differ on the topological arrangements within each major group. METHODOLOGY: We exploited the EST database to construct a supermatrix of DNA sequences (over 1,200 concatenated orthologous gene partitions for 17 taxa) to examine non-flowering seed plant relationships. This analysis employed programs that offer rapid and robust orthology determination of novel, short sequences from plant ESTs based on reference seed plant genomes. Our phylogenetic analysis retrieved an unbiased (with respect to gene choice), well-resolved and highly supported phylogenetic hypothesis that was robust to various outgroup combinations. CONCLUSIONS: We evaluated character support and the relative contribution of numerous variables (e.g. gene number, missing data, partitioning schemes, taxon sampling and outgroup choice) on tree topology, stability and support metrics. Our results indicate that while missing characters and order of addition of genes to an analysis do not influence branch support, inadequate taxon sampling and limited choice of outgroup(s) can lead to spurious inference of phylogeny when dealing with phylogenomic scale data sets. As expected, support and resolution increases significantly as more informative characters are added, until reaching a threshold, beyond which support metrics stabilize, and the effect of adding conflicting characters is minimized.


Assuntos
Arabidopsis/genética , Etiquetas de Sequências Expressas , Genoma de Planta , Interpretação Estatística de Dados , Bases de Dados Genéticas , Genes de Plantas , Genômica , Funções Verossimilhança , Modelos Genéticos , Filogenia , Plantas , Sementes/metabolismo , Análise de Sequência de DNA
12.
Fly (Austin) ; 2(6): 291-9, 2008.
Artigo em Inglês | MEDLINE | ID: mdl-19139635

RESUMO

The Drosophila 12 genome data set was used to construct whole genome, gene family presence/absence matrices using a broad range of E value cutoffs as criteria for gene family inclusion. The various matrices generated behave differently in phylogenetic analyses as a function of the e-value employed. Based on an optimality criterion that maximizes internal corroboration of information, we show that values of e(-105) to e(-125) extract the most internally consistent phylogenetic signal. Functional class of most genes and gene families can be accurately determined based on the D. melanogaster genome annotation. We used the gene ontology (GO) system to create partitions based on gene function. Several measures of phylogenetic congruence (diagnosis, consistency, partitioned support, hidden support) for different higher and lower level GO categories, were used to mine the data set for genes and gene families that show strong agreement or disagreement with the overall combined phylogenetic hypothesis. We propose that measures of phylogenetic congruence can be used as criteria to identify loci with related GO terms that have a significant impact on cladogenesis.


Assuntos
Drosophila/fisiologia , Genoma de Inseto/genética , Animais , Drosophila/classificação , Drosophila/genética , Proteínas de Drosophila/metabolismo , Filogenia
13.
Bioinformatics ; 22(6): 699-707, 2006 Mar 15.
Artigo em Inglês | MEDLINE | ID: mdl-16410324

RESUMO

MOTIVATION: The determination of gene orthology is a prerequisite for mining and utilizing the rapidly increasing amount of sequence data for genome-scale phylogenetics and comparative genomic studies. Until now, most researchers use pairwise distance comparisons algorithms, such as BLAST, COG, RBH, RSD and INPARANOID, to determine gene orthology. In contrast, orthology determination within a character-based phylogenetic framework has not been utilized on a genomic scale owing to the lack of efficiency and automation. RESULTS: We have developed OrthologID, a Web application that automates the labor-intensive procedures of gene orthology determination within a character-based phylogenetic framework, thus making character-based orthology determination on a genomic scale possible. In addition to generating gene family trees and determining orthologous gene sets for complete genomes, OrthologID can also identify diagnostic characters that define each orthologous gene set, as well as diagnostic characters that are responsible for classifying query sequences from other genomes into specific orthology groups. The OrthologID database currently includes several complete plant genomes, including Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, as well as a unicellular outgroup, Chlamydomonas reinhardtii. To improve the general utility of OrthologID beyond plant species, we plan to expand our sequence database to include the fully sequenced genomes of prokaryotes and other non-plant eukaryotes. AVAILABILITY: http://nypg.bio.nyu.edu/orthologid/


Assuntos
Algoritmos , Mapeamento Cromossômico/métodos , Bases de Dados Genéticas , Alinhamento de Sequência/métodos , Análise de Sequência de DNA/métodos , Software , Interface Usuário-Computador , Inteligência Artificial , Sequência Conservada/genética , Reconhecimento Automatizado de Padrão/métodos , Filogenia , Homologia de Sequência do Ácido Nucleico
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA