RESUMEN
BACKGROUND: Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. RESULTS: We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ). CONCLUSIONS: Our new trees for LIT genes will be a valuable resource for researchers studying the evolution of eyes or other light-interacting structures. We also introduce PIA, a high throughput method for using phylogenetic relationships to identify LIT genes in transcriptomes from non-model organisms. With simple modifications, our methods may be used to search for different sets of genes or to annotate data sets from taxa outside of Metazoa.
Asunto(s)
Luz , Anotación de Secuencia Molecular/métodos , Filogenia , Transcriptoma , Visión Ocular/genética , Algoritmos , Animales , Proteínas del Ojo/genética , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento , Funciones de Verosimilitud , Análisis de Secuencia de ProteínaRESUMEN
A growing body of work on the neuroethology of cubozoans is based largely on the capabilities of the photoreceptive tissues, and it is important to determine the molecular basis of their light sensitivity. The cubozoans rely on 24 special purpose eyes to extract specific information from a complex visual scene to guide their behavior in the habitat. The lens eyes are the most studied photoreceptive structures, and the phototransduction in the photoreceptor cells is based on light sensitive opsin molecules. Opsins are photosensitive transmembrane proteins associated with photoreceptors in eyes, and the amino acid sequence of the opsins determines the spectral properties of the photoreceptors. Here we show that two distinct opsins (Tripedalia cystophora-lens eye expressed opsin and Tripedalia cystophora-neuropil expressed opsin, or Tc-leo and Tc-neo) are expressed in the Tripedalia cystophora rhopalium. Quantitative PCR determined the level of expression of the two opsins, and we found Tc-leo to have a higher amount of expression than Tc-neo. In situ hybridization located Tc-leo expression in the retinal photoreceptors of the lens eyes where the opsin is involved in image formation. Tc-neo is expressed in a confined part of the neuropil and is probably involved in extraocular light sensation, presumably in relation to diurnal activity.
Asunto(s)
Cnidarios/genética , Ojo/metabolismo , Expresión Génica , Opsinas/genética , Animales , Cnidarios/clasificación , Ojo/ultraestructura , Femenino , Células Fotorreceptoras de Invertebrados/metabolismo , FilogeniaRESUMEN
Stomatopod crustaceans have complex visual systems containing up to 16 different spectral classes of photoreceptors, more than described for any other animal. A previous molecular study of this visual system focusing on the expression of opsin genes found many more transcripts than predicted on the basis of physiology, but was unable to fully document the expressed opsin genes responsible for this diversity. Furthermore, questions remain about how other components of phototransduction cascades are involved. This study continues prior investigations by examining the molecular function of stomatopods' visual systems using new whole eye 454 transcriptome datasets from two species, Hemisquilla californiensis and Pseudosquilla ciliata. These two species represent taxonomic diversity within the order Stomatopoda, as well as variations in the anatomy and physiology of the visual system. Using an evolutionary placement algorithm to annotate the transcriptome, we identified the presence of nine components of the stomatopods' G-protein-coupled receptor (GPCR) phototransduction cascade, including two visual arrestins, subunits of the heterotrimeric G-protein, phospholipase C, transient receptor potential channels, and opsin transcripts. The set of expressed transduction genes suggests that stomatopods utilize a Gq-mediated GPCR-signaling cascade. The most notable difference in expression between the phototransduction cascades of the two species was the number of opsin contigs recovered, with 18 contigs found in retinas of H. californiensis, and 49 contigs in those of P. ciliata. Based on phylogenetic placement and fragment overlap, these contigs were estimated to represent 14 and 33 expressed transcripts, respectively. These data expand the known opsin diversity in stomatopods to clades of arthropod opsins that are sensitive to short wavelengths and ultraviolet wavelengths and confirm the results of previous studies recovering more opsin transcripts than spectrally distinct types of photoreceptors. Many of the recovered transcripts were phylogenetically placed in an evolutionary clade of crustacean opsin sequences that is rapidly expanding as the visual systems from more species are investigated. We discuss these results in relation to the emerging pattern, particularly in crustacean visual systems, of the expression of multiple opsin transcripts in photoreceptors of the same spectral class, and even in single photoreceptor cells.
Asunto(s)
Evolución Biológica , Crustáceos/fisiología , Variación Genética , Fototransducción/fisiología , Células Fotorreceptoras de Invertebrados/fisiología , Receptores Acoplados a Proteínas G/metabolismo , Visión Ocular/fisiología , Secuencia de Aminoácidos , Animales , Arrestinas/metabolismo , Secuencia de Bases , Mapeo Contig , Crustáceos/anatomía & histología , Crustáceos/genética , Cartilla de ADN/genética , ADN Complementario/genética , Perfilación de la Expresión Génica , Fototransducción/genética , Funciones de Verosimilitud , Modelos Genéticos , Anotación de Secuencia Molecular , Datos de Secuencia Molecular , Células Fotorreceptoras de Invertebrados/citología , Receptores Acoplados a Proteínas G/genética , Alineación de Secuencia , Análisis de Secuencia de ADN , Especificidad de la EspecieRESUMEN
An ambitious, yet fundamental goal for comparative biology is to understand the evolutionary relationships for all of life. However, many important taxonomic groups have remained recalcitrant to inclusion into broader scale studies. Here, we focus on collection of 9 new 454 transcriptome data sets from Ostracoda, an ancient and diverse group with a dense fossil record, which is often undersampled in broader studies. We combine the new transcriptomes with a new morphological matrix (including fossils) and existing expressed sequence tag, mitochondrial genome, nuclear genome, and ribosomal DNA data. Our analyses lead to new insights into ostracod and pancrustacean phylogeny. We obtained support for three epic pancrustacean clades that likely originated in the Cambrian: Oligostraca (Ostracoda, Mystacocarida, Branchiura, and Pentastomida); Multicrustacea (Copepoda, Malacostraca, and Thecostraca); and a clade we refer to as Allotriocarida (Hexapoda, Remipedia, Cephalocarida, and Branchiopoda). Within the Oligostraca clade, our results support the unresolved question of ostracod monophyly. Within Multicrustacea, we find support for Thecostraca plus Copepoda, for which we suggest the name Hexanauplia. Within Allotriocarida, some analyses support the hypothesis that Remipedia is the sister taxon to Hexapoda, but others support Branchiopoda + Cephalocarida as the sister group of hexapods. In multiple different analyses, we see better support for equivocal nodes using slow-evolving genes or when excluding distant outgroups, highlighting the increased importance of conditional data combination in this age of abundant, often anonymous data. However, when we analyze the same set of species and ignore rate of gene evolution, we find higher support when including all data, more in line with a "total evidence" philosophy. By concatenating molecular and morphological data, we place pancrustacean fossils in the phylogeny, which can be used for studies of divergence times in Pancrustacea, Arthropoda, or Metazoa. Our results and new data will allow for attributes of Ostracoda, such as its amazing fossil record and diverse biology, to be leveraged in broader scale comparative studies. Further, we illustrate how adding extensive next-generation sequence data from understudied groups can yield important new phylogenetic insights into long-standing questions, especially when carefully analyzed in combination with other data.