Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 61
Filtrar
1.
PLoS Biol ; 22(5): e3002633, 2024 May.
Artigo em Inglês | MEDLINE | ID: mdl-38787797

RESUMO

Comparisons of single-cell RNA sequencing (scRNA-seq) data across species can reveal links between cellular gene expression and the evolution of cell functions, features, and phenotypes. These comparisons evoke evolutionary histories, as depicted by phylogenetic trees, that define relationships between species, genes, and cells. This Essay considers each of these in turn, laying out challenges and solutions derived from a phylogenetic comparative approach and relating these solutions to previously proposed methods for the pairwise alignment of cellular dimensional maps. This Essay contends that species trees, gene trees, cell phylogenies, and cell lineages can all be reconciled as descriptions of the same concept-the tree of cellular life. By integrating phylogenetic approaches into scRNA-seq analyses, challenges for building informed comparisons across species can be overcome, and hypotheses about gene and cell evolution can be robustly tested.


Assuntos
Filogenia , Análise de Sequência de RNA , Análise de Célula Única , Análise de Célula Única/métodos , Análise de Sequência de RNA/métodos , Animais , Humanos , Linhagem da Célula/genética , Evolução Molecular , Especificidade da Espécie
2.
PLoS Genet ; 19(1): e1010607, 2023 01.
Artigo em Inglês | MEDLINE | ID: mdl-36689550

RESUMO

With detailed data on gene expression accessible from an increasingly broad array of species, we can test the extent to which our developmental genetic knowledge from model organisms predicts expression patterns and variation across species. But to know when differences in gene expression across species are significant, we first need to know how much evolutionary variation in gene expression we expect to observe. Here we provide an answer by analyzing RNAseq data across twelve species of Hawaiian Drosophilidae flies, focusing on gene expression differences between the ovary and other tissues. We show that over evolutionary time, there exists a cohort of ovary specific genes that is stable and that largely corresponds to described expression patterns from laboratory model Drosophila species. Our results also provide a demonstration of the prediction that, as phylogenetic distance increases, variation between species overwhelms variation between tissue types. Using ancestral state reconstruction of expression, we describe the distribution of evolutionary changes in tissue-biased expression, and use this to identify gains and losses of ovary-biased expression across these twelve species. We then use this distribution to calculate the evolutionary correlation in expression changes between genes, and demonstrate that genes with known interactions in D. melanogaster are significantly more correlated in their evolution than genes with no or unknown interactions. Finally, we use this correlation matrix to infer new networks of genes that share evolutionary trajectories, and we present these results as a dataset of new testable hypotheses about genetic roles and interactions in the function and evolution of the Drosophila ovary.


Assuntos
Drosophila melanogaster , Ovário , Animais , Feminino , Drosophila melanogaster/genética , Drosophila melanogaster/metabolismo , Filogenia , Havaí , Genes de Insetos , Evolução Molecular , Drosophila/genética , Expressão Gênica
3.
Proc Natl Acad Sci U S A ; 118(8)2021 02 23.
Artigo em Inglês | MEDLINE | ID: mdl-33593896

RESUMO

Predator specialization has often been considered an evolutionary "dead end" due to the constraints associated with the evolution of morphological and functional optimizations throughout the organism. However, in some predators, these changes are localized in separate structures dedicated to prey capture. One of the most extreme cases of this modularity can be observed in siphonophores, a clade of pelagic colonial cnidarians that use tentilla (tentacle side branches armed with nematocysts) exclusively for prey capture. Here we study how siphonophore specialists and generalists evolve, and what morphological changes are associated with these transitions. To answer these questions, we: a) Measured 29 morphological characters of tentacles from 45 siphonophore species, b) mapped these data to a phylogenetic tree, and c) analyzed the evolutionary associations between morphological characters and prey-type data from the literature. Instead of a dead end, we found that siphonophore specialists can evolve into generalists, and that specialists on one prey type have directly evolved into specialists on other prey types. Our results show that siphonophore tentillum morphology has strong evolutionary associations with prey type, and suggest that shifts between prey types are linked to shifts in the morphology, mode of evolution, and evolutionary correlations of tentilla and their nematocysts. The evolutionary history of siphonophore specialization helps build a broader perspective on predatory niche diversification via morphological innovation and evolution. These findings contribute to understanding how specialization and morphological evolution have shaped present-day food webs.


Assuntos
Evolução Biológica , Cadeia Alimentar , Hidrozoários/fisiologia , Comportamento Predatório/fisiologia , Animais , Oceanos e Mares , Filogenia
4.
Mol Biol Evol ; 39(2)2022 02 03.
Artigo em Inglês | MEDLINE | ID: mdl-35134205

RESUMO

Siphonophores are complex colonial animals, consisting of asexually produced bodies (zooids) that are functionally specialized for specific tasks, including feeding, swimming, and sexual reproduction. Though this extreme functional specialization has captivated biologists for generations, its genomic underpinnings remain unknown. We use RNA-seq to investigate gene expression patterns in five zooids and one specialized tissue across seven siphonophore species. Analyses of gene expression across species present several challenges, including identification of comparable expression changes on gene trees with complex histories of speciation, duplication, and loss. We examine gene expression within species, conduct classical analyses examining expression patterns between species, and introduce species branch filtering, which allows us to examine the evolution of expression across species in a phylogenetic framework. Within and across species, we identified hundreds of zooid-specific and species-specific genes, as well as a number of putative transcription factors showing differential expression in particular zooids and developmental stages. We found that gene expression patterns tended to be largely consistent in zooids with the same function across species, but also some large lineage-specific shifts in gene expression. Our findings show that patterns of gene expression have the potential to define zooids in colonial organisms. Traditional analyses of the evolution of gene expression focus on the tips of gene phylogenies, identifying large-scale expression patterns that are zooid or species variable. The new explicit phylogenetic approach we propose here focuses on branches (not tips) offering a deeper evolutionary perspective into specific changes in gene expression within zooids along all branches of the gene (and species) trees.


Assuntos
Hidrozoários , Animais , Expressão Gênica , Genoma , Hidrozoários/genética , Filogenia , Especificidade da Espécie
5.
Mol Biol Evol ; 38(10): 4322-4333, 2021 09 27.
Artigo em Inglês | MEDLINE | ID: mdl-34097041

RESUMO

Identifying our most distant animal relatives has emerged as one of the most challenging problems in phylogenetics. This debate has major implications for our understanding of the origin of multicellular animals and of the earliest events in animal evolution, including the origin of the nervous system. Some analyses identify sponges as our most distant animal relatives (Porifera-sister hypothesis), and others identify comb jellies (Ctenophora-sister hypothesis). These analyses vary in many respects, making it difficult to interpret previous tests of these hypotheses. To gain insight into why different studies yield different results, an important next step in the ongoing debate, we systematically test these hypotheses by synthesizing 15 previous phylogenomic studies and performing new standardized analyses under consistent conditions with additional models. We find that Ctenophora-sister is recovered across the full range of examined conditions, and Porifera-sister is recovered in some analyses under narrow conditions when most outgroups are excluded and site-heterogeneous CAT models are used. We additionally find that the number of categories in site-heterogeneous models is sufficient to explain the Porifera-sister results. Furthermore, our cross-validation analyses show CAT models that recover Porifera-sister have hundreds of additional categories and fail to fit significantly better than site-heterogenuous models with far fewer categories. Systematic and standardized testing of diverse phylogenetic models suggests that we should be skeptical of Porifera-sister results both because they are recovered under such narrow conditions and because the models in these conditions fit the data no better than other models that recover Ctenophora-sister.


Assuntos
Ctenóforos , Animais , Filogenia
6.
Mol Biol Evol ; 37(2): 599-603, 2020 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-31633786

RESUMO

Phylogenetic trees and data are often stored in incompatible and inconsistent formats. The outputs of software tools that contain trees with analysis findings are often not compatible with each other, making it hard to integrate the results of different analyses in a comparative study. The treeio package is designed to connect phylogenetic tree input and output. It supports extracting phylogenetic trees as well as the outputs of commonly used analytical software. It can link external data to phylogenies and merge tree data obtained from different sources, enabling analyses of phylogeny-associated data from different disciplines in an evolutionary context. Treeio also supports export of a phylogenetic tree with heterogeneous-associated data to a single tree file, including BEAST compatible NEXUS and jtree formats; these facilitate data sharing as well as file format conversion for downstream analysis. The treeio package is designed to work with the tidytree and ggtree packages. Tree data can be processed using the tidy interface with tidytree and visualized by ggtree. The treeio package is released within the Bioconductor and rOpenSci projects. It is available at https://www.bioconductor.org/packages/treeio/.


Assuntos
Biologia Computacional/métodos , Mineração de Dados/métodos , Internet , Filogenia , Software
7.
Proc Natl Acad Sci U S A ; 115(3): E409-E417, 2018 01 16.
Artigo em Inglês | MEDLINE | ID: mdl-29301966

RESUMO

There is considerable interest in comparing functional genomic data across species. One goal of such work is to provide an integrated understanding of genome and phenotype evolution. Most comparative functional genomic studies have relied on multiple pairwise comparisons between species, an approach that does not incorporate information about the evolutionary relationships among species. The statistical problems that arise from not considering these relationships can lead pairwise approaches to the wrong conclusions and are a missed opportunity to learn about biology that can only be understood in an explicit phylogenetic context. Here, we examine two recently published studies that compare gene expression across species with pairwise methods, and find reason to question the original conclusions of both. One study interpreted pairwise comparisons of gene expression as support for the ortholog conjecture, the hypothesis that orthologs tend to have more similar attributes (expression in this case) than paralogs. The other study interpreted pairwise comparisons of embryonic gene expression across distantly related animals as evidence for a distinct evolutionary process that gave rise to phyla. In each study, distinct patterns of pairwise similarity among species were originally interpreted as evidence of particular evolutionary processes, but instead, we find that they reflect species relationships. These reanalyses concretely show the inadequacy of pairwise comparisons for analyzing functional genomic data across species. It will be critical to adopt phylogenetic comparative methods in future functional genomic work. Fortunately, phylogenetic comparative biology is also a rapidly advancing field with many methods that can be directly applied to functional genomic data.


Assuntos
Expressão Gênica/fisiologia , Genômica/métodos , Vertebrados/metabolismo , Algoritmos , Animais , Evolução Molecular , Filogenia , Software , Especificidade da Espécie , Vertebrados/genética
8.
Mol Phylogenet Evol ; 127: 823-833, 2018 10.
Artigo em Inglês | MEDLINE | ID: mdl-29940256

RESUMO

Siphonophores are a diverse group of hydrozoans (Cnidaria) that are found at most depths of the ocean - from the surface, like the familiar Portuguese man of war, to the deep sea. They play important roles in ocean ecosystems, and are among the most abundant gelatinous predators. A previous phylogenetic study based on two ribosomal RNA genes provided insight into the internal relationships between major siphonophore groups. There was, however, little support for many deep relationships within the clade Codonophora. Here, we present a new siphonophore phylogeny based on new transcriptome data from 29 siphonophore species analyzed in combination with 14 publicly available genomic and transcriptomic datasets. We use this new phylogeny to reconstruct several traits that are central to siphonophore biology, including sexual system (monoecy vs. dioecy), gain and loss of zooid types, life history traits, and habitat. The phylogenetic relationships in this study are largely consistent with the previous phylogeny, but we find strong support for new clades within Codonophora that were previously unresolved. These results have important implications for trait evolution within Siphonophora, including favoring the hypothesis that monoecy arose at least twice.


Assuntos
Hidrozoários/classificação , Filogenia , Característica Quantitativa Herdável , Animais , Ecossistema , Genoma , Hidrozoários/anatomia & histologia , Hidrozoários/genética , Funções Verossimilhança , Fenótipo , Processos Estocásticos
9.
Nature ; 480(7377): 364-7, 2011 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-22031330

RESUMO

Molluscs (snails, octopuses, clams and their relatives) have a great disparity of body plans and, among the animals, only arthropods surpass them in species number. This diversity has made Mollusca one of the best-studied groups of animals, yet their evolutionary relationships remain poorly resolved. Open questions have important implications for the origin of Mollusca and for morphological evolution within the group. These questions include whether the shell-less, vermiform aplacophoran molluscs diverged before the origin of the shelled molluscs (Conchifera) or lost their shells secondarily. Monoplacophorans were not included in molecular studies until recently, when it was proposed that they constitute a clade named Serialia together with Polyplacophora (chitons), reflecting the serial repetition of body organs in both groups. Attempts to understand the early evolution of molluscs become even more complex when considering the large diversity of Cambrian fossils. These can have multiple dorsal shell plates and sclerites or can be shell-less but with a typical molluscan radula and serially repeated gills. To better resolve the relationships among molluscs, we generated transcriptome data for 15 species that, in combination with existing data, represent for the first time all major molluscan groups. We analysed multiple data sets containing up to 216,402 sites and 1,185 gene regions using multiple models and methods. Our results support the clade Aculifera, containing the three molluscan groups with spicules but without true shells, and they support the monophyly of Conchifera. Monoplacophora is not the sister group to other Conchifera but to Cephalopoda. Strong support is found for a clade that comprises Scaphopoda (tusk shells), Gastropoda and Bivalvia, with most analyses placing Scaphopoda and Gastropoda as sister groups. This well-resolved tree will constitute a framework for further studies of mollusc evolution, development and anatomy.


Assuntos
Moluscos/classificação , Moluscos/genética , Filogenia , Transcriptoma/genética , Animais , Bivalves/classificação , Bivalves/genética , Cefalópodes/classificação , Cefalópodes/genética , Gastrópodes/classificação , Gastrópodes/genética , Perfilação da Expressão Gênica , Funções Verossimilhança , Modelos Biológicos , Especificidade da Espécie
10.
Syst Biol ; 64(6): 1048-58, 2015 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-26231182

RESUMO

The Swofford-Olsen-Waddell-Hillis (SOWH) test evaluates statistical support for incongruent phylogenetic topologies. It is commonly applied to determine if the maximum likelihood tree in a phylogenetic analysis is significantly different than an alternative hypothesis. The SOWH test compares the observed difference in log-likelihood between two topologies to a null distribution of differences in log-likelihood generated by parametric resampling. The test is a well-established phylogenetic method for topology testing, but it is sensitive to model misspecification, it is computationally burdensome to perform, and its implementation requires the investigator to make several decisions that each have the potential to affect the outcome of the test. We analyzed the effects of multiple factors using seven data sets to which the SOWH test was previously applied. These factors include a number of sample replicates, likelihood software, the introduction of gaps to simulated data, the use of distinct models of evolution for data simulation and likelihood inference, and a suggested test correction wherein an unresolved "zero-constrained" tree is used to simulate sequence data. To facilitate these analyses and future applications of the SOWH test, we wrote SOWHAT, a program that automates the SOWH test. We find that inadequate bootstrap sampling can change the outcome of the SOWH test. The results also show that using a zero-constrained tree for data simulation can result in a wider null distribution and higher p-values, but does not change the outcome of the SOWH test for most of the data sets tested here. These results will help others implement and evaluate the SOWH test and allow us to provide recommendations for future applications of the SOWH test. SOWHAT is available for download from https://github.com/josephryan/SOWHAT.


Assuntos
Classificação/métodos , Simulação por Computador , Filogenia , Primulaceae/classificação , Primulaceae/genética , Software , Interpretação Estatística de Dados
11.
Nature ; 527(7579): 448-9, 2015 Nov 26.
Artigo em Inglês | MEDLINE | ID: mdl-26580019
12.
Proc Biol Sci ; 282(1801): 20142332, 2015 Feb 22.
Artigo em Inglês | MEDLINE | ID: mdl-25589608

RESUMO

Bivalves are an ancient and ubiquitous group of aquatic invertebrates with an estimated 10 000-20 000 living species. They are economically significant as a human food source, and ecologically important given their biomass and effects on communities. Their phylogenetic relationships have been studied for decades, and their unparalleled fossil record extends from the Cambrian to the Recent. Nevertheless, a robustly supported phylogeny of the deepest nodes, needed to fully exploit the bivalves as a model for testing macroevolutionary theories, is lacking. Here, we present the first phylogenomic approach for this important group of molluscs, including novel transcriptomic data for 31 bivalves obtained through an RNA-seq approach, and analyse these data with published genomes and transcriptomes of other bivalves plus outgroups. Our results provide a well-resolved, robust phylogenetic backbone for Bivalvia with all major lineages delineated, addressing long-standing questions about the monophyly of Protobranchia and Heterodonta, and resolving the position of particular groups such as Palaeoheterodonta, Archiheterodonta and Anomalodesmata. This now fully resolved backbone demonstrates that genomic approaches using hundreds of genes are feasible for resolving phylogenetic questions in bivalves and other animals.


Assuntos
Bivalves/classificação , Bivalves/genética , Transcriptoma , Animais , DNA Complementar/genética , Dados de Sequência Molecular , Filogenia , Análise de Sequência de RNA
13.
J Exp Zool B Mol Dev Evol ; 324(5): 435-49, 2015 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-26036693

RESUMO

The siphonophore Nanomia bijuga is a pelagic hydrozoan (Cnidaria) with complex morphological organization. Each siphonophore is made up of many asexually produced, genetically identical zooids that are functionally specialized and morphologically distinct. These zooids predominantly arise by budding in two growth zones, and are arranged in precise patterns. This study describes the cellular anatomy of several zooid types, the stem, and the gas-filled float, called the pneumatophore. The distribution of cellular morphologies across zooid types enhances our understanding of zooid function. The unique absorptive cells in the palpon, for example, indicate specialized intracellular digestive processing in this zooid type. Though cnidarians are usually thought of as mono-epithelial, we characterize at least two cellular populations in this species which are not connected to a basement membrane. This work provides a greater understanding of epithelial diversity within the cnidarians, and will be a foundation for future studies on N. bijuga, including functional assays and gene expression analyses.


Assuntos
Hidrozoários/anatomia & histologia , Animais , Hidrozoários/citologia , Sistema Nervoso/anatomia & histologia , Sistema Nervoso/citologia
14.
Proc Biol Sci ; 281(1794): 20141739, 2014 11 07.
Artigo em Inglês | MEDLINE | ID: mdl-25232139

RESUMO

Gastropods are a highly diverse clade of molluscs that includes many familiar animals, such as limpets, snails, slugs and sea slugs. It is one of the most abundant groups of animals in the sea and the only molluscan lineage that has successfully colonized land. Yet the relationships among and within its constituent clades have remained in flux for over a century of morphological, anatomical and molecular study. Here, we re-evaluate gastropod phylogenetic relationships by collecting new transcriptome data for 40 species and analysing them in combination with publicly available genomes and transcriptomes. Our datasets include all five main gastropod clades: Patellogastropoda, Vetigastropoda, Neritimorpha, Caenogastropoda and Heterobranchia. We use two different methods to assign orthology, subsample each of these matrices into three increasingly dense subsets, and analyse all six of these supermatrices with two different models of molecular evolution. All 12 analyses yield the same unrooted network connecting the five major gastropod lineages. This reduces deep gastropod phylogeny to three alternative rooting hypotheses. These results reject the prevalent hypothesis of gastropod phylogeny, Orthogastropoda. Our dated tree is congruent with a possible end-Permian recovery of some gastropod clades, namely Caenogastropoda and some Heterobranchia subclades.


Assuntos
Evolução Molecular , Gastrópodes/classificação , Gastrópodes/genética , Genoma/genética , Filogenia , Transcriptoma/genética , Animais , Análise de Sequência de RNA
15.
Bioinformatics ; 29(23): 2959-63, 2013 Dec 01.
Artigo em Inglês | MEDLINE | ID: mdl-24021385

RESUMO

MOTIVATION: Draft de novo genome assemblies are now available for many organisms. These assemblies are point estimates of the true genome sequences. Each is a specific hypothesis, drawn from among many alternative hypotheses, of the sequence of a genome. Assembly uncertainty, the inability to distinguish between multiple alternative assembly hypotheses, can be due to real variation between copies of the genome in the sample, errors and ambiguities in the sequenced data and assumptions and heuristics of the assemblers. Most assemblers select a single assembly according to ad hoc criteria, and do not yet report and quantify the uncertainty of their outputs. Those assemblers that do report uncertainty take different approaches to describing multiple assembly hypotheses and the support for each. RESULTS: Here we review and examine the problem of representing and measuring uncertainty in assemblies. A promising recent development is the implementation of assemblers that are built according to explicit statistical models. Some new assembly methods, for example, estimate and maximize assembly likelihood. These advances, combined with technical advances in the representation of alternative assembly hypotheses, will lead to a more complete and biologically relevant understanding of assembly uncertainty. This will in turn facilitate the interpretation of downstream analyses and tests of specific biological hypotheses.


Assuntos
Biologia Computacional , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA/métodos
16.
Nature ; 452(7188): 745-9, 2008 Apr 10.
Artigo em Inglês | MEDLINE | ID: mdl-18322464

RESUMO

Long-held ideas regarding the evolutionary relationships among animals have recently been upended by sometimes controversial hypotheses based largely on insights from molecular data. These new hypotheses include a clade of moulting animals (Ecdysozoa) and the close relationship of the lophophorates to molluscs and annelids (Lophotrochozoa). Many relationships remain disputed, including those that are required to polarize key features of character evolution, and support for deep nodes is often low. Phylogenomic approaches, which use data from many genes, have shown promise for resolving deep animal relationships, but are hindered by a lack of data from many important groups. Here we report a total of 39.9 Mb of expressed sequence tags from 29 animals belonging to 21 phyla, including 11 phyla previously lacking genomic or expressed-sequence-tag data. Analysed in combination with existing sequences, our data reinforce several previously identified clades that split deeply in the animal tree (including Protostomia, Ecdysozoa and Lophotrochozoa), unambiguously resolve multiple long-standing issues for which there was strong conflicting support in earlier studies with less data (such as velvet worms rather than tardigrades as the sister group of arthropods), and provide molecular support for the monophyly of molluscs, a group long recognized by morphologists. In addition, we find strong support for several new hypotheses. These include a clade that unites annelids (including sipunculans and echiurans) with nemerteans, phoronids and brachiopods, molluscs as sister to that assemblage, and the placement of ctenophores as the earliest diverging extant multicellular animals. A single origin of spiral cleavage (with subsequent losses) is inferred from well-supported nodes. Many relationships between a stable subset of taxa find strong support, and a diminishing number of lineages remain recalcitrant to placement on the tree.


Assuntos
Classificação/métodos , Filogenia , Animais , Teorema de Bayes , Biologia Computacional , Bases de Dados Genéticas , Evolução Molecular , Etiquetas de Sequências Expressas , Biblioteca Gênica , Humanos , Cadeias de Markov , Reprodutibilidade dos Testes , Tamanho da Amostra , Sensibilidade e Especificidade
17.
Nat Ecol Evol ; 8(2): 325-338, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38182680

RESUMO

The origin and evolution of cell types has emerged as a key topic in evolutionary biology. Driven by rapidly accumulating single-cell datasets, recent attempts to infer cell type evolution have largely been limited to pairwise comparisons because we lack approaches to build cell phylogenies using model-based approaches. Here we approach the challenges of applying explicit phylogenetic methods to single-cell data by using principal components as phylogenetic characters. We infer a cell phylogeny from a large, comparative single-cell dataset of eye cells from five distantly related mammals. Robust cell type clades enable us to provide a phylogenetic, rather than phenetic, definition of cell type, allowing us to forgo marker genes and phylogenetically classify cells by topology. We further observe evolutionary relationships between diverse vessel endothelia and identify the myelinating and non-myelinating Schwann cells as sister cell types. Finally, we examine principal component loadings and describe the gene expression dynamics underlying the function and identity of cell type clades that have been conserved across the five species. A cell phylogeny provides a rigorous framework towards investigating the evolutionary history of cells and will be critical to interpret comparative single-cell datasets that aim to ask fundamental evolutionary questions.


Assuntos
Mamíferos , Animais , Filogenia , Análise de Sequência de RNA , Mamíferos/genética
18.
Theory Biosci ; 143(1): 45-62, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-37947999

RESUMO

Counting transcripts of mRNA are a key method of observation in modern biology. With advances in counting transcripts in single cells (single-cell RNA sequencing or scRNA-seq), these data are routinely used to identify cells by their transcriptional profile, and to identify genes with differential cellular expression. Because the total number of transcripts counted per cell can vary for technical reasons, the first step of many commonly used scRNA-seq workflows is to normalize by sequencing depth, transforming counts into proportional abundances. The primary objective of this step is to reshape the data such that cells with similar biological proportions of transcripts end up with similar transformed measurements. But there is growing concern that normalization and other transformations result in unintended distortions that hinder both analyses and the interpretation of results. This has led to an intense focus on optimizing methods for normalization and transformation of scRNA-seq data. Here, we take an alternative approach, by avoiding normalization and transformation altogether. We abandon the use of distances to compare cells, and instead use a restricted algebra, motivated by measurement theory and abstract algebra, that preserves the count nature of the data. We demonstrate that this restricted algebra is sufficient to draw meaningful and practical comparisons of gene expression through the use of the dot product and other elementary operations. This approach sidesteps many of the problems with common transformations, and has the added benefit of being simpler and more intuitive. We implement our approach in the package countland, available in python and R.


Assuntos
Análise de Célula Única , Software , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Perfilação da Expressão Gênica/métodos
19.
Genome Biol Evol ; 16(3)2024 03 02.
Artigo em Inglês | MEDLINE | ID: mdl-38502059

RESUMO

Siphonophores (Cnidaria: Hydrozoa) are abundant predators found throughout the ocean and are important constituents of the global zooplankton community. They range in length from a few centimeters to tens of meters. They are gelatinous, fragile, and difficult to collect, so many aspects of the biology of these roughly 200 species remain poorly understood. To survey siphonophore genome diversity, we performed Illumina sequencing of 32 species sampled broadly across the phylogeny. Sequencing depth was sufficient to estimate nuclear genome size from k-mer spectra in six specimens, ranging from 0.7 to 2.3 Gb, with heterozygosity estimates between 0.69% and 2.32%. Incremental k-mer counting indicates k-mer peaks can be absent with nearly 20× read coverage, suggesting minimum genome sizes range from 1.4 to 5.6 Gb in the 25 samples without peaks in the k-mer spectra. This work confirms most siphonophore nuclear genomes are large relative to the genomes of other cnidarians, but also identifies several with reduced size that are tractable targets for future siphonophore nuclear genome assembly projects. We also assembled complete mitochondrial genomes for 33 specimens from these new data, indicating a conserved gene order shared among nonsiphonophore hydrozoans, Cystonectae, and some Physonectae, revealing the ancestral mitochondrial gene order of siphonophores. Our results also suggest extensive rearrangement of mitochondrial genomes within other Physonectae and in Calycophorae. Though siphonophores comprise a small fraction of cnidarian species, this survey greatly expands our understanding of cnidarian genome diversity. This study further illustrates both the importance of deep phylogenetic sampling and the utility of k-mer-based genome skimming in understanding the genomic diversity of a clade.


Assuntos
Cnidários , Genoma Mitocondrial , Hidrozoários , Animais , Cnidários/genética , Filogenia , Hidrozoários/genética , Genômica , Tamanho do Genoma
20.
BMC Bioinformatics ; 14: 330, 2013 Nov 19.
Artigo em Inglês | MEDLINE | ID: mdl-24252138

RESUMO

BACKGROUND: In the past decade, transcriptome data have become an important component of many phylogenetic studies. They are a cost-effective source of protein-coding gene sequences, and have helped projects grow from a few genes to hundreds or thousands of genes. Phylogenetic studies now regularly include genes from newly sequenced transcriptomes, as well as publicly available transcriptomes and genomes. Implementing such a phylogenomic study, however, is computationally intensive, requires the coordinated use of many complex software tools, and includes multiple steps for which no published tools exist. Phylogenomic studies have therefore been manual or semiautomated. In addition to taking considerable user time, this makes phylogenomic analyses difficult to reproduce, compare, and extend. In addition, methodological improvements made in the context of one study often cannot be easily applied and evaluated in the context of other studies. RESULTS: We present Agalma, an automated tool that constructs matrices for phylogenomic analyses. The user provides raw Illumina transcriptome data, and Agalma produces annotated assemblies, aligned gene sequence matrices, a preliminary phylogeny, and detailed diagnostics that allow the investigator to make extensive assessments of intermediate analysis steps and the final results. Sequences from other sources, such as externally assembled genomes and transcriptomes, can also be incorporated in the analyses. Agalma is built on the BioLite bioinformatics framework, which tracks provenance, profiles processor and memory use, records diagnostics, manages metadata, installs dependencies, logs version numbers and calls to external programs, and enables rich HTML reports for all stages of the analysis. Agalma includes a small test data set and a built-in test analysis of these data. In addition to describing Agalma, we here present a sample analysis of a larger seven-taxon data set. Agalma is available for download at https://bitbucket.org/caseywdunn/agalma. CONCLUSIONS: Agalma allows complex phylogenomic analyses to be implemented and described unambiguously as a series of high-level commands. This will enable phylogenomic studies to be readily reproduced, modified, and extended. Agalma also facilitates methods development by providing a complete modular workflow, bundled with test data, that will allow further optimization of each step in the context of a full phylogenomic analysis.


Assuntos
Perfilação da Expressão Gênica/métodos , Genômica/métodos , Filogenia , Software , Genoma , Análise de Sequência de DNA/métodos
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA