RESUMO
Most eukaryotic lineages are microbial, and many have only recently been sampled for phylogenetic studies or remain in the "dark area" of the tree of life where there are no molecular data. To assess relationships among eukaryotic lineages, we perform a taxon-rich phylogenomic analysis including 232 eukaryotes selected to maximize taxonomic diversity and up to 1554 genes chosen as vertically inherited based on their broad distribution among eukaryotes. We also include sequences from 486 bacteria and 84 archaea to assess the impact of endosymbiotic gene transfer (EGT) from plastids and to detect contamination. Overall, our analyses are consistent with other less taxon-rich estimates of the eukaryotic tree of life, and we recover strong support for five major clades: Amoebozoa, Excavata (without the genus Malawimonas), Opisthokonta, Archaeplastida, and SAR (Stramenopila, Alveolata, and Rhizaria). Our analyses also highlight the existence of "orphan" lineages, lineages that lack robust placement in the eukaryotic tree of life, and indicate the possibility of as yet undiscovered diversity. In analyses including bacteria and archaea, we find that approximately 10% of the 1554 genes, which we choose because they are found in four or five of the five major eukaryotic clades and hence may be more likely to be inherited vertically, appear to have been acquired from cyanobacteria through EGT in photosynthetic lineages. Removing these EGT genes places the green algae as sister to the glaucophytes instead of the red algae, suggesting that unknowingly including genes of plastid origin, and combining them with genes of nuclear origin, may mislead phylogenetic estimates. Finally, the large size of our data set allows comparative analyses of subsets of data; alignments built from randomly sampled sites provide greater support, particularly for deep relationships, than do equivalent-sized data sets built from randomly sampled genes.
Assuntos
Classificação/métodos , Eucariotos/classificação , Eucariotos/genética , Filogenia , Archaea/genética , Bactérias/genética , Plastídeos/genéticaRESUMO
The first analyses of gene sequence data indicated that the eukaryotic tree of life consisted of a long stem of microbial groups "topped" by a crown-containing plants, animals, and fungi and their microbial relatives. Although more recent multigene concatenated analyses have refined the relationships among the many branches of eukaryotes, the root of the eukaryotic tree of life has remained elusive. Inferring the root of extant eukaryotes is challenging because of the age of the group (â¼1.7-2.1 billion years old), tremendous heterogeneity in rates of evolution among lineages, and lack of obvious outgroups for many genes. Here, we reconstruct a rooted phylogeny of extant eukaryotes based on minimizing the number of duplications and losses among a collection of gene trees. This approach does not require outgroup sequences or assumptions of orthology among sequences. We also explore the impact of taxon and gene sampling and assess support for alternative hypotheses for the root. Using 20 gene trees from 84 diverse eukaryotic lineages, this approach recovers robust eukaryotic clades and reveals evidence for a eukaryotic root that lies between the Opisthokonta (animals, fungi and their microbial relatives) and all remaining eukaryotes.
Assuntos
Eucariotos/genética , Evolução Molecular , Filogenia , DNA Ribossômico/genética , Duplicação Gênica , Genoma , Análise de Sequência de DNARESUMO
BACKGROUND: Onchocerca volvulus is a filarial parasite that is a major cause of dermatitis and blindness in endemic regions primarily in sub-Saharan Africa. Widespread efforts to control the disease caused by O. volvulus infection (onchocerciasis) began in 1974 and in recent years, following successful elimination of transmission in much of the Americas, the focus of efforts in Africa has moved from control to the more challenging goal of elimination of transmission in all endemic countries. Mass drug administration (MDA) with ivermectin has reached more than 150 million people and elimination of transmission has been confirmed in four South American countries, with at least two African countries having now stopped MDA as they approach verification of elimination. It is essential that accurate data for active transmission are used to assist in making the critical decision to stop MDA, since missing low levels of transmission and infection can lead to continued spread or recrudescence of the disease. METHODOLOGY/PRINCIPAL FINDINGS: Current World Health Organization guidelines for MDA stopping decisions and post-treatment surveillance include screening pools of the Simulium blackfly vector for the presence of O. volvulus larvae using a PCR-ELISA-based molecular technique. In this study, we address the potential of an updated, practical, standardized molecular diagnostic tool with increased sensitivity and species-specificity by comparing several candidate qPCR assays. When paired with heat-stable reagents, a qPCR assay with a mitochondrial DNA target (OvND5) was found to be more sensitive and species-specific than an O150 qPCR, which targets a non-protein coding repetitive DNA sequence. The OvND5 assay detected 19/20 pools of 100 blackfly heads spiked with a single L3, compared to 16/20 for the O150 qPCR assay. CONCLUSIONS/SIGNIFICANCE: Given the improved sensitivity, species-specificity and resistance to PCR inhibitors, we identified OvND5 as the optimal target for field sample detection. All reagents for this assay can be shipped at room temperature with no loss of activity. The qPCR protocol we propose is also simpler, faster, and more cost-effective than the current end-point molecular assays.
Assuntos
Volvo Intestinal , Onchocerca volvulus , Oncocercose , Simuliidae , Animais , Humanos , DNA Mitocondrial , Ivermectina/uso terapêutico , Onchocerca/genética , Onchocerca volvulus/genética , Oncocercose/tratamento farmacológico , Simuliidae/parasitologiaRESUMO
BACKGROUND: Elimination and control of Schistosoma japonicum, the most virulent of the schistosomiasis-causing blood flukes, requires the development of sensitive and specific diagnostic tools capable of providing an accurate measurement of the infection prevalence in endemic areas. Typically, detection of S. japonicum has occurred using the Kato-Katz technique, but this methodology, which requires skilled microscopists, has been shown to radically underestimate levels of infection. With the ever-improving capabilities of next-generation sequencing and bioinformatic analysis tools, identification of satellite sequences and other highly repetitive genomic elements for use as real-time PCR diagnostic targets is becoming increasingly common. Assays developed using these targets have the ability to improve the sensitivity and specificity of results for epidemiological studies that can in turn be used to inform mass drug administration and programmatic decision making. METHODOLOGY/PRINCIPAL FINDINGS: Utilizing Tandem Repeat Analyzer (TAREAN) and RepeatExplorer2, a cluster-based analysis of the S. japonicum genome was performed and a tandemly arranged genomic repeat, which we named SjTR1 (Schistosoma japonicum Tandem Repeat 1), was selected as the target for a real-time PCR diagnostic assay. Based on these analyses, a primer/probe set was designed and the assay was optimized. The resulting real-time PCR test was shown to reliably detect as little as 200 ag of S. japonicum genomic DNA and as little as 1 egg per gram of human stool. Based on these results, the index assay reported in this manuscript is more sensitive than previously published real-time PCR assays for the detection of S. japonicum. CONCLUSIONS/SIGNIFICANCE: The extremely sensitive and specific diagnostic assay described in this manuscript will facilitate the accurate detection of S. japonicum, particularly in regions with low levels of endemicity. This assay will be useful in providing data to inform programmatic decision makers, aiding disease control and elimination efforts.
Assuntos
Fezes/parasitologia , Reação em Cadeia da Polimerase em Tempo Real/métodos , Schistosoma japonicum/isolamento & purificação , Esquistossomose Japônica/parasitologia , Animais , Primers do DNA/genética , Feminino , Humanos , Masculino , Schistosoma japonicum/genética , Esquistossomose Japônica/diagnóstico , Sensibilidade e EspecificidadeRESUMO
Six eukaryotic supergroups have been proposed based on both morphological and molecular data. However, some of these supergroups are contentious and the deep relationships among them are poorly resolved. This is due to a limited number of morphological characters and few molecular markers in current use. The lack of resolution in most multigene analyses, including phylogenomic analyses, necessitates a search for additional, appropriate molecular markers to enable targeted sampling of taxa in key phylogenetic positions. We evaluated the phylogenetic signal of 860 proteins obtained from the Clusters of Orthologous Groups of proteins (COGs) database. We report a total of 17 markers that resulted in well-resolved topologies that are congruent with well-established components of the eukaryotic tree. To establish their utility, we designed universal degenerate primers for six markers, some of which showed promising results in unicellular eukaryotes. Finally, we present phylogenetic informativeness profiles for seven selected markers, revealing that the markers contain phylogenetic signal that spans the whole tree including the deeper branches.
Assuntos
Eucariotos/genética , Evolução Molecular , Filogenia , Bases de Dados de Proteínas , Eucariotos/classificação , Marcadores Genéticos , Análise de Sequência de DNARESUMO
BACKGROUND: Optimization of polymerase chain reaction (PCR)-based diagnostics requires the careful selection of molecular targets that are both highly repetitive and pathogen-specific. Advances in both next-generation sequencing (NGS) technologies and bioinformatics-based analysis tools are facilitating this selection process, informing target choices and reducing labor. Once developed, such assays provide disease control and elimination programs with an additional set of tools capable of evaluating and monitoring intervention successes. The importance of such tools is heightened as intervention efforts approach their endpoints, as accurate and complete information is an essential component of the informed decision-making process. As global efforts for the control and elimination of both lymphatic filariasis and malaria continue to make significant gains, the benefits of diagnostics with improved analytical and clinical/field-based sensitivities and specificities will become increasingly apparent. METHODOLOGY/PRINCIPAL FINDINGS: Coupling Illumina-based NGS with informatics approaches, we have successfully identified the tandemly repeated elements in both the Wuchereria bancrofti and Plasmodium falciparum genomes of putatively greatest copy number. Utilizing these sequences as quantitative real-time PCR (qPCR)-based targets, we have developed assays capable of exploiting the most abundant tandem repeats for both organisms. For the detection of P. falciparum, analysis and development resulted in an assay with improved analytical and field-based sensitivity vs. an established ribosomal sequence-targeting assay. Surprisingly, analysis of the W. bancrofti genome predicted a ribosomal sequence to be the genome's most abundant tandem repeat. While resulting cycle quantification values comparing a qPCR assay targeting this ribosomal sequence and a commonly targeted repetitive DNA sequence from the literature supported our finding that this ribosomal sequence was the most prevalent tandemly repeated target in the W. bancrofti genome, the resulting assay did not significantly improve detection sensitivity in conjunction with field sample testing. CONCLUSIONS/SIGNIFICANCE: Examination of pathogen genomes facilitates the development of PCR-based diagnostics targeting the most abundant and specific genomic elements. While in some instances currently available tools may deliver equal or superior performance, systematic analysis of potential targets provides confidence that the selected assays represent the most advantageous options available and that informed assay selection is occurring in the context of a particular study's objectives.
Assuntos
Culicidae/parasitologia , Plasmodium falciparum/isolamento & purificação , Reação em Cadeia da Polimerase em Tempo Real/métodos , Sequências de Repetição em Tandem , Wuchereria bancrofti/isolamento & purificação , Animais , DNA de Helmintos , Plasmodium falciparum/genética , Wuchereria bancrofti/genéticaRESUMO
Eukaryotic parasites are significant contributors to childhood illness in Niger. While helminthiases have received national attention through mass deworming efforts, the epidemiology of intestinal protozoa in Niger remains underexamined. This study employed real-time PCR diagnostics to describe the prevalence of two schistosomes, four soil-transmitted helminths, and one protozoan parasite in Boboye Department, Dosso Region. Prevalence was assessed using bulk stool specimens collected from a population-based sample of 86 children residing in 9 communities. Anthropometric measurements were used to calculate child growth z-scores and stool consistency was graded. Helminths were absent from the study population, with the exception of a single Schistosoma haematobium infection (1/86; 1.2%). Giardia duodenalis was the only protozoa present, detected in 65% (56/86) of children. Prevalence of G. duodenalis peaked in 2-year-olds with 88% (15/17) positivity. The population was generally undernourished, though growth indices did not differ significantly between children with and without G. duodenalis infection.
RESUMO
Though bulk stool remains the gold standard specimen type for enteropathogen diagnosis, rectal swabs may offer comparable sensitivity with greater ease of collection for select pathogens. This study sought to evaluate the validity and reproducibility of rectal swabs as a sample collection method for the molecular diagnosis of Giardia duodenalis. Paired rectal swab and bulk stool samples were collected from 86 children ages 0-4 years living in southwest Niger, with duplicate samples collected among a subset of 50 children. Infection was detected using a previously validated real-time PCR diagnostic targeting the small subunit ribosomal RNA gene. Giardia duodenalis was detected in 65.5% (55/84) of bulk stool samples and 44.0% (37/84) of swab samples. The kappa evaluating test agreement was 0.81 (95% CI: 0.54-1.00) among duplicate stool samples (N = 49) and 0.75 (95% CI: 0.47-1.00) among duplicate rectal swabs (N = 48). Diagnostic sensitivity was 93% (95% CI: 84-98) by bulk stool and 63% (95% CI: 49-75) by rectal swabs. When restricting to the lowest three quartiles of bulk stool quantitation cycle values (an indication of relatively high parasite load), sensitivity by rectal swabs increased to 78.0% (95% CI: 64-89, P < 0.0001). These findings suggest that rectal swabs provide less sensitive and reproducible results than bulk stool for the real-time PCR diagnosis of G. duodenalis. However, their fair sensitivity for higher parasite loads suggests that swabs may be a useful tool for detecting higher burden infections when stool collection is excessively expensive or logistically challenging.
Assuntos
Giardia lamblia/isolamento & purificação , Giardíase/diagnóstico , Manejo de Espécimes/métodos , Pré-Escolar , Testes Diagnósticos de Rotina , Fezes/parasitologia , Feminino , Giardia lamblia/genética , Giardíase/parasitologia , Humanos , Lactente , Recém-Nascido , Masculino , Reação em Cadeia da Polimerase em Tempo Real , Reto/parasitologia , Reprodutibilidade dos TestesRESUMO
There is growing interest in local elimination of soil-transmitted helminth (STH) infection in endemic settings. In such settings, highly sensitive diagnostics are needed to detect STH infection. We compared double-slide Kato-Katz, the most commonly used copromicroscopic detection method, to multi-parallel quantitative polymerase chain reaction (qPCR) in 2,799 stool samples from children aged 2-12 years in a setting in rural Bangladesh with predominantly low STH infection intensity. We estimated the sensitivity and specificity of each diagnostic using Bayesian latent class analysis. Compared to double-slide Kato-Katz, STH prevalence using qPCR was almost 3-fold higher for hookworm species and nearly 2-fold higher for Trichuris trichiura. Ascaris lumbricoides prevalence was lower using qPCR, and 26% of samples classified as A. lumbricoides positive by Kato-Katz were negative by qPCR. Amplicon sequencing of the 18S rDNA from 10 samples confirmed that A. lumbricoides was absent in samples classified as positive by Kato-Katz and negative by qPCR. The sensitivity of Kato-Katz was 49% for A. lumbricoides, 32% for hookworm, and 52% for T. trichiura; the sensitivity of qPCR was 79% for A. lumbricoides, 93% for hookworm, and 90% for T. trichiura. Specificity was ≥ 97% for both tests for all STH except for Kato-Katz for A. lumbricoides (specificity = 68%). There were moderate negative, monotonic correlations between qPCR cycle quantification values and eggs per gram quantified by Kato-Katz. While it is widely assumed that double-slide Kato-Katz has few false positives, our results indicate otherwise and highlight inherent limitations of the Kato-Katz technique. qPCR had higher sensitivity than Kato-Katz in this low intensity infection setting.
Assuntos
Helmintíase/diagnóstico , Enteropatias Parasitárias/diagnóstico , Técnicas Microbiológicas/métodos , Técnicas de Diagnóstico Molecular/métodos , Reação em Cadeia da Polimerase em Tempo Real/métodos , Ancylostomatoidea/isolamento & purificação , Animais , Ascaris lumbricoides/isolamento & purificação , Bangladesh , Criança , Pré-Escolar , DNA de Helmintos/genética , DNA Ribossômico/genética , Fezes/parasitologia , Feminino , Humanos , Lactente , Masculino , RNA Ribossômico 18S/genética , População Rural , Sensibilidade e Especificidade , Trichuris/isolamento & purificaçãoRESUMO
The balance of expense and ease of use vs. specificity and sensitivity in diagnostic assays for helminth disease is an important consideration, with expense and ease often winning out in endemic areas where funds and sophisticated equipment may be scarce. In this review, we argue that molecular diagnostics, specifically new assays that have been developed with the aid of next-generation sequence data and robust bioinformatic tools, more than make up for their expense with the benefit of a clear and precise assessment of the situation on the ground. Elimination efforts associated with the London Declaration and the World Health Organization (WHO) 2020 Roadmap have resulted in areas of low disease incidence and reduced infection burdens. An accurate assessment of infection levels is critical for determining where and when the programs can be successfully ended. Thus, more sensitive assays are needed in locations where elimination efforts are approaching a successful conclusion. Although microscopy or more general PCR targets have a role to play, they can mislead and cause study results to be confounded. Hyper-specific qPCR assays enable a more definitive assessment of the situation in the field, as well as of shifting dynamics and emerging diseases.
RESUMO
BACKGROUND: Molecular-based surveys have indicated that Ancylostoma ceylanicum, a zoonotic hookworm, is likely the second most prevalent hookworm species infecting humans in Asia. Most current PCR-based diagnostic options for the detection of Ancylostoma species target the Internal Transcribed Spacer (ITS) regions of the ribosomal gene cluster. These regions possess a considerable degree of conservation among the species of this genus and this conservation can lead to the misidentification of infecting species or require additional labor for accurate species-level determination. We have developed a novel, real-time PCR-based assay for the sensitive and species-specific detection of A. ceylanicum that targets a non-coding, highly repetitive genomic DNA element. Comparative testing of this PCR assay with an assay that targets ITS sequences was conducted on field-collected samples from Argentina and Timor-Leste to provide further evidence of the sensitivity and species-specificity of this assay. METHODS/PRINCIPAL FINDINGS: A previously described platform for the design of primers/probe targeting non-coding highly repetitive regions was used for the development of this novel assay. The assay's limits of detection (sensitivity) and cross-reactivity with other soil-transmitted helminth species (specificity) were assessed with real-time PCR experiments. The assay was successfully used to identify infections caused by A. ceylanicum that were previously only identified to the genus level as Ancylostoma spp. when analyzed using other published primer-probe pairings. Further proof of sensitive, species-specific detection was provided using a published, semi-nested restriction fragment length polymorphism-PCR assay that differentiates between Ancylostoma species. CONCLUSIONS/SIGNIFICANCE: Due to the close proximity of people and domestic/wild animals in many regions of the world, the potential for zoonotic infections is substantial. Sensitive tools enabling the screening for different soil-transmitted helminth infections are essential to the success of mass deworming efforts and facilitate the appropriate interpretation of data. This study describes a novel, species-specific, real-time PCR-based assay for the detection of A. ceylanicum that will help to address the need for such tools in integrated STH deworming programs. TRIAL REGISTRATION: ANZCTR.org.au ACTRN12614000680662.
Assuntos
Ancylostoma/isolamento & purificação , Ancilostomíase/diagnóstico , Fezes/parasitologia , Reação em Cadeia da Polimerase em Tempo Real/métodos , Ancylostoma/classificação , Animais , Argentina , DNA de Helmintos/isolamento & purificação , Humanos , Filogenia , Polimorfismo de Fragmento de Restrição , Ensaios Clínicos Controlados Aleatórios como Assunto , Especificidade da Espécie , Timor-Leste , Zoonoses/parasitologiaRESUMO
A sequestered germline in Metazoa has been argued to be an obstacle to lateral gene transfer (LGT), though few studies have specifically assessed this claim. Here, we test the hypothesis that the origin of a sequestered germline reduced LGT events in Bilateria (i.e., triploblast lineages) as compared to early-diverging Metazoa (i.e., Ctenophora, Cnidaria, Porifera, and Placozoa). We analyze single-gene phylogenies generated with over 900 species sampled from among Bacteria, Archaea, and Eukaryota to identify well-supported interdomain LGTs. We focus on ancient interdomain LGT (i.e., those between prokaryotes and multiple lineages of Metazoa) as systematic errors in single-gene tree reconstruction create uncertainties for interpreting eukaryote-to-eukaryote transfer. The breadth of the sampled Metazoa enables us to estimate the timing of LGTs, and to examine the pattern before versus after the evolution of a sequestered germline. We identified 58 LGTs found only in Metazoa and prokaryotes (i.e., bacteria and/or archaea), and seven genes transferred from prokaryotes into Metazoa plus one other eukaryotic clade. Our analyses indicate that more interdomain transfers occurred before the development of a sequestered germline, consistent with the hypothesis that this feature is an obstacle to LGT.
Assuntos
Evolução Molecular , Transferência Genética Horizontal , Células Germinativas , Animais , FilogeniaRESUMO
BACKGROUND: The soil transmitted helminths are a group of parasitic worms responsible for extensive morbidity in many of the world's most economically depressed locations. With growing emphasis on disease mapping and eradication, the availability of accurate and cost-effective diagnostic measures is of paramount importance to global control and elimination efforts. While real-time PCR-based molecular detection assays have shown great promise, to date, these assays have utilized sub-optimal targets. By performing next-generation sequencing-based repeat analyses, we have identified high copy-number, non-coding DNA sequences from a series of soil transmitted pathogens. We have used these repetitive DNA elements as targets in the development of novel, multi-parallel, PCR-based diagnostic assays. METHODOLOGY/PRINCIPAL FINDINGS: Utilizing next-generation sequencing and the Galaxy-based RepeatExplorer web server, we performed repeat DNA analysis on five species of soil transmitted helminths (Necator americanus, Ancylostoma duodenale, Trichuris trichiura, Ascaris lumbricoides, and Strongyloides stercoralis). Employing high copy-number, non-coding repeat DNA sequences as targets, novel real-time PCR assays were designed, and assays were tested against established molecular detection methods. Each assay provided consistent detection of genomic DNA at quantities of 2 fg or less, demonstrated species-specificity, and showed an improved limit of detection over the existing, proven PCR-based assay. CONCLUSIONS/SIGNIFICANCE: The utilization of next-generation sequencing-based repeat DNA analysis methodologies for the identification of molecular diagnostic targets has the ability to improve assay species-specificity and limits of detection. By exploiting such high copy-number repeat sequences, the assays described here will facilitate soil transmitted helminth diagnostic efforts. We recommend similar analyses when designing PCR-based diagnostic tests for the detection of other eukaryotic pathogens.
Assuntos
DNA de Helmintos/isolamento & purificação , Helmintíase/diagnóstico , Helmintíase/parasitologia , Reação em Cadeia da Polimerase/métodos , Solo/parasitologia , Animais , HumanosRESUMO
Lateral gene transfer (LGT) has impacted the evolutionary history of eukaryotes, though to a lesser extent than in bacteria and archaea. Detecting LGT and distinguishing it from single gene tree artifacts is difficult, particularly when considering very ancient events (i.e., over hundreds of millions of years). Here, we use two independent lines of evidence--a taxon-rich phylogenetic approach and an assessment of the patterns of gene presence/absence--to evaluate the extent of LGT in the parasitic amoebozoan genus Entamoeba. Previous work has suggested that a number of genes in the genome of Entamoeba spp. were acquired by LGT. Our approach, using an automated phylogenomic pipeline to build taxon-rich gene trees, suggests that LGT is more extensive than previously thought. Our analyses reveal that genes have frequently entered the Entamoeba genome via nonvertical events, including at least 116 genes acquired directly from bacteria or archaea, plus an additional 22 genes in which Entamoeba plus one other eukaryote are nested among bacteria and/or archaea. These genes may make good candidates for novel therapeutics, as drugs targeting these genes are less likely to impact the human host. Although we recognize the challenges of inferring intradomain transfers given systematic errors in gene trees, we find 109 genes supporting LGT from a eukaryote to Entamoeba spp., and 178 genes unique to Entamoeba spp. and one other eukaryotic taxon (i.e., presence/absence data). Inspection of these intradomain LGTs provide evidence of a common sister relationship between genes of Entamoeba (Amoebozoa) and parabasalids (Excavata). We speculate that this indicates a past close relationship (e.g., symbiosis) between ancestors of these extant lineages.
Assuntos
Entamoeba/classificação , Entamoeba/genética , Transferência Genética Horizontal , Parabasalídeos/classificação , Parabasalídeos/genética , Filogenia , Archaea/genética , Bactérias/genética , Evolução Molecular , GenomaRESUMO
Background Understanding the evolutionary relationships of all eukaryotes on Earth remains a paramount goal of modern biology, yet analyzing homologous sequences across 1.8 billion years of eukaryotic evolution is challenging. Many existing tools for identifying gene orthologs are inadequate when working with heterogeneous rates of evolution and endosymbiotic/lateral gene transfer. Moreover, genomic-scale sequencing, which was once the domain of large sequencing centers, has advanced to the point where small laboratories can now generate the data needed for phylogenomic studies. This has opened the door for increased taxonomic sampling as individual research groups have the ability to conduct genome-scale projects on their favorite non-model organism. Results Here we present some of the tools developed, and insights gained, as we created a pipeline that combines data-mining from public databases and our own transcriptome data to study the eukaryotic tree of life. The first steps of a phylogenomic pipeline involve choosing taxa and loci, and making decisions about how to handle alleles, paralogs and non-overlapping sequences. Next, orthologs are aligned for analyses including gene tree reconstruction and concatenation for supermatrix approaches. To build our pipeline, we created scripts written in Python that integrate third-party tools with custom methods. As a test case, we present the placement of five amoebae on the eukaryotic tree of life based on analyses of transcriptome data. Our scripts available on GitHUb and may be used as-is for automated analyses of large scale phylogenomics, or adapted for use in other types of studies. Conclusion Analyses on the scale of all eukaryotes present challenges not necessarily found in studies of more closely related organisms. Our approach will be of relevance to others for whom existing third-party tools fail to fully answer desired phylogenetic questions.
RESUMO
Tubulinea is a phylogenetically stable higher-level taxon within Amoebozoa, morphologically characterized by monoaxially streaming and cylindrical pseudopods. Contemporary phylogenetic reconstructions have largely relied on SSU rDNA, and to a lesser extent, on actin genes to reveal the relationships among these organisms. Additionally, the test (shell) forming Arcellinida, one of the most species-rich amoebozoan groups, is nested within Tubulinea and suffers from substantial under-sampling of taxa. Here, we increase taxonomic and gene sampling within the Tubulinea, characterizing molecular data for 22 taxa and six genes (SSU rDNA, actin, α- and ß-tubulin, elongation factor 2 and the 14-3-3 regulatory protein). We perform concatenated phylogenetic analyses using these genes as well as approximately unbiased tests to assess evolutionary relationships within the Tubulinea. We confirm the monophyly of Tubulinea and four of the six included lineages (Echinamoeboidea, Leptomyxida, Amoebida and Poseidonida). Arcellinida and Hartmanellidae, the remaining lineages, are not monophyletic in our reconstructions, although statistical testing does not allow rejection of either group. We further investigate more fine-grained morphological evolution of previously defined groups, concluding that relationships within Arcellinida are more consistent with general test and aperture shape than with test composition. We also discuss the implications of this phylogeny for interpretations of the Precambrian fossil record of testate amoebae.