RESUMO
BACKGROUND: New sequencing technologies have lowered financial barriers to whole genome sequencing, but resulting assemblies are often fragmented and far from 'finished'. Updating multi-scaffold drafts to chromosome-level status can be achieved through experimental mapping or re-sequencing efforts. Avoiding the costs associated with such approaches, comparative genomic analysis of gene order conservation (synteny) to predict scaffold neighbours (adjacencies) offers a potentially useful complementary method for improving draft assemblies. RESULTS: We evaluated and employed 3 gene synteny-based methods applied to 21 Anopheles mosquito assemblies to produce consensus sets of scaffold adjacencies. For subsets of the assemblies, we integrated these with additional supporting data to confirm and complement the synteny-based adjacencies: 6 with physical mapping data that anchor scaffolds to chromosome locations, 13 with paired-end RNA sequencing (RNAseq) data, and 3 with new assemblies based on re-scaffolding or long-read data. Our combined analyses produced 20 new superscaffolded assemblies with improved contiguities: 7 for which assignments of non-anchored scaffolds to chromosome arms span more than 75% of the assemblies, and a further 7 with chromosome anchoring including an 88% anchored Anopheles arabiensis assembly and, respectively, 73% and 84% anchored assemblies with comprehensively updated cytogenetic photomaps for Anopheles funestus and Anopheles stephensi. CONCLUSIONS: Experimental data from probe mapping, RNAseq, or long-read technologies, where available, all contribute to successful upgrading of draft assemblies. Our evaluations show that gene synteny-based computational methods represent a valuable alternative or complementary approach. Our improved Anopheles reference assemblies highlight the utility of applying comparative genomics approaches to improve community genomic resources.
Assuntos
Anopheles/genética , Evolução Biológica , Cromossomos , Técnicas Genéticas/instrumentação , Genômica/métodos , Sintenia , Animais , Mapeamento CromossômicoRESUMO
Complementing genome sequence with deep transcriptome and proteome data could enable more accurate assembly and annotation of newly sequenced genomes. Here, we provide a proof-of-concept of an integrated approach for analysis of the genome and proteome of Anopheles stephensi, which is one of the most important vectors of the malaria parasite. To achieve broad coverage of genes, we carried out transcriptome sequencing and deep proteome profiling of multiple anatomically distinct sites. Based on transcriptomic data alone, we identified and corrected 535 events of incomplete genome assembly involving 1196 scaffolds and 868 protein-coding gene models. This proteogenomic approach enabled us to add 365 genes that were missed during genome annotation and identify 917 gene correction events through discovery of 151 novel exons, 297 protein extensions, 231 exon extensions, 192 novel protein start sites, 19 novel translational frames, 28 events of joining of exons, and 76 events of joining of adjacent genes as a single gene. Incorporation of proteomic evidence allowed us to change the designation of more than 87 predicted "noncoding RNAs" to conventional mRNAs coded by protein-coding genes. Importantly, extension of the newly corrected genome assemblies and gene models to 15 other newly assembled Anopheline genomes led to the discovery of a large number of apparent discrepancies in assembly and annotation of these genomes. Our data provide a framework for how future genome sequencing efforts should incorporate transcriptomic and proteomic analysis in combination with simultaneous manual curation to achieve near complete assembly and accurate annotation of genomes.
Assuntos
Genoma/genética , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Anotação de Sequência Molecular , Transcriptoma/genética , Animais , Anopheles/genética , Éxons/genética , Perfilação da Expressão Gênica , Proteoma/genética , ProteômicaRESUMO
The genome of the Neotropical malaria vector Anopheles albimanus was sequenced as part of the 16 Anopheles Genomes Project published in 2015. The draft assembly of this species consisted of 204 scaffolds with an N50 scaffold size of 18.1 Mb and a total assembly size of 170.5 Mb. It was among the smallest genomes with the longest scaffolds in the 16 Anopheles species cluster, making An. albimanus the logical choice for anchoring the genome assembly to chromosomes. In this study, we developed a high-resolution cytogenetic photomap with completely straightened polytene chromosomes from the salivary glands of the mosquito larvae. Based on this photomap, we constructed a chromosome-based genome assembly using fluorescent in situ hybridization of PCR-amplified DNA probes. Our physical mapping, assisted by an ortholog-based bioinformatics approach, identified and corrected nine misassemblies in five large genomic scaffolds. Misassemblies mostly occurred in junctions between contigs. Our comparative analysis of scaffolds with the An. gambiae genome detected multiple genetic exchanges between pericentromeric regions of chromosomal arms caused by partial-arm translocations. The final map consists of 40 ordered genomic scaffolds and corrected fragments of misassembled scaffolds. The An. albimanus physical map comprises 98.2% of the total genome assembly and represents the most complete genome map among mosquito species. This study demonstrates that physical mapping is a powerful tool for correcting errors in draft genome assemblies and for creating chromosome-anchored reference genomes.
Assuntos
Anopheles/genética , Mapeamento Cromossômico/métodos , Genoma de Inseto , Malária/genética , Animais , Anopheles/patogenicidade , Hibridização in Situ Fluorescente , Malária/transmissão , Cromossomos Politênicos/genética , Glândulas Salivares , Translocação GenéticaRESUMO
The major vectors of malaria in sub-Saharan Africa belong to subgenus Cellia. Yet, phylogenetic relationships and temporal diversification among African mosquito species have not been unambiguously determined. Knowledge about vector evolutionary history is crucial for correct interpretation of genetic changes identified through comparative genomics analyses. In this study, we estimated a molecular phylogeny using 49 gene sequences for the African malaria vectors An. gambiae, An. funestus, An. nili, the Asian malaria mosquito An. stephensi, and the outgroup species Culex quinquefasciatus and Aedes aegypti. To infer the phylogeny, we identified orthologous sequences uniformly distributed approximately every 5 Mb in the five chromosomal arms. The sequences were aligned and the phylogenetic trees were inferred using maximum likelihood and neighbor-joining methods. Bayesian molecular dating using a relaxed log normal model was used to infer divergence times. Trees from individual genes agreed with each other, placing An. nili as a basal clade that diversified from the studied malaria mosquito species 47.6 million years ago (mya). Other African malaria vectors originated more recently, and independently acquired traits related to vectorial capacity. The lineage leading to An. gambiae diverged 30.4 mya, while the African vector An. funestus and the Asian vector An. stephensi were the most closely related sister taxa that split 20.8 mya. These results were supported by consistently high bootstrap values in concatenated phylogenetic trees generated individually for each chromosomal arm. Genome-wide multigene phylogenetic analysis is a useful approach for discerning historic relationships among malaria vectors, providing a framework for the correct interpretation of genomic changes across species, and comprehending the evolutionary origins of this ubiquitous and deadly insect-borne disease.
Assuntos
Culicidae/genética , Evolução Molecular , Especiação Genética , Insetos Vetores/genética , Malária/transmissão , Filogenia , Aedes/genética , África Subsaariana , Animais , Anopheles/genética , Cromossomos de Insetos , Culex/genética , Genes de InsetosRESUMO
BACKGROUND: Anopheles stephensi is the key vector of malaria throughout the Indian subcontinent and Middle East and an emerging model for molecular and genetic studies of mosquito-parasite interactions. The type form of the species is responsible for the majority of urban malaria transmission across its range. RESULTS: Here, we report the genome sequence and annotation of the Indian strain of the type form of An. stephensi. The 221 Mb genome assembly represents more than 92% of the entire genome and was produced using a combination of 454, Illumina, and PacBio sequencing. Physical mapping assigned 62% of the genome onto chromosomes, enabling chromosome-based analysis. Comparisons between An. stephensi and An. gambiae reveal that the rate of gene order reshuffling on the X chromosome was three times higher than that on the autosomes. An. stephensi has more heterochromatin in pericentric regions but less repetitive DNA in chromosome arms than An. gambiae. We also identify a number of Y-chromosome contigs and BACs. Interspersed repeats constitute 7.1% of the assembled genome while LTR retrotransposons alone comprise more than 49% of the Y contigs. RNA-seq analyses provide new insights into mosquito innate immunity, development, and sexual dimorphism. CONCLUSIONS: The genome analysis described in this manuscript provides a resource and platform for fundamental and translational research into a major urban malaria vector. Chromosome-based investigations provide unique perspectives on Anopheles chromosome evolution. RNA-seq analysis and studies of immunity genes offer new insights into mosquito biology and mosquito-parasite interactions.
Assuntos
Anopheles/genética , Insetos Vetores/genética , Animais , Anopheles/metabolismo , Mapeamento Cromossômico , Cromossomos de Insetos/genética , Análise por Conglomerados , Evolução Molecular , Genoma de Inseto , Humanos , Proteínas de Insetos/genética , Proteínas de Insetos/metabolismo , Malária/transmissão , Filogenia , Polimorfismo de Nucleotídeo Único , Análise de Sequência de DNA , Sintenia , Transcriptoma , População UrbanaRESUMO
Cytogenetic analysis is an informative classical approach to understanding the relationships among members in a group of closely related species of mosquitoes. Anopheles ovengensis is a recently discovered species of the Anopheles nili group and is one of the important malaria vectors in the African equatorial forest. This study characterized polytene chromosomes of An. ovengensis and compared them with polytene chromosomes of An. nili. Using fluorescent in situ hybridization and chromosome banding pattern comparison we have established correspondence between chromosomal arms of An. ovengensis and An. nili. Analysis of chromosome morphology in the two species revealed a limited similarity in the banding patterns. The most extensive reorganization occurs in pericentromeric and intercalary heterochromatin. Chromosomes of An. ovengensis are joined together by a diffuse chromocenter and they have two large regions of intercalary heterochromatin in arms 2L and 3R. In contrast, the chromocenter and intercalary heterochromatin are not seen in An. nili chromosomes. Comparative analysis of the arm association suggests the occurrence of a whole-arm translocation between the two members of the group. The observed, substantial reorganizations of chromosome structure implies either a rapid rate of chromosome evolution in the An. nili group, or that the two species belong to different taxonomic groups within subgenus Cellia.
Assuntos
Anopheles/genética , Cromossomos Politênicos , Animais , Anopheles/química , Anopheles/classificação , Mapeamento Cromossômico , Feminino , Hibridização in Situ FluorescenteRESUMO
BACKGROUND: Anopheles nili is a major vector of malaria in the humid savannas and forested areas of sub-Saharan Africa. Understanding the population genetic structure and evolutionary dynamics of this species is important for the development of an adequate and targeted malaria control strategy in Africa. Chromosomal inversions and microsatellite markers are commonly used for studying the population structure of malaria mosquitoes. Physical mapping of these markers onto the chromosomes further improves the toolbox, and allows inference on the demographic and evolutionary history of the target species. RESULTS: Availability of polytene chromosomes allowed us to develop a map of microsatellite markers and to study polymorphism of chromosomal inversions. Nine microsatellite markers were mapped to unique locations on all five chromosomal arms of An. nili using fluorescent in situ hybridization (FISH). Probes were obtained from 300-483 bp-long inserts of plasmid clones and from 506-559 bp-long fragments amplified with primers designed using the An. nili genome assembly generated on an Illumina platform. Two additional loci were assigned to specific chromosome arms of An. nili based on in silico sequence similarity and chromosome synteny with Anopheles gambiae. Three microsatellites were mapped inside or in the vicinity of the polymorphic chromosomal inversions 2Rb and 2Rc. A statistically significant departure from Hardy-Weinberg equilibrium, due to a deficit in heterozygotes at the 2Rb inversion, and highly significant linkage disequilibrium between the two inversions, were detected in natural An. nili populations collected from Burkina Faso. CONCLUSIONS: Our study demonstrated that next-generation sequencing can be used to improve FISH for microsatellite mapping in species with no reference genome sequence. Physical mapping of microsatellite markers in An. nili showed that their cytological locations spanned the entire five-arm complement, allowing genome-wide inferences. The knowledge about polymorphic inversions and chromosomal locations of microsatellite markers has been useful for explaining differences in genetic variability across loci and significant differentiation observed among natural populations of An. nili.