RESUMEN
Female Aedes aegypti mosquitoes infect more than 400 million people each year with dangerous viral pathogens including dengue, yellow fever, Zika and chikungunya. Progress in understanding the biology of mosquitoes and developing the tools to fight them has been slowed by the lack of a high-quality genome assembly. Here we combine diverse technologies to produce the markedly improved, fully re-annotated AaegL5 genome assembly, and demonstrate how it accelerates mosquito science. We anchored physical and cytogenetic maps, doubled the number of known chemosensory ionotropic receptors that guide mosquitoes to human hosts and egg-laying sites, provided further insight into the size and composition of the sex-determining M locus, and revealed copy-number variation among glutathione S-transferase genes that are important for insecticide resistance. Using high-resolution quantitative trait locus and population genomic analyses, we mapped new candidates for dengue vector competence and insecticide resistance. AaegL5 will catalyse new biological insights and intervention strategies to fight this deadly disease vector.
Asunto(s)
Aedes/genética , Infecciones por Arbovirus/virología , Arbovirus , Genoma de los Insectos/genética , Genómica/normas , Control de Insectos , Mosquitos Vectores/genética , Mosquitos Vectores/virología , Aedes/virología , Animales , Infecciones por Arbovirus/transmisión , Arbovirus/aislamiento & purificación , Variaciones en el Número de Copia de ADN/genética , Virus del Dengue/aislamiento & purificación , Femenino , Variación Genética/genética , Genética de Población , Glutatión Transferasa/genética , Resistencia a los Insecticidas/efectos de los fármacos , Masculino , Anotación de Secuencia Molecular , Familia de Multigenes/genética , Piretrinas/farmacología , Estándares de Referencia , Procesos de Determinación del Sexo/genéticaRESUMEN
Horizontal gene transfer from viruses to eukaryotic cells is a pervasive phenomenon. Somatic viral integrations are linked to persistent viral infection whereas integrations into germline cells are maintained in host genomes by vertical transmission and may be co-opted for host functions. In the arboviral vector Aedes aegypti, an endogenous viral element from a nonretroviral RNA virus (nrEVE) was shown to produce PIWI-interacting RNAs (piRNAs) to limit infection with a cognate virus. Thus, nrEVEs may constitute a heritable, sequence-specific mechanism for antiviral immunity, analogous to piRNA-mediated silencing of transposable elements. Here, we combine population genomics and evolutionary approaches to analyse the genomic architecture of nrEVEs in A. aegypti. We conducted a genome-wide screen for adaptive nrEVEs and searched for novel population-specific nrEVEs in the genomes of 80 individual wild-caught mosquitoes from five geographical populations. We show a dynamic landscape of nrEVEs in mosquito genomes and identified five novel nrEVEs derived from two currently circulating viruses, providing evidence of the environmental-dependent modification of a piRNA cluster. Overall, our results show that virus endogenization events are complex with only a few nrEVEs contributing to adaptive evolution in A. aegypti.
Asunto(s)
Aedes , Aedes/genética , Animales , Genómica , Metagenómica , Mosquitos Vectores/genética , ARN Interferente Pequeño/genéticaRESUMEN
Drug resistance is an obstacle to global malaria control, as evidenced by the recent emergence and rapid spread of delayed artemisinin (ART) clearance by mutant forms of the PfKelch13 protein in Southeast Asia. Identifying genetic determinants of ART resistance in African-derived parasites is important for surveillance and for understanding the mechanism of resistance. In this study, we carried out long-term in vitro selection of two recently isolated West African parasites (from Pikine and Thiès, Senegal) with increasing concentrations of dihydroartemisinin (DHA), the biologically active form of ART, over a 4-y period. We isolated two parasite clones, one from each original isolate, that exhibited enhanced survival to DHA in the ring-stage survival assay. Whole-genome sequence analysis identified 10 mutations in seven different genes. We chose to focus on the gene encoding PfCoronin, a member of the WD40-propeller domain protein family, because mutations in this gene occurred in both independent selections, and the protein shares the ß-propeller motif with PfKelch13 protein. For functional validation, when pfcoronin mutations were introduced into the parental parasites by CRISPR/Cas9-mediated gene editing, these mutations were sufficient to reduce ART susceptibility in the parental lines. The discovery of a second gene for ART resistance may yield insights into the molecular mechanisms of resistance. It also suggests that pfcoronin mutants could emerge as a nonkelch13 type of resistance to ART in natural settings.
Asunto(s)
4-Butirolactona/análogos & derivados , Artemisininas/farmacología , Proteínas de Microfilamentos/genética , Mutación/genética , Plasmodium falciparum/efectos de los fármacos , Plasmodium falciparum/genética , 4-Butirolactona/genética , Antimaláricos/farmacología , Repeticiones Palindrómicas Cortas Agrupadas y Regularmente Espaciadas/genética , Resistencia a Medicamentos/genética , Edición Génica/métodos , Humanos , Malaria Falciparum/tratamiento farmacológico , Malaria Falciparum/parasitología , Repeticiones WD40/genéticaRESUMEN
BACKGROUND: Aedes aegypti is the principal mosquito vector of Zika, dengue, and yellow fever viruses. Two subspecies of Ae. aegypti exhibit phenotypic divergence with regard to habitat, host preference, and vectorial capacity. Chromosomal inversions have been shown to play a major role in adaptation and speciation in dipteran insects and would be of great utility for studies of Ae. aegypti. However, the large and highly repetitive genome of Ae. aegypti makes it difficult to detect inversions with paired-end short-read sequencing data, and polytene chromosome analysis does not provide sufficient resolution to detect chromosome banding patterns indicative of inversions. RESULTS: To characterize chromosomal diversity in this species, we have carried out deep Illumina sequencing of linked-read (10X Genomics) libraries in order to discover inversion loci as well as SNPs. We analyzed individuals from colonies representing the geographic limits of each subspecies, one contact zone between subspecies, and a closely related sister species. Despite genome-wide SNP divergence and abundant microinversions, we do not find any inversions occurring as fixed differences between subspecies. Many microinversions are found in regions that have introgressed and have captured genes that could impact behavior, such as a cluster of odorant-binding proteins that may play a role in host feeding preference. CONCLUSIONS: Our study shows that inversions are abundant and widely shared among subspecies of Aedes aegypti and that introgression has occurred in regions of secondary contact. This library of 32 novel chromosomal inversions demonstrates the capacity for linked-read sequencing to identify previously intractable genomic rearrangements and provides a foundation for future population genetics studies in this species.
Asunto(s)
Aedes/genética , Inversión Cromosómica , Introgresión Genética , Mosquitos Vectores/genética , Animales , Cromosomas , Variación Genética , Secuenciación de Nucleótidos de Alto RendimientoRESUMEN
Detecting de novo mutations in viral and bacterial pathogens enables researchers to reconstruct detailed networks of disease transmission and is a key technique in genomic epidemiology. However, these techniques have not yet been applied to the malaria parasite, Plasmodium falciparum, in which a larger genome, slower generation times, and a complex life cycle make them difficult to implement. Here, we demonstrate the viability of de novo mutation studies in P. falciparum for the first time. Using a combination of sequencing, library preparation, and genotyping methods that have been optimized for accuracy in low-complexity genomic regions, we have detected de novo mutations that distinguish nominally identical parasites from clonal lineages. Despite its slower evolutionary rate compared with bacterial or viral species, de novo mutation can be detected in P. falciparum across timescales of just 1-2 years and evolutionary rates in low-complexity regions of the genome can be up to twice that detected in the rest of the genome. The increased mutation rate allows the identification of separate clade expansions that cannot be found using previous genomic epidemiology approaches and could be a crucial tool for mapping residual transmission patterns in disease elimination campaigns and reintroduction scenarios.
Asunto(s)
Evolución Molecular , Malaria/parasitología , Mutación , Plasmodium falciparum/genética , Técnicas Genéticas , Malaria/transmisión , FilogeniaRESUMEN
Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
Asunto(s)
Secuencia Conservada/genética , Genoma/genética , Pez Cebra/genética , Animales , Cromosomas/genética , Evolución Molecular , Femenino , Genes/genética , Genoma Humano/genética , Genómica , Humanos , Masculino , Meiosis/genética , Anotación de Secuencia Molecular , Seudogenes/genética , Estándares de Referencia , Procesos de Determinación del Sexo/genética , Proteínas de Pez Cebra/genéticaRESUMEN
The passage through the mosquito is a major bottleneck for malaria parasite populations and a target of interventions aiming to block disease transmission. Here, we used DNA microarrays to profile the developmental transcriptomes of the rodent malaria parasite Plasmodium berghei in vivo, in the midgut of Anopheles gambiae mosquitoes, from parasite stages in the midgut blood bolus to sporulating oocysts on the basal gut wall. Data analysis identified several distinct transcriptional programmes encompassing genes putatively involved in developmental processes or in interactions with the mosquito. At least two of these programmes are associated with the ookinete development that is linked to mosquito midgut invasion and establishment of infection. Targeted disruption by homologous recombination of two of these genes resulted in mutant parasites exhibiting notable infection phenotypes. GAMER encodes a short polypeptide with granular localization in the gametocyte cytoplasm and shows a highly penetrant loss-of-function phenotype manifested as greatly reduced ookinete numbers, linked to impaired male gamete release. HADO encodes a putative magnesium phosphatase with distinctive cortical localization along the concave ookinete periphery. Disruption of HADO compromises ookinete development leading to significant reduction of oocyst numbers. Our data provide important insights into the molecular framework underpinning Plasmodium development in the mosquito and identifies two genes with important functions at initial stages of parasite development in the mosquito midgut.
Asunto(s)
Anopheles/parasitología , Perfilación de la Expresión Génica , Plasmodium berghei/crecimiento & desarrollo , Animales , Tracto Gastrointestinal/parasitología , Malaria/transmisión , Análisis de Secuencia por Matrices de Oligonucleótidos , Plasmodium berghei/genética , Plasmodium berghei/aislamiento & purificaciónRESUMEN
BACKGROUND: Translating genomic technologies into healthcare applications for the malaria parasite Plasmodium falciparum has been limited by the technical and logistical difficulties of obtaining high quality clinical samples from the field. Sampling by dried blood spot (DBS) finger-pricks can be performed safely and efficiently with minimal resource and storage requirements compared with venous blood (VB). Here, the use of selective whole genome amplification (sWGA) to sequence the P. falciparum genome from clinical DBS samples was evaluated, and the results compared with current methods that use leucodepleted VB. METHODS: Parasite DNA with high (>95%) human DNA contamination was selectively amplified by Phi29 polymerase using short oligonucleotide probes of 8-12 mers as primers. These primers were selected on the basis of their differential frequency of binding the desired (P. falciparum DNA) and contaminating (human) genomes. RESULTS: Using sWGA method, clinical samples from 156 malaria patients, including 120 paired samples for head-to-head comparison of DBS and leucodepleted VB were sequenced. Greater than 18-fold enrichment of P. falciparum DNA was achieved from DBS extracts. The parasitaemia threshold to achieve >5× coverage for 50% of the genome was 0.03% (40 parasites per 200 white blood cells). Over 99% SNP concordance between VB and DBS samples was achieved after excluding missing calls. CONCLUSION: The sWGA methods described here provide a reliable and scalable way of generating P. falciparum genome sequence data from DBS samples. The current data indicate that it will be possible to get good quality sequence on most if not all drug resistance loci from the majority of symptomatic malaria patients. This technique overcomes a major limiting factor in P. falciparum genome sequencing from field samples, and paves the way for large-scale epidemiological applications.
Asunto(s)
Sangre/parasitología , Desecación , Genoma de Protozoos , Técnicas de Amplificación de Ácido Nucleico/métodos , Plasmodium falciparum/genética , Análisis de Secuencia de ADN , Manejo de Especímenes/métodos , Cartilla de ADN/genética , ADN Protozoario/química , ADN Protozoario/genética , ADN Protozoario/aislamiento & purificación , Humanos , Plasmodium falciparum/aislamiento & purificaciónRESUMEN
BACKGROUND: The genome-wide association study (GWAS) techniques that have been used for genetic mapping in other organisms have not been successfully applied to mosquitoes, which have genetic characteristics of high nucleotide diversity, low linkage disequilibrium, and complex population stratification that render population-based GWAS essentially unfeasible at realistic sample size and marker density. METHODS: We designed a novel mapping strategy for the mosquito system that combines the power of linkage mapping with the resolution afforded by genetic association. We established founder colonies from West Africa, controlled for diversity, linkage disequilibrium and population stratification. Colonies were challenged by feeding on the infectious stage of the human malaria parasite, Plasmodium falciparum, mosquitoes were phenotyped for parasite load, and DNA pools for phenotypically similar mosquitoes were Illumina sequenced. Phenotype-genotype mapping was carried out in two stages, coarse and fine. RESULTS: In the first mapping stage, pooled sequences were analysed genome-wide for intervals displaying relativereduction in diversity between phenotype pools, and candidate genomic loci were identified for influence upon parasite infection levels. In the second mapping stage, focused genotyping of SNPs from the first mapping stage was carried out in unpooled individual mosquitoes and replicates. The second stage confirmed significant SNPs in a locus encoding two Toll-family proteins. RNAi-mediated gene silencing and infection challenge revealed that TOLL 11 protects mosquitoes against P. falciparum infection. CONCLUSIONS: We present an efficient and cost-effective method for genetic mapping using natural variation segregating in defined recent Anopheles founder colonies, and demonstrate its applicability for mapping in a complex non-model genome. This approach is a practical and preferred alternative to population-based GWAS for first-pass mapping of phenotypes in Anopheles. This design should facilitate mapping of other traits involved in physiology, epidemiology, and behaviour.
Asunto(s)
Anopheles/genética , Estudio de Asociación del Genoma Completo , Malaria Falciparum/genética , Plasmodium falciparum/genética , Receptores Toll-Like/genética , Animales , Anopheles/parasitología , Mapeo Cromosómico , Genoma de los Insectos , Genotipo , Interacciones Huésped-Parásitos/genética , Humanos , Insectos Vectores/genética , Malaria Falciparum/parasitología , Malaria Falciparum/transmisión , Fenotipo , Plasmodium falciparum/patogenicidad , Polimorfismo de Nucleótido SimpleRESUMEN
VectorBase (http://www.vectorbase.org) is a NIAID-supported bioinformatics resource for invertebrate vectors of human pathogens. It hosts data for nine genomes: mosquitoes (three Anopheles gambiae genomes, Aedes aegypti and Culex quinquefasciatus), tick (Ixodes scapularis), body louse (Pediculus humanus), kissing bug (Rhodnius prolixus) and tsetse fly (Glossina morsitans). Hosted data range from genomic features and expression data to population genetics and ontologies. We describe improvements and integration of new data that expand our taxonomic coverage. Releases are bi-monthly and include the delivery of preliminary data for emerging genomes. Frequent updates of the genome browser provide VectorBase users with increasing options for visualizing their own high-throughput data. One major development is a new population biology resource for storing genomic variations, insecticide resistance data and their associated metadata. It takes advantage of improved ontologies and controlled vocabularies. Combined, these new features ensure timely release of multiple types of data in the public domain while helping overcome the bottlenecks of bioinformatics and annotation by engaging with our user community.
Asunto(s)
Bases de Datos Genéticas , Genoma de los Insectos , Insectos Vectores/genética , Animales , Culicidae/genética , Variación Genética , Genómica , Resistencia a los Insecticidas , Ixodes/genética , Pediculus/genética , Rhodnius/genética , Moscas Tse-Tse/genéticaRESUMEN
Omicron surged as a variant of concern in late 2021. Several distinct Omicron variants appeared and overtook each other. We combined variant frequencies and infection estimates from a nowcasting model for each US state to estimate variant-specific infections, attack rates, and effective reproduction numbers (Rt). BA.1 rapidly emerged, and we estimate that it infected 47.7% of the US population before it was replaced by BA.2. We estimate that BA.5 infected 35.7% of the US population, persisting in circulation for nearly 6 months. Other variants-BA.2, BA.4, and XBB-together infected 30.7% of the US population. We found a positive correlation between the state-level BA.1 attack rate and social vulnerability and a negative correlation between the BA.1 and BA.2 attack rates. Our findings illustrate the complex interplay between viral evolution, population susceptibility, and social factors during the Omicron emergence in the US.
Asunto(s)
COVID-19 , SARS-CoV-2 , SARS-CoV-2/genética , SARS-CoV-2/aislamiento & purificación , COVID-19/virología , COVID-19/epidemiología , Humanos , Estados Unidos/epidemiología , Genoma Viral , Genómica/métodosRESUMEN
BACKGROUND: Human Malaria is transmitted by mosquitoes of the genus Anopheles. Transmission is a complex phenomenon involving biological and environmental factors of humans, parasites and mosquitoes. Among more than 500 anopheline species, only a few species from different branches of the mosquito evolutionary tree transmit malaria, suggesting that their vectorial capacity has evolved independently. Anopheles albimanus (subgenus Nyssorhynchus) is an important malaria vector in the Americas. The divergence time between Anopheles gambiae, the main malaria vector in Africa, and the Neotropical vectors has been estimated to be 100 My. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to explore the mosquito biology beyond the An. gambiae complex. RESULTS: We sequenced the transcriptome of the An. albimanus adult female. By combining Sanger, 454 and Illumina sequences from cDNA libraries derived from the midgut, cuticular fat body, dorsal vessel, salivary gland and whole body, we generated a single, high-quality assembly containing 16,669 transcripts, 92% of which mapped to the An. darlingi genome and covered 90% of the core eukaryotic genome. Bidirectional comparisons between the An. gambiae, An. darlingi and An. albimanus predicted proteomes allowed the identification of 3,772 putative orthologs. More than half of the transcripts had a match to proteins in other insect vectors and had an InterPro annotation. We identified several protein families that may be relevant to the study of Plasmodium-mosquito interaction. An open source transcript annotation browser called GDAV (Genome-Delinked Annotation Viewer) was developed to facilitate public access to the data generated by this and future transcriptome projects. CONCLUSIONS: We have explored the adult female transcriptome of one important New World malaria vector, An. albimanus. We identified protein-coding transcripts involved in biological processes that may be relevant to the Plasmodium lifecycle and can serve as the starting point for searching targets for novel control strategies. Our data increase the available genomic information regarding An. albimanus several hundred-fold, and will facilitate molecular research in medical entomology, evolutionary biology, genomics and proteomics of anopheline mosquito vectors. The data reported in this manuscript is accessible to the community via the VectorBase website (http://www.vectorbase.org/Other/AdditionalOrganisms/).
Asunto(s)
Anopheles/genética , Insectos Vectores/genética , Transcriptoma/genética , Animales , Mapeo Cromosómico , Bases de Datos Genéticas , Etiquetas de Secuencia Expresada , Femenino , Biblioteca de Genes , Genoma , Interacciones Huésped-Parásitos , Plasmodium/fisiología , Proteoma/metabolismo , Análisis de Secuencia de ADNRESUMEN
BACKGROUND: Quantitative transcriptome data for the malaria-transmitting mosquito Anopheles gambiae covers a broad range of biological and experimental conditions, including development, blood feeding and infection. Web-based summaries of differential expression for individual genes with respect to these conditions are a useful tool for the biologist, but they lack the context that a visualisation of all genes with respect to all conditions would give. For most organisms, including A. gambiae, such a systems-level view of gene expression is not yet available. RESULTS: We have clustered microarray-based gene-averaged expression values, available from VectorBase, for 10194 genes over 93 experimental conditions using a self-organizing map. Map regions corresponding to known biological events, such as egg production, are revealed. Many individual gene clusters (nodes) on the map are highly enriched in biological and molecular functions, such as protein synthesis, protein degradation and DNA replication. Gene families, such as odorant binding proteins, can be classified into distinct functional groups based on their expression and evolutionary history. Immunity-related genes are non-randomly distributed in several distinct regions on the map, and are generally distant from genes with house-keeping roles. Each immunity-rich region appears to represent a distinct biological context for pathogen recognition and clearance (e.g. the humoral and gut epithelial responses). Several immunity gene families, such as peptidoglycan recognition proteins (PGRPs) and defensins, appear to be specialised for these distinct roles, while three genes with physically interacting protein products (LRIM1/APL1C/TEP1) are found in close proximity. CONCLUSIONS: The map provides the first genome-scale, multi-experiment overview of gene expression in A. gambiae and should also be useful at the gene-level for investigating potential interactions. A web interface is available through the VectorBase website http://www.vectorbase.org/. It is regularly updated as new experimental data becomes available.
Asunto(s)
Anopheles/genética , Mapeo Cromosómico/métodos , Transcriptoma , Animales , Anopheles/metabolismo , Proteínas Portadoras/genética , Defensinas/genética , Perfilación de la Expresión Génica , Genoma , Proteínas de Insectos/genética , Receptores Odorantes/genéticaRESUMEN
VectorBase (http://www.vectorbase.org) is an NIAID-funded Bioinformatic Resource Center focused on invertebrate vectors of human pathogens. VectorBase annotates and curates vector genomes providing a web accessible integrated resource for the research community. Currently, VectorBase contains genome information for three mosquito species: Aedes aegypti, Anopheles gambiae and Culex quinquefasciatus, a body louse Pediculus humanus and a tick species Ixodes scapularis. Since our last report VectorBase has initiated a community annotation system, a microarray and gene expression repository and controlled vocabularies for anatomy and insecticide resistance. We have continued to develop both the software infrastructure and tools for interrogating the stored data.
Asunto(s)
Vectores Artrópodos/genética , Culicidae/genética , Bases de Datos Genéticas , Aedes/genética , Animales , Anopheles/genética , Culex/genética , Culicidae/metabolismo , Perfilación de la Expresión Génica , Genoma de los Insectos , Genómica , Ixodes/genética , Pediculus/genética , Vocabulario ControladoRESUMEN
VectorBase (http://www.vectorbase.org/) is a web-accessible data repository for information about invertebrate vectors of human pathogens. VectorBase annotates and maintains vector genomes providing an integrated resource for the research community. Currently, VectorBase contains genome information for two organisms: Anopheles gambiae, a vector for the Plasmodium protozoan agent causing malaria, and Aedes aegypti, a vector for the flaviviral agents causing Yellow fever and Dengue fever.
Asunto(s)
Aedes/genética , Anopheles/genética , Bases de Datos Genéticas , Genoma de los Insectos , Insectos Vectores/genética , Animales , Secuencia de Bases , Secuencia Conservada , Genómica , Humanos , Internet , Interfaz Usuario-ComputadorRESUMEN
BACKGROUND: Anopheles funestus is one of the 3 most consequential and widespread vectors of human malaria in tropical Africa. However, the lack of a high-quality reference genome has hindered the association of phenotypic traits with their genetic basis in this important mosquito. FINDINGS: Here we present a new high-quality A. funestus reference genome (AfunF3) assembled using 240× coverage of long-read single-molecule sequencing for contigging, combined with 100× coverage of short-read Hi-C data for chromosome scaffolding. The assembled contigs total 446 Mbp of sequence and contain substantial duplication due to alternative alleles present in the sequenced pool of mosquitos from the FUMOZ colony. Using alignment and depth-of-coverage information, these contigs were deduplicated to a 211 Mbp primary assembly, which is closer to the expected haploid genome size of 250 Mbp. This primary assembly consists of 1,053 contigs organized into 3 chromosome-scale scaffolds with an N50 contig size of 632 kbp and an N50 scaffold size of 93.811 Mbp, representing a 100-fold improvement in continuity versus the current reference assembly, AfunF1. CONCLUSION: This highly contiguous and complete A. funestus reference genome assembly will serve as an improved basis for future studies of genomic variation and organization in this important disease vector.
Asunto(s)
Anopheles/genética , Cromosomas de Insectos , Secuenciación Completa del Genoma , Animales , Femenino , GenómicaRESUMEN
Chromosomal inversion polymorphisms play an important role in adaptation to environmental heterogeneities. For mosquito species in the Anopheles gambiae complex that are significant vectors of human malaria, paracentric inversion polymorphisms are abundant and are associated with ecologically and epidemiologically important phenotypes. Improved understanding of these traits relies on determining mosquito karyotype, which currently depends upon laborious cytogenetic methods whose application is limited both by the requirement for specialized expertise and for properly preserved adult females at specific gonotrophic stages. To overcome this limitation, we developed sets of tag single nucleotide polymorphisms (SNPs) inside inversions whose biallelic genotype is strongly correlated with inversion genotype. We leveraged 1,347 fully sequenced An. gambiae and Anopheles coluzzii genomes in the Ag1000G database of natural variation. Beginning with principal components analysis (PCA) of population samples, applied to windows of the genome containing individual chromosomal rearrangements, we classified samples into three inversion genotypes, distinguishing homozygous inverted and homozygous uninverted groups by inclusion of the small subset of specimens in Ag1000G that are associated with cytogenetic metadata. We then assessed the correlation between candidate tag SNP genotypes and PCA-based inversion genotypes in our training sets, selecting those candidates with >80% agreement. Our initial tests both in held-back validation samples from Ag1000G and in data independent of Ag1000G suggest that when used for in silico inversion genotyping of sequenced mosquitoes, these tags perform better than traditional cytogenetics, even for specimens where only a small subset of the tag SNPs can be successfully ascertained.
Asunto(s)
Anopheles/clasificación , Anopheles/genética , Cromosomas de Insectos , Cariotipificación , Polimorfismo Genético , Animales , Anopheles/parasitología , Inversión Cromosómica , Evolución Molecular , Variación Genética , Genotipo , Humanos , Malaria/transmisión , Mosquitos Vectores/clasificación , Mosquitos Vectores/genética , Mosquitos Vectores/parasitología , Polimorfismo de Nucleótido Simple , Reproducibilidad de los ResultadosRESUMEN
Hall et al. have strategically used long-read sequencing technology to characterize the structure and highly repetitive content of the Y chromosome in Anopheles malaria mosquitoes. Their work confirms that this important but elusive heterochromatic sex chromosome is evolving extremely rapidly and harbors a remarkably small number of genes.
Asunto(s)
Anopheles/genética , Genoma de los Insectos/genética , Cromosoma Y/genética , Animales , Femenino , Masculino , Procesos de Determinación del Sexo/genéticaRESUMEN
Linking phenotypic with genotypic diversity has become a major requirement for basic and applied genome-centric biological research. To meet this need, a comprehensive database backend for efficiently storing, querying and analyzing large experimental data sets is necessary. Chado, a generic, modular, community-based database schema is widely used in the biological community to store information associated with genome sequence data. To meet the need to also accommodate large-scale phenotyping and genotyping projects, a new Chado module called Natural Diversity has been developed. The module strictly adheres to the Chado remit of being generic and ontology driven. The flexibility of the new module is demonstrated in its capacity to store any type of experiment that either uses or generates specimens or stock organisms. Experiments may be grouped or structured hierarchically, whereas any kind of biological entity can be stored as the observed unit, from a specimen to be used in genotyping or phenotyping experiments, to a group of species collected in the field that will undergo further lab analysis. We describe details of the Natural Diversity module, including the design approach, the relational schema and use cases implemented in several databases.