RESUMEN
Anopheles gambiae is the principal vector of malaria, a disease that afflicts more than 500 million people and causes more than 1 million deaths each year. Tenfold shotgun sequence coverage was obtained from the PEST strain of A. gambiae and assembled into scaffolds that span 278 million base pairs. A total of 91% of the genome was organized in 303 scaffolds; the largest scaffold was 23.1 million base pairs. There was substantial genetic variation within this strain, and the apparent existence of two haplotypes of approximately equal frequency ("dual haplotypes") in a substantial fraction of the genome likely reflects the outbred nature of the PEST strain. The sequence produced a conservative inference of more than 400,000 single-nucleotide polymorphisms that showed a markedly bimodal density distribution. Analysis of the genome sequence revealed strong evidence for about 14,000 protein-encoding transcripts. Prominent expansions in specific families of proteins likely involved in cell adhesion and immunity were noted. An expressed sequence tag analysis of genes regulated by blood feeding provided insights into the physiological adaptations of a hematophagous insect.
Asunto(s)
Anopheles/genética , Genes de Insecto , Genoma , Análisis de Secuencia de ADN , Animales , Anopheles/clasificación , Anopheles/parasitología , Anopheles/fisiología , Evolución Biológica , Sangre , Inversión Cromosómica , Cromosomas Artificiales Bacterianos , Biología Computacional , Elementos Transponibles de ADN , Digestión , Drosophila melanogaster/genética , Enzimas/química , Enzimas/genética , Enzimas/metabolismo , Etiquetas de Secuencia Expresada , Conducta Alimentaria , Regulación de la Expresión Génica , Variación Genética , Haplotipos , Humanos , Proteínas de Insectos/química , Proteínas de Insectos/genética , Proteínas de Insectos/fisiología , Insectos Vectores/genética , Insectos Vectores/parasitología , Insectos Vectores/fisiología , Malaria Falciparum/transmisión , Datos de Secuencia Molecular , Control de Mosquitos , Mapeo Físico de Cromosoma , Plasmodium falciparum/crecimiento & desarrollo , Polimorfismo de Nucleótido Simple , Proteoma , Especificidad de la Especie , Factores de Transcripción/química , Factores de Transcripción/genética , Factores de Transcripción/fisiologíaRESUMEN
Chromosome 14 is one of five acrocentric chromosomes in the human genome. These chromosomes are characterized by a heterochromatic short arm that contains essentially ribosomal RNA genes, and a euchromatic long arm in which most, if not all, of the protein-coding genes are located. The finished sequence of human chromosome 14 comprises 87,410,661 base pairs, representing 100% of its euchromatic portion, in a single continuous segment covering the entire long arm with no gaps. Two loci of crucial importance for the immune system, as well as more than 60 disease genes, have been localized so far on chromosome 14. We identified 1,050 genes and gene fragments, and 393 pseudogenes. On the basis of comparisons with other vertebrate genomes, we estimate that more than 96% of the chromosome 14 genes have been annotated. From an analysis of the CpG island occurrences, we estimate that 70% of these annotated genes are complete at their 5' end.