RESUMO
The accurate and complete assembly of both haplotype sequences of a diploid organism is essential to understanding the role of variation in genome functions, phenotypes and diseases1. Here, using a trio-binning approach, we present a high-quality, diploid reference genome, with both haplotypes assembled independently at the chromosome level, for the common marmoset (Callithrix jacchus), an primate model system that is widely used in biomedical research2,3. The full spectrum of heterozygosity between the two haplotypes involves 1.36% of the genome-much higher than the 0.13% indicated by the standard estimation based on single-nucleotide heterozygosity alone. The de novo mutation rate is 0.43 × 10-8 per site per generation, and the paternal inherited genome acquired twice as many mutations as the maternal. Our diploid assembly enabled us to discover a recent expansion of the sex-differentiation region and unique evolutionary changes in the marmoset Y chromosome. In addition, we identified many genes with signatures of positive selection that might have contributed to the evolution of Callithrix biological features. Brain-related genes were highly conserved between marmosets and humans, although several genes experienced lineage-specific copy number variations or diversifying selection, with implications for the use of marmosets as a model system.
Assuntos
Callithrix/genética , Diploide , Evolução Molecular , Genoma/genética , Genômica/normas , Animais , Pesquisa Biomédica , Variações do Número de Cópias de DNA , Feminino , Mutação em Linhagem Germinativa/genética , Haplótipos/genética , Heterozigoto , Humanos , Mutação INDEL/genética , Masculino , Padrões de Referência , Seleção Genética , Diferenciação Sexual/genética , Cromossomo Y/genéticaRESUMO
High-quality and complete reference genome assemblies are fundamental for the application of genomics to biology, disease, and biodiversity conservation. However, such assemblies are available for only a few non-microbial species1-4. To address this issue, the international Genome 10K (G10K) consortium5,6 has worked over a five-year period to evaluate and develop cost-effective methods for assembling highly accurate and nearly complete reference genomes. Here we present lessons learned from generating assemblies for 16 species that represent six major vertebrate lineages. We confirm that long-read sequencing technologies are essential for maximizing genome quality, and that unresolved complex repeats and haplotype heterozygosity are major sources of assembly error when not handled correctly. Our assemblies correct substantial errors, add missing sequence in some of the best historical reference genomes, and reveal biological discoveries. These include the identification of many false gene duplications, increases in gene sizes, chromosome rearrangements that are specific to lineages, a repeated independent chromosome breakpoint in bat genomes, and a canonical GC-rich pattern in protein-coding genes and their regulatory regions. Adopting these lessons, we have embarked on the Vertebrate Genomes Project (VGP), an international effort to generate high-quality, complete reference genomes for all of the roughly 70,000 extant vertebrate species and to help to enable a new era of discovery across the life sciences.
Assuntos
Genoma , Genômica/métodos , Vertebrados/genética , Animais , Aves , Biblioteca Gênica , Tamanho do Genoma , Genoma Mitocondrial , Haplótipos , Sequenciamento de Nucleotídeos em Larga Escala , Anotação de Sequência Molecular , Alinhamento de Sequência , Análise de Sequência de DNA , Cromossomos Sexuais/genéticaRESUMO
After two decades of improvements, the current human reference genome (GRCh38) is the most accurate and complete vertebrate genome ever produced. However, no single chromosome has been finished end to end, and hundreds of unresolved gaps persist1,2. Here we present a human genome assembly that surpasses the continuity of GRCh382, along with a gapless, telomere-to-telomere assembly of a human chromosome. This was enabled by high-coverage, ultra-long-read nanopore sequencing of the complete hydatidiform mole CHM13 genome, combined with complementary technologies for quality improvement and validation. Focusing our efforts on the human X chromosome3, we reconstructed the centromeric satellite DNA array (approximately 3.1 Mb) and closed the 29 remaining gaps in the current reference, including new sequences from the human pseudoautosomal regions and from cancer-testis ampliconic gene families (CT-X and GAGE). These sequences will be integrated into future human reference genome releases. In addition, the complete chromosome X, combined with the ultra-long nanopore data, allowed us to map methylation patterns across complex tandem repeats and satellite arrays. Our results demonstrate that finishing the entire human genome is now within reach, and the data presented here will facilitate ongoing efforts to complete the other human chromosomes.
Assuntos
Cromossomos Humanos X/genética , Genoma Humano/genética , Telômero/genética , Centrômero/genética , Ilhas de CpG/genética , Metilação de DNA , DNA Satélite/genética , Feminino , Humanos , Mola Hidatiforme/genética , Masculino , Gravidez , Reprodutibilidade dos Testes , Testículo/metabolismoRESUMO
Sea turtles represent an ancient lineage of marine vertebrates that evolved from terrestrial ancestors over 100 Mya. The genomic basis of the unique physiological and ecological traits enabling these species to thrive in diverse marine habitats remains largely unknown. Additionally, many populations have drastically declined due to anthropogenic activities over the past two centuries, and their recovery is a high global conservation priority. We generated and analyzed high-quality reference genomes for the leatherback (Dermochelys coriacea) and green (Chelonia mydas) turtles, representing the two extant sea turtle families. These genomes are highly syntenic and homologous, but localized regions of noncollinearity were associated with higher copy numbers of immune, zinc-finger, and olfactory receptor (OR) genes in green turtles, with ORs related to waterborne odorants greatly expanded in green turtles. Our findings suggest that divergent evolution of these key gene families may underlie immunological and sensory adaptations assisting navigation, occupancy of neritic versus pelagic environments, and diet specialization. Reduced collinearity was especially prevalent in microchromosomes, with greater gene content, heterozygosity, and genetic distances between species, supporting their critical role in vertebrate evolutionary adaptation. Finally, diversity and demographic histories starkly contrasted between species, indicating that leatherback turtles have had a low yet stable effective population size, exhibit extremely low diversity compared with other reptiles, and harbor a higher genetic load compared with green turtles, reinforcing concern over their persistence under future climate scenarios. These genomes provide invaluable resources for advancing our understanding of evolution and conservation best practices in an imperiled vertebrate lineage.
Assuntos
Tartarugas , Animais , Ecossistema , Dinâmica PopulacionalRESUMO
BACKGROUND: The Nile rat (Avicanthis niloticus) is an important animal model because of its robust diurnal rhythm, a cone-rich retina, and a propensity to develop diet-induced diabetes without chemical or genetic modifications. A closer similarity to humans in these aspects, compared to the widely used Mus musculus and Rattus norvegicus models, holds the promise of better translation of research findings to the clinic. RESULTS: We report a 2.5 Gb, chromosome-level reference genome assembly with fully resolved parental haplotypes, generated with the Vertebrate Genomes Project (VGP). The assembly is highly contiguous, with contig N50 of 11.1 Mb, scaffold N50 of 83 Mb, and 95.2% of the sequence assigned to chromosomes. We used a novel workflow to identify 3613 segmental duplications and quantify duplicated genes. Comparative analyses revealed unique genomic features of the Nile rat, including some that affect genes associated with type 2 diabetes and metabolic dysfunctions. We discuss 14 genes that are heterozygous in the Nile rat or highly diverged from the house mouse. CONCLUSIONS: Our findings reflect the exceptional level of genomic resolution present in this assembly, which will greatly expand the potential of the Nile rat as a model organism.
Assuntos
Diabetes Mellitus Tipo 2 , Humanos , Animais , Haplótipos , Diabetes Mellitus Tipo 2/genética , Murinae , Genoma , GenômicaRESUMO
Understanding the mechanisms driving lineage-specific evolution in both primates and rodents has been hindered by the lack of sister clades with a similar phylogenetic structure having high-quality genome assemblies. Here, we have created chromosome-level assemblies of the Mus caroli and Mus pahari genomes. Together with the Mus musculus and Rattus norvegicus genomes, this set of rodent genomes is similar in divergence times to the Hominidae (human-chimpanzee-gorilla-orangutan). By comparing the evolutionary dynamics between the Muridae and Hominidae, we identified punctate events of chromosome reshuffling that shaped the ancestral karyotype of Mus musculus and Mus caroli between 3 and 6 million yr ago, but that are absent in the Hominidae. Hominidae show between four- and sevenfold lower rates of nucleotide change and feature turnover in both neutral and functional sequences, suggesting an underlying coherence to the Muridae acceleration. Our system of matched, high-quality genome assemblies revealed how specific classes of repeats can play lineage-specific roles in related species. Recent LINE activity has remodeled protein-coding loci to a greater extent across the Muridae than the Hominidae, with functional consequences at the species level such as reproductive isolation. Furthermore, we charted a Muridae-specific retrotransposon expansion at unprecedented resolution, revealing how a single nucleotide mutation transformed a specific SINE element into an active CTCF binding site carrier specifically in Mus caroli, which resulted in thousands of novel, species-specific CTCF binding sites. Our results show that the comparison of matched phylogenetic sets of genomes will be an increasingly powerful strategy for understanding mammalian biology.
Assuntos
Evolução Molecular , Genoma/genética , Muridae/genética , Filogenia , Animais , Sítios de Ligação , Fator de Ligação a CCCTC/genética , Cromossomos/genética , Cariotipagem/métodos , Elementos Nucleotídeos Longos e Dispersos/genética , Camundongos , Retroelementos/genética , Especificidade da EspécieRESUMO
The human reference genome assembly plays a central role in nearly all aspects of today's basic and clinical research. GRCh38 is the first coordinate-changing assembly update since 2009; it reflects the resolution of roughly 1000 issues and encompasses modifications ranging from thousands of single base changes to megabase-scale path reorganizations, gap closures, and localization of previously orphaned sequences. We developed a new approach to sequence generation for targeted base updates and used data from new genome mapping technologies and single haplotype resources to identify and resolve larger assembly issues. For the first time, the reference assembly contains sequence-based representations for the centromeres. We also expanded the number of alternate loci to create a reference that provides a more robust representation of human population variation. We demonstrate that the updates render the reference an improved annotation substrate, alter read alignments in unchanged regions, and impact variant interpretation at clinically relevant loci. We additionally evaluated a collection of new de novo long-read haploid assemblies and conclude that although the new assemblies compare favorably to the reference with respect to continuity, error rate, and gene completeness, the reference still provides the best representation for complex genomic regions and coding sequences. We assert that the collected updates in GRCh38 make the newer assembly a more robust substrate for comprehensive analyses that will promote our understanding of human biology and advance our efforts to improve health.
Assuntos
Mapeamento de Sequências Contíguas/métodos , Genoma Humano , Genômica/métodos , Análise de Sequência de DNA/métodos , Software , Mapeamento de Sequências Contíguas/normas , Genômica/normas , Haploidia , Haplótipos , Humanos , Polimorfismo Genético , Padrões de Referência , Análise de Sequência de DNA/normasRESUMO
We have generated an improved assembly and gene annotation of the pig X Chromosome, and a first draft assembly of the pig Y Chromosome, by sequencing BAC and fosmid clones from Duroc animals and incorporating information from optical mapping and fiber-FISH. The X Chromosome carries 1033 annotated genes, 690 of which are protein coding. Gene order closely matches that found in primates (including humans) and carnivores (including cats and dogs), which is inferred to be ancestral. Nevertheless, several protein-coding genes present on the human X Chromosome were absent from the pig, and 38 pig-specific X-chromosomal genes were annotated, 22 of which were olfactory receptors. The pig Y-specific Chromosome sequence generated here comprises 30 megabases (Mb). A 15-Mb subset of this sequence was assembled, revealing two clusters of male-specific low copy number genes, separated by an ampliconic region including the HSFY gene family, which together make up most of the short arm. Both clusters contain palindromes with high sequence identity, presumably maintained by gene conversion. Many of the ancestral X-related genes previously reported in at least one mammalian Y Chromosome are represented either as active genes or partial sequences. This sequencing project has allowed us to identify genes--both single copy and amplified--on the pig Y Chromosome, to compare the pig X and Y Chromosomes for homologous sequences, and thereby to reveal mechanisms underlying pig X and Y Chromosome evolution.
Assuntos
Cromossomos de Mamíferos/genética , Evolução Molecular , Suínos/genética , Cromossomo X/genética , Cromossomo Y/genética , Animais , Sequência de Bases , Gatos/genética , Cães/genética , Feminino , Conversão Gênica , Expressão Gênica , Biblioteca Gênica , Ordem dos Genes , Humanos , Masculino , Dados de Sequência Molecular , Alinhamento de Sequência , Análise de Sequência de DNARESUMO
Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.
Assuntos
Sequência Conservada/genética , Genoma/genética , Peixe-Zebra/genética , Animais , Cromossomos/genética , Evolução Molecular , Feminino , Genes/genética , Genoma Humano/genética , Genômica , Humanos , Masculino , Meiose/genética , Anotação de Sequência Molecular , Pseudogenes/genética , Padrões de Referência , Processos de Determinação Sexual/genética , Proteínas de Peixe-Zebra/genéticaRESUMO
For 10,000 years pigs and humans have shared a close and complex relationship. From domestication to modern breeding practices, humans have shaped the genomes of domestic pigs. Here we present the assembly and analysis of the genome sequence of a female domestic Duroc pig (Sus scrofa) and a comparison with the genomes of wild and domestic pigs from Europe and Asia. Wild pigs emerged in South East Asia and subsequently spread across Eurasia. Our results reveal a deep phylogenetic split between European and Asian wild boars â¼1 million years ago, and a selective sweep analysis indicates selection on genes involved in RNA processing and regulation. Genes associated with immune response and olfaction exhibit fast evolution. Pigs have the largest repertoire of functional olfactory receptor genes, reflecting the importance of smell in this scavenging animal. The pig genome sequence provides an important resource for further improvements of this important livestock species, and our identification of many putative disease-causing variants extends the potential of the pig as a biomedical model.
Assuntos
Genoma/genética , Filogenia , Sus scrofa/classificação , Sus scrofa/genética , Animais , Demografia , Modelos Animais , Dados de Sequência Molecular , Dinâmica PopulacionalRESUMO
MOTIVATION: For most research approaches, genome analyses are dependent on the existence of a high quality genome reference assembly. However, the local accuracy of an assembly remains difficult to assess and improve. The gEVAL browser allows the user to interrogate an assembly in any region of the genome by comparing it to different datasets and evaluating the concordance. These analyses include: a wide variety of sequence alignments, comparative analyses of multiple genome assemblies, and consistency with optical and other physical maps. gEVAL highlights allelic variations, regions of low complexity, abnormal coverage, and potential sequence and assembly errors, and offers strategies for improvement. Although gEVAL focuses primarily on sequence integrity, it can also display arbitrary annotation including from Ensembl or TrackHub sources. We provide gEVAL web sites for many human, mouse, zebrafish and chicken assemblies to support the Genome Reference Consortium, and gEVAL is also downloadable to enable its use for any organism and assembly. AVAILABILITY AND IMPLEMENTATION: Web Browser: http://geval.sanger.ac.uk, Plugin: http://wchow.github.io/wtsi-geval-plugin CONTACT: kj2@sanger.ac.uk SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Assuntos
Genômica , Navegador , Animais , Genoma , Humanos , Internet , Camundongos , Alinhamento de SequênciaRESUMO
We prospectively evaluated the accuracy of the 2007 World Health Organization (WHO) criteria for diagnosing polycythemia vera (PV), especially in "early-stage" patients. A total of 28 of 30 patients were diagnosed as PV owing to an elevated Cr-51 red cell mass (RCM), JAK2 positivity, and at least 1 minor criterion. A total of 18 PV patients did not meet the WHO criterion for an increased hemoglobin value and 8 did not meet the WHO criterion for an increased hematocrit value. Bone marrow morphology was very valuable for diagnosis. Low serum erythropoietin (EPO) values were specific for PV, but normal EPO values were found at presentation (20%). We recommend revision of the WHO criteria, especially to distinguish early-stage PV from essential thrombocythemia. Major criteria remain JAK2 positivity and increased red cell volume, but Cr-51 RCM is mandatory for patients who do not meet the defined elevated hemoglobin or hematocrit value (>18.5 g/dL and 60% in men and >16.5 g/dL and 56% in women, respectively). Minor criteria remain bone marrow histology or a low serum EPO value. For patients with a normal EPO value, marrow examination is mandatory for diagnostic confirmation. Because the therapies for myeloproliferative disorders differ, our data have major clinical implications.
Assuntos
Policitemia Vera/sangue , Policitemia Vera/diagnóstico , Guias de Prática Clínica como Assunto/normas , Organização Mundial da Saúde , Medula Óssea/patologia , Volume de Eritrócitos , Eritropoetina/sangue , Hematócrito , Hemoglobinas/metabolismo , Humanos , Janus Quinase 2/metabolismo , Policitemia Vera/enzimologia , Estudos Prospectivos , Sensibilidade e EspecificidadeRESUMO
Genomic regions sometimes show patterns of genetic variation distinct from the genome-wide population structure. Such deviations have often been interpreted to represent effects of selection. However, systematic investigation of whether and how non-selective factors, such as recombination rates, can affect distinct patterns has been limited. Here, we associate distinct patterns of genetic variation with reduced recombination rates in a songbird, the Eurasian blackcap (Sylvia atricapilla), using a new reference genome assembly, whole-genome resequenc- ing data and recombination maps. We find that distinct patterns of genetic variation reflect haplotype structure at genomic regions with different prevalence of reduced recombination rate across populations. At low-recombining regions shared in most populations, distinct patterns reflect conspicuous haplotypes segregating in multiple populations. At low-recombining regions found only in a few populations, distinct patterns represent variance among cryptic haplotypes within the low-recombining populations. With simulations, we confirm that these distinct patterns evolve neutrally by reduced recombination rate, on which the effects of selection can be overlaid. Our results highlight that distinct patterns of genetic variation can emerge through evolutionary reduction of local recombination rate. The recombination landscape as an evolvable trait therefore plays an important role determining the heterogeneous distribution of genetic variation along the genome.
RESUMO
Vocal rhythm plays a fundamental role in sexual selection and species recognition in birds, but little is known of its genetic basis due to the confounding effect of vocal learning in model systems. Uncovering its genetic basis could facilitate identifying genes potentially important in speciation. Here we investigate the genomic underpinnings of rhythm in vocal non-learning Pogoniulus tinkerbirds using 135 individual whole genomes distributed across a southern African hybrid zone. We find rhythm speed is associated with two genes that are also known to affect human speech, Neurexin-1 and Coenzyme Q8A. Models leveraging ancestry reveal these candidate loci also impact rhythmic stability, a trait linked with motor performance which is an indicator of quality. Character displacement in rhythmic stability suggests possible reinforcement against hybridization, supported by evidence of asymmetric assortative mating in the species producing faster, more stable rhythms. Because rhythm is omnipresent in animal communication, candidate genes identified here may shape vocal rhythm across birds and other vertebrates.
Assuntos
Vocalização Animal , Animais , Vocalização Animal/fisiologia , Masculino , Genômica , Genoma/genética , Feminino , Aves Canoras/genética , Aves Canoras/fisiologia , Aves/genética , Aves/fisiologiaRESUMO
The seventh iteration of the reference genome assembly for Rattus norvegicus-mRatBN7.2-corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared with its predecessor. Gene annotations are now more complete, improving the mapping precision of genomic, transcriptomic, and proteomics datasets. We jointly analyzed 163 short-read whole-genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined â¼20.0 million sequence variations, of which 18,700 are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.
Assuntos
Genoma , Genômica , Ratos , Animais , Genoma/genética , Anotação de Sequência Molecular , Sequenciamento Completo do Genoma , Variação Genética/genéticaRESUMO
Tasmanian devils have spawned two transmissible cancer lineages, named devil facial tumor 1 (DFT1) and devil facial tumor 2 (DFT2). We investigated the genetic diversity and evolution of these clones by analyzing 78 DFT1 and 41 DFT2 genomes relative to a newly assembled, chromosome-level reference. Time-resolved phylogenetic trees reveal that DFT1 first emerged in 1986 (1982 to 1989) and DFT2 in 2011 (2009 to 2012). Subclone analysis documents transmission of heterogeneous cell populations. DFT2 has faster mutation rates than DFT1 across all variant classes, including substitutions, indels, rearrangements, transposable element insertions, and copy number alterations, and we identify a hypermutated DFT1 lineage with defective DNA mismatch repair. Several loci show plausible evidence of positive selection in DFT1 or DFT2, including loss of chromosome Y and inactivation of MGA, but none are common to both cancers. This study reveals the parallel long-term evolution of two transmissible cancers inhabiting a common niche in Tasmanian devils.
Assuntos
Evolução Molecular , Neoplasias Faciais , Marsupiais , Seleção Genética , Animais , Neoplasias Faciais/classificação , Neoplasias Faciais/genética , Neoplasias Faciais/veterinária , Genoma , Marsupiais/genética , FilogeniaRESUMO
Numerous novel adaptations characterise the radiation of notothenioids, the dominant fish group in the freezing seas of the Southern Ocean. To improve understanding of the evolution of this iconic fish group, here we generate and analyse new genome assemblies for 24 species covering all major subgroups of the radiation, including five long-read assemblies. We present a new estimate for the onset of the radiation at 10.7 million years ago, based on a time-calibrated phylogeny derived from genome-wide sequence data. We identify a two-fold variation in genome size, driven by expansion of multiple transposable element families, and use the long-read data to reconstruct two evolutionarily important, highly repetitive gene family loci. First, we present the most complete reconstruction to date of the antifreeze glycoprotein gene family, whose emergence enabled survival in sub-zero temperatures, showing the expansion of the antifreeze gene locus from the ancestral to the derived state. Second, we trace the loss of haemoglobin genes in icefishes, the only vertebrates lacking functional haemoglobins, through complete reconstruction of the two haemoglobin gene clusters across notothenioid families. Both the haemoglobin and antifreeze genomic loci are characterised by multiple transposon expansions that may have driven the evolutionary history of these genes.
Assuntos
Peixes , Perciformes , Animais , Peixes/genética , Genômica , Vertebrados , Filogenia , Hemoglobinas/genética , Regiões AntárticasRESUMO
Insights into the evolution of non-model organisms are limited by the lack of reference genomes of high accuracy, completeness, and contiguity. Here, we present a chromosome-level, karyotype-validated reference genome and pangenome for the barn swallow (Hirundo rustica). We complement these resources with a reference-free multialignment of the reference genome with other bird genomes and with the most comprehensive catalog of genetic markers for the barn swallow. We identify potentially conserved and accelerated genes using the multialignment and estimate genome-wide linkage disequilibrium using the catalog. We use the pangenome to infer core and accessory genes and to detect variants using it as a reference. Overall, these resources will foster population genomics studies in the barn swallow, enable detection of candidate genes in comparative genomics studies, and help reduce bias toward a single reference genome.
Assuntos
Andorinhas , Animais , Andorinhas/genética , Metagenômica , Genoma/genética , Genômica , CromossomosRESUMO
BACKGROUND: The Australian black swan (Cygnus atratus) is an iconic species with contrasting plumage to that of the closely related northern hemisphere white swans. The relative geographic isolation of the black swan may have resulted in a limited immune repertoire and increased susceptibility to infectious diseases, notably infectious diseases from which Australia has been largely shielded. Unlike mallard ducks and the mute swan (Cygnus olor), the black swan is extremely sensitive to highly pathogenic avian influenza. Understanding this susceptibility has been impaired by the absence of any available swan genome and transcriptome information. RESULTS: Here, we generate the first chromosome-length black and mute swan genomes annotated with transcriptome data, all using long-read based pipelines generated for vertebrate species. We use these genomes and transcriptomes to show that unlike other wild waterfowl, black swans lack an expanded immune gene repertoire, lack a key viral pattern-recognition receptor in endothelial cells and mount a poorly controlled inflammatory response to highly pathogenic avian influenza. We also implicate genetic differences in SLC45A2 gene in the iconic plumage of the black swan. CONCLUSION: Together, these data suggest that the immune system of the black swan is such that should any avian viral infection become established in its native habitat, the black swan would be in a significant peril.
Assuntos
Anseriformes , Influenza Aviária , Animais , Transcriptoma , Células Endoteliais , AustráliaRESUMO
The seventh iteration of the reference genome assembly for Rattus norvegicus-mRatBN7.2-corrects numerous misplaced segments and reduces base-level errors by approximately 9-fold and increases contiguity by 290-fold compared to its predecessor. Gene annotations are now more complete, significantly improving the mapping precision of genomic, transcriptomic, and proteomics data sets. We jointly analyzed 163 short-read whole genome sequencing datasets representing 120 laboratory rat strains and substrains using mRatBN7.2. We defined ~20.0 million sequence variations, of which 18.7 thousand are predicted to potentially impact the function of 6,677 genes. We also generated a new rat genetic map from 1,893 heterogeneous stock rats and annotated transcription start sites and alternative polyadenylation sites. The mRatBN7.2 assembly, along with the extensive analysis of genomic variations among rat strains, enhances our understanding of the rat genome, providing researchers with an expanded resource for studies involving rats.