RESUMEN
The gut fungal community represents an essential element of human health, yet its functional and metabolic potential remains insufficiently elucidated, largely due to the limited availability of reference genomes. To address this gap, we presented the cultivated gut fungi (CGF) catalog, encompassing 760 fungal genomes derived from the feces of healthy individuals. This catalog comprises 206 species spanning 48 families, including 69 species previously unidentified. We explored the functional and metabolic attributes of the CGF species and utilized this catalog to construct a phylogenetic representation of the gut mycobiome by analyzing over 11,000 fecal metagenomes from Chinese and non-Chinese populations. Moreover, we identified significant common disease-related variations in gut mycobiome composition and corroborated the associations between fungal signatures and inflammatory bowel disease (IBD) through animal experimentation. These resources and findings substantially enrich our understanding of the biological diversity and disease relevance of the human gut mycobiome.
Asunto(s)
Hongos , Microbioma Gastrointestinal , Micobioma , Animales , Humanos , Masculino , Ratones , Heces/microbiología , Hongos/genética , Hongos/clasificación , Hongos/aislamiento & purificación , Genoma Fúngico/genética , Genómica , Enfermedades Inflamatorias del Intestino/microbiología , Enfermedades Inflamatorias del Intestino/genética , Metagenoma , Filogenia , Femenino , Adulto , Persona de Mediana EdadRESUMEN
The 1000 Genomes Project (1kGP) is the largest fully open resource of whole-genome sequencing (WGS) data consented for public distribution without access or use restrictions. The final, phase 3 release of the 1kGP included 2,504 unrelated samples from 26 populations and was based primarily on low-coverage WGS. Here, we present a high-coverage 3,202-sample WGS 1kGP resource, which now includes 602 complete trios, sequenced to a depth of 30X using Illumina. We performed single-nucleotide variant (SNV) and short insertion and deletion (INDEL) discovery and generated a comprehensive set of structural variants (SVs) by integrating multiple analytic methods through a machine learning model. We show gains in sensitivity and precision of variant calls compared to phase 3, especially among rare SNVs as well as INDELs and SVs spanning frequency spectrum. We also generated an improved reference imputation panel, making variants discovered here accessible for association studies.
Asunto(s)
Genoma Humano , Secuenciación Completa del Genoma , Femenino , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Mutación INDEL , Masculino , Polimorfismo de Nucleótido SimpleRESUMEN
RNA viruses generate defective viral genomes (DVGs) that can interfere with replication of the parental wild-type virus. To examine their therapeutic potential, we created a DVG by deleting the capsid-coding region of poliovirus. Strikingly, intraperitoneal or intranasal administration of this genome, which we termed eTIP1, elicits an antiviral response, inhibits replication, and protects mice from several RNA viruses, including enteroviruses, influenza, and SARS-CoV-2. While eTIP1 replication following intranasal administration is limited to the nasal cavity, its antiviral action extends non-cell-autonomously to the lungs. eTIP1 broad-spectrum antiviral effects are mediated by both local and distal type I interferon responses. Importantly, while a single eTIP1 dose protects animals from SARS-CoV-2 infection, it also stimulates production of SARS-CoV-2 neutralizing antibodies that afford long-lasting protection from SARS-CoV-2 reinfection. Thus, eTIP1 is a safe and effective broad-spectrum antiviral generating short- and long-term protection against SARS-CoV-2 and other respiratory infections in animal models.
Asunto(s)
Proteínas de la Cápside/genética , Virus Interferentes Defectuosos/metabolismo , Replicación Viral/efectos de los fármacos , Administración Intranasal , Animales , Antivirales/farmacología , Anticuerpos ampliamente neutralizantes/inmunología , Anticuerpos ampliamente neutralizantes/farmacología , COVID-19 , Proteínas de la Cápside/metabolismo , Línea Celular , Virus Interferentes Defectuosos/patogenicidad , Modelos Animales de Enfermedad , Genoma Viral/genética , Humanos , Gripe Humana , Interferones/metabolismo , Masculino , Ratones , Ratones Endogámicos C57BL , Poliovirus/genética , Poliovirus/metabolismo , Infecciones del Sistema Respiratorio/virología , SARS-CoV-2/efectos de los fármacos , SARS-CoV-2/patogenicidadRESUMEN
Structural variants contribute substantially to genetic diversity and are important evolutionarily and medically, but they are still understudied. Here we present a comprehensive analysis of structural variation in the Human Genome Diversity panel, a high-coverage dataset of 911 samples from 54 diverse worldwide populations. We identify, in total, 126,018 variants, 78% of which were not identified in previous global sequencing projects. Some reach high frequency and are private to continental groups or even individual populations, including regionally restricted runaway duplications and putatively introgressed variants from archaic hominins. By de novo assembly of 25 genomes using linked-read sequencing, we discover 1,643 breakpoint-resolved unique insertions, in aggregate accounting for 1.9 Mb of sequence absent from the GRCh38 reference. Our results illustrate the limitation of a single human reference and the need for high-quality genomes from diverse populations to fully discover and understand human genetic variation.
Asunto(s)
Genética de Población , Variación Estructural del Genoma , Alelos , Bases de Datos Genéticas , Dosificación de Gen , Duplicación de Gen , Frecuencia de los Genes/genética , Variación Genética , Genoma Humano , HumanosRESUMEN
We conducted the largest investigation of predisposition variants in cancer to date, discovering 853 pathogenic or likely pathogenic variants in 8% of 10,389 cases from 33 cancer types. Twenty-one genes showed single or cross-cancer associations, including novel associations of SDHA in melanoma and PALB2 in stomach adenocarcinoma. The 659 predisposition variants and 18 additional large deletions in tumor suppressors, including ATM, BRCA1, and NF1, showed low gene expression and frequent (43%) loss of heterozygosity or biallelic two-hit events. We also discovered 33 such variants in oncogenes, including missenses in MET, RET, and PTPN11 associated with high gene expression. We nominated 47 additional predisposition variants from prioritized VUSs supported by multiple evidences involving case-control frequency, loss of heterozygosity, expression effect, and co-localization with mutations and modified residues. Our integrative approach links rare predisposition variants to functional consequences, informing future guidelines of variant classification and germline genetic testing in cancer.
Asunto(s)
Células Germinativas/metabolismo , Neoplasias/patología , Variaciones en el Número de Copia de ADN , Bases de Datos Genéticas , Eliminación de Gen , Frecuencia de los Genes , Predisposición Genética a la Enfermedad , Genotipo , Células Germinativas/citología , Mutación de Línea Germinal , Humanos , Pérdida de Heterocigocidad/genética , Mutación Missense , Neoplasias/genética , Polimorfismo de Nucleótido Simple , Proteínas Proto-Oncogénicas c-met/genética , Proteínas Proto-Oncogénicas c-ret/genética , Proteínas Supresoras de Tumor/genéticaRESUMEN
Arabidopsis thaliana serves as a model organism for the study of fundamental physiological, cellular, and molecular processes. It has also greatly advanced our understanding of intraspecific genome variation. We present a detailed map of variation in 1,135 high-quality re-sequenced natural inbred lines representing the native Eurasian and North African range and recently colonized North America. We identify relict populations that continue to inhabit ancestral habitats, primarily in the Iberian Peninsula. They have mixed with a lineage that has spread to northern latitudes from an unknown glacial refugium and is now found in a much broader spectrum of habitats. Insights into the history of the species and the fine-scale distribution of genetic diversity provide the basis for full exploitation of A. thaliana natural variation through integration of genomes and epigenomes with molecular and non-molecular phenotypes.
Asunto(s)
Arabidopsis/genética , Genoma de Planta , Polimorfismo Genético , Epigénesis Genética , Epigenómica , Estudio de Asociación del Genoma Completo , FenotipoRESUMEN
Regulatory elements activate promoters by recruiting transcription factors (TFs) to specific motifs. Notably, TF-DNA interactions often depend on cooperativity with colocalized partners, suggesting an underlying cis-regulatory syntax. To explore TF cooperativity in mammals, we analyze â¼500 mouse and human primary cells by combining an atlas of TF motifs, footprints, ChIP-seq, transcriptomes, and accessibility. We uncover two TF groups that colocalize with most expressed factors, forming stripes in hierarchical clustering maps. The first group includes lineage-determining factors that occupy DNA elements broadly, consistent with their key role in tissue-specific transcription. The second one, dubbed universal stripe factors (USFs), comprises â¼30 SP, KLF, EGR, and ZBTB family members that recognize overlapping GC-rich sequences in all tissues analyzed. Knockouts and single-molecule tracking reveal that USFs impart accessibility to colocalized partners and increase their residence time. Mammalian cells have thus evolved a TF superfamily with overlapping DNA binding that facilitate chromatin accessibility.
Asunto(s)
Cromatina , Factores de Transcripción , Animales , Sitios de Unión , Cromatina/genética , ADN/genética , Humanos , Mamíferos/genética , Mamíferos/metabolismo , Ratones , Ratones Noqueados , Unión Proteica , Factores de Transcripción/metabolismoRESUMEN
There is an urgent need to improve wheat for upcoming challenges, including biotic and abiotic stresses. Sustainable wheat improvement requires the introduction of new genes and alleles in high-yielding wheat cultivars. Using new approaches, tools, and technologies to identify and introduce new genes in wheat cultivars is critical. High-quality genomes, transcriptomes, and pangenomes provide essential resources and tools to examine wheat closely to identify and manipulate new and targeted genes and alleles. Wheat genomics has improved excellently in the past 5 years, generating multiple genomes, pangenomes, and transcriptomes. Leveraging these resources allows us to accelerate our crop improvement pipelines. This review summarizes the progress made in wheat genomics and trait discovery in the past 5 years.
RESUMEN
The availability of public genomic resources can greatly assist biodiversity assessment, conservation, and restoration efforts by providing evidence for scientifically informed management decisions. Here we survey the main approaches and applications in biodiversity and conservation genomics, considering practical factors, such as cost, time, prerequisite skills, and current shortcomings of applications. Most approaches perform best in combination with reference genomes from the target species or closely related species. We review case studies to illustrate how reference genomes can facilitate biodiversity research and conservation across the tree of life. We conclude that the time is ripe to view reference genomes as fundamental resources and to integrate their use as a best practice in conservation genomics.
Asunto(s)
Biodiversidad , Conservación de los Recursos Naturales , Genómica , GenomaRESUMEN
The evolution of genomes in all life forms involves two distinct, dynamic types of genomic changes: gene duplication (and loss) that shape families of paralogous genes and extension (and contraction) of low-complexity regions (LCR), which occurs through dynamics of short repeats in protein-coding genes. Although the roles of each of these types of events in genome evolution have been studied, their co-evolutionary dynamics is not thoroughly understood. Here, by analyzing a wide range of genomes from diverse bacteria and archaea, we show that LCR and paralogy represent two distinct routes of evolution that are inversely correlated. The emergence of LCR is a prominent evolutionary mechanism in fast evolving, young protein families, whereas paralogy dominates the comparatively slow evolution of old protein families. The analysis of multiple prokaryotic genomes shows that the formation of LCR is likely a widespread, transient evolutionary mechanism that temporally and locally affects also ancestral functions, but apparently, fades away with time, under mutational and selective pressures, yielding to gene paralogy. We propose that compensatory relationships between short-term and longer-term evolutionary mechanisms are universal in the evolution of life.
Asunto(s)
Evolución Molecular , Células Procariotas , Filogenia , Bacterias/genética , Archaea/genéticaRESUMEN
Virus genomes may encode overlapping or nested open reading frames that increase their coding capacity. It is not known whether the constraints on spatial structures of the two encoded proteins limit the evolvability of nested genes. We examine the evolution of a pair of proteins, p22 and p19, encoded by nested genes in plant viruses from the genus Tombusvirus. The known structure of p19, a suppressor of RNA silencing, belongs to the RAGNYA fold from the alpha+beta class. The structure of p22, the cell-to-cell movement protein from the 30K family widespread in plant viruses, is predicted with the AlphaFold approach, suggesting a single jelly-roll fold core from the all-beta class, structurally similar to capsid proteins from plant and animal viruses. The nucleotide and codon preferences impose modest constraints on the types of secondary structures encoded in the alternative reading frames, nonetheless allowing for compact, well-ordered folds from different structural classes in two similarly-sized nested proteins. Tombusvirus p22 emerged through radiation of the widespread 30K family, which evolved by duplication of a virus capsid protein early in the evolution of plant viruses, whereas lineage-specific p19 may have emerged by a stepwise increase in the length of the overprinted gene and incremental acquisition of functionally active secondary structure elements by the protein product. This evolution of p19 toward the RAGNYA fold represents one of the first documented examples of protein structure convergence in naturally occurring proteins.
Asunto(s)
Tombusvirus , Evolución Molecular , Sistemas de Lectura Abierta , Pliegue de Proteína , Estructura Secundaria de Proteína , Tombusvirus/genética , Tombusvirus/metabolismo , Proteínas Virales/genética , Proteínas Virales/metabolismo , Proteínas Virales/química , Secuencia de Aminoácidos , Homología de Secuencia de Aminoácido , Modelos Psicológicos , Estructura Terciaria de ProteínaRESUMEN
Over-expression (OE) lines for the ER-tethered NAC transcription factor ANAC017 displayed de-repression of gun marker genes when grown on lincomycin (lin). RNA-seq revealed that ANAC017OE2 plants constitutively expressed greater than 40% of the genes induced in wild-type with lin treatment, including plastid encoded genes ycf1.2 and the gene cluster ndhH-ndhA-ndhI-ndhG-ndhE-psaC-ndhD, documented as direct RNA targets of GUN1. Genes encoding components involved in organelle translation were enriched in constitutively expressed genes in ANAC017OE2. ANAC017OE resulted in constitutive location in the nucleus and significant constitutive binding of ANAC017 was detected by ChIP-Seq to target genes. ANAC017OE2 lines maintained the ability to green on lin, were more ABA sensitive, did not show photo-oxidative damage after exposure of de-etiolated seedlings to continuous light and the transcriptome response to lin were as much as 80% unique compared to gun1-1. Both double mutants, gun1-1:ANAC017OE and bzip60:ANAC017OE (but not single bzip60), have a gun molecular gene expression pattern and result in variegated and green plants, suggesting that ANAC017OE may act through an independent pathway compared to gun1. Over-expression of ANAC013 or rcd1 did not produce a GUN phenotype or green plants on lin. Thus, constitutive ANAC017OE2 establishes an alternative transcriptional program that likely acts through a number of pathways, that is, maintains plastid gene expression, and induction of a variety of transcription factors involved in reactive oxygen species metabolism, priming plants for lin tolerance to give a gun phenotype.
Asunto(s)
Proteínas de Arabidopsis , Arabidopsis , Regulación de la Expresión Génica de las Plantas , Lincomicina , Fenotipo , Factores de Transcripción , Lincomicina/farmacología , Arabidopsis/genética , Arabidopsis/metabolismo , Proteínas de Arabidopsis/genética , Proteínas de Arabidopsis/metabolismo , Factores de Transcripción/metabolismo , Factores de Transcripción/genética , Plantas Modificadas Genéticamente , Genoma de Planta/genética , Proteínas de Unión al ADNRESUMEN
Hybrid sterility is a reproductive isolation barrier between diverging taxa securing the early steps of speciation. Hybrid sterility is ubiquitous in the animal and plant kingdoms, but its genetic control is poorly understood. In our previous studies, we have uncovered the sterility of hybrids between musculus and domesticus subspecies of the house mouse, which is controlled by the Prdm9 gene, the X-linked Hstx2 locus, and subspecific heterozygosity for genetic background. To further investigate this form of genic-driven chromosomal sterility, we constructed a simplified hybrid sterility model within the genome of the domesticus subspecies by swapping domesticus autosomes with their homologous partners from the musculus subspecies. We show that the "sterility" allelic combination of Prdm9 and Hstx2 can be activated by a musculus/domesticus heterozygosity of as few as two autosomes, Chromosome 17 (Chr 17) and Chr 18 and is further enhanced when another heterosubspecific autosomal pair is present, whereas it has no effect on meiotic progression in the pure domesticus genome. In addition, we identify a new X-linked hybrid sterility locus, Hstx3, at the centromeric end of Chr X, which modulates the incompatibility between Prdm9 and Hstx2. These results further support our concept of chromosomal hybrid sterility based on evolutionarily accumulated divergence between homologous sequences. Based on these and previous results, we believe that future studies should include more information on the mutual recognition of homologous chromosomes at or before the first meiotic prophase in interspecific hybrids, as this may serve as a general reproductive isolation checkpoint in mice and other species.
Asunto(s)
N-Metiltransferasa de Histona-Lisina , Hibridación Genética , Animales , Ratones , N-Metiltransferasa de Histona-Lisina/genética , Aislamiento Reproductivo , Genoma , Infertilidad/genética , Masculino , Femenino , Especiación GenéticaRESUMEN
Previous studies reveal extensive genetic introgression between Ovis species, which affects genetic adaptation and morphological traits. However, the exact evolutionary scenarios underlying the hybridization between sheep and allopatric wild relatives remain unknown. To address this problem, we here integrate the reference genomes of several ovine and caprine species: domestic sheep, argali, bighorn sheep, snow sheep, and domestic goats. Additionally, we use 856 whole genomes representing 169 domestic sheep populations and their 6 wild relatives: Asiatic mouflon, urial, argali, snow sheep, thinhorn sheep and bighorn sheep. We implement a comprehensive set of analyses to test introgression among these species. We infer that the argali lineage originated ca. 3.08-3.35 Mya and hybridized with the ancestor of Pachyceriforms (e.g., bighorn sheep and snow sheep) at â¼1.56 Mya. Previous studies show apparent introgression from North American Pachyceriforms into the Bashibai sheep, a Chinese native sheep breed, despite their wide geographic separation. We show here that, in fact, the apparent introgression from the Pachyceriforms into Bashibai can be explained by the old introgression from Pachyceriforms into argali, and subsequent recent introgression from argali into Bashibai. Our results illustrate the challenges of estimating complex introgression histories and provide an example of how indirect and direct introgression can be distinguished.
RESUMEN
The blue whale, Balaenoptera musculus, is the largest animal known to have ever existed, making it an important case study in longevity and resistance to cancer. To further this and other blue whale-related research, we report a reference-quality, long-read-based genome assembly of this fascinating species. We assembled the genome from PacBio long reads and utilized Illumina/10×, optical maps, and Hi-C data for scaffolding, polishing, and manual curation. We also provided long read RNA-seq data to facilitate the annotation of the assembly by NCBI and Ensembl. Additionally, we annotated both haplotypes using TOGA and measured the genome size by flow cytometry. We then compared the blue whale genome with other cetaceans and artiodactyls, including vaquita (Phocoena sinus), the world's smallest cetacean, to investigate blue whale's unique biological traits. We found a dramatic amplification of several genes in the blue whale genome resulting from a recent burst in segmental duplications, though the possible connection between this amplification and giant body size requires further study. We also discovered sites in the insulin-like growth factor-1 gene correlated with body size in cetaceans. Finally, using our assembly to examine the heterozygosity and historical demography of Pacific and Atlantic blue whale populations, we found that the genomes of both populations are highly heterozygous and that their genetic isolation dates to the last interglacial period. Taken together, these results indicate how a high-quality, annotated blue whale genome will serve as an important resource for biology, evolution, and conservation research.
Asunto(s)
Balaenoptera , Neoplasias , Animales , Balaenoptera/genética , Duplicaciones Segmentarias en el Genoma , Genoma , Demografía , Neoplasias/genéticaRESUMEN
The 5S rRNA genes are among the most conserved nucleotide sequences across all species. Similar to the 5S preservation we observe the occurrence of 5S-related nonautonomous retrotransposons, so-called Cassandras. Cassandras harbor highly conserved 5S rDNA-related sequences within their long terminal repeats, advantageously providing them with the 5S internal promoter. However, the dynamics of Cassandra retrotransposon evolution in the context of 5S rRNA gene sequence information and structural arrangement are still unclear, especially: (1) do we observe repeated or gradual domestication of the highly conserved 5S promoter by Cassandras and (2) do changes in 5S organization such as in the linked 35S-5S rDNA arrangements impact Cassandra evolution? Here, we show evidence for gradual co-evolution of Cassandra sequences with their corresponding 5S rDNAs. To follow the impact of 5S rDNA variability on Cassandra TEs, we investigate the Asteraceae family where highly variable 5S rDNAs, including 5S promoter shifts and both linked and separated 35S-5S rDNA arrangements have been reported. Cassandras within the Asteraceae mirror 5S rDNA promoter mutations of their host genome, likely as an adaptation to the host's specific 5S transcription factors and hence compensating for evolutionary changes in the 5S rDNA sequence. Changes in the 5S rDNA sequence and in Cassandras seem uncorrelated with linked/separated rDNA arrangements. We place all these observations into the context of angiosperm 5S rDNA-Cassandra evolution, discuss Cassandra's origin hypotheses (single or multiple) and Cassandra's possible impact on rDNA and plant genome organization, giving new insights into the interplay of ribosomal genes and transposable elements.
Asunto(s)
ARN Ribosómico 5S , Retroelementos , ARN Ribosómico 5S/genética , Retroelementos/genética , Genes de ARNr , Secuencia de Bases , ADN Ribosómico/genética , Genoma de Planta , Mutación , Evolución MolecularRESUMEN
G-protein-coupled receptors (GPCRs) are the largest superfamily in the human genome and the major targets for the market drugs. Recent massive genomics studies revealed numerous natural variations in the general population. 54KJPN is the most extensive Japanese population genomics study, curating the whole genome sequences from about 54,000 individuals. Here, by analyzing 390 non-olfactory GPCR genes in the 54KJPN dataset, we annotated 25,443 missense single-nucleotide variations. Among them, we found 120 major variations that appear with an allele frequency greater than 0.5, including variations that occurred on posttranslational modification sites. Structural alignment of GPCRs using the generic numbering system in the GPCRdb reveals enrichment of alterations in the conserved arginine residue within the DRY motif, which contributes to downstream G-protein signaling. A comparison with the worldwide 1000 Genomes Project (1KGP) dataset found 23 variations that were present exclusively in the 54KJPN dataset. This study will be the basis for future pharmacogenomics studies for the Japanese population.
RESUMEN
During viral replication, viruses carrying an RNA genome produce non-standard viral genomes (nsVGs), including copy-back viral genomes (cbVGs) and deletion viral genomes (delVGs), that play a crucial role in regulating viral replication and pathogenesis. Because of their critical roles in determining the outcome of RNA virus infections, the study of nsVGs has flourished in recent years, exposing a need for bioinformatic tools that can accurately identify them within next-generation sequencing data obtained from infected samples. Here, we present our data analysis pipeline, Viral Opensource DVG Key Algorithm 2 (VODKA2), that is optimized to run on a parallel computing environment for fast and accurate detection of nsVGs from large data sets.
Asunto(s)
Algoritmos , Genoma Viral , RNA-Seq , Biología Computacional/métodos , Replicación Viral , ARN Viral/genéticaRESUMEN
Negevirus is a recently proposed taxon of arthropod-infecting virus, which is associated with plant viruses of two families (Virgaviridae and Kitaviridae). Nevertheless, the evolutionary history of negevirus-host and its relationship with plant viruses remain poorly understood. Endogenous nege-like viral elements (ENVEs) are ancient nege-like viral sequences integrated into the arthropod genomes, which can serve as the molecular fossil records of previous viral infection. In this study, 292 ENVEs were identified in 150 published arthropod genomes, revealing the evolutionary history of nege-like viruses and two related plant virus families. We discovered three novel and eight strains of nege-like viruses in 11 aphid species. Further analysis indicated that 10 ENVEs were detected in six aphid genomes, and they were divided into four types (ENVE1-ENVE4). Orthologous integration and phylogenetic analyses revealed that nege-like viruses had a history of infection of over 60 My and coexisted with aphid ancestors throughout the Cenozoic Era. Moreover, two nege-like viral proteins (CP and SP24) were highly homologous to those of plant viruses in the families Virgaviridae and Kitaviridae. CP- and SP24-derived ENVEs were widely integrated into numerous arthropod genomes. These results demonstrate that nege-like viruses have a long-term coexistence with arthropod hosts and plant viruses of the two families, Virgaviridae and Kitaviridae, which may have evolved from the nege-like virus ancestor through horizontal virus transfer events. These findings broaden our perspective on the history of viral infection in arthropods and the origins of plant viruses. IMPORTANCE: Although negevirus is phylogenetically related to plant virus, the evolutionary history of negevirus-host and its relationship with plant virus remain largely unknown. In this study, we used endogenous nege-like viral elements (ENVEs) as the molecular fossil records to investigate the history of nege-like viral infection in arthropod hosts and the evolution of two related plant virus families (Virgaviridae and Kitaviridae). Our results showed the infection of nege-like viruses for over 60 My during the arthropod evolution. ENVEs highly homologous to viral sequences in Virgaviridae and Kitaviridae were present in a wide range of arthropod genomes but were absent in plant genomes, indicating that plant viruses in these two families possibly evolved from the nege-like virus ancestor through cross-species horizontal virus transmission. Our findings provide a new perspective on the virus-host coevolution and the origins of plant viruses.
Asunto(s)
Áfidos , Artrópodos , Evolución Molecular , Filogenia , Virus de Plantas , Animales , Áfidos/virología , Virus de Plantas/genética , Virus de Plantas/clasificación , Artrópodos/virología , Coevolución Biológica , Proteínas Virales/genética , Genoma Viral/genética , Interacciones Huésped-Patógeno/genéticaRESUMEN
During virus replication in cultured cells, copy-back defective viral genomes (cbDVGs) can arise. CbDVGs are powerful inducers of innate immune responses in vitro, but their occurrence and impact on natural infections of human hosts remain poorly defined. We asked whether cbDVGs were generated in the brain of a patient who succumbed to subacute sclerosing panencephalitis (SSPE) about 20 years after acute measles virus (MeV) infection. Previous analyses of 13 brain specimens of this patient indicated that a collective infectious unit (CIU) drove lethal MeV spread. In this study, we identified 276 replication-competent cbDVG species, each present in over 100 copies in the brain. Six species were detected in multiple forebrain locations, implying that they travelled long-distance with the CIU. The cbDVG to full-length genomes ratio was often close to 1 (0.6-1.74). Most cbDVGs were 324-2,000 bases in length, corresponding to 2%-12% of the full-length genome; all are predicted to have complementary terminal sequences. If improperly encapsidated, these sequences have the potential to form double-stranded structures that can induce innate immune responses. To assess this, we examined the transcriptome of all brain specimens. Several interferon and inflammatory response genes were upregulated, but upregulation levels did not correlate with cbDVG levels in the specimens. Thus, the CIU that drove MeV pathogenesis in this brain includes, in addition to two complementary full-length genome populations, many locally restricted and few widespread cbDVG species. The widespread cbDVG species may have been positively selected but how they impacted pathogenesis remains to be determined.IMPORTANCECopy-back defective viral genomes (cbDVGs) can drive virus-host interactions. They can suppress virus replication directly, by competing with full-length genomes, or indirectly by stimulating antiviral immunity. In vitro, cbDVG can slow down infections and promote persistence, but there is limited documentation of their presence in human hosts or of their impact on disease. We had the unique opportunity to analyze the brain of a patient who succumbed to subacute sclerosing panencephalitis, a rare but lethal consequence of measles. We detected more than 270 distinct cbDVG species; most were restricted to one specimen, but several reached all lobes of the forebrain, suggesting positive selection. Our analyses provide the missing knowledge of the diversity of cbDVG in a natural infection of a human host. They also reveal that a collective infectious unit that caused lethal human brain disease includes few widespread cbDVG, in addition to two ubiquitous complementary full-length genome populations.