RESUMEN
SUMMARY: Missing regions in short-read assemblies of prokaryote genomes are often attributed to biases in sequencing technologies and to repetitive elements, the former resulting in low sequencing coverage of certain loci and the latter to unresolved loops in the de novo assembly graph. We developed SASpector, a command-line tool that compares short-read assemblies (draft genomes) to their corresponding closed assemblies and extracts missing regions to analyze them at the sequence and functional level. SASpector allows to benchmark the need for resolved genomes, can be integrated into pipelines to control the quality of assemblies, and could be used for comparative investigations of missingness in assemblies for which both short-read and long-read data are available in the public databases. AVAILABILITY AND IMPLEMENTATION: SASpector is available at https://github.com/LoGT-KULeuven/SASpector. The tool is implemented in Python3 and available through pip and Docker (0mician/saspector). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento , Programas Informáticos , Genoma , Genómica , Análisis de Secuencia de ADNRESUMEN
Gut viruses are important, yet often neglected, players in the complex human gut microbial ecosystem. Recently, the number of human gut virome studies has been increasing; however, we are still only scratching the surface of the immense viral diversity. In this study, 254 virus-enriched fecal metagenomes from 204 Danish subjects were used to generate the Danish Enteric Virome Catalog (DEVoC) containing 12,986 nonredundant viral scaffolds, of which the majority was previously undescribed, encoding 190,029 viral genes. The DEVoC was used to compare 91 healthy DEVoC gut viromes from children, adolescents, and adults that were used to create the DEVoC. Gut viromes of healthy Danish subjects were dominated by phages. While most phage genomes (PGs) only occurred in a single subject, indicating large virome individuality, 39 PGs were present in more than 10 healthy subjects. Among these 39 PGs, the prevalences of three PGs were associated with age. To further study the prevalence of these 39 prevalent PGs, 1,880 gut virome data sets of 27 studies from across the world were screened, revealing several age-, geography-, and disease-related prevalence patterns. Two PGs also showed a remarkably high prevalence worldwide-a crAss-like phage (20.6% prevalence), belonging to the tentative AlphacrAssvirinae subfamily, and a previously undescribed circular temperate phage infecting Bacteroides dorei (14.4% prevalence), called LoVEphage because it encodes lots of viral elements. Due to the LoVEphage's high prevalence and novelty, public data sets in which the LoVEphage was detected were de novo assembled, resulting in an additional 18 circular LoVEphage-like genomes (67.9 to 72.4 kb). IMPORTANCE Through generation of the DEVoC, we added numerous previously uncharacterized viral genomes and genes to the ever-increasing worldwide pool of human gut viromes. The DEVoC, the largest human gut virome catalog generated from consistently processed fecal samples, facilitated the analysis of the 91 healthy Danish gut viromes. Characterizing the biggest cohort of healthy gut viromes from children, adolescents, and adults to date confirmed the previously established high interindividual variation in human gut viromes and demonstrated that the effect of age on the gut virome composition was limited to the prevalence of specific phage (groups). The identification of a previously undescribed prevalent phage illustrates the usefulness of developing virome catalogs, and we foresee that the DEVoC will benefit future analysis of the roles of gut viruses in human health and disease.
RESUMEN
In vertebrates, the mineralocorticoid receptor (MR) is a steroid-activated nuclear receptor (NR) that plays essential roles in water-electrolyte balance and blood pressure homeostasis. It belongs to the group of oxo-steroidian NRs, together with the glucocorticoid (GR), progesterone (PR), and androgen (AR) receptors. Classically, these oxo-steroidian NRs homodimerize and bind to specific genomic sequences to activate gene expression. NRs are multi-domain proteins, and dimerization is mediated by both the DNA (DBD) and ligand binding domains (LBDs), with the latter thought to provide the largest dimerization interface. However, at the structural level, the dimerization of oxo-steroidian receptors LBDs has remained largely a matter of debate and, despite their sequence homology, there is currently no consensus on a common homodimer assembly across the four receptors, that is, GR, PR, AR, and MR. Here, we examined all available MR LBD crystals using different computational methods (protein common interface database, proteins, interfaces, structures and assemblies, protein-protein interaction prediction by structural matching, and evolutionary protein-protein interface classifier, and the molecular mechanics Poisson-Boltzmann surface area method). A consensus is reached by all methods and singles out an interface mediated by helices H9, H10 and the C-terminal F domain as having characteristics of a biologically relevant assembly. Interestingly, a similar assembly was previously identified for GRα, MR closest homolog. Alternative architectures that were proposed for GRα were not observed for MR. These data call for further experimental investigations of oxo-steroid dimer architectures.