RESUMO
KEY MESSAGE: The barley mutant xan-h.chli-1 shows phenotypic features, such as reduced leaf chlorophyll content and daily transpiration rate, typical of wild barley accessions and landraces adapted to arid climatic conditions. The pale green trait, i.e. reduced chlorophyll content, has been shown to increase the efficiency of photosynthesis and biomass accumulation when photosynthetic microorganisms and tobacco plants are cultivated at high densities. Here, we assess the effects of reducing leaf chlorophyll content in barley by altering the chlorophyll biosynthesis pathway (CBP). To this end, we have isolated and characterised the pale green barley mutant xan-h.chli-1, which carries a missense mutation in the Xan-h gene for subunit I of Mg-chelatase (HvCHLI), the first enzyme in the CBP. Intriguingly, xan-h.chli-1 is the only known viable homozygous mutant at the Xan-h locus in barley. The Arg298Lys amino-acid substitution in the ATP-binding cleft causes a slight decrease in HvCHLI protein abundance and a marked reduction in Mg-chelatase activity. Under controlled growth conditions, mutant plants display reduced accumulation of antenna and photosystem core subunits, together with reduced photosystem II yield relative to wild-type under moderate illumination, and consistently higher than wild-type levels at high light intensities. Moreover, the reduced content of leaf chlorophyll is associated with a stable reduction in daily transpiration rate, and slight decreases in total biomass accumulation and water-use efficiency, reminiscent of phenotypic features of wild barley accessions and landraces that thrive under arid climatic conditions.
Assuntos
Clorofila , Hordeum , Liases , Mutação de Sentido Incorreto , Folhas de Planta , Proteínas de Plantas , Transpiração Vegetal , Hordeum/genética , Hordeum/fisiologia , Hordeum/enzimologia , Clorofila/metabolismo , Transpiração Vegetal/genética , Proteínas de Plantas/genética , Proteínas de Plantas/metabolismo , Folhas de Planta/genética , Folhas de Planta/fisiologia , Liases/genética , Liases/metabolismo , Fotossíntese/genética , Fenótipo , Complexo de Proteína do Fotossistema II/metabolismo , Complexo de Proteína do Fotossistema II/genéticaRESUMO
Various next generation sequencing (NGS) based strategies have been successfully used in the recent past for tracing origins and understanding the evolution of infectious agents, investigating the spread and transmission chains of outbreaks, as well as facilitating the development of effective and rapid molecular diagnostic tests and contributing to the hunt for treatments and vaccines. The ongoing COVID-19 pandemic poses one of the greatest global threats in modern history and has already caused severe social and economic costs. The development of efficient and rapid sequencing methods to reconstruct the genomic sequence of SARS-CoV-2, the etiological agent of COVID-19, has been fundamental for the design of diagnostic molecular tests and to devise effective measures and strategies to mitigate the diffusion of the pandemic. Diverse approaches and sequencing methods can, as testified by the number of available sequences, be applied to SARS-CoV-2 genomes. However, each technology and sequencing approach has its own advantages and limitations. In the current review, we will provide a brief, but hopefully comprehensive, account of currently available platforms and methodological approaches for the sequencing of SARS-CoV-2 genomes. We also present an outline of current repositories and databases that provide access to SARS-CoV-2 genomic data and associated metadata. Finally, we offer general advice and guidelines for the appropriate sharing and deposition of SARS-CoV-2 data and metadata, and suggest that more efficient and standardized integration of current and future SARS-CoV-2-related data would greatly facilitate the struggle against this new pathogen. We hope that our 'vademecum' for the production and handling of SARS-CoV-2-related sequencing data, will contribute to this objective.
Assuntos
COVID-19/virologia , Genoma Viral , Sequenciamento de Nucleotídeos em Larga Escala/métodos , SARS-CoV-2/genética , COVID-19/epidemiologia , Humanos , PandemiasRESUMO
Genome instability is a condition characterized by the accumulation of genetic alterations and is a hallmark of cancer cells. To uncover new genes and cellular pathways affecting endogenous DNA damage and genome integrity, we exploited a Synthetic Genetic Array (SGA)-based screen in yeast. Among the positive genes, we identified VID22, reported to be involved in DNA double-strand break repair. vid22Δ cells exhibit increased levels of endogenous DNA damage, chronic DNA damage response activation and accumulate DNA aberrations in sequences displaying high probabilities of forming G-quadruplexes (G4-DNA). If not resolved, these DNA secondary structures can block the progression of both DNA and RNA polymerases and correlate with chromosome fragile sites. Vid22 binds to and protects DNA at G4-containing regions both in vitro and in vivo. Loss of VID22 causes an increase in gross chromosomal rearrangement (GCR) events dependent on G-quadruplex forming sequences. Moreover, the absence of Vid22 causes defects in the correct maintenance of G4-DNA rich elements, such as telomeres and mtDNA, and hypersensitivity to the G4-stabilizing ligand TMPyP4. We thus propose that Vid22 is directly involved in genome integrity maintenance as a novel regulator of G4 metabolism.
Assuntos
Quadruplex G , Instabilidade Genômica , Proteínas de Membrana/fisiologia , Proteínas de Saccharomyces cerevisiae/fisiologia , Aberrações Cromossômicas , Dano ao DNA , Genoma Fúngico , Proteínas de Membrana/genética , Proteínas de Membrana/metabolismo , Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/genética , Proteínas de Saccharomyces cerevisiae/metabolismo , Homeostase do TelômeroRESUMO
Effective systems for the analysis of molecular data are fundamental for monitoring the spread of infectious diseases and studying pathogen evolution. The rapid identification of emerging viral strains, and/or genetic variants potentially associated with novel phenotypic features is one of the most important objectives of genomic surveillance of human pathogens and represents one of the first lines of defense for the control of their spread. During the COVID 19 pandemic, several taxonomic frameworks have been proposed for the classification of SARS-Cov-2 isolates. These systems, which are typically based on phylogenetic approaches, represent essential tools for epidemiological studies as well as contributing to the study of the origin of the outbreak. Here, we propose an alternative, reproducible, and transparent phenetic method to study changes in SARS-CoV-2 genomic diversity over time. We suggest that our approach can complement other systems and facilitate the identification of biologically relevant variants in the viral genome. To demonstrate the validity of our approach, we present comparative genomic analyses of more than 175,000 genomes. Our method delineates 22 distinct SARS-CoV-2 haplogroups, which, based on the distribution of high-frequency genetic variants, fall into four major macrohaplogroups. We highlight biased spatiotemporal distributions of SARS-CoV-2 genetic profiles and show that seven of the 22 haplogroups (and of all of the four haplogroup clusters) showed a broad geographic distribution within China by the time the outbreak was widely recognized-suggesting early emergence and widespread cryptic circulation of the virus well before its isolation in January 2020. General patterns of genomic variability are remarkably similar within all major SARS-CoV-2 haplogroups, with UTRs consistently exhibiting the greatest variability, with s2m, a conserved secondary structure element of unknown function in the 3'-UTR of the viral genome showing evidence of a functional shift. Although several polymorphic sites that are specific to one or more haplogroups were predicted to be under positive or negative selection, overall our analyses suggest that the emergence of novel types is unlikely to be driven by convergent evolution and independent fixation of advantageous substitutions, or by selection of recombined strains. In the absence of extensive clinical metadata for most available genome sequences, and in the context of extensive geographic and temporal biases in the sampling, many questions regarding the evolution and clinical characteristics of SARS-CoV-2 isolates remain open. However, our data indicate that the approach outlined here can be usefully employed in the identification of candidate SARS-CoV-2 genetic variants of clinical and epidemiological importance.
Assuntos
COVID-19/genética , Evolução Molecular , Genoma Viral , Genômica , Filogenia , SARS-CoV-2/genética , HumanosRESUMO
A number of studies have reported the successful application of single-molecule sequencing technologies to the determination of the size and sequence of pathological expanded microsatellite repeats over the last 5 years. However, different custom bioinformatics pipelines were employed in each study, preventing meaningful comparisons and somewhat limiting the reproducibility of the results. In this review, we provide a brief summary of state-of-the-art methods for the characterization of expanded repeats alleles, along with a detailed comparison of bioinformatics tools for the determination of repeat length and sequence, using both real and simulated data. Our reanalysis of publicly available human genome sequencing data suggests a modest, but statistically significant, increase of the error rate of single-molecule sequencing technologies at genomic regions containing short tandem repeats. However, we observe that all the methods herein tested, irrespective of the strategy used for the analysis of the data (either based on the alignment or assembly of the reads), show high levels of sensitivity in both the detection of expanded tandem repeats and the estimation of the expansion size, suggesting that approaches based on single-molecule sequencing technologies are highly effective for the detection and quantification of tandem repeat expansions and contractions.
Assuntos
Biologia Computacional , Sequenciamento de Nucleotídeos em Larga Escala , Repetições de Microssatélites , Dados de Sequência Molecular , Análise de Sequência de DNA , Alelos , Mapeamento Cromossômico , Genoma Humano , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos , Reprodutibilidade dos Testes , Análise de Sequência de DNA/métodosRESUMO
SUMMARY: While over 200 000 genomic sequences are currently available through dedicated repositories, ad hoc methods for the functional annotation of SARS-CoV-2 genomes do not harness all currently available resources for the annotation of functionally relevant genomic sites. Here, we present CorGAT, a novel tool for the functional annotation of SARS-CoV-2 genomic variants. By comparisons with other state of the art methods we demonstrate that, by providing a more comprehensive and rich annotation, our method can facilitate the identification of evolutionary patterns in the genome of SARS-CoV-2. AVAILABILITYAND IMPLEMENTATION: Galaxy.http://corgat.cloud.ba.infn.it/galaxy; software: https://github.com/matteo14c/CorGAT/tree/Revision_V1; docker: https://hub.docker.com/r/laniakeacloud/galaxy_corgat. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMO
MOTIVATION: Clinical applications of genome re-sequencing technologies typically generate large amounts of data that need to be carefully annotated and interpreted to identify genetic variants potentially associated with pathological conditions. In this context, accurate and reproducible methods for the functional annotation and prioritization of genetic variants are of fundamental importance. RESULTS: In this article, we present VINYL, a flexible and fully automated system for the functional annotation and prioritization of genetic variants. Extensive analyses of both real and simulated datasets suggest that VINYL can identify clinically relevant genetic variants in a more accurate manner compared to equivalent state of the art methods, allowing a more rapid and effective prioritization of genetic variants in different experimental settings. As such we believe that VINYL can establish itself as a valuable tool to assist healthcare operators and researchers in clinical genomics investigations. AVAILABILITY AND IMPLEMENTATION: VINYL is available at http://beaconlab.it/VINYL and https://github.com/matteo14c/VINYL. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
RESUMO
ESCRT (Endosomal Sorting Complex Required for Transport) proteins have been shown to control an increasing number of membrane-associated processes. Some of these, and prominently regulation of receptor trafficking, profoundly shape signal transduction. Evidence in fungi, plants and multiple animal models support the emerging concept that ESCRTs are main actors in coordination of signaling with the changes in cells and tissues occurring during development and homeostasis. Consistent with their pleiotropic function, ESCRTs are regulated in multiple ways to tailor signaling to developmental and homeostatic needs. ESCRT activity is crucial to correct execution of developmental programs, especially at key transitions, allowing eukaryotes to thrive and preventing appearance of congenital defects.
Assuntos
Complexos Endossomais de Distribuição Requeridos para Transporte , Transdução de Sinais , Animais , Transporte Biológico , Membrana Celular/metabolismo , Núcleo Celular/metabolismo , Sistema Nervoso Central/metabolismo , Complexos Endossomais de Distribuição Requeridos para Transporte/genética , Complexos Endossomais de Distribuição Requeridos para Transporte/metabolismo , Endossomos/metabolismo , Humanos , Transdução de Sinais/genéticaRESUMO
Nuclear Factor Y (NF-Y) is a heterotrimeric transcription factor that binds CCAAT elements. The NF-Y trimer is composed of a Histone Fold Domain (HFD) dimer (NF-YB/NF-YC) and NF-YA, which confers DNA sequence specificity. NF-YA shares a conserved domain with the CONSTANS, CONSTANS-LIKE, TOC1 (CCT) proteins. We show that CONSTANS (CO/B-BOX PROTEIN1 BBX1), a master flowering regulator, forms a trimer with Arabidopsis thaliana NF-YB2/NF-YC3 to efficiently bind the CORE element of the FLOWERING LOCUS T promoter. We term this complex NF-CO. Using saturation mutagenesis, electrophoretic mobility shift assays, and RNA-sequencing profiling of co, nf-yb, and nf-yc mutants, we identify CCACA elements as the core NF-CO binding site. CO physically interacts with the same HFD surface required for NF-YA association, as determined by mutations in NF-YB2 and NF-YC9, and tested in vitro and in vivo. The co-7 mutation in the CCT domain, corresponding to an NF-YA arginine directly involved in CCAAT recognition, abolishes NF-CO binding to DNA. In summary, a unifying molecular mechanism of CO function relates it to the NF-YA paradigm, as part of a trimeric complex imparting sequence specificity to HFD/DNA interactions. It is likely that members of the large CCT family participate in similar complexes with At-NF-YB and At-NF-YC, broadening HFD combinatorial possibilities in terms of trimerization, DNA binding specificities, and transcriptional regulation.
Assuntos
Proteínas de Arabidopsis/metabolismo , Arabidopsis/genética , Arabidopsis/metabolismo , DNA de Plantas/genética , Proteínas de Ligação a DNA/metabolismo , Fatores de Transcrição/metabolismo , Proteínas de Arabidopsis/genética , Fator de Ligação a CCAAT/genética , Fator de Ligação a CCAAT/metabolismo , Monóxido de Carbono/metabolismo , Proteínas de Ligação a DNA/genética , Ligação Proteica , Fatores de Transcrição/genéticaRESUMO
While RNA editing by A-to-I deamination is a requisite for neuronal function in humans, it is under-investigated in single cells. Here we fill this gap by analyzing RNA editing profiles of single cells from the brain cortex of living human subjects. We show that RNA editing levels per cell are bimodally distributed and distinguish between major brain cell types, thus providing new insights into neuronal dynamics.
Assuntos
Encéfalo/metabolismo , Edição de RNA , Análise de Célula Única , Transcriptoma , Astrócitos/metabolismo , Análise por Conglomerados , Bases de Dados de Ácidos Nucleicos , Perfilação da Expressão Gênica , Regulação da Expressão Gênica , Humanos , Neurônios/metabolismo , Análise de Célula Única/métodosRESUMO
The alarming diffusion of multidrug-resistant (MDR) bacterial strains requires investigations on nonantibiotic therapies. Among such therapies, the use of bacteriophages (phages) as antimicrobial agents, namely, phage therapy, is a promising treatment strategy supported by the findings of recent successful compassionate treatments in Europe and the United States. In this work, we combined host range and genomic information to design a 6-phage cocktail killing several clinical strains of Pseudomonas aeruginosa, including those collected from Italian cystic fibrosis (CF) patients, and analyzed the cocktail performance. We demonstrated that the cocktail composed of four novel phages (PYO2, DEV, E215 and E217) and two previously characterized phages (PAK_P1 and PAK_P4) was able to lyse P. aeruginosa both in planktonic liquid cultures and in biofilms. In addition, we showed that the phage cocktail could cure acute respiratory infection in mice and treat bacteremia in wax moth (Galleria mellonella) larvae. Furthermore, administration of the cocktail to larvae prior to bacterial infection provided prophylaxis. In this regard, the efficiency of the phage cocktail was found to be unaffected by the MDR or mucoid phenotype of the pseudomonal strain. The cocktail was found to be superior to the individual phages in destroying biofilms and providing a faster treatment in mice. We also found the Galleria larva model to be cost-effective for testing the susceptibility of clinical strains to phages, suggesting that it could be implemented in the frame of developing personalized phage therapies.
Assuntos
Bacteriófagos/fisiologia , Larva/microbiologia , Mariposas/microbiologia , Terapia por Fagos/métodos , Infecções por Pseudomonas/microbiologia , Infecções por Pseudomonas/terapia , Pseudomonas aeruginosa/patogenicidade , Pseudomonas aeruginosa/virologia , Animais , Biofilmes , Fibrose Cística/microbiologia , Fibrose Cística/terapia , Fagos de PseudomonasRESUMO
OBJECTIVES: Celiac disease (CD)-associated duodenal dysbiosis has not yet been clearly defined, and the mechanisms by which CD-associated dysbiosis could concur to CD development or exacerbation are unknown. In this study, we analyzed the duodenal microbiome of CD patients. METHODS: The microbiome was evaluated in duodenal biopsy samples of 20 adult patients with active CD, 6 CD patients on a gluten-free diet, and 15 controls by DNA sequencing of 16S ribosomal RNA libraries. Bacterial species were cultured, isolated and identified by mass spectrometry. Isolated bacterial species were used to infect CaCo-2 cells, and to stimulate normal duodenal explants and cultured human and murine dendritic cells (DCs). Inflammatory markers and cytokines were evaluated by immunofluorescence and ELISA, respectively. RESULTS: Proteobacteria was the most abundant and Firmicutes and Actinobacteria the least abundant phyla in the microbiome profiles of active CD patients. Members of the Neisseria genus (Betaproteobacteria class) were significantly more abundant in active CD patients than in the other two groups (P=0.03). Neisseria flavescens (CD-Nf) was the most abundant Neisseria species in active CD duodenum. Whole-genome sequencing of CD-Nf and control-Nf showed genetic diversity of the iron acquisition systems and of some hemoglobin-related genes. CD-Nf was able to escape the lysosomal compartment in CaCo-2 cells and to induce an inflammatory response in DCs and in ex-vivo mucosal explants. CONCLUSIONS: Marked dysbiosis and an abundance of a peculiar CD-Nf strain characterize the duodenal microbiome in active CD patients thus suggesting that the CD-associated microbiota could contribute to the many inflammatory signals in this disorder.
Assuntos
Doença Celíaca/microbiologia , Duodeno/microbiologia , Disbiose/microbiologia , Metagenômica , Neisseria/isolamento & purificação , Actinobacteria/classificação , Actinobacteria/isolamento & purificação , Adulto , Biópsia , Células CACO-2 , Dieta Livre de Glúten , Ensaio de Imunoadsorção Enzimática , Feminino , Imunofluorescência , Humanos , Itália , Masculino , Microbiota , Neisseria/classificação , Proteobactérias/classificação , Proteobactérias/isolamento & purificaçãoRESUMO
During very early stages of flower development in Arabidopsis thaliana, a series of key decisions are taken. Indeed, the position and the basic patterning of new flowers are determined in less than 4 days. Given that the scientific literature provides hard evidence for the function of only 10% of A. thaliana genes, we hypothesized that although many essential genes have already been identified, many poorly characterized genes are likely to be involved in floral patterning. In the current study, we use high-throughput sequencing to describe the transcriptome of the native inflorescence meristem, the floral meristem and the new flower immediately after the start of organ differentiation. We provide evidence that our experimental system is reliable and less affected by experimental artefacts than a widely used floral induction system. Furthermore, we show how these data can be used to identify candidate genes for functional studies, and to generate hypotheses of functional redundancies and regulatory interactions.
Assuntos
Proteínas de Arabidopsis/genética , Arabidopsis/genética , Regulação da Expressão Gênica de Plantas , Transcriptoma , Arabidopsis/citologia , Arabidopsis/crescimento & desenvolvimento , Proteínas de Arabidopsis/metabolismo , Diferenciação Celular , Análise por Conglomerados , Biologia Computacional , Bases de Dados Genéticas , Flores/citologia , Flores/genética , Flores/crescimento & desenvolvimento , Regulação da Expressão Gênica no Desenvolvimento , Sequenciamento de Nucleotídeos em Larga Escala , Hibridização In Situ , Inflorescência/citologia , Inflorescência/genética , Inflorescência/crescimento & desenvolvimento , Meristema/citologia , Meristema/genética , Meristema/crescimento & desenvolvimento , Microdissecção , RNA de Plantas/química , RNA de Plantas/genética , Análise de Sequência de RNARESUMO
The fdl1-1 mutation, caused by an Enhancer/Suppressor mutator (En/Spm) element insertion located in the third exon of the gene, identifies a novel gene encoding ZmMYB94, a transcription factor of the R2R3-MYB subfamily. The fdl1 gene was isolated through co-segregation analysis, whereas proof of gene identity was obtained using an RNAi strategy that conferred less severe, but clearly recognizable specific mutant traits on seedlings. Fdl1 is involved in the regulation of cuticle deposition in young seedlings as well as in the establishment of a regular pattern of epicuticular wax deposition on the epidermis of young leaves. Lack of Fdl1 action also correlates with developmental defects, such as delayed germination and seedling growth, abnormal coleoptile opening and presence of curly leaves showing areas of fusion between the coleoptile and the first leaf or between the first and the second leaf. The expression profile of ZmMYB94 mRNA-determined by quantitative RT-PCR-overlaps the pattern of mutant phenotypic expression and is confined to a narrow developmental window. High expression was observed in the embryo, in the seedling coleoptile and in the first two leaves, whereas RNA level, as well as phenotypic defects, decreases at the third leaf stage. Interestingly several of the Arabidopsis MYB genes most closely related to ZmMYB94 are also involved in the activation of cuticular wax biosynthesis, suggesting deep conservation of regulatory processes related to cuticular wax deposition between monocots and dicots.
Assuntos
Proteínas de Plantas/genética , Fatores de Transcrição/genética , Zea mays/genética , Cotilédone/genética , Cotilédone/crescimento & desenvolvimento , Cotilédone/metabolismo , Mutação , Organogênese Vegetal , Proteínas de Plantas/metabolismo , Brotos de Planta/genética , Brotos de Planta/crescimento & desenvolvimento , Brotos de Planta/metabolismo , Plântula/genética , Plântula/crescimento & desenvolvimento , Plântula/metabolismo , Sementes/genética , Sementes/crescimento & desenvolvimento , Sementes/metabolismo , Fatores de Transcrição/metabolismo , Zea mays/embriologia , Zea mays/metabolismoRESUMO
Several bioinformatics methods have been proposed for the detection and characterization of genomic structural variation (SV) from ultra high-throughput genome resequencing data. Recent surveys show that comprehensive detection of SV events of different types between an individual resequenced genome and a reference sequence is best achieved through the combination of methods based on different principles (split mapping, reassembly, read depth, insert size, etc.). The improvement of individual predictors is thus an important objective. In this study, we propose a new method that combines deviations from expected library insert sizes and additional information from local patterns of read mapping and uses supervised learning to predict the position and nature of structural variants. We show that our approach provides greatly increased sensitivity with respect to other tools based on paired end read mapping at no cost in specificity, and it makes reliable predictions of very short insertions and deletions in repetitive and low-complexity genomic contexts that can confound tools based on split mapping of reads.
Assuntos
Variação Estrutural do Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Análise de Sequência de DNA , Software , Genoma Humano , Heterozigoto , Humanos , Máquina de Vetores de SuporteRESUMO
DEV is an obligatory lytic Pseudomonas phage of the N4-like genus, recently reclassified as Schitoviridae. The DEV genome encodes 91 ORFs, including a 3398 amino acid virion-associated RNA polymerase (vRNAP). Here, we describe the complete architecture of DEV, determined using a combination of cryo-electron microscopy localized reconstruction, biochemical methods, and genetic knockouts. We built de novo structures of all capsid factors and tail components involved in host attachment. We demonstrate that DEV long tail fibers are essential for infection of Pseudomonas aeruginosa but dispensable for infecting mutants with a truncated lipopolysaccharide devoid of the O-antigen. We determine that DEV vRNAP is part of a three-gene operon conserved in 191 Schitoviridae genomes. We propose these three proteins are ejected into the host to form a genome ejection motor spanning the cell envelope. We posit that the design principles of the DEV ejection apparatus are conserved in all Schitoviridae.
Assuntos
Microscopia Crioeletrônica , Genoma Viral , Fagos de Pseudomonas , Pseudomonas aeruginosa , Fagos de Pseudomonas/genética , Fagos de Pseudomonas/ultraestrutura , Genoma Viral/genética , Pseudomonas aeruginosa/virologia , Pseudomonas aeruginosa/genética , RNA Polimerases Dirigidas por DNA/metabolismo , RNA Polimerases Dirigidas por DNA/genética , Vírion/ultraestrutura , Vírion/genética , Fases de Leitura Aberta/genética , Proteínas Virais/genética , Proteínas Virais/metabolismo , Proteínas Virais/química , Óperon/genética , Proteínas do Capsídeo/genética , Proteínas do Capsídeo/metabolismo , Proteínas do Capsídeo/química , Capsídeo/metabolismo , Capsídeo/ultraestruturaRESUMO
Accurate and timely monitoring of the evolution of SARS-CoV-2 is crucial for identifying and tracking potentially more transmissible/virulent viral variants, and implement mitigation strategies to limit their spread. Here we introduce HaploCoV, a novel software framework that enables the exploration of SARS-CoV-2 genomic diversity through space and time, to identify novel emerging viral variants and prioritize variants of potential epidemiological interest in a rapid and unsupervised manner. HaploCoV can integrate with any classification/nomenclature and incorporates an effective scoring system for the prioritization of SARS-CoV-2 variants. By performing retrospective analyses of more than 11.5 M genome sequences we show that HaploCoV demonstrates high levels of accuracy and reproducibility and identifies the large majority of epidemiologically relevant viral variants - as flagged by international health authorities - automatically and with rapid turn-around times.Our results highlight the importance of the application of strategies based on the systematic analysis and integration of regional data for rapid identification of novel, emerging variants of SARS-CoV-2. We believe that the approach outlined in this study will contribute to relevant advances to current and future genomic surveillance methods.
Assuntos
COVID-19 , Humanos , COVID-19/diagnóstico , COVID-19/epidemiologia , Reprodutibilidade dos Testes , Estudos Retrospectivos , SARS-CoV-2/genéticaRESUMO
RNA editing is a widespread post-transcriptional molecular phenomenon that can increase proteomic diversity, by modifying the sequence of completely or partially non-functional primary transcripts, through a variety of mechanistically and evolutionarily unrelated pathways. Editing by base substitution has been investigated in both animals and plants. However, conventional strategies based on directed Sanger sequencing are time-consuming and effectively preclude genome wide identification of RNA editing and assessment of partial and tissue-specific editing sites. In contrast, the high-throughput RNA-Seq approach allows the generation of a comprehensive landscape of RNA editing at the genome level. Short reads from Solexa/Illumina GA and ABI SOLiD platforms have been used to investigate the editing pattern in mitochondria of Vitis vinifera providing significant support for 401 C-to-U conversions in coding regions and an additional 44 modifications in non-coding RNAs. Moreover, 76% of all C-to-U conversions in coding genes represent partial RNA editing events and 28% of them were shown to be significantly tissue specific. Solexa/Illumina and SOLiD platforms showed different characteristics with respect to the specific issue of large-scale editing analysis, and the combined approach presented here reduces the false positive rate of discovery of editing events.
Assuntos
DNA Mitocondrial/química , Edição de RNA , RNA de Plantas/química , RNA/química , Análise de Sequência de RNA/métodos , Vitis/genética , Arabidopsis/genética , Pareamento Incorreto de Bases , DNA de Plantas/química , Genes Mitocondriais , Genoma Mitocondrial , Genômica , RNA/metabolismo , RNA Mitocondrial , RNA de Plantas/metabolismoRESUMO
The 5' and 3' untranslated regions of eukaryotic mRNAs (UTRs) play crucial roles in the post-transcriptional regulation of gene expression through the modulation of nucleo-cytoplasmic mRNA transport, translation efficiency, subcellular localization and message stability. UTRdb is a curated database of 5' and 3' untranslated sequences of eukaryotic mRNAs, derived from several sources of primary data. Experimentally validated functional motifs are annotated and also collated as the UTRsite database where more specific information on the functional motifs and cross-links to interacting regulatory protein are provided. In the current update, the UTR entries have been organized in a gene-centric structure to better visualize and retrieve 5' and 3'UTR variants generated by alternative initiation and termination of transcription and alternative splicing. Experimentally validated miRNA targets and conserved sequence elements are also annotated. The integration of UTRdb with genomic data has allowed the implementation of an efficient annotation system and a powerful retrieval resource for the selection and extraction of specific UTR subsets. All internet resources implemented for retrieval and functional analysis of 5' and 3' untranslated regions of eukaryotic mRNAs are accessible at http://utrdb.ba.itb.cnr.it/.