Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 7.279
Filtrar
1.
BMC Bioinformatics ; 21(1): 513, 2020 Nov 10.
Artigo em Inglês | MEDLINE | ID: mdl-33172385

RESUMO

BACKGROUND: Recent advances in sequencing technologies have led to an explosion in the number of genomes available, but accurate genome annotation remains a major challenge. The prediction of protein-coding genes in eukaryotic genomes is especially problematic, due to their complex exon-intron structures. Even the best eukaryotic gene prediction algorithms can make serious errors that will significantly affect subsequent analyses. RESULTS: We first investigated the prevalence of gene prediction errors in a large set of 176,478 proteins from ten primate proteomes available in public databases. Using the well-studied human proteins as a reference, a total of 82,305 potential errors were detected, including 44,001 deletions, 27,289 insertions and 11,015 mismatched segments where part of the correct protein sequence is replaced with an alternative erroneous sequence. We then focused on the mismatched sequence errors that cause particular problems for downstream applications. A detailed characterization allowed us to identify the potential causes for the gene misprediction in approximately half (5446) of these cases. As a proof-of-concept, we also developed a simple method which allowed us to propose improved sequences for 603 primate proteins. CONCLUSIONS: Gene prediction errors in primate proteomes affect up to 50% of the sequences. Major causes of errors include undetermined genome regions, genome sequencing or assembly issues, and limitations in the models used to represent gene exon-intron structures. Nevertheless, existing genome sequences can still be exploited to improve protein sequence quality. Perspectives of the work include the characterization of other types of gene prediction errors, as well as the development of a more comprehensive algorithm for protein sequence error correction.


Assuntos
Fases de Leitura Aberta/genética , Primatas/metabolismo , Proteoma , Sequência de Aminoácidos , Animais , Bases de Dados de Proteínas , Deleção de Genes , Humanos , Mutagênese Insercional , Proteínas Tirosina Fosfatases Semelhantes a Receptores/química , Proteínas Tirosina Fosfatases Semelhantes a Receptores/genética , Proteínas Tirosina Fosfatases Semelhantes a Receptores/metabolismo , Alinhamento de Sequência
2.
BMC Bioinformatics ; 21(1): 455, 2020 Oct 14.
Artigo em Inglês | MEDLINE | ID: mdl-33054771

RESUMO

BACKGROUND: Small open reading frame (smORF) is open reading frame with a length of less than 100 codons. Microproteins, translated from smORFs, have been found to participate in a variety of biological processes such as muscle formation and contraction, cell proliferation, and immune activation. Although previous studies have collected and annotated a large abundance of smORFs, functions of the vast majority of smORFs are still unknown. It is thus increasingly important to develop computational methods to annotate the functions of these smORFs. RESULTS: In this study, we collected 617,462 unique smORFs from three studies. The expression of smORF RNAs was estimated by reannotated microarray probes. Using a speed-optimized correlation algorism, the functions of smORFs were predicted by their correlated genes with known functional annotations. After applying our method to 5 known microproteins from literatures, our method successfully predicted their functions. Further validation from the UniProt database showed that at least one function of 202 out of 270 microproteins was predicted. CONCLUSIONS: We developed a method, smORFunction, to provide function predictions of smORFs/microproteins in at most 265 models generated from 173 datasets, including 48 tissues/cells, 82 diseases (and normal). The tool can be available at https://www.cuilab.cn/smorfunction .


Assuntos
Fases de Leitura Aberta/genética , Proteínas/genética , Software , Regulação da Expressão Gênica , Humanos , Internet , Análise em Microsséries , Anotação de Sequência Molecular , RNA/genética , Reprodutibilidade dos Testes
3.
BMC Bioinformatics ; 21(1): 459, 2020 Oct 15.
Artigo em Inglês | MEDLINE | ID: mdl-33059593

RESUMO

BACKGROUND: High-throughput sequencing can establish the functional capacity of a microbial community by cataloging the protein-coding sequences (CDS) present in the metagenome of the community. The relative performance of different computational methods for identifying CDS from whole-genome shotgun sequencing is not fully established. RESULTS: Here we present an automated benchmarking workflow, using synthetic shotgun sequencing reads for which we know the true CDS content of the underlying communities, to determine the relative performance (sensitivity, positive predictive value or PPV, and computational efficiency) of different metagenome analysis tools for extracting the CDS content of a microbial community. Assembly-based methods are limited by coverage depth, with poor sensitivity for CDS at < 5X depth of sequencing, but have excellent PPV. Mapping-based techniques are more sensitive at low coverage depths, but can struggle with PPV. We additionally describe an expectation maximization based iterative algorithmic approach which we show to successfully improve the PPV of a mapping based technique while retaining improved sensitivity and computational efficiency. CONCLUSION: Our benchmarking approach reveals the trade-offs of assembly versus alignment-based approaches and the relative performance of specific implementations when one wishes to extract the protein coding capacity of microbial communities.


Assuntos
Benchmarking , Simulação por Computador , Metagenoma , Fases de Leitura Aberta/genética , Algoritmos , Humanos , Metagenômica , Microbiota/genética , Valor Preditivo dos Testes
4.
Elife ; 92020 10 01.
Artigo em Inglês | MEDLINE | ID: mdl-33001029

RESUMO

Understanding the emergence of novel viruses requires an accurate and comprehensive annotation of their genomes. Overlapping genes (OLGs) are common in viruses and have been associated with pandemics but are still widely overlooked. We identify and characterize ORF3d, a novel OLG in SARS-CoV-2 that is also present in Guangxi pangolin-CoVs but not other closely related pangolin-CoVs or bat-CoVs. We then document evidence of ORF3d translation, characterize its protein sequence, and conduct an evolutionary analysis at three levels: between taxa (21 members of Severe acute respiratory syndrome-related coronavirus), between human hosts (3978 SARS-CoV-2 consensus sequences), and within human hosts (401 deeply sequenced SARS-CoV-2 samples). ORF3d has been independently identified and shown to elicit a strong antibody response in COVID-19 patients. However, it has been misclassified as the unrelated gene ORF3b, leading to confusion. Our results liken ORF3d to other accessory genes in emerging viruses and highlight the importance of OLGs.


Assuntos
Betacoronavirus/genética , Infecções por Coronavirus/virologia , Evolução Molecular , Homologia de Genes , Genes Virais , Especificidade de Hospedeiro/genética , Fases de Leitura Aberta/genética , Pandemias , Pneumonia Viral/virologia , Sequência de Aminoácidos , Animais , Anticorpos Antivirais/imunologia , Especificidade de Anticorpos , Antígenos Virais/biossíntese , Antígenos Virais/genética , Antígenos Virais/imunologia , Betacoronavirus/patogenicidade , Betacoronavirus/fisiologia , China/epidemiologia , Quirópteros/virologia , Coronavirus/genética , Infecções por Coronavirus/epidemiologia , Epitopos/genética , Epitopos/imunologia , Europa (Continente)/epidemiologia , Eutérios/virologia , Regulação Viral da Expressão Gênica , Variação Genética , Haplótipos/genética , Humanos , Modelos Moleculares , Mutação , Filogenia , Pneumonia Viral/epidemiologia , Biossíntese de Proteínas , Conformação Proteica , RNA Viral/genética , Alinhamento de Sequência , Homologia de Sequência do Ácido Nucleico , Proteínas Virais/biossíntese , Proteínas Virais/genética , Proteínas Virais/imunologia
5.
BMC Bioinformatics ; 21(1): 474, 2020 Oct 22.
Artigo em Inglês | MEDLINE | ID: mdl-33092526

RESUMO

BACKGROUND: Identifying frequently mutated regions is a key approach to discover DNA elements influencing cancer progression. However, it is challenging to identify these burdened regions due to mutation rate heterogeneity across the genome and across different individuals. Moreover, it is known that this heterogeneity partially stems from genomic confounding factors, such as replication timing and chromatin organization. The increasing availability of cancer whole genome sequences and functional genomics data from the Encyclopedia of DNA Elements (ENCODE) may help address these issues. RESULTS: We developed a negative binomial regression-based Integrative Method for mutation Burden analysiS (NIMBus). Our approach addresses the over-dispersion of mutation count statistics by (1) using a Gamma-Poisson mixture model to capture the mutation-rate heterogeneity across different individuals and (2) estimating regional background mutation rates by regressing the varying local mutation counts against genomic features extracted from ENCODE. We applied NIMBus to whole-genome cancer sequences from the PanCancer Analysis of Whole Genomes project (PCAWG) and other cohorts. It successfully identified well-known coding and noncoding drivers, such as TP53 and the TERT promoter. To further characterize the burdening of non-coding regions, we used NIMBus to screen transcription factor binding sites in promoter regions that intersect DNase I hypersensitive sites (DHSs). This analysis identified mutational hotspots that potentially disrupt gene regulatory networks in cancer. We also compare this method to other mutation burden analysis methods. CONCLUSION: NIMBus is a powerful tool to identify mutational hotspots. The NIMBus software and results are available as an online resource at github.gersteinlab.org/nimbus.


Assuntos
Análise Mutacional de DNA/métodos , Mutação/genética , Software , Calibragem , Simulação por Computador , Doença/genética , Genoma Humano , Humanos , Anotação de Sequência Molecular , Taxa de Mutação , Neoplasias/genética , Fases de Leitura Aberta/genética , Regiões Promotoras Genéticas , Análise de Regressão , Sequenciamento Completo do Genoma
6.
BMC Bioinformatics ; 21(1): 431, 2020 Oct 02.
Artigo em Inglês | MEDLINE | ID: mdl-33008363

RESUMO

BACKGROUND: This paper describes a web based tool that uses a combination of sonification and an animated display to inquire into the SARS-CoV-2 genome. The audio data is generated in real time from a variety of RNA motifs that are known to be important in the functioning of RNA. Additionally, metadata relating to RNA translation and transcription has been used to shape the auditory and visual displays. Together these tools provide a unique approach to further understand the metabolism of the viral RNA genome. This audio provides a further means to represent the function of the RNA in addition to traditional written and visual approaches. RESULTS: Sonification of the SARS-CoV-2 genomic RNA sequence results in a complex auditory stream composed of up to 12 individual audio tracks. Each auditory motive is derived from the actual RNA sequence or from metadata. This approach has been used to represent transcription or translation of the viral RNA genome. The display highlights the real-time interaction of functional RNA elements. The sonification of codons derived from all three reading frames of the viral RNA sequence in combination with sonified metadata provide the framework for this display. Functional RNA motifs such as transcription regulatory sequences and stem loop regions have also been sonified. Using the tool, audio can be generated in real-time from either genomic or sub-genomic representations of the RNA. Given the large size of the viral genome, a collection of interactive buttons has been provided to navigate to regions of interest, such as cleavage regions in the polyprotein, untranslated regions or each gene. These tools are available through an internet browser and the user can interact with the data display in real time. CONCLUSION: The auditory display in combination with real-time animation of the process of translation and transcription provide a unique insight into the large body of evidence describing the metabolism of the RNA genome. Furthermore, the tool has been used as an algorithmic based audio generator. These audio tracks can be listened to by the general community without reference to the visual display to encourage further inquiry into the science.


Assuntos
Betacoronavirus/genética , Genoma Viral , Software , Betacoronavirus/isolamento & purificação , Infecções por Coronavirus/patologia , Infecções por Coronavirus/virologia , Genômica , Humanos , Fases de Leitura Aberta/genética , Pandemias , Pneumonia Viral/patologia , Pneumonia Viral/virologia , RNA Viral/química , RNA Viral/genética , RNA Viral/metabolismo
7.
PLoS One ; 15(9): e0233197, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32946445

RESUMO

Levels of protein translation by ribosomes are governed both by features of the translation machinery as well as sequence properties of the mRNAs themselves. We focus here on a striking three-nucleotide periodicity, characterized by overrepresentation of GCN codons and underrepresentation of G at the second position of codons, that is observed in Open Reading Frames (ORFs) of mRNAs. Our examination of mRNA sequences in Saccharomyces cerevisiae revealed that this periodicity is particularly pronounced in the initial codons-the ramp region-of ORFs of genes with high protein expression. It is also found in mRNA sequences immediately following non-standard AUG start sites, located upstream or downstream of the standard annotated start sites of genes. To explore the possible influences of the ramp GCN periodicity on translation efficiency, we tested edited ramps with accentuated or depressed periodicity in two test genes, SKN7 and HMT1. Greater conformance to (GCN)n was found to significantly depress translation, whereas disrupting conformance had neutral or positive effects on translation. Our recent Molecular Dynamics analysis of a subsystem of translocating ribosomes in yeast revealed an interaction surface that H-bonds to the +1 codon that is about to enter the ribosome decoding center A site. The surface, comprised of 16S/18S rRNA C1054 and A1196 (E. coli numbering) and R146 of ribosomal protein Rps3, preferentially interacts with GCN codons, and we hypothesize that modulation of this mRNA-ribosome interaction may underlie GCN-mediated regulation of protein translation. Integration of our expression studies with large-scale reporter studies of ramp sequence variants suggests a model in which the C1054-A1196-R146 (CAR) interaction surface can act as both an accelerator and braking system for ribosome translation.


Assuntos
Códon de Iniciação/genética , Biossíntese de Proteínas/genética , Ribossomos/metabolismo , Saccharomyces cerevisiae/genética , Composição de Bases/genética , Códon de Iniciação/metabolismo , Proteínas de Ligação a DNA/biossíntese , Proteínas de Ligação a DNA/genética , Simulação de Dinâmica Molecular , Fases de Leitura Aberta/genética , Proteína-Arginina N-Metiltransferases/biossíntese , Proteína-Arginina N-Metiltransferases/genética , RNA Mensageiro/genética , RNA Mensageiro/metabolismo , Proteínas Repressoras/biossíntese , Proteínas Repressoras/genética , Saccharomyces cerevisiae/citologia , Saccharomyces cerevisiae/metabolismo , Proteínas de Saccharomyces cerevisiae/biossíntese , Proteínas de Saccharomyces cerevisiae/genética , Fatores de Transcrição/biossíntese , Fatores de Transcrição/genética
9.
PLoS One ; 15(9): e0239044, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32931501

RESUMO

Holothuria leucospilota (Echinodermata: Holothuroidea) is a widespread tropical sea cucumber with strong value for the ecological restoration of coral reefs. Therefore, some studies regarding the artificial breeding and cultivation of H. leucospilota have been undertaken recently. However, the biological functions of the digestive system of this species have not been elucidated. In this study, a cDNA coding for α-amylase, an indicator of digestive maturity in animals, was identified from H. leucospilota and designated Hl-Amy. The full-length cDNA of the Hl-Amy gene, which is 1734 bp in length with an open reading frame (ORF) of 1578 bp, encodes a 525 amino acid (a.a.) protein with a deduced molecular weight of 59.34 kDa. According to the CaZy database annotation, Hl-Amy belongs to the class of GH-H with the official nomenclature of α-amylase (EC 3.2.1.1) or 4-α-D-glucan glucanohydrolase. The Hl-Amy protein contains a signal peptide at the N-terminal followed by a functional amylase domain, which includes the catalytic activity site. The mRNA expression of Hl-Amy was abundantly exhibited in the intestine, followed by the transverse vessel with a low level, but was hardly detected in other selected tissues. During embryonic and larval development, Hl-Amy was constitutively expressed in all stages, and the highest expression level was observed in the blastula. By in situ hybridization (ISH), positive Hl-Amy signals were observed in different parts of the three different intestinal segments (foregut, midgut and hindgut). The Hl-Amy recombinant protein was generated in an E. coli system with codon optimization, which is necessary for Hl-Amy successfully expressed in this heterogenous system. The Hl-Amy recombinant protein was purified by immobilized metal ion affinity chromatography (IMAC), and its activity of starch hydrolysis was further detected. The optimal temperatures and pH for Hl-Amy recombinant protein were 55°C and 6.0, respectively, with an activity of 62.2 U/mg. In summary, this current study has filled a knowledge gap on the biological function and expression profiles of an essential digestive enzyme in sea cucumber, which may encourage future investigation toward rationalized diets for H. leucospilota in artificial cultivation, and optimized heterogenous prokaryotic systems for producing recombinant enzymes of marine origins.


Assuntos
Pepinos-do-Mar/enzimologia , Pepinos-do-Mar/genética , alfa-Amilases/genética , Sequência de Aminoácidos/genética , Animais , Fenômenos Biológicos , Clonagem Molecular/métodos , Códon/genética , DNA Complementar/genética , Equinodermos/genética , Perfilação da Expressão Gênica/métodos , Fases de Leitura Aberta/genética , Filogenia , Alinhamento de Sequência/métodos , Distribuição Tecidual/genética , alfa-Amilases/metabolismo
10.
Nucleic Acids Res ; 48(18): 10441-10455, 2020 10 09.
Artigo em Inglês | MEDLINE | ID: mdl-32941651

RESUMO

Comprehensive genome-wide analysis has revealed the presence of translational elements in the 3' untranslated regions (UTRs) of human transcripts. However, the mechanisms by which translation is initiated in 3' UTRs and the physiological function of their products remain unclear. This study showed that eIF4G drives the translation of various downstream open reading frames (dORFs) in 3' UTRs. The 3' UTR of GCH1, which encodes GTP cyclohydrolase 1, contains an internal ribosome entry site (IRES) that initiates the translation of dORFs. An in vitro reconstituted translation system showed that the IRES in the 3' UTR of GCH1 required eIF4G and conventional translation initiation factors, except eIF4E, for AUG-initiated translation of dORFs. The 3' UTR of GCH1-mediated translation was resistant to the mTOR inhibitor Torin 1, which inhibits cap-dependent initiation by increasing eIF4E-unbound eIF4G. eIF4G was also required for the activity of various elements, including polyU and poliovirus type 2, a short element thought to recruit ribosomes by base-pairing with 18S rRNA. These findings indicate that eIF4G mediates translation initiation of various ORFs in mammalian cells, suggesting that the 3' UTRs of mRNAs may encode various products.


Assuntos
Fator de Iniciação 4G em Eucariotos/genética , GTP Cicloidrolase/genética , Fases de Leitura Aberta/genética , Serina-Treonina Quinases TOR/genética , Regiões 3' não Traduzidas/genética , Fator de Iniciação 4E em Eucariotos/genética , Humanos , Naftiridinas/farmacologia , Poliovirus/genética , Biossíntese de Proteínas/genética , Capuzes de RNA/genética , RNA Mensageiro/genética , RNA Ribossômico 18S/genética , Ribossomos/genética , Serina-Treonina Quinases TOR/antagonistas & inibidores
11.
BMC Bioinformatics ; 21(Suppl 8): 201, 2020 Sep 16.
Artigo em Inglês | MEDLINE | ID: mdl-32938407

RESUMO

MicroRNA are small non-coding RNAs that post-transcriptionally regulate the expression levels of messenger RNAs. MicroRNA regulation activity depends on the recognition of binding sites located on mRNA molecules. ComiR is a web tool realized to predict the targets of a set of microRNAs, starting from their expression profile. ComiR was trained with the information regarding binding sites in the 3'utr region, by using a reliable dataset containing the targets of endogenously expressed microRNA in D. melanogaster S2 cells. This dataset was obtained by comparing the results from two different experimental approaches, i.e., inhibition, and immunoprecipitation of the AGO1 protein--a component of the microRNA induced silencing complex.In this work, we tested whether including coding region binding sites in ComiR algorithm improves the performance of the tool in predicting microRNA targets. We focused the analysis on the D. melanogaster species and updated the ComiR underlying database with the currently available releases of mRNA and microRNA sequences. As a result, we find that ComiR algorithm trained with the information related to the coding regions is more efficient in predicting the microRNA targets, with respect to the algorithm trained with 3'utr information. On the other hand, we show that 3'utr based predictions can be seen as complementary to the coding region based predictions, which suggests that both predictions, from 3'utr and coding regions, should be considered in comprehensive analysis.Furthermore, we observed that the lists of targets obtained by analyzing data from one experimental approach only, that is, inhibition or immunoprecipitation of AGO1, are not reliable enough to test the performance of our microRNA target prediction algorithm. Further analysis will be conducted to investigate the effectiveness of the tool with data from other species, provided that validated datasets, as obtained from the comparison of RISC proteins inhibition and immunoprecipitation experiments, will be available for the same samples. Finally, we propose to upgrade the existing ComiR web-tool by including the coding region based trained model, available together with the 3'utr based one.


Assuntos
Drosophila melanogaster/genética , MicroRNAs/genética , Fases de Leitura Aberta/genética , RNA Mensageiro/genética , Algoritmos , Animais , Humanos
12.
Sci Signal ; 13(651)2020 09 29.
Artigo em Inglês | MEDLINE | ID: mdl-32994211

RESUMO

There are currently no antiviral therapies specific for SARS-CoV-2, the virus responsible for the global pandemic disease COVID-19. To facilitate structure-based drug design, we conducted an x-ray crystallographic study of the SARS-CoV-2 nsp16-nsp10 2'-O-methyltransferase complex, which methylates Cap-0 viral mRNAs to improve viral protein translation and to avoid host immune detection. We determined the structures for nsp16-nsp10 heterodimers bound to the methyl donor S-adenosylmethionine (SAM), the reaction product S-adenosylhomocysteine (SAH), or the SAH analog sinefungin (SFG). We also solved structures for nsp16-nsp10 in complex with the methylated Cap-0 analog m7GpppA and either SAM or SAH. Comparative analyses between these structures and published structures for nsp16 from other betacoronaviruses revealed flexible loops in open and closed conformations at the m7GpppA-binding pocket. Bound sulfates in several of the structures suggested the location of the ribonucleic acid backbone phosphates in the ribonucleotide-binding groove. Additional nucleotide-binding sites were found on the face of the protein opposite the active site. These various sites and the conserved dimer interface could be exploited for the development of antiviral inhibitors.


Assuntos
Betacoronavirus/enzimologia , Infecções por Coronavirus/tratamento farmacológico , Metiltransferases/química , Pneumonia Viral/tratamento farmacológico , Proteínas não Estruturais Virais/química , Adenosina/análogos & derivados , Adenosina/metabolismo , Adenosina/farmacologia , Betacoronavirus/efeitos dos fármacos , Sítios de Ligação , Domínio Catalítico , Cristalografia por Raios X , Dimerização , Genes Virais/genética , Humanos , Metilação , Metiltransferases/antagonistas & inibidores , Modelos Moleculares , Fases de Leitura Aberta/genética , Pandemias , Ligação Proteica , Conformação Proteica , Análogos de Capuz de RNA/metabolismo , Processamento Pós-Transcricional do RNA , RNA Viral/metabolismo , S-Adenosil-Homocisteína/metabolismo , S-Adenosilmetionina/metabolismo , Relação Estrutura-Atividade , Proteínas não Estruturais Virais/antagonistas & inibidores , Proteínas não Estruturais Virais/metabolismo
13.
Proc Natl Acad Sci U S A ; 117(40): 24936-24946, 2020 10 06.
Artigo em Inglês | MEDLINE | ID: mdl-32958672

RESUMO

While near-cognate codons are frequently used for translation initiation in eukaryotes, their efficiencies are usually low (<10% compared to an AUG in optimal context). Here, we describe a rare case of highly efficient near-cognate initiation. A CUG triplet located in the 5' leader of POLG messenger RNA (mRNA) initiates almost as efficiently (∼60 to 70%) as an AUG in optimal context. This CUG directs translation of a conserved 260-triplet-long overlapping open reading frame (ORF), which we call POLGARF (POLG Alternative Reading Frame). Translation of a short upstream ORF 5' of this CUG governs the ratio between POLG (the catalytic subunit of mitochondrial DNA polymerase) and POLGARF synthesized from a single POLG mRNA. Functional investigation of POLGARF suggests a role in extracellular signaling. While unprocessed POLGARF localizes to the nucleoli together with its interacting partner C1QBP, serum stimulation results in rapid cleavage and secretion of a POLGARF C-terminal fragment. Phylogenetic analysis shows that POLGARF evolved ∼160 million y ago due to a mammalian-wide interspersed repeat (MIR) transposition into the 5' leader sequence of the mammalian POLG gene, which became fixed in placental mammals. This discovery of POLGARF unveils a previously undescribed mechanism of de novo protein-coding gene evolution.


Assuntos
Códon de Iniciação/genética , Polimerase do DNA Mitocondrial/genética , Filogenia , Biossíntese de Proteínas/genética , Animais , Sequência de Bases , Proteínas de Transporte/genética , Feminino , Humanos , Proteínas Mitocondriais/genética , Fases de Leitura Aberta/genética , Gravidez , RNA Mensageiro/genética , Fases de Leitura/genética
14.
Int J Mol Sci ; 21(15)2020 Aug 04.
Artigo em Inglês | MEDLINE | ID: mdl-32759818

RESUMO

The current COronaVIrus Disease 2019 (COVID-19) pandemic started in December 2019. COVID-19 cases are confirmed by the detection of SARS-CoV-2 RNA in biological samples by RT-qPCR. However, limited numbers of SARS-CoV-2 genomes were available when the first RT-qPCR methods were developed in January 2020 for initial in silico specificity evaluation and to verify whether the targeted loci are highly conserved. Now that more whole genome data have become available, we used the bioinformatics tool SCREENED and a total of 4755 publicly available SARS-CoV-2 genomes, downloaded at two different time points, to evaluate the specificity of 12 RT-qPCR tests (consisting of a total of 30 primers and probe sets) used for SARS-CoV-2 detection and the impact of the virus' genetic evolution on four of them. The exclusivity of these methods was also assessed using the human reference genome and 2624 closely related other respiratory viral genomes. The specificity of the assays was generally good and stable over time. An exception is the first method developed by the China Center for Disease Control and prevention (CDC), which exhibits three primer mismatches present in 358 SARS-CoV-2 genomes sequenced mainly in Europe from February 2020 onwards. The best results were obtained for the assay of Chan et al. (2020) targeting the gene coding for the spiking protein (S). This demonstrates that our user-friendly strategy can be used for a first in silico specificity evaluation of future RT-qPCR tests, as well as verifying that the former methods are still capable of detecting circulating SARS-CoV-2 variants.


Assuntos
Betacoronavirus/genética , Infecções por Coronavirus/diagnóstico , Genoma Viral , Pneumonia Viral/diagnóstico , RNA Viral/metabolismo , Reação em Cadeia da Polimerase em Tempo Real/métodos , Betacoronavirus/isolamento & purificação , Infecções por Coronavirus/virologia , Bases de Dados Genéticas , Humanos , Fases de Leitura Aberta/genética , Pandemias , Pneumonia Viral/virologia , Polimorfismo de Nucleotídeo Único , RNA Replicase/genética , RNA Viral/análise , Sensibilidade e Especificidade , Sequenciamento Completo do Genoma
15.
Int J Mol Sci ; 21(15)2020 Aug 03.
Artigo em Inglês | MEDLINE | ID: mdl-32756480

RESUMO

The pandemic of coronavirus disease 2019 (COVID-19), with rising numbers of patients worldwide, presents an urgent need for effective treatments. To date, there are no therapies or vaccines that are proven to be effective against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). Several potential candidates or repurposed drugs are under investigation, including drugs that inhibit SARS-CoV-2 replication and block infection. The most promising therapy to date is remdesivir, which is US Food and Drug Administration (FDA) approved for emergency use in adults and children hospitalized with severe suspected or laboratory-confirmed COVID-19. Herein we summarize the general features of SARS-CoV-2's molecular and immune pathogenesis and discuss available pharmacological strategies, based on our present understanding of SARS-CoV and Middle East respiratory syndrome coronavirus (MERS-CoV) infections. Finally, we outline clinical trials currently in progress to investigate the efficacy of potential therapies for COVID-19.


Assuntos
Imunidade Adaptativa , Betacoronavirus/fisiologia , Infecções por Coronavirus/patologia , Pneumonia Viral/patologia , Anti-Inflamatórios/uso terapêutico , Antivirais/uso terapêutico , Betacoronavirus/isolamento & purificação , Infecções por Coronavirus/imunologia , Infecções por Coronavirus/terapia , Infecções por Coronavirus/virologia , Humanos , Imunoterapia , Coronavírus da Síndrome Respiratória do Oriente Médio/isolamento & purificação , Coronavírus da Síndrome Respiratória do Oriente Médio/fisiologia , Fases de Leitura Aberta/genética , Pandemias , Pneumonia Viral/imunologia , Pneumonia Viral/terapia , Pneumonia Viral/virologia
16.
Biomolecules ; 10(8)2020 08 07.
Artigo em Inglês | MEDLINE | ID: mdl-32784796

RESUMO

Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR) loci are found in bacterial and archaeal genomes where they provide the molecular machinery for acquisition of immunity against foreign DNA. In addition to the cas genes fundamentally required for CRISPR activity, a second class of genes is associated with the CRISPR loci, of which many have no reported function in CRISPR-mediated immunity. Here, we characterize MM_0565 associated to the type I-B CRISPR-locus of Methanosarcina mazei Gö1. We show that purified MM_0565 composed of a CRISPR-Cas Associated Rossmann Fold (CARF) and a winged helix-turn-helix domain forms a dimer in solution; in vivo, the dimeric MM_0565 is strongly stabilized under high salt stress. While direct effects on CRISPR-Cas transcription were not detected by genetic approaches, specific binding of MM_0565 to the leader region of both CRISPR-Cas systems was observed by microscale thermophoresis and electromobility shift assays. Moreover, overexpression of MM_0565 strongly induced transcription of the cas1-solo gene located in the recently reported casposon, the gene product of which shows high similarity to classical Cas1 proteins. Based on our findings, and taking the absence of the expressed CRISPR locus-encoded Cas1 protein into account, we hypothesize that MM_0565 might modulate the activity of the CRISPR systems on different levels.


Assuntos
Proteínas Associadas a CRISPR/química , Repetições Palindrômicas Curtas Agrupadas e Regularmente Espaçadas/genética , Methanosarcina/genética , Motivos de Aminoácidos/genética , Proteínas Associadas a CRISPR/genética , Sistemas CRISPR-Cas , Regulação da Expressão Gênica em Archaea/genética , Methanosarcina/química , Methanosarcina/metabolismo , Fases de Leitura Aberta/genética , Regiões Promotoras Genéticas , Ligação Proteica , Dobramento de Proteína , Multimerização Proteica/genética , RNA-Seq
17.
PLoS One ; 15(8): e0237559, 2020.
Artigo em Inglês | MEDLINE | ID: mdl-32780783

RESUMO

BACKGROUND: The world is going through the critical phase of COVID-19 pandemic, caused by human coronavirus, SARS-CoV-2. Worldwide concerted effort to identify viral genomic changes across different sub-types has identified several strong changes in the coding region. However, there have not been many studies focusing on the variations in the 5' and 3' untranslated regions and their consequences. Considering the possible importance of these regions in host mediated regulation of viral RNA genome, we wanted to explore the phenomenon. METHODS: To have an idea of the global changes in 5' and 3'-UTR sequences, we downloaded 8595 complete and high-coverage SARS-CoV-2 genome sequence information from human host in FASTA format from Global Initiative on Sharing All Influenza Data (GISAID) from 15 different geographical regions. Next, we aligned them using Clustal Omega software and investigated the UTR variants. We also looked at the putative host RNA binding protein (RBP) and microRNA binding sites in these regions by 'RBPmap' and 'RNA22 v2' respectively. Expression status of selected RBPs and microRNAs were checked in lungs tissue. RESULTS: We identified 28 unique variants in SARS-CoV-2 UTR region based on a minimum variant percentage cut-off of 0.5. Along with 241C>T change the important 5'-UTR change identified was 187A>G, while 29734G>C, 29742G>A/T and 29774C>T were the most familiar variants of 3'UTR among most of the continents. Furthermore, we found that despite the variations in the UTR regions, binding of host RBP to them remains mostly unaltered, which further influenced the functioning of specific miRNAs. CONCLUSION: Our results, shows for the first time in SARS-Cov-2 infection, a possible cross-talk between host RBPs-miRNAs and viral UTR variants, which ultimately could explain the mechanism of escaping host RNA decay machinery by the virus. The knowledge might be helpful in developing anti-viral compounds in future.


Assuntos
Regiões 3' não Traduzidas/genética , Regiões 5' não Traduzidas/genética , Betacoronavirus/genética , Infecções por Coronavirus/metabolismo , Genoma Viral/genética , Instabilidade Genômica/genética , Interações Hospedeiro-Patógeno/genética , MicroRNAs/metabolismo , Pneumonia Viral/metabolismo , RNA Viral/metabolismo , Proteínas de Ligação a RNA/metabolismo , Sequência de Bases , Sítios de Ligação , Infecções por Coronavirus/virologia , Humanos , Fases de Leitura Aberta/genética , Pandemias , Pneumonia Viral/virologia , Ligação Proteica/genética
18.
Sci Rep ; 10(1): 14004, 2020 08 19.
Artigo em Inglês | MEDLINE | ID: mdl-32814791

RESUMO

Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), a novel evolutionary divergent RNA virus, is responsible for the present devastating COVID-19 pandemic. To explore the genomic signatures, we comprehensively analyzed 2,492 complete and/or near-complete genome sequences of SARS-CoV-2 strains reported from across the globe to the GISAID database up to 30 March 2020. Genome-wide annotations revealed 1,516 nucleotide-level variations at different positions throughout the entire genome of SARS-CoV-2. Moreover, nucleotide (nt) deletion analysis found twelve deletion sites throughout the genome other than previously reported deletions at coding sequence of the ORF8 (open reading frame), spike, and ORF7a proteins, specifically in polyprotein ORF1ab (n = 9), ORF10 (n = 1), and 3´-UTR (n = 2). Evidence from the systematic gene-level mutational and protein profile analyses revealed a large number of amino acid (aa) substitutions (n = 744), demonstrating the viral proteins heterogeneous. Notably, residues of receptor-binding domain (RBD) showing crucial interactions with angiotensin-converting enzyme 2 (ACE2) and cross-reacting neutralizing antibody were found to be conserved among the analyzed virus strains, except for replacement of lysine with arginine at 378th position of the cryptic epitope of a Shanghai isolate, hCoV-19/Shanghai/SH0007/2020 (EPI_ISL_416320). Furthermore, our results of the preliminary epidemiological data on SARS-CoV-2 infections revealed that frequency of aa mutations were relatively higher in the SARS-CoV-2 genome sequences of Europe (43.07%) followed by Asia (38.09%), and North America (29.64%) while case fatality rates remained higher in the European temperate countries, such as Italy, Spain, Netherlands, France, England and Belgium. Thus, the present method of genome annotation employed at this early pandemic stage could be a promising tool for monitoring and tracking the continuously evolving pandemic situation, the associated genetic variants, and their implications for the development of effective control and prophylaxis strategies.


Assuntos
Betacoronavirus/classificação , Betacoronavirus/genética , Infecções por Coronavirus/epidemiologia , Heterogeneidade Genética , Genoma Viral/genética , Estudo de Associação Genômica Ampla/métodos , Saúde Global , Pneumonia Viral/epidemiologia , Sequência de Aminoácidos/genética , Anticorpos Neutralizantes/imunologia , Pareamento Incorreto de Bases , Sequência de Bases/genética , Clima , Infecções por Coronavirus/virologia , Humanos , Fases de Leitura Aberta/genética , Pandemias , Peptidil Dipeptidase A/metabolismo , Filogenia , Pneumonia Viral/virologia , Domínios Proteicos/genética , Domínios Proteicos/imunologia , Deleção de Sequência , Glicoproteína da Espícula de Coronavírus/química , Glicoproteína da Espícula de Coronavírus/genética , Glicoproteína da Espícula de Coronavírus/metabolismo
19.
PLoS Genet ; 16(8): e1008995, 2020 08.
Artigo em Inglês | MEDLINE | ID: mdl-32833967

RESUMO

Pan-genomic open reading frames (ORFs) potentially carry protein-coding gene or coding variant information in a population. In this study, we suggest that pan-genomic ORFs are promising to be utilized in estimation of heritability and genomic prediction. A Saccharomyces cerevisiae dataset with whole-genome SNPs, pan-genomic ORFs, and the copy numbers of those ORFs is used to test the effectiveness of ORF data as a predictor in three prediction models for 35 traits. Our results show that the ORF-based heritability can capture more genetic effects than SNP-based heritability for all traits. Compared to SNP-based genomic prediction (GBLUP), pan-genomic ORF-based genomic prediction (OBLUP) is distinctly more accurate for all traits, and the predictive abilities on average are more than doubled across all traits. For four traits, the copy number of ORF-based prediction(CBLUP) is more accurate than OBLUP. When using different numbers of isolates in training sets in ORF-based prediction, the predictive abilities for all traits increased as more isolates are added in the training sets, suggesting that with very large training sets the prediction accuracy will be in the range of the square root of the heritability. We conclude that pan-genomic ORFs have the potential to be a supplement of single nucleotide polymorphisms in estimation of heritability and genomic prediction.


Assuntos
Genoma/genética , Genômica , Fases de Leitura Aberta/genética , Locos de Características Quantitativas/genética , Animais , Cruzamento , Estudo de Associação Genômica Ampla , Genótipo , Fenótipo , Polimorfismo de Nucleotídeo Único/genética
20.
G3 (Bethesda) ; 10(9): 3399-3402, 2020 09 02.
Artigo em Inglês | MEDLINE | ID: mdl-32763951

RESUMO

The world is facing a global pandemic of COVID-19 caused by the SARS-CoV-2 coronavirus. Here we describe a collection of codon-optimized coding sequences for SARS-CoV-2 cloned into Gateway-compatible entry vectors, which enable rapid transfer into a variety of expression and tagging vectors. The collection is freely available. We hope that widespread availability of this SARS-CoV-2 resource will enable many subsequent molecular studies to better understand the viral life cycle and how to block it.


Assuntos
Betacoronavirus/genética , Fases de Leitura Aberta/genética , Betacoronavirus/isolamento & purificação , Clonagem Molecular , Infecções por Coronavirus/patologia , Infecções por Coronavirus/virologia , Escherichia coli/metabolismo , Humanos , Pandemias , Plasmídeos/genética , Plasmídeos/metabolismo , Pneumonia Viral/patologia , Pneumonia Viral/virologia , Potyvirus/genética
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA