Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 38
Filtrar
1.
Adv Sci (Weinh) ; 11(14): e2306311, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38298116

RESUMO

The G-protein-coupled human cannabinoid receptor 1 (CB1) is a promising therapeutic target for pain management, inflammation, obesity, and substance abuse disorders. The structures of CB1-Gi complexes in synthetic agonist-bound forms have been resolved to date. However, the commercial drug recognition and Gq coupling mechanisms of CB1 remain elusive. Herein, the cryo-electron microscopy (cryo-EM) structure of CB1-Gq complex, in fenofibrate-bound form, at near-atomic resolution, is reported. The structure elucidates the delicate mechanisms of the precise fenofibrate recognition and Gq protein coupling by CB1 and will facilitate future drug discovery and design.


Assuntos
Canabinoides , Fenofibrato , Humanos , Receptor CB1 de Canabinoide , Microscopia Crioeletrônica , Proteínas de Ligação ao GTP
2.
Org Lett ; 25(19): 3573-3577, 2023 May 19.
Artigo em Inglês | MEDLINE | ID: mdl-37154605

RESUMO

The stereoselective synthesis of dienyl esters with high atom- and step-economy has been largely unexplored. Herein, we report an efficient approach for the synthesis of E-dienyl esters via rhodium catalysis using carboxylic acid and acetylene as C2 synthon through the cascade of cyclometalation and C-O coupling. This protocol features mild conditions, excellent functional group tolerance, and exclusive E-stereoselectivity and utility in the late-stage modification of pharmaceuticals and natural products.

3.
Nature ; 621(7978): 396-403, 2023 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-37130545

RESUMO

Messenger RNA (mRNA) vaccines are being used to combat the spread of COVID-19 (refs. 1-3), but they still exhibit critical limitations caused by mRNA instability and degradation, which are major obstacles for the storage, distribution and efficacy of the vaccine products4. Increasing secondary structure lengthens mRNA half-life, which, together with optimal codons, improves protein expression5. Therefore, a principled mRNA design algorithm must optimize both structural stability and codon usage. However, owing to synonymous codons, the mRNA design space is prohibitively large-for example, there are around 2.4 × 10632 candidate mRNA sequences for the SARS-CoV-2 spike protein. This poses insurmountable computational challenges. Here we provide a simple and unexpected solution using the classical concept of lattice parsing in computational linguistics, where finding the optimal mRNA sequence is analogous to identifying the most likely sentence among similar-sounding alternatives6. Our algorithm LinearDesign finds an optimal mRNA design for the spike protein in just 11 minutes, and can concurrently optimize stability and codon usage. LinearDesign substantially improves mRNA half-life and protein expression, and profoundly increases antibody titre by up to 128 times in mice compared to the codon-optimization benchmark on mRNA vaccines for COVID-19 and varicella-zoster virus. This result reveals the great potential of principled mRNA design and enables the exploration of previously unreachable but highly stable and efficient designs. Our work is a timely tool for vaccines and other mRNA-based medicines encoding therapeutic proteins such as monoclonal antibodies and anti-cancer drugs7,8.


Assuntos
Algoritmos , Vacinas contra COVID-19 , COVID-19 , Estabilidade de RNA , RNA Mensageiro , SARS-CoV-2 , Vacinas de mRNA , Animais , Humanos , Camundongos , Códon/genética , COVID-19/genética , COVID-19/imunologia , COVID-19/prevenção & controle , Vacinas contra COVID-19/química , Vacinas contra COVID-19/genética , Vacinas contra COVID-19/imunologia , Meia-Vida , Herpesvirus Humano 3/genética , Herpesvirus Humano 3/imunologia , Vacinas de mRNA/química , Vacinas de mRNA/genética , Vacinas de mRNA/imunologia , Estabilidade de RNA/genética , Estabilidade de RNA/imunologia , RNA Mensageiro/química , RNA Mensageiro/genética , RNA Mensageiro/imunologia , RNA Mensageiro/metabolismo , SARS-CoV-2/genética , SARS-CoV-2/imunologia
4.
Chem Sci ; 14(7): 1912-1918, 2023 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-36819868

RESUMO

Vinyl-substituted alcohols represent a highly useful class of molecular skeletons. The current method typically requires either stoichiometric metallic reagents or preformed precursors. Herein, we report a nickel catalysis-enabled synthesis of vinyl-substituted alcohols via a 5-membered oxa-metallacycle. In this protocol, acetylene, the simplest alkyne and abundant feedstock, is employed as an ideal C2 synthon. The reaction features mild conditions, good functional group tolerance and broad substrate scope. Mechanistic exploration implies that the oxa-metallacycle originated from the cyclometallation of aldehyde and acetylene is the key intermediate for this transformation, which is then terminated by a silane-mediated σ-bond metathesis and subsequent reductive elimination.

5.
Nat Commun ; 14(1): 847, 2023 02 15.
Artigo em Inglês | MEDLINE | ID: mdl-36792607

RESUMO

Genome wide association studies for coronary artery disease (CAD) have identified a risk locus at 11q22.3. Here, we verify with mechanistic studies that rs2019090 and PDGFD represent the functional variant and gene at this locus. Further, FOXC1/C2 transcription factor binding at rs2019090 is shown to promote PDGFD transcription through the CAD promoting allele. With single cell transcriptomic and histology studies with Pdgfd knockdown in an SMC lineage tracing male atherosclerosis mouse model we find that Pdgfd promotes expansion, migration, and transition of SMC lineage cells to the chondromyocyte phenotype. Pdgfd also increases adventitial fibroblast and pericyte expression of chemokines and leukocyte adhesion molecules, which is linked to plaque macrophage recruitment. Despite these changes there is no effect of Pdgfd deletion on overall plaque burden. These findings suggest that PDGFD mediates CAD risk by promoting deleterious phenotypic changes in SMC, along with an inflammatory response that is primarily focused in the adventitia.


Assuntos
Aterosclerose , Doença da Artéria Coronariana , Animais , Masculino , Camundongos , Alelos , Aterosclerose/genética , Doença da Artéria Coronariana/genética , Doença da Artéria Coronariana/patologia , Estudo de Associação Genômica Ampla , Ligação Proteica
6.
bioRxiv ; 2023 Jan 27.
Artigo em Inglês | MEDLINE | ID: mdl-36747745

RESUMO

Platelet derived growth factor (PDGF) signaling has been extensively studied in the context of vascular disease, but the genetics of this pathway remain to be established. Genome wide association studies (GWAS) for coronary artery disease (CAD) have identified a risk locus at 11q22.3, and we have verified with fine mapping approaches that the regulatory variant rs2019090 and PDGFD represent the functional variant and putative functional gene. Further, FOXC1/C2 transcription factor (TF) binding at rs2019090 was found to promote PDGFD transcription through the CAD promoting allele. Employing a constitutive Pdgfd knockout allele along with SMC lineage tracing in a male atherosclerosis mouse model we mapped single cell transcriptomic, cell state, and lesion anatomical changes associated with gene loss. These studies revealed that Pdgfd promotes expansion, migration, and transition of SMC lineage cells to the chondromyocyte phenotype and vascular calcification. This is in contrast to protective CAD genes TCF21, ZEB2, and SMAD3 which we have shown to promote the fibroblast-like cell transition or perturb the pattern or extent of transition to the chondromyocyte phenotype. Further, Pdgfd expressing fibroblasts and pericytes exhibited greater expression of chemokines and leukocyte adhesion molecules, consistent with observed increased macrophage recruitment to the plaque. Despite these changes there was no effect of Pdgfd deletion on SMC contribution to the fibrous cap or overall lesion burden. These findings suggest that PDGFD mediates CAD risk through promoting SMC expansion and migration, in conjunction with deleterious phenotypic changes, and through promoting an inflammatory response that is primarily focused in the adventitia where it contributes to leukocyte trafficking to the diseased vessel wall.

7.
Chem Sci ; 14(7): 1919, 2023 Feb 15.
Artigo em Inglês | MEDLINE | ID: mdl-36812102

RESUMO

[This corrects the article DOI: 10.1039/D2SC06400F.].

8.
Chem Sci ; 13(25): 7604-7609, 2022 Jun 29.
Artigo em Inglês | MEDLINE | ID: mdl-35872813

RESUMO

A copper-catalyzed three-component carboboration of acetylene with B2Pin2 and Michael acceptors is reported. In this reaction, a cheap and abundant C2 chemical feedstock, acetylene, was used as a starting material to afford cis-alkenyl boronates bearing a homoallylic carbonyl group. The reaction was robust and could be reliably performed on the molar scale. Furthermore, the resulting cis-alkenyl boronates could be converted to diverse functionalized molecules with ease.

9.
Nat Commun ; 13(1): 2412, 2022 05 03.
Artigo em Inglês | MEDLINE | ID: mdl-35504872

RESUMO

Human neurodegenerative disorders often exhibit similar pathologies, suggesting a shared aetiology. Key pathological features of Parkinson's disease (PD) are also observed in other neurodegenerative diseases. Pantothenate Kinase-Associated Neurodegeneration (PKAN) is caused by mutations in the human PANK2 gene, which catalyzes the initial step of de novo CoA synthesis. Here, we show that fumble (fbl), the human PANK2 homolog in Drosophila, interacts with PINK1 genetically. fbl and PINK1 mutants display similar mitochondrial abnormalities, and overexpression of mitochondrial Fbl rescues PINK1 loss-of-function (LOF) defects. Dietary vitamin B5 derivatives effectively rescue CoA/acetyl-CoA levels and mitochondrial function, reversing the PINK1 deficiency phenotype. Mechanistically, Fbl regulates Ref(2)P (p62/SQSTM1 homolog) by acetylation to promote mitophagy, whereas PINK1 regulates fbl translation by anchoring mRNA molecules to the outer mitochondrial membrane. In conclusion, Fbl (or PANK2) acts downstream of PINK1, regulating CoA/acetyl-CoA metabolism to promote mitophagy, uncovering a potential therapeutic intervention strategy in PD treatment.


Assuntos
Proteínas de Drosophila , Doenças Neurodegenerativas , Doença de Parkinson , Acetilcoenzima A/metabolismo , Animais , Drosophila/genética , Proteínas de Drosophila/genética , Proteínas de Drosophila/metabolismo , Mitocôndrias/metabolismo , Doenças Neurodegenerativas/metabolismo , Doença de Parkinson/metabolismo , Fosfotransferases (Aceptor do Grupo Álcool)/genética , Fosfotransferases (Aceptor do Grupo Álcool)/metabolismo , Proteínas Quinases/genética , Proteínas Quinases/metabolismo , Proteínas Serina-Treonina Quinases
10.
Chem Commun (Camb) ; 58(32): 4969-4972, 2022 Apr 19.
Artigo em Inglês | MEDLINE | ID: mdl-35353104

RESUMO

The highly efficient copper-catalyzed homo-dimerization and cross-coupling of propargyl esters have been developed. Various 1-en-3,5-diynes, [5]cumulenes and 1,3-diynes were successfully furnished via the copper-allenylidene intermediates with moderate to excellent yields. Migratory insertion is proposed as the key step to achieve the selectivity at the carbene carbon of the copper-allenylidene.

11.
Proc Natl Acad Sci U S A ; 118(52)2021 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-34887342

RESUMO

The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length and are thus infeasible for coronaviruses, which possess the longest genomes (∼30,000 nt) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single-sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions. Here we present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Surprisingly, on a group of SARS-CoV-2 and SARS-related genomes, LinearTurboFold's purely in silico prediction not only is close to experimentally guided models for local structures, but also goes far beyond them by capturing the end-to-end pairs between 5' and 3' untranslated regions (UTRs) (∼29,800 nt apart) that match perfectly with a purely experimental work. Furthermore, LinearTurboFold identifies undiscovered conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, small interfering RNAs (siRNAs), CRISPR-Cas13 guide RNAs, and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies and will be a useful tool in fighting the current and future pandemics.


Assuntos
Algoritmos , RNA Viral/química , SARS-CoV-2/química , Betacoronavirus/química , Betacoronavirus/genética , Sequência Conservada , Genoma Viral , Mutação , Conformação de Ácido Nucleico , Dobramento de RNA , RNA Viral/genética , SARS-CoV-2/genética , Alinhamento de Sequência
12.
bioRxiv ; 2021 Nov 15.
Artigo em Inglês | MEDLINE | ID: mdl-34816262

RESUMO

The constant emergence of COVID-19 variants reduces the effectiveness of existing vaccines and test kits. Therefore, it is critical to identify conserved structures in SARS-CoV-2 genomes as potential targets for variant-proof diagnostics and therapeutics. However, the algorithms to predict these conserved structures, which simultaneously fold and align multiple RNA homologs, scale at best cubically with sequence length, and are thus infeasible for coronaviruses, which possess the longest genomes (∼30,000 nt ) among RNA viruses. As a result, existing efforts on modeling SARS-CoV-2 structures resort to single sequence folding as well as local folding methods with short window sizes, which inevitably neglect long-range interactions that are crucial in RNA functions. Here we present LinearTurboFold, an efficient algorithm for folding RNA homologs that scales linearly with sequence length, enabling unprecedented global structural analysis on SARS-CoV-2. Surprisingly, on a group of SARS-CoV-2 and SARS-related genomes, LinearTurbo-Fold's purely in silico prediction not only is close to experimentally-guided models for local structures, but also goes far beyond them by capturing the end-to-end pairs between 5' and 3' UTRs (∼29,800 nt apart) that match perfectly with a purely experimental work. Furthermore, LinearTurboFold identifies novel conserved structures and conserved accessible regions as potential targets for designing efficient and mutation-insensitive small-molecule drugs, antisense oligonucleotides, siRNAs, CRISPR-Cas13 guide RNAs and RT-PCR primers. LinearTurboFold is a general technique that can also be applied to other RNA viruses and full-length genome studies, and will be a useful tool in fighting the current and future pandemics. SIGNIFICANCE STATEMENT: Conserved RNA structures are critical for designing diagnostic and therapeutic tools for many diseases including COVID-19. However, existing algorithms are much too slow to model the global structures of full-length RNA viral genomes. We present LinearTurboFold, a linear-time algorithm that is orders of magnitude faster, making it the first method to simultaneously fold and align whole genomes of SARS-CoV-2 variants, the longest known RNA virus (∼30 kilobases). Our work enables unprecedented global structural analysis and captures long-range interactions that are out of reach for existing algorithms but crucial for RNA functions. LinearTurboFold is a general technique for full-length genome studies and can help fight the current and future pandemics.

13.
NPJ Precis Oncol ; 5(1): 90, 2021 Oct 08.
Artigo em Inglês | MEDLINE | ID: mdl-34625644

RESUMO

Non-small cell lung cancer (NSCLC) metastatic to the brain leptomeninges is rapidly fatal, cannot be biopsied, and cancer cells in the cerebrospinal fluid (CSF) are few; therefore, available tissue samples to develop effective treatments are severely limited. This study aimed to converge single-cell RNA-seq and cell-free RNA (cfRNA) analyses to both diagnose NSCLC leptomeningeal metastases (LM), and to use gene expression profiles to understand progression mechanisms of NSCLC in the brain leptomeninges. NSCLC patients with suspected LM underwent withdrawal of CSF via lumbar puncture. Four cytology-positive CSF samples underwent single-cell capture (n = 197 cells) by microfluidic chip. Using robust principal component analyses, NSCLC LM cell gene expression was compared to immune cells. Massively parallel qPCR (9216 simultaneous reactions) on human CSF cfRNA samples compared the relative gene expression of patients with NSCLC LM (n = 14) to non-tumor controls (n = 7). The NSCLC-associated gene, CEACAM6, underwent in vitro validation in NSCLC cell lines for involvement in pathologic behaviors characteristic of LM. NSCLC LM gene expression revealed by single-cell RNA-seq was also reflected in CSF cfRNA of cytology-positive patients. Tumor-associated cfRNA (e.g., CEACAM6, MUC1) was present in NSCLC LM patients' CSF, but not in controls (CEACAM6 detection sensitivity 88.24% and specificity 100%). Cell migration in NSCLC cell lines was directly proportional to CEACAM6 expression, suggesting a role in disease progression. NSCLC-associated cfRNA is detectable in the CSF of patients with LM, and corresponds to the gene expression profile of NSCLC LM cells. CEACAM6 contributes significantly to NSCLC migration, a hallmark of LM pathophysiology.

14.
BMC Med Inform Decis Mak ; 21(1): 258, 2021 09 06.
Artigo em Inglês | MEDLINE | ID: mdl-34488734

RESUMO

BACKGROUND: Biomedical language translation requires multi-lingual fluency as well as relevant domain knowledge. Such requirements make it challenging to train qualified translators and costly to generate high-quality translations. Machine translation represents an effective alternative, but accurate machine translation requires large amounts of in-domain data. While such datasets are abundant in general domains, they are less accessible in the biomedical domain. Chinese and English are two of the most widely spoken languages, yet to our knowledge, a parallel corpus does not exist for this language pair in the biomedical domain. DESCRIPTION: We developed an effective pipeline to acquire and process an English-Chinese parallel corpus from the New England Journal of Medicine (NEJM). This corpus consists of about 100,000 sentence pairs and 3,000,000 tokens on each side. We showed that training on out-of-domain data and fine-tuning with as few as 4000 NEJM sentence pairs improve translation quality by 25.3 (13.4) BLEU for en[Formula: see text]zh (zh[Formula: see text]en) directions. Translation quality continues to improve at a slower pace on larger in-domain data subsets, with a total increase of 33.0 (24.3) BLEU for en[Formula: see text]zh (zh[Formula: see text]en) directions on the full dataset. CONCLUSIONS: The code and data are available at https://github.com/boxiangliu/ParaMed .


Assuntos
Idioma , Processamento de Linguagem Natural , China , Humanos , Tradução
15.
Front Aging Neurosci ; 13: 650103, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33776747

RESUMO

Alzheimer's disease (AD) is a neurodegenerative disorder characterized by memory impairments, which has no effective therapy. Stem cell transplantation shows great potential in the therapy of various disease. However, the application of stem cell therapy in neurological disorders, especially the ones with a long-term disease course such as AD, is limited by the delivery approach due to the presence of the brain blood barrier. So far, the most commonly used delivery approach in the therapy of neurological disorders with stem cells in preclinical and clinical studies are intracranial injection and intrathecal injection, both of which are invasive. In the present study, we use repetitive intranasal delivery of human neural stem cells (hNSCs) to the brains of APP/PS1 transgenic mice to investigate the effect of hNSCs on the pathology of AD. The results indicate that the intranasally transplanted hNSCs survive and exhibit extensive migration and higher neuronal differentiation, with a relatively limited glial differentiation. A proportion of intranasally transplanted hNSCs differentiate to cholinergic neurons, which rescue cholinergic dysfunction in APP/PS1 mice. In addition, intranasal transplantation of hNSCs attenuates ß-amyloid accumulation by upregulating the expression of ß-amyloid degrading enzymes, insulin-degrading enzymes, and neprilysin. Moreover, intranasal transplantation of hNSCs ameliorates other AD-like pathology including neuroinflammation, cholinergic dysfunction, and pericytic and synaptic loss, while enhancing adult hippocampal neurogenesis, eventually rescuing the cognitive deficits of APP/PS1 transgenic mice. Thus, our findings highlight that intranasal transplantation of hNSCs benefits cognition through multiple mechanisms, and exhibit the great potential of intranasal administration of stem cells as a non-invasive therapeutic strategy for AD.

16.
Genome Biol ; 22(1): 49, 2021 01 26.
Artigo em Inglês | MEDLINE | ID: mdl-33499903

RESUMO

The resources generated by the GTEx consortium offer unprecedented opportunities to advance our understanding of the biology of human diseases. Here, we present an in-depth examination of the phenotypic consequences of transcriptome regulation and a blueprint for the functional interpretation of genome-wide association study-discovered loci. Across a broad set of complex traits and diseases, we demonstrate widespread dose-dependent effects of RNA expression and splicing. We develop a data-driven framework to benchmark methods that prioritize causal genes and find no single approach outperforms the combination of multiple approaches. Using colocalization and association approaches that take into account the observed allelic heterogeneity of gene expression, we propose potential target genes for 47% (2519 out of 5385) of the GWAS loci examined.


Assuntos
Expressão Gênica , Predisposição Genética para Doença/genética , Estudo de Associação Genômica Ampla/métodos , Genótipo , Genes , Humanos , Herança Multifatorial , Transcriptoma
17.
Front Genet ; 12: 785290, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-35154244

RESUMO

Human and animal tissues consist of heterogeneous cell types that organize and interact in highly structured manners. Bulk and single-cell sequencing technologies remove cells from their original microenvironments, resulting in a loss of spatial information. Spatial transcriptomics is a recent technological innovation that measures transcriptomic information while preserving spatial information. Spatial transcriptomic data can be generated in several ways. RNA molecules are measured by in situ sequencing, in situ hybridization, or spatial barcoding to recover original spatial coordinates. The inclusion of spatial information expands the range of possibilities for analysis and visualization, and spurred the development of numerous novel methods. In this review, we summarize the core concepts of spatial genomics technology and provide a comprehensive review of current analysis and visualization methods for spatial transcriptomics.

18.
Front Artif Intell ; 4: 732381, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34988434

RESUMO

Recently, several studies have reported promising results with BERT-like methods on acronym tasks. In this study, we find an older rule-based program, Ab3P, not only performs better, but error analysis suggests why. There is a well-known spelling convention in acronyms where each letter in the short form (SF) refers to "salient" letters in the long form (LF). The error analysis uses decision trees and logistic regression to show that there is an opportunity for many pre-trained models (BERT, T5, BioBert, BART, ERNIE) to take advantage of this spelling convention.

19.
Nat Genet ; 52(11): 1158-1168, 2020 11.
Artigo em Inglês | MEDLINE | ID: mdl-33106633

RESUMO

Genome-wide association studies of neurological diseases have identified thousands of variants associated with disease phenotypes. However, most of these variants do not alter coding sequences, making it difficult to assign their function. Here, we present a multi-omic epigenetic atlas of the adult human brain through profiling of single-cell chromatin accessibility landscapes and three-dimensional chromatin interactions of diverse adult brain regions across a cohort of cognitively healthy individuals. We developed a machine-learning classifier to integrate this multi-omic framework and predict dozens of functional SNPs for Alzheimer's and Parkinson's diseases, nominating target genes and cell types for previously orphaned loci from genome-wide association studies. Moreover, we dissected the complex inverted haplotype of the MAPT (encoding tau) Parkinson's disease risk locus, identifying putative ectopic regulatory interactions in neurons that may mediate this disease association. This work expands understanding of inherited variation and provides a roadmap for the epigenomic dissection of causal regulatory variation in disease.


Assuntos
Doença de Alzheimer/genética , Encéfalo/anatomia & histologia , Neurônios/fisiologia , Doença de Parkinson/genética , Adulto , Atlas como Assunto , Variação Biológica da População , Montagem e Desmontagem da Cromatina , Estudos de Coortes , Elementos Facilitadores Genéticos , Epigenômica , Heterogeneidade Genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Haplótipos , Humanos , Aprendizado de Máquina , Polimorfismo de Nucleotídeo Único , Regiões Promotoras Genéticas , Proteínas tau/genética
20.
J Med Internet Res ; 22(10): e22299, 2020 10 02.
Artigo em Inglês | MEDLINE | ID: mdl-32931441

RESUMO

BACKGROUND: COVID-19 became a global pandemic not long after its identification in late 2019. The genomes of SARS-CoV-2 are being rapidly sequenced and shared on public repositories. To keep up with these updates, scientists need to frequently refresh and reclean data sets, which is an ad hoc and labor-intensive process. Further, scientists with limited bioinformatics or programming knowledge may find it difficult to analyze SARS-CoV-2 genomes. OBJECTIVE: To address these challenges, we developed CoV-Seq, an integrated web server that enables simple and rapid analysis of SARS-CoV-2 genomes. METHODS: CoV-Seq is implemented in Python and JavaScript. The web server and source code URLs are provided in this article. RESULTS: Given a new sequence, CoV-Seq automatically predicts gene boundaries and identifies genetic variants, which are displayed in an interactive genome visualizer and are downloadable for further analysis. A command-line interface is available for high-throughput processing. In addition, we aggregated all publicly available SARS-CoV-2 sequences from the Global Initiative on Sharing Avian Influenza Data (GISAID), National Center for Biotechnology Information (NCBI), European Nucleotide Archive (ENA), and China National GeneBank (CNGB), and extracted genetic variants from these sequences for download and downstream analysis. The CoV-Seq database is updated weekly. CONCLUSIONS: We have developed CoV-Seq, an integrated web service for fast and easy analysis of custom SARS-CoV-2 sequences. The web server provides an interactive module for the analysis of custom sequences and a weekly updated database of genetic variants of all publicly accessible SARS-CoV-2 sequences. We believe CoV-Seq will help improve our understanding of the genetic underpinnings of COVID-19.


Assuntos
Betacoronavirus/genética , Infecções por Coronavirus/virologia , Visualização de Dados , Bases de Dados Genéticas , Genoma Viral/genética , Pneumonia Viral/virologia , Software , COVID-19 , Biologia Computacional , Infecções por Coronavirus/epidemiologia , Humanos , Pandemias , Pneumonia Viral/epidemiologia , SARS-CoV-2
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...