Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
1.
Microbiol Spectr ; 12(8): e0361523, 2024 Aug 06.
Artículo en Inglés | MEDLINE | ID: mdl-38904371

RESUMEN

To analyze the characteristics of Mycoplasma pneumoniae as well as macrolide antibiotic resistance through whole-genome sequencing and comparative genomics. Thirteen clinical strains isolated from 2003 to 2019 were selected, 10 of which were resistant to erythromycin (MIC >64 µg/mL), including 8 P1-type I and 2 P1-type II. Three were sensitive (<1 µg/mL) and P1-type II. One resistant strain had an A→G point mutation at position 2064 in region V of the 23S rRNA, the others had it at position 2063, while the three sensitive strains had no mutation here. Genome assembly and comparative genome analysis revealed a high level of genome consistency within the P1 type, and the primary differences in genome sequences concentrated in the region encoding the P1 protein. In P1-type II strains, three specific gene mutations were identified: C162A and A430G in L4 gene and T1112G mutation in the CARDS gene. Clinical information showed seven cases were diagnosed with severe pneumonia, all of which were infected with drug-resistant strains. Notably, BS610A4 and CYM219A1 exhibited a gene multi-copy phenomenon and shared a conserved functional domain with the DUF31 protein family. Clinically, the patients had severe refractory pneumonia, with pleural effusion, necessitating treatment with glucocorticoids and bronchoalveolar lavage. The primary variations between strains occur among different P1-types, while there is a high level of genomic consistency within P1-types. Three mutation loci associated with specific types were identified, and no specific genetic alterations directly related to clinical presentation were observed.IMPORTANCEMycoplasma pneumoniae is an important pathogen of community-acquired pneumonia, and macrolide resistance brings difficulties to clinical treatment. We analyzed the characteristics of M. pneumoniae as well as macrolide antibiotic resistance through whole-genome sequencing and comparative genomics. The work addressed primary variations between strains that occur among different P1-types, while there is a high level of genomic consistency within P1-types. In P1-type II strains, three specific gene mutations were identified: C162A and A430G in L4 gene and T1112G mutation in the CARDS gene. All the strains isolated from severe pneumonia cases were drug-resistant, two of which exhibited a gene multi-copy phenomenon, sharing a conserved functional domain with the DUF31 protein family. Three mutation loci associated with specific types were identified, and no specific genetic alterations directly related to clinical presentation were observed.


Asunto(s)
Antibacterianos , Farmacorresistencia Bacteriana , Genoma Bacteriano , Pruebas de Sensibilidad Microbiana , Mycoplasma pneumoniae , Neumonía por Mycoplasma , Mycoplasma pneumoniae/genética , Mycoplasma pneumoniae/efectos de los fármacos , Mycoplasma pneumoniae/clasificación , Mycoplasma pneumoniae/aislamiento & purificación , Humanos , Antibacterianos/farmacología , Genoma Bacteriano/genética , Neumonía por Mycoplasma/microbiología , Neumonía por Mycoplasma/tratamiento farmacológico , Farmacorresistencia Bacteriana/genética , Masculino , Femenino , Secuenciación Completa del Genoma , Persona de Mediana Edad , Macrólidos/farmacología , Adulto , Mutación , ARN Ribosómico 23S/genética , Genómica , Anciano , Eritromicina/farmacología
2.
Chem Asian J ; : e202400626, 2024 Jun 25.
Artículo en Inglés | MEDLINE | ID: mdl-38924352

RESUMEN

This study explores the synthesis, structural characterization, and host-guest interactions of heteroatom bridged nanobelts, focusing on a cyclothianthrene nanobelt and a fused nanobelt incorporating thianthrene and phenoxathiin. Utilizing a cyclization-followed-by-bridging synthetic approach, both molecular belts were successfully synthesized, and their structures confirmed through NMR and MALDI-TOF-MS analysis. Crystallographic studies revealed that the cyclothianthrene nanobelt adopts an octagonal column-like conformation, while the hybrid belt forms an oval tub-shaped shape, both exhibiting distinct assembly motifs. The host-guest chemistry of these nanobelts was investigated with fullerenes (C60, C70, and PC61BM). The cyclothianthrene belt showed no interaction with these fullerenes, whereas the other belt demonstrated adaptive binding capabilities, forming stable complexes with C60 and C70 through π-π interactions and C-H⋅⋅⋅S hydrogen bonds. The binding constants indicated that the hybrid belt has a stronger affinity for C70 due to better size complementarity. Additionally, its interaction with PC61BM showcased a specific 1 : 1 binding mode despite exhibiting a smaller binding constant. This study underscores the impact of heteroatom incorporation on the structural and functional properties of nanobelts, offering insights for future molecular design strategies.

3.
Chem Commun (Camb) ; 60(50): 6387-6390, 2024 Jun 18.
Artículo en Inglés | MEDLINE | ID: mdl-38831735

RESUMEN

A molecular belt incorporating naphthalene moieties, featuring an ellipsoidal cavity, was precisely engineered through bottom-up synthesis. Its pre-arranged geometry exhibits excellent complementarity to fullerene C70, resulting in remarkable selective binding ability (K = 1.3 × 106 M-1) for C70 compared to C60 (K = 176 M-1), forming a 1 : 1 complex. This superiority was unequivocally demonstrated by the single crystal structure of the complex, which revealed outstanding concave-convex shape complementarity between the two components. This highlights the potential application of molecular belts in the purification and separation of fullerenes.

4.
Angew Chem Int Ed Engl ; : e202407575, 2024 Jun 20.
Artículo en Inglés | MEDLINE | ID: mdl-38899382

RESUMEN

Crown ethers (CEs), known for their exceptional host-guest complexation, offer potential as linkers in covalent organic frameworks (COFs) for enhanced performance in catalysis and host-guest binding. However, their highly flexible conformation and low symmetry limit the diversity of CE-derived COFs. Here, we introduce a novel C3-symmetrical azacrown ether (ACE) building block, tris(pyrido)[18]crown-6 (TPy18C6), for COF fabrication (ACE-COF-1 and ACE-COF-2) via reticular synthesis. This approach enables precise integration of CEs into COFs, enhancing Ni2+ ion immobilization while maintaining crystallinity. The resulting Ni2+-doped COFs (Ni@ACE-COF-1 and Ni@ACE-COF-2) exhibit high discharge capacity (up to 1.27 mAh ⋅ cm-2 at 8 mA ⋅ cm-2) and exceptional cycling stability (>1000 cycles) as cathode materials in aqueous alkaline nickel-zinc batteries. This study serves as an exemplar of the seamless integration of macrocyclic chemistry and reticular chemistry, laying the groundwork for extending the macrocyclic-synthon driven strategy to a diverse array of COF building blocks, ultimately yielding advanced materials tailored for specific applications.

5.
Front Plant Sci ; 15: 1371222, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38567138

RESUMEN

Pan-genome studies are important for understanding plant evolution and guiding the breeding of crops by containing all genomic diversity of a certain species. Three short-read-based strategies for plant pan-genome construction include iterative individual, iteration pooling, and map-to-pan. Their performance is very different under various conditions, while comprehensive evaluations have yet to be conducted nowadays. Here, we evaluate the performance of these three pan-genome construction strategies for plants under different sequencing depths and sample sizes. Also, we indicate the influence of length and repeat content percentage of novel sequences on three pan-genome construction strategies. Besides, we compare the computational resource consumption among the three strategies. Our findings indicate that map-to-pan has the greatest recall but the lowest precision. In contrast, both two iterative strategies have superior precision but lower recall. Factors of sample numbers, novel sequence length, and the percentage of novel sequences' repeat content adversely affect the performance of all three strategies. Increased sequencing depth improves map-to-pan's performance, while not affecting the other two iterative strategies. For computational resource consumption, map-to-pan demands considerably more than the other two iterative strategies. Overall, the iterative strategy, especially the iterative pooling strategy, is optimal when the sequencing depth is less than 20X. Map-to-pan is preferable when the sequencing depth exceeds 20X despite its higher computational resource consumption.

6.
Nucleic Acids Res ; 52(D1): D1651-D1660, 2024 Jan 05.
Artículo en Inglés | MEDLINE | ID: mdl-37843152

RESUMEN

Tropical crops are vital for tropical agriculture, with resource scarcity, functional diversity and extensive market demand, providing considerable economic benefits for the world's tropical agriculture-producing countries. The rapid development of sequencing technology has promoted a milestone in tropical crop research, resulting in the generation of massive amount of data, which urgently needs an effective platform for data integration and sharing. However, the existing databases cannot fully satisfy researchers' requirements due to the relatively limited integration level and untimely update. Here, we present the Tropical Crop Omics Database (TCOD, https://ngdc.cncb.ac.cn/tcod), a comprehensive multi-omics data platform for tropical crops. TCOD integrates diverse omics data from 15 species, encompassing 34 chromosome-level de novo assemblies, 1 255 004 genes with functional annotations, 282 436 992 unique variants from 2048 WGS samples, 88 transcriptomic profiles from 1997 RNA-Seq samples and 13 381 germplasm items. Additionally, TCOD not only employs genes as a bridge to interconnect multi-omics data, enabling cross-species comparisons based on homology relationships, but also offers user-friendly online tools for efficient data mining and visualization. In short, TCOD integrates multi-species, multi-omics data and online tools, which will facilitate the research on genomic selective breeding and trait biology of tropical crops.


Asunto(s)
Productos Agrícolas , Bases de Datos Genéticas , Productos Agrícolas/genética , Transcriptoma , Genoma de Planta
7.
Artículo en Inglés | MEDLINE | ID: mdl-37595788

RESUMEN

Since its initial release in 2001, the human reference genome has undergone continuous improvement in quality, and the recently released telomere-to-telomere (T2T) version - T2T-CHM13 - reaches its highest level of continuity and accuracy after 20 years of effort by working on a simplified, nearly homozygous genome of a hydatidiform mole cell line. Here, to provide an authentic complete diploid human genome reference for the Han Chinese, the largest population in the world, we assembled the genome of a male Han Chinese individual, T2T-YAO, which includes T2T assemblies of all the 22 + X + M and 22 + Y chromosomes in both haploid. The quality of T2T-YAO is much better than all currently available diploid assemblies, and its haploid version, T2T-YAO-hp, generated by selecting the better assembly for each autosome, reaches the top quality of fewer than one error per 29.5 Mb, even higher than that of T2T-CHM13. Derived from an individual living in the aboriginal region of the Han population, T2T-YAO shows clear ancestry and potential genetic continuity from the ancient ancestors. Each haplotype of T2T-YAO possesses ∼ 330-Mb exclusive sequences, ∼ 3100 unique genes, and tens of thousands of nucleotide and structural variations as compared with CHM13, highlighting the necessity of a population-stratified reference genome. The construction of T2T-YAO, a truly accurate and authentic representative of the Chinese population, would enable precise delineation of genomic variations and advance our understandings in the hereditability of diseases and phenotypes, especially within the context of the unique variations of the Chinese population.

8.
Microbiol Spectr ; 11(1): e0342622, 2023 02 14.
Artículo en Inglés | MEDLINE | ID: mdl-36622170

RESUMEN

SARS-CoV-2 has infected more than 600 million people. However, the origin of the virus is still unclear; knowing where the virus came from could help us prevent future zoonotic epidemics. Sequencing data, particularly metagenomic data, can profile the genomes of all species in the sample, including those not recognized at the time, thus allowing for the identification of the progenitor of SARS-CoV-2 in samples collected before the pandemic. We analyzed the data from 5,196 SARS-CoV-2-positive sequencing runs in the NCBI's SRA database with collection dates prior to 2020 or unknown. We found that the mutation patterns obtained from these suspicious SARS-CoV-2 reads did not match the genome characteristics of an unknown progenitor of the virus, suggesting that they may derive from circulating SARS-CoV-2 variants or other coronaviruses. Despite a negative result for tracking the progenitor of SARS-CoV-2, the methods developed in the study could assist in pinpointing the origin of various pathogens in the future. IMPORTANCE Sequences that are homologous to the SARS-CoV-2 genome were found in numerous sequencing runs that were not associated with the SARS-CoV-2 studies in the public database. It is unclear whether they are derived from the possible progenitor of SARS-CoV-2 or contamination of more recent SARS-CoV-2 variants circulated in the population due to the lack of information on the collection, library preparation, and sequencing processes. We have developed a computational framework to infer the evolutionary relationship between sequences based on the comparison of mutations, which enabled us to rule out the possibility that these suspicious sequences originate from unknown progenitors of SARS-CoV-2.


Asunto(s)
COVID-19 , SARS-CoV-2 , Humanos , SARS-CoV-2/genética , Metagenómica , Mutación , Genoma Viral
9.
Nucleic Acids Res ; 51(D1): D994-D1002, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36318261

RESUMEN

Homology is fundamental to infer genes' evolutionary processes and relationships with shared ancestry. Existing homolog gene resources vary in terms of inferring methods, homologous relationship and identifiers, posing inevitable difficulties for choosing and mapping homology results from one to another. Here, we present HGD (Homologous Gene Database, https://ngdc.cncb.ac.cn/hgd), a comprehensive homologs resource integrating multi-species, multi-resources and multi-omics, as a complement to existing resources providing public and one-stop data service. Currently, HGD houses a total of 112 383 644 homologous pairs for 37 species, including 19 animals, 16 plants and 2 microorganisms. Meanwhile, HGD integrates various annotations from public resources, including 16 909 homologs with traits, 276 670 homologs with variants, 398 573 homologs with expression and 536 852 homologs with gene ontology (GO) annotations. HGD provides a wide range of omics gene function annotations to help users gain a deeper understanding of gene function.


Asunto(s)
Bases de Datos Genéticas , Animales , Anotación de Secuencia Molecular
10.
Brief Bioinform ; 23(5)2022 09 20.
Artículo en Inglés | MEDLINE | ID: mdl-36088550

RESUMEN

Somatic variants act as critical players during cancer occurrence and development. Thus, an accurate and robust method to identify them is the foundation of cutting-edge cancer genome research. However, due to low accessibility and high individual-/sample-specificity of the somatic variants in tumor samples, the detection is, to date, still crammed with challenges, particularly when lacking paired normal samples as control. To solve this burning issue, we developed a tumor-only somatic and germline variant identification method (TSomVar) using the random forest algorithm established on sample-specific variant datasets derived from genotype imputation, reads-mapping level annotation and functional annotation. We trained TSomVar by using genomic variant datasets of three major cancer types: colorectal cancer, hepatocellular carcinoma and skin cutaneous melanoma. Compared with existing tumor-only somatic variant identification tools, TSomVar shows excellent performances in somatic variant detection with higher accuracy and better capability of recalling for test datasets from colorectal cancer and skin cutaneous melanoma. In addition, TSomVar is equipped with the competence of accurately identifying germline variants in tumor samples. Taken together, TSomVar will undoubtedly facilitate and revolutionize somatic variant explorations in cancer research.


Asunto(s)
Neoplasias Colorrectales , Melanoma , Neoplasias , Neoplasias Cutáneas , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos , Melanoma/genética , Neoplasias/genética , Neoplasias Cutáneas/genética , Melanoma Cutáneo Maligno
11.
Genome Biol ; 23(1): 203, 2022 09 26.
Artículo en Inglés | MEDLINE | ID: mdl-36163035

RESUMEN

BACKGROUND: The laboratory mouse was domesticated from the wild house mouse. Understanding the genetics underlying domestication in laboratory mice, especially in the widely used classical inbred mice, is vital for studies using mouse models. However, the genetic mechanism of laboratory mouse domestication remains unknown due to lack of adequate genomic sequences of wild mice. RESULTS: We analyze the genetic relationships by whole-genome resequencing of 36 wild mice and 36 inbred strains. All classical inbred mice cluster together distinctly from wild and wild-derived inbred mice. Using nucleotide diversity analysis, Fst, and XP-CLR, we identify 339 positively selected genes that are closely associated with nervous system function. Approximately one third of these positively selected genes are highly expressed in brain tissues, and genetic mouse models of 125 genes in the positively selected genes exhibit abnormal behavioral or nervous system phenotypes. These positively selected genes show a higher ratio of differential expression between wild and classical inbred mice compared with all genes, especially in the hippocampus and frontal lobe. Using a mutant mouse model, we find that the SNP rs27900929 (T>C) in gene Astn2 significantly reduces the tameness of mice and modifies the ratio of the two Astn2 (a/b) isoforms. CONCLUSION: Our study indicates that classical inbred mice experienced high selection pressure during domestication under laboratory conditions. The analysis shows the positively selected genes are closely associated with behavior and the nervous system in mice. Tameness may be related to the Astn2 mutation and regulated by the ratio of the two Astn2 (a/b) isoforms.


Asunto(s)
Domesticación , Genoma , Animales , Ratones , Nucleótidos , Fenotipo , Selección Genética , Secuenciación Completa del Genoma
12.
Front Genet ; 13: 956781, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-36035123

RESUMEN

Due to the explosion of cancer genome data and the urgent needs for cancer treatment, it is becoming increasingly important and necessary to easily and timely analyze and annotate cancer genomes. However, tumor heterogeneity is recognized as a serious barrier to annotate cancer genomes at the individual patient level. In addition, the interpretation and analysis of cancer multi-omics data rely heavily on existing database resources that are often located in different data centers or research institutions, which poses a huge challenge for data parsing. Here we present CCAS (Cancer genome Consensus Annotation System, https://ngdc.cncb.ac.cn/ccas/#/home), a one-stop and comprehensive annotation system for the individual patient at multi-omics level. CCAS integrates 20 widely recognized resources in the field to support data annotation of 10 categories of cancers covering 395 subtypes. Data from each resource are manually curated and standardized by using ontology frameworks. CCAS accepts data on single nucleotide variant/insertion or deletion, expression, copy number variation, and methylation level as input files to build a consensus annotation. Outputs are arranged in the forms of tables or figures and can be searched, sorted, and downloaded. Expanded panels with additional information are used for conciseness, and most figures are interactive to show additional information. Moreover, CCAS offers multidimensional annotation information, including mutation signature pattern, gene set enrichment analysis, pathways and clinical trial related information. These are helpful for intuitively understanding the molecular mechanisms of tumors and discovering key functional genes.

13.
Genes (Basel) ; 13(6)2022 05 24.
Artículo en Inglés | MEDLINE | ID: mdl-35741697

RESUMEN

Endometrial carcinoma (EC), a common female reproductive system malignant tumor, affects thousands of people with high morbidity and mortality worldwide. This study was aimed at developing a prediction model for the diagnosis of EC in the general population. First, we obtained datasets GSE63678, GSE106191, and GSE115810 from the Gene Expression Omnibus (GEO) database, dataset GSE17025 from the GEO database, and the RNA sequence of EC from The Cancer Genome Atlas (TCGA) database to constitute the training, test, and validation groups, respectively. Subsequently, the 96 most significantly differentially expressed genes (DEGs) were identified and analyzed for function and pathway enrichment in the training group. Next, we acquired the disease-specific genes by random forest and established an artificial neural network for the diagnosis. Receiver operating characteristic (ROC) curves were utilized to identify the signature across the three groups. Finally, immune infiltration was analyzed to reveal tumor-immune microenvironment (TIME) alterations in EC. The top 96 DEGs (77 down-regulated and 19 up-regulated genes) were primarily enriched in the interleukin-17 signaling pathway, protein digestion and absorption, and transcriptional misregulation in cancer. Subsequently, 14 characterizing genes of EC were identified by random forest. In the training, test, and validation groups, the artificial neural network was constructed with high diagnostic accuracies of 0.882, 0.864, and 0.839, respectively, and areas under the ROC curve (AUCs) of 0.928, 0.921, and 0.782, respectively. Finally, resting and activated mast cells were found to have increased in TIME. We constructed an artificial diagnostic model with excellent reliability for EC and uncovered variations in the immunological ecosystem of EC through integrated bioinformatics approaches, which might be potential diagnostic targets for EC.


Asunto(s)
Ecosistema , Neoplasias Endometriales , Neoplasias Endometriales/diagnóstico , Neoplasias Endometriales/genética , Neoplasias Endometriales/metabolismo , Femenino , Humanos , Aprendizaje Automático , Redes Neurales de la Computación , Reproducibilidad de los Resultados , Microambiente Tumoral
14.
Chem Sci ; 13(21): 6291-6296, 2022 Jun 01.
Artículo en Inglés | MEDLINE | ID: mdl-35733896

RESUMEN

An unprecedented zirconium metal-organic framework featuring a T-shaped benzimidazole strut was constructed and employed as a sponge-like material for selective absorption of macrocyclic guests. The neutral benzimidazole domain of the as-synthesized framework can be readily protonated and fully converted to benzimidazolium. Mechanical threading of [24]crown-8 ether wheels onto recognition sites to form pseudorotaxanes was evidenced by solution nuclear magnetic resonance, solid-state fluorescence, and infrared spectroscopy. Selective absorption of [24]crown-8 ether rather than its dibenzo counterpart was also observed. Further study reveals that this binding process is reversible and acid-base switchable. The success of docking macrocyclic guests in crystals via host-guest interactions provides an alternative route to complex functional materials with interpenetrated structures.

15.
Chem Commun (Camb) ; 58(39): 5829-5832, 2022 May 12.
Artículo en Inglés | MEDLINE | ID: mdl-35388851

RESUMEN

A mechanically interlocked [3]rotaxane was newly designed, synthesized, and employed as a ligand for constructing metal-organic frameworks (MOFs). The nano-confinement by macrocycles forces the soft bis-isophthalate axle into a pseudo-rigid conformation and coordinates to zinc(II) ions, affording a two- or three-dimensional MOF under controlled conditions. The 2D MOF exhibits a neutral framework with a periodic puckering sheet structure, while an anionic framework with a pts topology was observed for the 3D MOF. The phase purity of both bulk materials was confirmed by powder X-ray diffraction. Thermogravimetric analysis reveals that both materials are stable up to 250 °C. The success of applying mechanical bonds to rigidify flexible ligands provides new insights for the design of metal-organic frameworks.

16.
Nucleic Acids Res ; 50(D1): D1147-D1155, 2022 01 07.
Artículo en Inglés | MEDLINE | ID: mdl-34643725

RESUMEN

With the proliferating studies of human cancers by single-cell RNA sequencing technique (scRNA-seq), cellular heterogeneity, immune landscape and pathogenesis within diverse cancers have been uncovered successively. The exponential explosion of massive cancer scRNA-seq datasets in the past decade are calling for a burning demand to be integrated and processed for essential investigations in tumor microenvironment of various cancer types. To fill this gap, we developed a database of Cancer Single-cell Expression Map (CancerSCEM, https://ngdc.cncb.ac.cn/cancerscem), particularly focusing on a variety of human cancers. To date, CancerSCE version 1.0 consists of 208 cancer samples across 28 studies and 20 human cancer types. A series of uniformly and multiscale analyses for each sample were performed, including accurate cell type annotation, functional gene expressions, cell interaction network, survival analysis and etc. Plus, we visualized CancerSCEM as a user-friendly web interface for users to browse, search, online analyze and download all the metadata as well as analytical results. More importantly and unprecedentedly, the newly-constructed comprehensive online analyzing platform in CancerSCEM integrates seven analyze functions, where investigators can interactively perform cancer scRNA-seq analyses. In all, CancerSCEM paves an informative and practical way to facilitate human cancer studies, and also provides insights into clinical therapy assessments.


Asunto(s)
Bases de Datos Genéticas , Neoplasias/genética , Programas Informáticos , Regulación Neoplásica de la Expresión Génica/genética , Humanos , Neoplasias/clasificación , RNA-Seq , Análisis de la Célula Individual/normas , Microambiente Tumoral/genética
17.
Biosaf Health ; 2021 Nov 09.
Artículo en Inglés | MEDLINE | ID: mdl-34778742

RESUMEN

By re-analzying public metagenomic data from 101 patients infected with influenza A virus during the 2007-2012 H1N1 flu seasons in France, we identified 22 samples with SARS-CoV sequences. In 3 of them, the SARS genome sequences could be fully assembled out of each. These sequences are highly similar (99.99% and 99.7%) to the artificially constructed recombinant SARS-CoV (SARSr-CoV) strains generated by the J. Craig Venter Institute in the USA. Moreover, samples from different flu seasons have different SARS-CoV strains, and the divergence between these strains cannot be explained by natural evolution. Our study also shows that retrospective studies using public metagenomic data from past major epidemic outbreaks serve as a genomic strategy for researching the origins or spread of infectious diseases.

18.
Yi Chuan ; 43(10): 988-993, 2021 Oct 20.
Artículo en Inglés | MEDLINE | ID: mdl-34702711

RESUMEN

The Genome Sequence Archive for Human (GSA-Human) is a data repository specialized for human genetic related data derived from biomedical researches, and also supports the data collection and management of National Key Research and Development Projects. GSA-Human has a data security management strategy according to the national regulations of human genetic resources. It provides two different models of data access: Open-access and Controlled-access. Open-access data are universally and freely accessible for global researchers, while Controlled-access ensures that data are accessed only by authorized users with the permission of the Data Access Committee (DAC). Till July 2021, GSA-Human has housed more than 5.27 PB of data from 750 datasets.

19.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34402866

RESUMEN

Genotype imputation is a statistical method for estimating missing genotypes from a denser haplotype reference panel. Existing methods usually performed well on common variants, but they may not be ideal for low-frequency and rare variants. Previous studies showed that the population similarity between study and reference panels is one of the key factors influencing the imputation accuracy. Here, we developed an imputation reference panel reconstruction method (RefRGim) using convolutional neural networks (CNNs), which can generate a study-specified reference panel for each input data based on the genetic similarity of individuals from current study and references. The CNNs were pretrained with single nucleotide polymorphism data from the 1000 Genomes Project. Our evaluations showed that genotype imputation with RefRGim can achieve higher accuracies than original reference panel, especially for low-frequency and rare variants. RefRGim will serve as an efficient reference panel reconstruction method for genotype imputation. RefRGim is freely available via GitHub: https://github.com/shishuo16/RefRGim.


Asunto(s)
Biología Computacional/métodos , Genotipo , Técnicas de Genotipaje/métodos , Redes Neurales de la Computación , Programas Informáticos , Algoritmos , Bases de Datos Genéticas , Aprendizaje Profundo , Genética de Población/métodos , Estudio de Asociación del Genoma Completo/métodos , Humanos , Reproducibilidad de los Resultados , Navegador Web
20.
Zool Res ; 41(6): 705-708, 2020 Nov 18.
Artículo en Inglés | MEDLINE | ID: mdl-33045776

RESUMEN

Since the first reported severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection in December 2019, coronavirus disease 2019 (COVID-19) has become a global pandemic, spreading to more than 200 countries and regions worldwide. With continued research progress and virus detection, SARS-CoV-2 genomes and sequencing data have been reported and accumulated at an unprecedented rate. To meet the need for fast analysis of these genome sequences, the National Genomics Data Center (NGDC) of the China National Center for Bioinformation (CNCB) has established an online coronavirus analysis platform, which includes de novoassembly, BLAST alignment, genome annotation, variant identification, and variant annotation modules. The online analysis platform can be freely accessed at the 2019 Novel Coronavirus Resource (2019nCoVR) (https://bigd.big.ac.cn/ncov/online/tools).


Asunto(s)
Betacoronavirus/genética , Biología Computacional/métodos , Infecciones por Coronavirus/diagnóstico , Genoma Viral/genética , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Neumonía Viral/diagnóstico , Animales , Betacoronavirus/clasificación , Betacoronavirus/fisiología , COVID-19 , China , Biología Computacional/organización & administración , Infecciones por Coronavirus/virología , Variación Genética , Humanos , Internet , Anotación de Secuencia Molecular , Pandemias , Neumonía Viral/virología , SARS-CoV-2
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...