Pesquisa | Portal de Pesquisa da BVS

1.

Long-read sequencing of 111 rice genomes reveals significantly larger pan-genomes.

Zhang, Fan; Xue, Hongzhang; Dong, Xiaorui; Li, Min; Zheng, Xiaoming; Li, Zhikang; Xu, Jianlong; Wang, Wensheng; Wei, Chaochun.

Genome Res ; 32(5): 853-863, 2022 05.

Artigo em Inglês | MEDLINE | ID: mdl-35396275

RESUMO

The concept of pan-genome, which is the collection of all genomes from a population, has shown a great potential in genomics study, especially for crop sciences. The rice pan-genome constructed from the second-generation sequencing (SGS) data is about 270 Mb larger than Nipponbare, the rice reference genome (NipRG), but it is still disadvantaged by incompleteness and loss of genomic contexts. The third-generation sequencing (TGS) with long reads can help to construct better pan-genomes. In this paper, we report a high-quality rice pan-genome construction method by introducing a series of new steps to deal with the long-read data, including unmapped sequence block filtering, redundancy removing, and sequence block elongating. Compared to NipRG, the long-read sequencing-based pan-genome constructed from 105 rice accessions, which contains 604 Mb novel sequences, is much more comprehensive than the one constructed from â¼3000 rice genomes sequenced with short reads. The repetitive sequences are the main components of novel sequences, which partially explain the differences between the pan-genomes based on TGS and SGS. Adding six wild rice accessions, there are about 879 Mb novel sequences and 19,000 novel genes in the rice pan-genome in total. In addition, we have created high-quality reference genomes for all representative rice populations, including five gapless reference genomes. This study has made significant progress in our understanding of the rice pan-genome, and this pan-genome construction method for long-read data can be applied to accelerate a broad range of genomics studies.

Assuntos

Oryza , Genoma , Genômica/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Oryza/genética , Análise de Sequência de DNA

2.

PPanG: a precision pangenome browser enabling nucleotide-level analysis of genomic variations in individual genomes and their graph-based pangenome.

Liu, Mingwei; Zhang, Fan; Lu, Huimin; Xue, Hongzhang; Dong, Xiaorui; Li, Zhikang; Xu, Jianlong; Wang, Wensheng; Wei, Chaochun.

BMC Genomics ; 25(1): 405, 2024 Apr 24.

Artigo em Inglês | MEDLINE | ID: mdl-38658835

RESUMO

Graph-based pangenome is gaining more popularity than linear pangenome because it stores more comprehensive information of variations. However, traditional linear genome browser has its own advantages, especially the tremendous resources accumulated historically. With the fast-growing number of individual genomes and their annotations available, the demand for a genome browser to visualize genome annotation for many individuals together with a graph-based pangenome is getting higher and higher. Here we report a new pangenome browser PPanG, a precise pangenome browser enabling nucleotide-level comparison of individual genome annotations together with a graph-based pangenome. Nine rice genomes with annotations were provided by default as potential references, and any individual genome can be selected as the reference. Our pangenome browser provides unprecedented insights on genome variations at different levels from base to gene, and reveals how the structures of a gene could differ for individuals. PPanG can be applied to any species with multiple individual genomes available and it is available at https://cgm.sjtu.edu.cn/PPanG .

Assuntos

Genômica , Genômica/métodos , Oryza/genética , Anotação de Sequência Molecular , Genoma de Planta , Variação Genética , Software , Navegador , Bases de Dados Genéticas , Nucleotídeos/genética , Genoma

3.

UHRF1 regulates alternative splicing by interacting with splicing factors and U snRNAs in a H3R2me involved manner.

Xu, Peng; Zhang, Lan; Xiao, Yao; Li, Wei; Hu, Zhiqiang; Zhang, Rukui; Li, Jin; Wu, Feizhen; Xi, Yanping; Zou, Qingping; Wang, Zhentian; Guo, Rui; Ma, Honghui; Dong, Shihua; Xiao, Min; Yang, Zhicong; Ren, Xiaoguang; Wei, Chaochun; Yu, Wenqiang.

Hum Mol Genet ; 30(22): 2110-2122, 2021 11 01.

Artigo em Inglês | MEDLINE | ID: mdl-34196368

RESUMO

The well-established functions of UHRF1 converge to DNA biological processes, as exemplified by DNA methylation maintenance and DNA damage repair during cell cycles. However, the potential effect of UHRF1 on RNA metabolism is largely unexplored. Here, we revealed that UHRF1 serves as a novel alternative RNA splicing regulator. The protein interactome of UHRF1 identified various splicing factors. Among them, SF3B3 could interact with UHRF1 directly and participate in UHRF1-regulated alternative splicing events. Furthermore, we interrogated the RNA interactome of UHRF1, and surprisingly, we identified U snRNAs, the canonical spliceosome components, in the purified UHRF1 complex. Unexpectedly, we found H3R2 methylation status determines the binding preference of U snRNAs, especially U2 snRNAs. The involvement of U snRNAs in UHRF1-containing complex and their binding preference to specific chromatin configuration imply a finely orchestrated mechanism at play. Our results provided the resources and pinpointed the molecular basis of UHRF1-mediated alternative RNA splicing, which will help us better our understanding of the physiological and pathological roles of UHRF1 in disease development.

Assuntos

Processamento Alternativo , Proteínas Estimuladoras de Ligação a CCAAT/metabolismo , Histonas/metabolismo , Fatores de Processamento de RNA/metabolismo , RNA Nuclear Pequeno/genética , Ubiquitina-Proteína Ligases/metabolismo , Proteínas Estimuladoras de Ligação a CCAAT/genética , Humanos , Metilação , Complexos Multiproteicos , Conformação de Ácido Nucleico , Ligação Proteica , RNA Nuclear Pequeno/metabolismo , Ubiquitina-Proteína Ligases/genética

4.

GESLM algorithm for detecting causal SNPs in GWAS with multiple phenotypes.

Lyu, Ruiqi; Sun, Jianle; Xu, Dong; Jiang, Qianxue; Wei, Chaochun; Zhang, Yue.

Brief Bioinform ; 22(6)2021 11 05.

Artigo em Inglês | MEDLINE | ID: mdl-34323927

RESUMO

With the development of genome-wide association studies, how to gain information from a large scale of data has become an issue of common concern, since traditional methods are not fully developed to solve problems such as identifying loci-to-loci interactions (also known as epistasis). Previous epistatic studies mainly focused on local information with a single outcome (phenotype), while in this paper, we developed a two-stage global search algorithm, Greedy Equivalence Search with Local Modification (GESLM), to implement a global search of directed acyclic graph in order to identify genome-wide epistatic interactions with multiple outcome variables (phenotypes) in a case-control design. GESLM integrates the advantages of score-based methods and constraint-based methods to learn the phenotype-related Bayesian network and is powerful and robust to find the interaction structures that display both genetic associations with phenotypes and gene interactions. We compared GESLM with some common phenotype-related loci detecting methods in simulation studies. The results showed that our method improved the accuracy and efficiency compared with others, especially in an unbalanced case-control study. Besides, its application on the UK Biobank dataset suggested that our algorithm has great performance when handling genome-wide association data with more than one phenotype.

Assuntos

Algoritmos , Estudo de Associação Genômica Ampla , Fenótipo , Polimorfismo de Nucleotídeo Único , Teorema de Bayes , Conjuntos de Dados como Assunto , Humanos

5.

Visible-Light-Driven [2 + 2] Photocycloaddition for Constructing Dimers of N,N'-Diacyl-1,4-dihydropyrazines: Experimental and Theoretical Investigation.

Zhang, Xiaokun; Wei, Chaochun; Chu, Dongchen; Yan, Hong; Song, Xiuqing.

J Org Chem ; 88(19): 13946-13955, 2023 Oct 06.

Artigo em Inglês | MEDLINE | ID: mdl-37676850

RESUMO

In this study, the visible-light-driven [2 + 2] photocycloaddition of 1,4-dihydropyrazines in solution was reported. The N,N'-diacyl-1,4-dihydropyrazines with different substituents showed completely different reactivity under the irradiation of a 430 nm blue light-emitting diode (LED) lamp. N,N'-Diacetyl-1,4-dihydropyrazine and N,N'-dipropionyl-1,4-dihydropyrazine were the only compounds capable of undergoing a [2 + 2] photocycloaddition reaction, yielding syn-dimers and cage-dimers (known as 3,6,9,12-tetraazatetraasteranes) with overall yields of 76 and 83%, correspondingly. The substituent-reactivity effect on [2 + 2] photocycloaddition of N,N'-diacyl-1,4-dihydropyrazines was investigated by density functional theory calculations. The results show that the substituents have little influence on Gibbs free energy for the [2 + 2] photocycloaddition and mainly affect the excited energy, reaction sites, and the triplet excited-state structures of 1,4-dihydropyrazines, which are closely related to whether the reaction occurs. The results offer insights into the photochemical reactivity of 1,4-dihydropyrazines and an approach for constructing dimers of N,N'-diacyl-1,4-dihydropyrazines through a solution-based visible-light-driven [2 + 2] photocycloaddition, especially for the construction of 3,6,9,12-tetraazatetraasteranes. Compared with the solid-state [2 + 2] photocycloaddition of 1,4-dihydropyrazine, this photocycloaddition will be an efficient and environmentally friendly method for synthesizing tetraazatetraasteranes with the advantages of milder reaction conditions, simple operation, adjustable reaction amounts by omitting the cocrystal growth step, etc.

6.

ivTerm-An R package for interactive visualization of functional analysis results of meta-omics data.

Dong, Xiaorui; Xue, Hongzhang; Wei, Chaochun.

J Cell Biochem ; 122(10): 1428-1434, 2021 10.

Artigo em Inglês | MEDLINE | ID: mdl-34132422

RESUMO

Interpreting functional analysis results derived from environmental samples using direct sequencing meta-omics data, including metagenomics and meta-transcriptomics data, is challenging due to their complexity. Visualization of functional analysis results can help researchers discover relevant biological insights. Despite the availability of many R packages, there lacks interactive and comprehensive graphic systems for displaying functional terms and corresponding genes in meta-omics analysis results. Here, we present ivTerm, an R-shiny package with a user-friendly graphical interface that enables users to inspect functional annotations, compare results across multiple experiments, create customized charts, and download these charts. It provides various basic and innovative chart types to visualize functional terms and involved genes. Users can also browse the description of terms obtained from the database web servers automatically. Two examples, including a metagenome analysis data for human gut and a meta-transcriptome data for coral symbiomes, are given to show the usage of ivTerm. In the end, we compared ivTerm with existing tools with similar functions, such as GOplot, ViSEAGO, and Chordomics. The tool ivTerm is convenient and efficient for biologists to gain an integrated view and develop deep insights by interactive analysis of meta-omics data. It can accelerate the procedure to develop insights from complex meta-omics data. The code for ivTerm is freely available at https://github.com/SJTU-CGM/ivTerm.

Assuntos

Biologia Computacional/métodos , Gráficos por Computador , Visualização de Dados , Software , Interpretação Estatística de Dados , Bases de Dados Factuais , Perfilação da Expressão Gênica , Redes Reguladoras de Genes , Genômica/métodos , Humanos , Metabolômica/métodos , Metagenoma , Transcriptoma

7.

4D-QSAR Molecular Modeling and Analysis of Flavonoid Derivatives as Acetylcholinesterase Inhibitors.

Wang, Yanyu; Zhao, Yanping; Wei, Chaochun; Tian, Nana; Yan, Hong.

Biol Pharm Bull ; 44(7): 999-1006, 2021.

Artigo em Inglês | MEDLINE | ID: mdl-34193695

RESUMO

Flavonoids are potential strikingly natural compounds with antioxidant activity and acetylcholinesterase (AChE) inhibitory activity for treating Alzheimer's disease (AD). In present study, in line with our interests in flavonoid derivatives as AChE inhibitors, a four-dimensional quantitative structure-activity relationship (4D-QSAR) molecular model was proposed. The data required to perform 4D-QSAR analysis includes 52 compounds reported in the literature, usually analogs, and their measured biological activities in a common assay. The model was generated by a complete set of 4D-QSAR program which was written by our group. The best model was found after trying multiple experiments. It had a good predictive ability with the cross-validation correlation coefficient Q2 = 0.77, the internal validation correlation coefficient R2 = 0.954, and the external validation correlation coefficient R2pred = 0.715. The molecular docking analysis was also carried out to understand exceedingly the interactions between flavonoids and the AChE targets, which was in good agreement with the 4D-QSAR model. Based on the information provided by the 4D-QSAR model and molecular docking analysis, the idea for optimizing the structures of flavonoids as AChE inhibitors was put forward which maybe provide theoretical guidance for the research and development of new AChE inhibitors.

Assuntos

Inibidores da Colinesterase/química , Flavonoides/química , Modelos Moleculares , Relação Quantitativa Estrutura-Atividade

8.

CDKAM: a taxonomic classification tool using discriminative k-mers and approximate matching strategies.

Bui, Van-Kien; Wei, Chaochun.

BMC Bioinformatics ; 21(1): 468, 2020 Oct 20.

Artigo em Inglês | MEDLINE | ID: mdl-33081690

RESUMO

BACKGROUND: Current taxonomic classification tools use exact string matching algorithms that are effective to tackle the data from the next generation sequencing technology. However, the unique error patterns in the third generation sequencing (TGS) technologies could reduce the accuracy of these programs. RESULTS: We developed a Classification tool using Discriminative K-mers and Approximate Matching algorithm (CDKAM). This approximate matching method was used for searching k-mers, which included two phases, a quick mapping phase and a dynamic programming phase. Simulated datasets as well as real TGS datasets have been tested to compare the performance of CDKAM with existing methods. We showed that CDKAM performed better in many aspects, especially when classifying TGS data with average length 1000-1500 bases. CONCLUSIONS: CDKAM is an effective program with higher accuracy and lower memory requirement for TGS metagenome sequence classification. It produces a high species-level accuracy.

Assuntos

Algoritmos , Sequenciamento de Nucleotídeos em Larga Escala/métodos , Humanos

9.

PaSS: a sequencing simulator for PacBio sequencing.

Zhang, Wenmin; Jia, Ben; Wei, Chaochun.

BMC Bioinformatics ; 20(1): 352, 2019 Jun 21.

Artigo em Inglês | MEDLINE | ID: mdl-31226925

RESUMO

BACKGROUND: Third-generation sequencing platforms, such as PacBio sequencing, have been developed rapidly in recent years. PacBio sequencing generates much longer reads than the second-generation sequencing (or the next generation sequencing, NGS) technologies and it has unique sequencing error patterns. An effective read simulator is essential to evaluate and promote the development of new bioinformatics tools for PacBio sequencing data analysis. RESULTS: We developed a new PacBio Sequencing Simulator (PaSS). It can learn sequence patterns from PacBio sequencing data currently available. In addition to the distribution of read lengths and error rates, we included a context-specific sequencing error model. Compared to existing PacBio sequencing simulators such as PBSIM, LongISLND and NPBSS, PaSS performed better in many aspects. Assembly tests also suggest that reads simulated by PaSS are the most similar to experimental sequencing data. CONCLUSION: PaSS is an effective sequence simulator for PacBio sequencing. It will facilitate the evaluation and development of new analysis tools for the third-generation sequencing data.

Assuntos

Sequenciamento de Nucleotídeos em Larga Escala/métodos , Análise de Sequência de DNA , Software , Animais , Arabidopsis/genética , Caenorhabditis elegans/genética , Simulação por Computador , Escherichia coli/genética

10.

Discovery and characterization of the evolution, variation and functions of diversity-generating retroelements using thousands of genomes and metagenomes.

Yan, Fazhe; Yu, Xuelin; Duan, Zhongqu; Lu, Jinyuan; Jia, Ben; Qiao, Yuyang; Sun, Chen; Wei, Chaochun.

BMC Genomics ; 20(1): 595, 2019 Jul 19.

Artigo em Inglês | MEDLINE | ID: mdl-31324156

RESUMO

BACKGROUND: Diversity-generating retroelements (DGRs) are a unique family of retroelements that generate sequence diversity of DNA to benefit their hosts by introducing variations and accelerating the evolution of target proteins. They exist widely in bacteria, archaea, phage and plasmid. However, our understanding about DGRs in natural environments was still very limited. RESULTS: We developed an efficient computational algorithm to identify DGRs, and applied it to characterize DGRs in more than 80,000 sequenced bacterial genomes as well as more than 4,000 human metagenome datasets. In total, we identified 948 non-redundant DGRs, which expanded the number of known DGRs in bacterial genomes and human microbiomes by about 55%, and provided a much more comprehensive reference for the study of DGRs. Phylogenetic analysis was done for identified DGRs. The putative target genes of DGRs were searched, and the functions of these target genes were investigated with a comprehensive alignment against the nr database. CONCLUSIONS: DGR system is a powerful and universal mechanism to generate diversity. DGR evolution is closely associated with the living environment and their cassette structures. Furthermore, it may impact a wide range of functional processes in addition to receptor-binding. These results significantly improved our understanding about DGRs.

Assuntos

Evolução Molecular , Variação Genética , Genômica , Metagenoma/genética , Retroelementos/genética , Algoritmos , Bactérias/genética , Humanos , Microbiota/genética

11.

A near complete genome of Arachis monticola, an allotetraploid wild peanut.

Xue, Hongzhang; Zhao, Kai; Zhao, Kunkun; Han, Suoyi; Chitikineni, Annapurna; Zhang, Lin; Qiu, Ding; Ren, Rui; Gong, Fangping; Li, Zhongfeng; Ma, Xingli; Zhang, Xingguo; Varshney, Rajeev K; Zhang, Xinyou; Wei, Chaochun; Yin, Dongmei.

Plant Biotechnol J ; 22(8): 2110-2112, 2024 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-38436521

Assuntos

Arachis , Genoma de Planta , Arachis/genética , Arachis/microbiologia , Genoma de Planta/genética , Tetraploidia

12.

RPAN: rice pan-genome browser for â¼3000 rice genomes.

Sun, Chen; Hu, Zhiqiang; Zheng, Tianqing; Lu, Kuangchen; Zhao, Yue; Wang, Wensheng; Shi, Jianxin; Wang, Chunchao; Lu, Jinyuan; Zhang, Dabing; Li, Zhikang; Wei, Chaochun.

Nucleic Acids Res ; 45(2): 597-605, 2017 01 25.

Artigo em Inglês | MEDLINE | ID: mdl-27940610

RESUMO

A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k.

Assuntos

Genoma de Planta , Genômica , Oryza/genética , Software , Biologia Computacional/métodos , Bases de Dados Genéticas , Genômica/métodos , Anotação de Sequência Molecular , Navegador

13.

EUPAN enables pan-genome studies of a large number of eukaryotic genomes.

Hu, Zhiqiang; Sun, Chen; Lu, Kuang-Chen; Chu, Xixia; Zhao, Yue; Lu, Jinyuan; Shi, Jianxin; Wei, Chaochun.

Bioinformatics ; 33(15): 2408-2409, 2017 Aug 01.

Artigo em Inglês | MEDLINE | ID: mdl-28369371

RESUMO

SUMMARY: Pan-genome analyses are routinely carried out for bacteria to interpret the within-species gene presence/absence variations (PAVs). However, pan-genome analyses are rare for eukaryotes due to the large sizes and higher complexities of their genomes. Here we proposed EUPAN, a eukaryotic pan-genome analysis toolkit, enabling automatic large-scale eukaryotic pan-genome analyses and detection of gene PAVs at a relatively low sequencing depth. In the previous studies, we demonstrated the effectiveness and high accuracy of EUPAN in the pan-genome analysis of 453 rice genomes, in which we also revealed widespread gene PAVs among individual rice genomes. Moreover, EUPAN can be directly applied to the current re-sequencing projects primarily focusing on single nucleotide polymorphisms. AVAILABILITY AND IMPLEMENTATION: EUPAN is implemented in Perl, R and C ++. It is supported under Linux and preferred for a computer cluster with LSF and SLURM job scheduling system. EUPAN together with its standard operating procedure (SOP) is freely available for non-commercial use (CC BY-NC 4.0) at http://cgm.sjtu.edu.cn/eupan/index.html . CONTACT: ccwei@sjtu.edu.cn or jianxin.shi@sjtu.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Assuntos

Eucariotos/genética , Genética Populacional/métodos , Genoma , Análise de Sequência de DNA/métodos , Software , Genômica/métodos , Técnicas de Genotipagem/métodos , Sequenciamento de Nucleotídeos em Larga Escala , Polimorfismo de Nucleotídeo Único

14.

In silico analysis of endogenous siRNAs associated transposable elements and NATs in Schistosoma japonicum reveals their putative roles during reproductive development.

Giri, Bikash Ranjan; Ye, Jiannan; Chen, Yongjun; Wei, Chaochun; Cheng, Guofeng.

Parasitol Res ; 117(5): 1549-1558, 2018 May.

Artigo em Inglês | MEDLINE | ID: mdl-29568977

RESUMO

Schistosomiasis is a neglected tropical disease caused by trematode of the genus Schistosoma. Successful reproductive development is critical for the production of eggs, which are responsible for host pathology and disease dissemination. Endogenous small non-coding RNAs play important roles in many biological processes such as protection against foreign pathogens, cell differentiation, and chromosomal stability by regulating target gene expression at the transcriptional and post-transcriptional levels. In this study, we performed in silico analysis of endogenous small non-coding RNAs in different stages, and sex of S. japonicum focusing on endogenous small interfering RNAs (endo-siRNAs) generated from transposable elements (TEs) and natural antisense transcripts (NATs). Both total and unique siRNA populations show 18-30 nt in length, but the predominant size was 20 nt and the leading first base was adenosine. Sense TE-derived endo-siRNAs reads were higher than antisense reads at different relative positions of TEs, whereas no such difference was observed for NAT-derived endo-siRNAs. TE- and NAT-derived endo-siRNAs were more enriched in the male compared to female worms, with the higher relative expression in early phase of pairing. Putative targets of endo-siRNAs indicated more of them in males (106 and 66) than in females (6 and 23) for TE- and NAT-derived endo-siRNAs, respectively. Our preliminary study revealed vital role of endo-siRNAs during the reproductive development of S. japonicum and provide clues for putative novel targets to suppress worm reproduction and direction for effective anti-schistosomal drug development.

Assuntos

Elementos de DNA Transponíveis/genética , RNA Interferente Pequeno/genética , Schistosoma japonicum/genética , Animais , Simulação por Computador , Feminino , Humanos , Masculino , Esquistossomose/parasitologia , Esquistossomose/patologia , Esquistossomose/transmissão

15.

Widespread of horizontal gene transfer in the human genome.

Huang, Wenze; Tsai, Lillian; Li, Yulong; Hua, Nan; Sun, Chen; Wei, Chaochun.

BMC Genomics ; 18(1): 274, 2017 04 04.

Artigo em Inglês | MEDLINE | ID: mdl-28376762

RESUMO

BACKGROUND: A fundamental concept in biology is that heritable material is passed from parents to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic materials between different species. Horizontal gene transfer has been found prevalent in prokaryotes but very rare in eukaryote. In this paper, we investigate horizontal gene transfer in the human genome. RESULTS: From the pair-wise alignments between human genome and 53 vertebrate genomes, 1,467 human genome regions (2.6 M bases) from all chromosomes were found to be more conserved with non-mammals than with most mammals. These human genome regions involve 642 known genes, which are enriched with ion binding. Compared to known horizontal gene transfer regions in the human genome, there were few overlapping regions, which indicated horizontal gene transfer is more common than we expected in the human genome. CONCLUSIONS: Horizontal gene transfer impacts hundreds of human genes and this study provided insight into potential mechanisms of HGT in the human genome.

Assuntos

Transferência Genética Horizontal , Genoma Humano , Composição de Bases , Cromossomos Humanos/genética , Biologia Computacional , Evolução Molecular , Humanos , Filogenia , Análise de Sequência de DNA

16.

Whole-genome sequencing of six dog breeds from continuous altitudes reveals adaptation to high-altitude hypoxia.

Gou, Xiao; Wang, Zhen; Li, Ning; Qiu, Feng; Xu, Ze; Yan, Dawei; Yang, Shuli; Jia, Jia; Kong, Xiaoyan; Wei, Zehui; Lu, Shaoxiong; Lian, Linsheng; Wu, Changxin; Wang, Xueyan; Li, Guozhi; Ma, Teng; Jiang, Qiang; Zhao, Xue; Yang, Jiaqiang; Liu, Baohong; Wei, Dongkai; Li, Hong; Yang, Jianfa; Yan, Yulin; Zhao, Guiying; Dong, Xinxing; Li, Mingli; Deng, Weidong; Leng, Jing; Wei, Chaochun; Wang, Chuan; Mao, Huaming; Zhang, Hao; Ding, Guohui; Li, Yixue.

Genome Res ; 24(8): 1308-15, 2014 Aug.

Artigo em Inglês | MEDLINE | ID: mdl-24721644

RESUMO

The hypoxic environment imposes severe selective pressure on species living at high altitude. To understand the genetic bases of adaptation to high altitude in dogs, we performed whole-genome sequencing of 60 dogs including five breeds living at continuous altitudes along the Tibetan Plateau from 800 to 5100 m as well as one European breed. More than 150× sequencing coverage for each breed provides us with a comprehensive assessment of the genetic polymorphisms of the dogs, including Tibetan Mastiffs. Comparison of the breeds from different altitudes reveals strong signals of population differentiation at the locus of hypoxia-related genes including endothelial Per-Arnt-Sim (PAS) domain protein 1 (EPAS1) and beta hemoglobin cluster. Notably, four novel nonsynonymous mutations specific to high-altitude dogs are identified at EPAS1, one of which occurred at a quite conserved site in the PAS domain. The association testing between EPAS1 genotypes and blood-related phenotypes on additional high-altitude dogs reveals that the homozygous mutation is associated with decreased blood flow resistance, which may help to improve hemorheologic fitness. Interestingly, EPAS1 was also identified as a selective target in Tibetan highlanders, though no amino acid changes were found. Thus, our results not only indicate parallel evolution of humans and dogs in adaptation to high-altitude hypoxia, but also provide a new opportunity to study the role of EPAS1 in the adaptive processes.

Assuntos

Adaptação Fisiológica/genética , Cães/genética , Altitude , Sequência de Aminoácidos , Animais , Sequência de Bases , Fatores de Transcrição Hélice-Alça-Hélice Básicos/genética , Hipóxia Celular , Análise Mutacional de DNA , Genoma , Sequenciamento de Nucleotídeos em Larga Escala , Dados de Sequência Molecular , Polimorfismo de Nucleotídeo Único

17.

MOST+: A de novo motif finding approach combining genomic sequence and heterogeneous genome-wide signatures.

Zhang, Yizhe; He, Yupeng; Zheng, Guangyong; Wei, Chaochun.

BMC Genomics ; 16 Suppl 7: S13, 2015.

Artigo em Inglês | MEDLINE | ID: mdl-26099518

RESUMO

BACKGROUND: Motifs are regulatory elements that will activate or inhibit the expression of related genes when proteins (such as transcription factors, TFs) bind to them. Therefore, motif finding is important to understand the mechanisms of gene regulation. De novo discovery of regulatory elements, like transcription factor binding sites (TFBSs), has long been a major challenge to gain insight on mechanisms of gene regulation. Recent advances in experimental profiling of genome-wide signals such as histone modifications and DNase I hypersensitivity sites allow scientists to develop better computational methods to enhance motif discovery. However, existing methods for motif finding suffer from high false positive rates and slow speed, and it's difficult to evaluate the performance of these methods systematically. RESULT: Here we present MOST+, a motif finder integrating genomic sequences and genome-wide signals such as intensity and shape features from histone modification marks and DNase I hypersensitivity sites, to improve the prediction accuracy. MOST+ can detect motifs from a large input sequence of about 100 Mbs within a few minutes. Systematic comparison method has been established and MOST+ has been compared with existing methods. CONCLUSION: MOST+ is a fast and accurate de novo method for motif finding by integrating genomic sequence and experimental signals as clues.

Assuntos

Algoritmos , Biologia Computacional/métodos , Elementos Reguladores de Transcrição , Animais , Bases de Dados Genéticas , Epigênese Genética , Regulação da Expressão Gênica , Genoma , Humanos , Camundongos

18.

Towards a deeper haplotype mining of complex traits in rice with RFGB v2.0.

Wang, Chun-Chao; Yu, Hong; Huang, Ji; Wang, Wen-Sheng; Faruquee, Muhiuddin; Zhang, Fan; Zhao, Xiu-Qin; Fu, Bin-Ying; Chen, Kai; Zhang, Hong-Liang; Tai, Shuai-Shuai; Wei, Chaochun; McNally, Kenneth L; Alexandrov, Nickolai; Gao, Xiu-Ying; Li, Jiayang; Li, Zhi-Kang; Xu, Jian-Long; Zheng, Tian-Qing.

Plant Biotechnol J ; 18(1): 14-16, 2020 01.

Artigo em Inglês | MEDLINE | ID: mdl-31336017

Assuntos

Bases de Dados Genéticas , Haplótipos , Herança Multifatorial , Oryza/genética , Mineração de Dados , Fenótipo

19.

4D-QSAR and MIA-QSAR Studies of Aminobenzimidazole Derivatives as Fourth-generation EGFR Inhibitors.

Jia, Xuegong; Wei, Chaochun; Tian, Nana; Yan, Hong; Wang, Hongjun.

Med Chem ; 20(2): 140-152, 2024.

Artigo em Inglês | MEDLINE | ID: mdl-37957859

RESUMO

BACKGROUND: The epidermal growth factor receptor (EGFR) protein has been intensively studied as a therapeutic target for non-small cell lung cancer (NSCLC). The aminobenzimidazole derivatives as the fourth-generation EGFR inhibitors have achieved promising results and overcame EGFR mutations at C797S, del19 and T790M in NSCLC. OBJECTIVE: In order to understand the quantitative structure-activity relationship (QSAR) of aminobenzimidazole derivatives as EGFRdel19 T790M C797S inhibitors, the four-dimensional QSAR (4D-QSAR) and multivariate image analysis (MIA-QSAR) have been performed on the data of 45 known aminobenzimidazole derivatives. METHODS: The 4D-QSAR descriptors were acquired by calculating the association energies between probes and aligned conformational ensemble profiles (CEP), and the regression models were established by partial least squares (PLS). In order to further understand and verify the 4D-QSAR model, MIA-QSAR was constructed by using chemical structure pictures to generate descriptors and PLS regression. Furthermore, the molecular docking and averaged noncovalent interactions (aNCI) analysis were also performed to further understand the interactions between ligands and the EGFR targets, which was in good agreement with the 4D-QSAR model. RESULTS: The established 4D-QSAR and MIA-QSAR models have strong stability and good external prediction ability. CONCLUSION: These results will provide theoretical guidance for the research and development of aminobenzimidazole derivatives as new EGFRdel19 T790M C797S inhibitors.

Assuntos

Carcinoma Pulmonar de Células não Pequenas , Neoplasias Pulmonares , Humanos , Relação Quantitativa Estrutura-Atividade , Simulação de Acoplamento Molecular , Receptores ErbB/genética , Inibidores de Proteínas Quinases/farmacologia , Inibidores de Proteínas Quinases/química , Mutação , Resistencia a Medicamentos Antineoplásicos

20.

4D-QSAR, ADMET properties, and molecular dynamics simulations for designing N-substituted urea/thioureas as human glutaminyl cyclase inhibitors.

Wei, Chaochun; Zhang, Haolin; Niu, Lexuan; Zhong, Qidi; Yan, Hong; Wang, Juan.

Comput Biol Chem ; 112: 108131, 2024 Jun 30.

Artigo em Inglês | MEDLINE | ID: mdl-38968781

RESUMO

Human glutaminyl cyclase (hQC) inhibitors have great potential to be used as anti- Alzheimer's disease (AD) agents by reducing the toxic pyroform of ß-amyloid in the brains of AD patients. The four-dimensional quantitative structure activity relationship (4D-QSAR) model of N-substituted urea/thioureas was established with satisfying predictive ability and statistical reliability (Q2 = 0.521, R2 = 0.933, R2prep = 0.619). By utilizing the developed 4D-QSAR model, a set of new N-substituted urea/thioureas was designed and evaluated for their Absorption Distribution Metabolism Excretion and Toxicity (ADMET) properties. The results of molecular dynamics (MD) simulations, Principal component analysis (PCA), free energy landscape (FEL), dynamic cross-correlation matrix (DCCM) and molecular mechanics generalized Born Poisson-Boltzmann surface area (MM-PBSA) free energy calculations, revealed that the designed compounds were remained stable in protein binding pocket and compounds b â¼ f (-35.1 to -44.55 kcal/mol) showed higher binding free energy than that of compound 14 (-33.51 kcal/mol). The findings of this work will be a theoretical foundation for further research and experimental validation of urea/thiourea derivatives as hQC inhibitors.

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

Assuntos

RESUMO

Assuntos

RESUMO

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA