Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 55
Filtrar
Más filtros

Banco de datos
País/Región como asunto
Tipo del documento
Intervalo de año de publicación
1.
Genome Res ; 32(5): 853-863, 2022 05.
Artículo en Inglés | MEDLINE | ID: mdl-35396275

RESUMEN

The concept of pan-genome, which is the collection of all genomes from a population, has shown a great potential in genomics study, especially for crop sciences. The rice pan-genome constructed from the second-generation sequencing (SGS) data is about 270 Mb larger than Nipponbare, the rice reference genome (NipRG), but it is still disadvantaged by incompleteness and loss of genomic contexts. The third-generation sequencing (TGS) with long reads can help to construct better pan-genomes. In this paper, we report a high-quality rice pan-genome construction method by introducing a series of new steps to deal with the long-read data, including unmapped sequence block filtering, redundancy removing, and sequence block elongating. Compared to NipRG, the long-read sequencing-based pan-genome constructed from 105 rice accessions, which contains 604 Mb novel sequences, is much more comprehensive than the one constructed from ∼3000 rice genomes sequenced with short reads. The repetitive sequences are the main components of novel sequences, which partially explain the differences between the pan-genomes based on TGS and SGS. Adding six wild rice accessions, there are about 879 Mb novel sequences and 19,000 novel genes in the rice pan-genome in total. In addition, we have created high-quality reference genomes for all representative rice populations, including five gapless reference genomes. This study has made significant progress in our understanding of the rice pan-genome, and this pan-genome construction method for long-read data can be applied to accelerate a broad range of genomics studies.


Asunto(s)
Oryza , Genoma , Genómica/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Oryza/genética , Análisis de Secuencia de ADN
2.
BMC Genomics ; 25(1): 405, 2024 Apr 24.
Artículo en Inglés | MEDLINE | ID: mdl-38658835

RESUMEN

Graph-based pangenome is gaining more popularity than linear pangenome because it stores more comprehensive information of variations. However, traditional linear genome browser has its own advantages, especially the tremendous resources accumulated historically. With the fast-growing number of individual genomes and their annotations available, the demand for a genome browser to visualize genome annotation for many individuals together with a graph-based pangenome is getting higher and higher. Here we report a new pangenome browser PPanG, a precise pangenome browser enabling nucleotide-level comparison of individual genome annotations together with a graph-based pangenome. Nine rice genomes with annotations were provided by default as potential references, and any individual genome can be selected as the reference. Our pangenome browser provides unprecedented insights on genome variations at different levels from base to gene, and reveals how the structures of a gene could differ for individuals. PPanG can be applied to any species with multiple individual genomes available and it is available at https://cgm.sjtu.edu.cn/PPanG .


Asunto(s)
Genómica , Genómica/métodos , Oryza/genética , Anotación de Secuencia Molecular , Genoma de Planta , Variación Genética , Programas Informáticos , Navegador Web , Bases de Datos Genéticas , Nucleótidos/genética , Genoma
3.
Hum Mol Genet ; 30(22): 2110-2122, 2021 11 01.
Artículo en Inglés | MEDLINE | ID: mdl-34196368

RESUMEN

The well-established functions of UHRF1 converge to DNA biological processes, as exemplified by DNA methylation maintenance and DNA damage repair during cell cycles. However, the potential effect of UHRF1 on RNA metabolism is largely unexplored. Here, we revealed that UHRF1 serves as a novel alternative RNA splicing regulator. The protein interactome of UHRF1 identified various splicing factors. Among them, SF3B3 could interact with UHRF1 directly and participate in UHRF1-regulated alternative splicing events. Furthermore, we interrogated the RNA interactome of UHRF1, and surprisingly, we identified U snRNAs, the canonical spliceosome components, in the purified UHRF1 complex. Unexpectedly, we found H3R2 methylation status determines the binding preference of U snRNAs, especially U2 snRNAs. The involvement of U snRNAs in UHRF1-containing complex and their binding preference to specific chromatin configuration imply a finely orchestrated mechanism at play. Our results provided the resources and pinpointed the molecular basis of UHRF1-mediated alternative RNA splicing, which will help us better our understanding of the physiological and pathological roles of UHRF1 in disease development.


Asunto(s)
Empalme Alternativo , Proteínas Potenciadoras de Unión a CCAAT/metabolismo , Histonas/metabolismo , Factores de Empalme de ARN/metabolismo , ARN Nuclear Pequeño/genética , Ubiquitina-Proteína Ligasas/metabolismo , Proteínas Potenciadoras de Unión a CCAAT/genética , Humanos , Metilación , Complejos Multiproteicos , Conformación de Ácido Nucleico , Unión Proteica , ARN Nuclear Pequeño/metabolismo , Ubiquitina-Proteína Ligasas/genética
4.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34323927

RESUMEN

With the development of genome-wide association studies, how to gain information from a large scale of data has become an issue of common concern, since traditional methods are not fully developed to solve problems such as identifying loci-to-loci interactions (also known as epistasis). Previous epistatic studies mainly focused on local information with a single outcome (phenotype), while in this paper, we developed a two-stage global search algorithm, Greedy Equivalence Search with Local Modification (GESLM), to implement a global search of directed acyclic graph in order to identify genome-wide epistatic interactions with multiple outcome variables (phenotypes) in a case-control design. GESLM integrates the advantages of score-based methods and constraint-based methods to learn the phenotype-related Bayesian network and is powerful and robust to find the interaction structures that display both genetic associations with phenotypes and gene interactions. We compared GESLM with some common phenotype-related loci detecting methods in simulation studies. The results showed that our method improved the accuracy and efficiency compared with others, especially in an unbalanced case-control study. Besides, its application on the UK Biobank dataset suggested that our algorithm has great performance when handling genome-wide association data with more than one phenotype.


Asunto(s)
Algoritmos , Estudio de Asociación del Genoma Completo , Fenotipo , Polimorfismo de Nucleótido Simple , Teorema de Bayes , Conjuntos de Datos como Asunto , Humanos
5.
J Org Chem ; 88(19): 13946-13955, 2023 Oct 06.
Artículo en Inglés | MEDLINE | ID: mdl-37676850

RESUMEN

In this study, the visible-light-driven [2 + 2] photocycloaddition of 1,4-dihydropyrazines in solution was reported. The N,N'-diacyl-1,4-dihydropyrazines with different substituents showed completely different reactivity under the irradiation of a 430 nm blue light-emitting diode (LED) lamp. N,N'-Diacetyl-1,4-dihydropyrazine and N,N'-dipropionyl-1,4-dihydropyrazine were the only compounds capable of undergoing a [2 + 2] photocycloaddition reaction, yielding syn-dimers and cage-dimers (known as 3,6,9,12-tetraazatetraasteranes) with overall yields of 76 and 83%, correspondingly. The substituent-reactivity effect on [2 + 2] photocycloaddition of N,N'-diacyl-1,4-dihydropyrazines was investigated by density functional theory calculations. The results show that the substituents have little influence on Gibbs free energy for the [2 + 2] photocycloaddition and mainly affect the excited energy, reaction sites, and the triplet excited-state structures of 1,4-dihydropyrazines, which are closely related to whether the reaction occurs. The results offer insights into the photochemical reactivity of 1,4-dihydropyrazines and an approach for constructing dimers of N,N'-diacyl-1,4-dihydropyrazines through a solution-based visible-light-driven [2 + 2] photocycloaddition, especially for the construction of 3,6,9,12-tetraazatetraasteranes. Compared with the solid-state [2 + 2] photocycloaddition of 1,4-dihydropyrazine, this photocycloaddition will be an efficient and environmentally friendly method for synthesizing tetraazatetraasteranes with the advantages of milder reaction conditions, simple operation, adjustable reaction amounts by omitting the cocrystal growth step, etc.

6.
J Cell Biochem ; 122(10): 1428-1434, 2021 10.
Artículo en Inglés | MEDLINE | ID: mdl-34132422

RESUMEN

Interpreting functional analysis results derived from environmental samples using direct sequencing meta-omics data, including metagenomics and meta-transcriptomics data, is challenging due to their complexity. Visualization of functional analysis results can help researchers discover relevant biological insights. Despite the availability of many R packages, there lacks interactive and comprehensive graphic systems for displaying functional terms and corresponding genes in meta-omics analysis results. Here, we present ivTerm, an R-shiny package with a user-friendly graphical interface that enables users to inspect functional annotations, compare results across multiple experiments, create customized charts, and download these charts. It provides various basic and innovative chart types to visualize functional terms and involved genes. Users can also browse the description of terms obtained from the database web servers automatically. Two examples, including a metagenome analysis data for human gut and a meta-transcriptome data for coral symbiomes, are given to show the usage of ivTerm. In the end, we compared ivTerm with existing tools with similar functions, such as GOplot, ViSEAGO, and Chordomics. The tool ivTerm is convenient and efficient for biologists to gain an integrated view and develop deep insights by interactive analysis of meta-omics data. It can accelerate the procedure to develop insights from complex meta-omics data. The code for ivTerm is freely available at https://github.com/SJTU-CGM/ivTerm.


Asunto(s)
Biología Computacional/métodos , Gráficos por Computador , Visualización de Datos , Programas Informáticos , Interpretación Estadística de Datos , Bases de Datos Factuales , Perfilación de la Expresión Génica , Redes Reguladoras de Genes , Genómica/métodos , Humanos , Metabolómica/métodos , Metagenoma , Transcriptoma
7.
Biol Pharm Bull ; 44(7): 999-1006, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34193695

RESUMEN

Flavonoids are potential strikingly natural compounds with antioxidant activity and acetylcholinesterase (AChE) inhibitory activity for treating Alzheimer's disease (AD). In present study, in line with our interests in flavonoid derivatives as AChE inhibitors, a four-dimensional quantitative structure-activity relationship (4D-QSAR) molecular model was proposed. The data required to perform 4D-QSAR analysis includes 52 compounds reported in the literature, usually analogs, and their measured biological activities in a common assay. The model was generated by a complete set of 4D-QSAR program which was written by our group. The best model was found after trying multiple experiments. It had a good predictive ability with the cross-validation correlation coefficient Q2 = 0.77, the internal validation correlation coefficient R2 = 0.954, and the external validation correlation coefficient R2pred = 0.715. The molecular docking analysis was also carried out to understand exceedingly the interactions between flavonoids and the AChE targets, which was in good agreement with the 4D-QSAR model. Based on the information provided by the 4D-QSAR model and molecular docking analysis, the idea for optimizing the structures of flavonoids as AChE inhibitors was put forward which maybe provide theoretical guidance for the research and development of new AChE inhibitors.


Asunto(s)
Inhibidores de la Colinesterasa/química , Flavonoides/química , Modelos Moleculares , Relación Estructura-Actividad Cuantitativa
8.
BMC Bioinformatics ; 21(1): 468, 2020 Oct 20.
Artículo en Inglés | MEDLINE | ID: mdl-33081690

RESUMEN

BACKGROUND: Current taxonomic classification tools use exact string matching algorithms that are effective to tackle the data from the next generation sequencing technology. However, the unique error patterns in the third generation sequencing (TGS) technologies could reduce the accuracy of these programs. RESULTS: We developed a Classification tool using Discriminative K-mers and Approximate Matching algorithm (CDKAM). This approximate matching method was used for searching k-mers, which included two phases, a quick mapping phase and a dynamic programming phase. Simulated datasets as well as real TGS datasets have been tested to compare the performance of CDKAM with existing methods. We showed that CDKAM performed better in many aspects, especially when classifying TGS data with average length 1000-1500 bases. CONCLUSIONS: CDKAM is an effective program with higher accuracy and lower memory requirement for TGS metagenome sequence classification. It produces a high species-level accuracy.


Asunto(s)
Algoritmos , Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Humanos
9.
BMC Bioinformatics ; 20(1): 352, 2019 Jun 21.
Artículo en Inglés | MEDLINE | ID: mdl-31226925

RESUMEN

BACKGROUND: Third-generation sequencing platforms, such as PacBio sequencing, have been developed rapidly in recent years. PacBio sequencing generates much longer reads than the second-generation sequencing (or the next generation sequencing, NGS) technologies and it has unique sequencing error patterns. An effective read simulator is essential to evaluate and promote the development of new bioinformatics tools for PacBio sequencing data analysis. RESULTS: We developed a new PacBio Sequencing Simulator (PaSS). It can learn sequence patterns from PacBio sequencing data currently available. In addition to the distribution of read lengths and error rates, we included a context-specific sequencing error model. Compared to existing PacBio sequencing simulators such as PBSIM, LongISLND and NPBSS, PaSS performed better in many aspects. Assembly tests also suggest that reads simulated by PaSS are the most similar to experimental sequencing data. CONCLUSION: PaSS is an effective sequence simulator for PacBio sequencing. It will facilitate the evaluation and development of new analysis tools for the third-generation sequencing data.


Asunto(s)
Secuenciación de Nucleótidos de Alto Rendimiento/métodos , Análisis de Secuencia de ADN , Programas Informáticos , Animales , Arabidopsis/genética , Caenorhabditis elegans/genética , Simulación por Computador , Escherichia coli/genética
10.
BMC Genomics ; 20(1): 595, 2019 Jul 19.
Artículo en Inglés | MEDLINE | ID: mdl-31324156

RESUMEN

BACKGROUND: Diversity-generating retroelements (DGRs) are a unique family of retroelements that generate sequence diversity of DNA to benefit their hosts by introducing variations and accelerating the evolution of target proteins. They exist widely in bacteria, archaea, phage and plasmid. However, our understanding about DGRs in natural environments was still very limited. RESULTS: We developed an efficient computational algorithm to identify DGRs, and applied it to characterize DGRs in more than 80,000 sequenced bacterial genomes as well as more than 4,000 human metagenome datasets. In total, we identified 948 non-redundant DGRs, which expanded the number of known DGRs in bacterial genomes and human microbiomes by about 55%, and provided a much more comprehensive reference for the study of DGRs. Phylogenetic analysis was done for identified DGRs. The putative target genes of DGRs were searched, and the functions of these target genes were investigated with a comprehensive alignment against the nr database. CONCLUSIONS: DGR system is a powerful and universal mechanism to generate diversity. DGR evolution is closely associated with the living environment and their cassette structures. Furthermore, it may impact a wide range of functional processes in addition to receptor-binding. These results significantly improved our understanding about DGRs.


Asunto(s)
Evolución Molecular , Variación Genética , Genómica , Metagenoma/genética , Retroelementos/genética , Algoritmos , Bacterias/genética , Humanos , Microbiota/genética
12.
Nucleic Acids Res ; 45(2): 597-605, 2017 01 25.
Artículo en Inglés | MEDLINE | ID: mdl-27940610

RESUMEN

A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k.


Asunto(s)
Genoma de Planta , Genómica , Oryza/genética , Programas Informáticos , Biología Computacional/métodos , Bases de Datos Genéticas , Genómica/métodos , Anotación de Secuencia Molecular , Navegador Web
13.
Bioinformatics ; 33(15): 2408-2409, 2017 Aug 01.
Artículo en Inglés | MEDLINE | ID: mdl-28369371

RESUMEN

SUMMARY: Pan-genome analyses are routinely carried out for bacteria to interpret the within-species gene presence/absence variations (PAVs). However, pan-genome analyses are rare for eukaryotes due to the large sizes and higher complexities of their genomes. Here we proposed EUPAN, a eukaryotic pan-genome analysis toolkit, enabling automatic large-scale eukaryotic pan-genome analyses and detection of gene PAVs at a relatively low sequencing depth. In the previous studies, we demonstrated the effectiveness and high accuracy of EUPAN in the pan-genome analysis of 453 rice genomes, in which we also revealed widespread gene PAVs among individual rice genomes. Moreover, EUPAN can be directly applied to the current re-sequencing projects primarily focusing on single nucleotide polymorphisms. AVAILABILITY AND IMPLEMENTATION: EUPAN is implemented in Perl, R and C ++. It is supported under Linux and preferred for a computer cluster with LSF and SLURM job scheduling system. EUPAN together with its standard operating procedure (SOP) is freely available for non-commercial use (CC BY-NC 4.0) at http://cgm.sjtu.edu.cn/eupan/index.html . CONTACT: ccwei@sjtu.edu.cn or jianxin.shi@sjtu.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Eucariontes/genética , Genética de Población/métodos , Genoma , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Genómica/métodos , Técnicas de Genotipaje/métodos , Secuenciación de Nucleótidos de Alto Rendimiento , Polimorfismo de Nucleótido Simple
14.
Parasitol Res ; 117(5): 1549-1558, 2018 May.
Artículo en Inglés | MEDLINE | ID: mdl-29568977

RESUMEN

Schistosomiasis is a neglected tropical disease caused by trematode of the genus Schistosoma. Successful reproductive development is critical for the production of eggs, which are responsible for host pathology and disease dissemination. Endogenous small non-coding RNAs play important roles in many biological processes such as protection against foreign pathogens, cell differentiation, and chromosomal stability by regulating target gene expression at the transcriptional and post-transcriptional levels. In this study, we performed in silico analysis of endogenous small non-coding RNAs in different stages, and sex of S. japonicum focusing on endogenous small interfering RNAs (endo-siRNAs) generated from transposable elements (TEs) and natural antisense transcripts (NATs). Both total and unique siRNA populations show 18-30 nt in length, but the predominant size was 20 nt and the leading first base was adenosine. Sense TE-derived endo-siRNAs reads were higher than antisense reads at different relative positions of TEs, whereas no such difference was observed for NAT-derived endo-siRNAs. TE- and NAT-derived endo-siRNAs were more enriched in the male compared to female worms, with the higher relative expression in early phase of pairing. Putative targets of endo-siRNAs indicated more of them in males (106 and 66) than in females (6 and 23) for TE- and NAT-derived endo-siRNAs, respectively. Our preliminary study revealed vital role of endo-siRNAs during the reproductive development of S. japonicum and provide clues for putative novel targets to suppress worm reproduction and direction for effective anti-schistosomal drug development.


Asunto(s)
Elementos Transponibles de ADN/genética , ARN Interferente Pequeño/genética , Schistosoma japonicum/genética , Animales , Simulación por Computador , Femenino , Humanos , Masculino , Esquistosomiasis/parasitología , Esquistosomiasis/patología , Esquistosomiasis/transmisión
15.
BMC Genomics ; 18(1): 274, 2017 04 04.
Artículo en Inglés | MEDLINE | ID: mdl-28376762

RESUMEN

BACKGROUND: A fundamental concept in biology is that heritable material is passed from parents to offspring, a process called vertical gene transfer. An alternative mechanism of gene acquisition is through horizontal gene transfer (HGT), which involves movement of genetic materials between different species. Horizontal gene transfer has been found prevalent in prokaryotes but very rare in eukaryote. In this paper, we investigate horizontal gene transfer in the human genome. RESULTS: From the pair-wise alignments between human genome and 53 vertebrate genomes, 1,467 human genome regions (2.6 M bases) from all chromosomes were found to be more conserved with non-mammals than with most mammals. These human genome regions involve 642 known genes, which are enriched with ion binding. Compared to known horizontal gene transfer regions in the human genome, there were few overlapping regions, which indicated horizontal gene transfer is more common than we expected in the human genome. CONCLUSIONS: Horizontal gene transfer impacts hundreds of human genes and this study provided insight into potential mechanisms of HGT in the human genome.


Asunto(s)
Transferencia de Gen Horizontal , Genoma Humano , Composición de Base , Cromosomas Humanos/genética , Biología Computacional , Evolución Molecular , Humanos , Filogenia , Análisis de Secuencia de ADN
16.
Genome Res ; 24(8): 1308-15, 2014 Aug.
Artículo en Inglés | MEDLINE | ID: mdl-24721644

RESUMEN

The hypoxic environment imposes severe selective pressure on species living at high altitude. To understand the genetic bases of adaptation to high altitude in dogs, we performed whole-genome sequencing of 60 dogs including five breeds living at continuous altitudes along the Tibetan Plateau from 800 to 5100 m as well as one European breed. More than 150× sequencing coverage for each breed provides us with a comprehensive assessment of the genetic polymorphisms of the dogs, including Tibetan Mastiffs. Comparison of the breeds from different altitudes reveals strong signals of population differentiation at the locus of hypoxia-related genes including endothelial Per-Arnt-Sim (PAS) domain protein 1 (EPAS1) and beta hemoglobin cluster. Notably, four novel nonsynonymous mutations specific to high-altitude dogs are identified at EPAS1, one of which occurred at a quite conserved site in the PAS domain. The association testing between EPAS1 genotypes and blood-related phenotypes on additional high-altitude dogs reveals that the homozygous mutation is associated with decreased blood flow resistance, which may help to improve hemorheologic fitness. Interestingly, EPAS1 was also identified as a selective target in Tibetan highlanders, though no amino acid changes were found. Thus, our results not only indicate parallel evolution of humans and dogs in adaptation to high-altitude hypoxia, but also provide a new opportunity to study the role of EPAS1 in the adaptive processes.


Asunto(s)
Adaptación Fisiológica/genética , Perros/genética , Altitud , Secuencia de Aminoácidos , Animales , Secuencia de Bases , Factores de Transcripción con Motivo Hélice-Asa-Hélice Básico/genética , Hipoxia de la Célula , Análisis Mutacional de ADN , Genoma , Secuenciación de Nucleótidos de Alto Rendimiento , Datos de Secuencia Molecular , Polimorfismo de Nucleótido Simple
17.
BMC Genomics ; 16 Suppl 7: S13, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26099518

RESUMEN

BACKGROUND: Motifs are regulatory elements that will activate or inhibit the expression of related genes when proteins (such as transcription factors, TFs) bind to them. Therefore, motif finding is important to understand the mechanisms of gene regulation. De novo discovery of regulatory elements, like transcription factor binding sites (TFBSs), has long been a major challenge to gain insight on mechanisms of gene regulation. Recent advances in experimental profiling of genome-wide signals such as histone modifications and DNase I hypersensitivity sites allow scientists to develop better computational methods to enhance motif discovery. However, existing methods for motif finding suffer from high false positive rates and slow speed, and it's difficult to evaluate the performance of these methods systematically. RESULT: Here we present MOST+, a motif finder integrating genomic sequences and genome-wide signals such as intensity and shape features from histone modification marks and DNase I hypersensitivity sites, to improve the prediction accuracy. MOST+ can detect motifs from a large input sequence of about 100 Mbs within a few minutes. Systematic comparison method has been established and MOST+ has been compared with existing methods. CONCLUSION: MOST+ is a fast and accurate de novo method for motif finding by integrating genomic sequence and experimental signals as clues.


Asunto(s)
Algoritmos , Biología Computacional/métodos , Elementos Reguladores de la Transcripción , Animales , Bases de Datos Genéticas , Epigénesis Genética , Regulación de la Expresión Génica , Genoma , Humanos , Ratones
19.
Med Chem ; 20(2): 140-152, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-37957859

RESUMEN

BACKGROUND: The epidermal growth factor receptor (EGFR) protein has been intensively studied as a therapeutic target for non-small cell lung cancer (NSCLC). The aminobenzimidazole derivatives as the fourth-generation EGFR inhibitors have achieved promising results and overcame EGFR mutations at C797S, del19 and T790M in NSCLC. OBJECTIVE: In order to understand the quantitative structure-activity relationship (QSAR) of aminobenzimidazole derivatives as EGFRdel19 T790M C797S inhibitors, the four-dimensional QSAR (4D-QSAR) and multivariate image analysis (MIA-QSAR) have been performed on the data of 45 known aminobenzimidazole derivatives. METHODS: The 4D-QSAR descriptors were acquired by calculating the association energies between probes and aligned conformational ensemble profiles (CEP), and the regression models were established by partial least squares (PLS). In order to further understand and verify the 4D-QSAR model, MIA-QSAR was constructed by using chemical structure pictures to generate descriptors and PLS regression. Furthermore, the molecular docking and averaged noncovalent interactions (aNCI) analysis were also performed to further understand the interactions between ligands and the EGFR targets, which was in good agreement with the 4D-QSAR model. RESULTS: The established 4D-QSAR and MIA-QSAR models have strong stability and good external prediction ability. CONCLUSION: These results will provide theoretical guidance for the research and development of aminobenzimidazole derivatives as new EGFRdel19 T790M C797S inhibitors.


Asunto(s)
Carcinoma de Pulmón de Células no Pequeñas , Neoplasias Pulmonares , Humanos , Relación Estructura-Actividad Cuantitativa , Simulación del Acoplamiento Molecular , Receptores ErbB/genética , Inhibidores de Proteínas Quinasas/farmacología , Inhibidores de Proteínas Quinasas/química , Mutación , Resistencia a Antineoplásicos
20.
Comput Biol Chem ; 112: 108131, 2024 Jun 30.
Artículo en Inglés | MEDLINE | ID: mdl-38968781

RESUMEN

Human glutaminyl cyclase (hQC) inhibitors have great potential to be used as anti- Alzheimer's disease (AD) agents by reducing the toxic pyroform of ß-amyloid in the brains of AD patients. The four-dimensional quantitative structure activity relationship (4D-QSAR) model of N-substituted urea/thioureas was established with satisfying predictive ability and statistical reliability (Q2 = 0.521, R2 = 0.933, R2prep = 0.619). By utilizing the developed 4D-QSAR model, a set of new N-substituted urea/thioureas was designed and evaluated for their Absorption Distribution Metabolism Excretion and Toxicity (ADMET) properties. The results of molecular dynamics (MD) simulations, Principal component analysis (PCA), free energy landscape (FEL), dynamic cross-correlation matrix (DCCM) and molecular mechanics generalized Born Poisson-Boltzmann surface area (MM-PBSA) free energy calculations, revealed that the designed compounds were remained stable in protein binding pocket and compounds b ∼ f (-35.1 to -44.55 kcal/mol) showed higher binding free energy than that of compound 14 (-33.51 kcal/mol). The findings of this work will be a theoretical foundation for further research and experimental validation of urea/thiourea derivatives as hQC inhibitors.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA