Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 13 de 13
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Methods ; 229: 125-132, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-38964595

RESUMEN

DNase I hypersensitive sites (DHSs) are chromatin regions highly sensitive to DNase I enzymes. Studying DHSs is crucial for understanding complex transcriptional regulation mechanisms and localizing cis-regulatory elements (CREs). Numerous studies have indicated that disease-related loci are often enriched in DHSs regions, underscoring the importance of identifying DHSs. Although wet experiments exist for DHSs identification, they are often labor-intensive. Therefore, there is a strong need to develop computational methods for this purpose. In this study, we used experimental data to construct a benchmark dataset. Seven feature extraction methods were employed to capture information about human DHSs. The F-score was applied to filter the features. By comparing the prediction performance of various classification algorithms through five-fold cross-validation, random forest was proposed to perform the final model construction. The model could produce an overall prediction accuracy of 0.859 with an AUC value of 0.837. We hope that this model can assist scholars conducting DNase research in identifying these sites.


Asunto(s)
Cromatina , Desoxirribonucleasa I , Genoma Humano , Humanos , Desoxirribonucleasa I/metabolismo , Desoxirribonucleasa I/genética , Desoxirribonucleasa I/química , Cromatina/genética , Cromatina/metabolismo , Cromatina/química , Biología Computacional/métodos , Algoritmos , Secuencias Reguladoras de Ácidos Nucleicos/genética
2.
Int J Biol Macromol ; : 134146, 2024 Jul 25.
Artículo en Inglés | MEDLINE | ID: mdl-39067723

RESUMEN

Liquid-liquid phase separation (LLPS) regulates many biological processes including RNA metabolism, chromatin rearrangement, and signal transduction. Aberrant LLPS potentially leads to serious diseases. Therefore, the identification of the LLPS proteins is crucial. Traditionally, biochemistry-based methods for identifying LLPS proteins are costly, time-consuming, and laborious. In contrast, artificial intelligence-based approaches are fast and cost-effective and can be a better alternative to biochemistry-based methods. Previous research methods employed word2vec in conjunction with machine learning or deep learning algorithms. Although word2vec captures word semantics and relationships, it might not be effective in capturing features relevant to protein classification, like physicochemical properties, evolutionary relationships, or structural features. Additionally, other studies often focused on a limited set of features for model training, including planar π contact frequency, pi-pi, and ß-pairing propensities. To overcome such shortcomings, this study first constructed a reliable dataset containing 1206 protein sequences, including 603 LLPS and 603 non-LLPS protein sequences. Then a computational model was proposed to efficiently identify the LLPS proteins by perceiving semantic information of protein sequences directly; using an ESM2-36 pre-trained model based on transformer architecture in conjunction with a convolutional neural network. The model could achieve an accuracy of 85.86 % and 89.26 %, respectively on training data and test data, surpassing the accuracy of previous studies. The performance demonstrates the potential of our computational methods as efficient alternatives for identifying LLPS proteins.

3.
PLoS Comput Biol ; 19(6): e1011218, 2023 06.
Artículo en Inglés | MEDLINE | ID: mdl-37289843

RESUMEN

Synthetic lethality (SL) occurs when mutations in two genes together lead to cell or organism death, while a single mutation in either gene does not have a significant impact. This concept can also be extended to three or more genes for SL. Computational and experimental methods have been developed to predict and verify SL gene pairs, especially for yeast and Escherichia coli. However, there is currently a lack of a specialized platform to collect microbial SL gene pairs. Therefore, we designed a synthetic interaction database for microbial genetics that collects 13,313 SL and 2,994 Synthetic Rescue (SR) gene pairs that are reported in the literature, as well as 86,981 putative SL pairs got through homologous transfer method in 281 bacterial genomes. Our database website provides multiple functions such as search, browse, visualization, and Blast. Based on the SL interaction data in the S. cerevisiae, we review the issue of duplications' essentiality and observed that the duplicated genes and singletons have a similar ratio of being essential when we consider both individual and SL. The Microbial Synthetic Lethal and Rescue Database (Mslar) is expected to be a useful reference resource for researchers interested in the SL and SR genes of microorganisms. Mslar is open freely to everyone and available on the web at http://guolab.whu.edu.cn/Mslar/.


Asunto(s)
Neoplasias , Saccharomyces cerevisiae , Humanos , Saccharomyces cerevisiae/genética , Mutaciones Letales Sintéticas , Mutación , Genoma Bacteriano/genética , Bases de Datos Genéticas , Neoplasias/genética
4.
ACS Omega ; 7(27): 23555-23565, 2022 Jul 12.
Artículo en Inglés | MEDLINE | ID: mdl-35847330

RESUMEN

Hexavalent chromium (Cr(VI)) pollution is a global problem, and the reduction of highly toxic Cr(VI) to less toxic Cr(III) is considered to be an effective method to address Cr(VI) pollution. In this study, low-toxicity carbon quantum dots (CQDs) were used to reduce Cr(VI) in wastewater. The results show that CQDs can directly reduce Cr(VI) at pH 2 and can achieve a reduction efficiency of 94% within 120 min. It is observed that under pH higher than 2, CQDs can activate peroxymonosulfate (PMS) to produce reactive oxygen species (ROS) for the reduction of Cr(VI) and the reduction efficiency can reach 99% within 120 min even under neutral conditions. The investigation of the mechanism shows that the hydroxyl groups on the surface of CQDs can be directly oxidized by Cr(VI) because of the higher redox potential of Cr(VI) at pH 2. As the pH increases, the carbonyl groups on the surface of CQDs can activate PMS to generate ROS, O2 •-, and 1O2, which result in Cr(VI) being reduced. To facilitate the practical application of CQDs, the treatment of Cr(VI) in real water samples by CQDs was simulated and the method reduced Cr(VI) from an initial concentration of 5 mg/L to only 8 µg/L in 150 min, which is below the California water quality standard of 10 µg/L. The study provides a new method for the removal of Cr(VI) from wastewater and a theoretical basis for practical application.

5.
Front Microbiol ; 13: 847325, 2022.
Artículo en Inglés | MEDLINE | ID: mdl-35602045

RESUMEN

If a stop codon appears within one gene, then its translation will be terminated earlier than expected. False folding of premature protein will be adverse to the host; hence, all functional genes would tend to avoid the intragenic stop codons. Therefore, we hypothesize that there will be less frequency of nucleotides corresponding to stop codons at each codon position of genes. Here, we validate this inference by investigating the nucleotide frequency at a large scale and results from 19,911 prokaryote genomes revealed that nucleotides coinciding with stop codons indeed have the lowest frequency in most genomes. Interestingly, genes with three types of stop codons all tend to follow a T-G-A deficiency pattern, suggesting that the property of avoiding intragenic termination pressure is the same and the major stop codon TGA plays a dominant role in this effect. Finally, a positive correlation between the TGA deficiency extent and the base length was observed in start-experimentally verified genes of Escherichia coli (E. coli). This strengthens the proof of our hypothesis. The T-G-A deficiency pattern observed would help to understand the evolution of codon usage tactics in extant organisms.

6.
Interdiscip Sci ; 14(2): 349-357, 2022 Jun.
Artículo en Inglés | MEDLINE | ID: mdl-34817803

RESUMEN

In 2002, our research group observed a gene clustering pattern based on the base frequency of A versus T at the second codon position in the genome of Vibrio cholera and found that the functional category distribution of genes in the two clusters was different. With the availability of a large number of sequenced genomes, we performed a systematic investigation of A2-T2 distribution and found that 2694 out of 2764 prokaryotic genomes have an optimal clustering number of two, indicating a consistent pattern. Analysis of the functional categories of the coding genes in each cluster in 1483 prokaryotic genomes indicated, that 99.33% of the genomes exhibited a significant difference (p < 0.01) in function distribution between the two clusters. Specifically, functional category P was overrepresented in the small cluster of 98.65% of genomes, whereas categories J, K, and L were overrepresented in the larger cluster of over 98.52% of genomes. Lineage analysis uncovered that these preferences appear consistently across all phyla. Overall, our work revealed an almost universal clustering pattern based on the relative frequency of A2 versus T2 and its role in functional category preference. These findings will promote the understanding of the rationality of theoretical prediction of functional classes of genes from their nucleotide sequences and how protein function is determined by DNA sequence.


Asunto(s)
Proteínas , Secuencia de Bases , Análisis por Conglomerados , Codón/genética , Proteínas/genética
7.
Comput Struct Biotechnol J ; 19: 4042-4048, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34527183

RESUMEN

Studies on codon property would deepen our understanding of the origin of primitive life and enlighten biotechnical application. Here, we proposed a quantitative measurement of codon-amino acid association and found that seven out of 13 physicochemical properties have stronger associations with the nucleotide identity at the second codon position, indicating that protein structure and function may associate more closely with it than the other two sites. When extending the effect of codon-amino acid association to protein level, it was found that the correlation between the second codon position (measured by the relative frequencies of nucleobase T and A at this codon site) and hydrophobicity (by the form of GRAVY value) became stronger with 96% genomes having R > 0.90 and p < 1e-60. Furthermore, we revealed that informational genes encoding proteins have lower GRAVY values than operational proteins (p < 3e-37) in both prokaryotic and eukaryotic genomes. The above results reveal a complete link from codon identity (A2 versus T2) to amino acid property (hydrophilic versus hydrophobic) and then to protein functions (informational versus operational). Hence, our work may help to understand how the nucleotide sequence determines protein function.

8.
Brief Bioinform ; 21(1): 171-181, 2020 Jan 17.
Artículo en Inglés | MEDLINE | ID: mdl-30496347

RESUMEN

Essential genes have attracted increasing attention in recent years due to the important functions of these genes in organisms. Among the methods used to identify the essential genes, accurate and efficient computational methods can make up for the deficiencies of expensive and time-consuming experimental technologies. In this review, we have collected researches on essential gene predictions in prokaryotes and eukaryotes and summarized the five predominant types of features used in these studies. The five types of features include evolutionary conservation, domain information, network topology, sequence component and expression level. We have described how to implement the useful forms of these features and evaluated their performance based on the data of Escherichia coli MG1655, Bacillus subtilis 168 and human. The prerequisite and applicable range of these features is described. In addition, we have investigated the techniques used to weight features in various models. To facilitate researchers in the field, two available online tools, which are accessible for free and can be directly used to predict gene essentiality in prokaryotes and humans, were referred. This article provides a simple guide for the identification of essential genes in prokaryotes and eukaryotes.

9.
Genome Biol Evol ; 10(8): 2072-2085, 2018 08 01.
Artículo en Inglés | MEDLINE | ID: mdl-30060177

RESUMEN

Pandemic cholera is a major concern for public health because of its high mortality and morbidity. Mutation accumulation (MA) experiments were performed on a representative strain of the current cholera pandemic. Although the base-pair substitution mutation rates in Vibrio cholerae (1.24 × 10-10 per site per generation for wild-type lines and 3.29 × 10-8 for mismatch repair deficient lines) are lower than that previously reported in other bacteria using MA analysis, we discovered specific high rates (8.31 × 10-8 site/generation for wild-type lines and 1.82 × 10-6 for mismatch repair deficient lines) of base duplication or deletion driven by large-scale copy number variations (CNVs). These duplication-deletions are located in two pathogenic islands, IMEX and the large integron island. Each element of these islands has discrepant rate in rapid integration and excision, which provides clues to the pandemicity evolution of V. cholerae. These results also suggest that large-scale structural variants such as CNVs can accumulate rapidly during short-term evolution. Mismatch repair deficient lines exhibit a significantly increased mutation rate in the larger chromosome (Chr1) at specific regions, and this pattern is not observed in wild-type lines. We propose that the high frequency of GATC sites in Chr1 improves the efficiency of MMR, resulting in similar rates of mutation in the wild-type condition. In addition, different mutation rates and spectra were observed in the MA lines under distinct growth conditions, including minimal media, rich media and antibiotic treatments.


Asunto(s)
Emparejamiento Base/genética , Cólera/epidemiología , Cólera/microbiología , Eliminación de Gen , Duplicación de Gen , Pandemias , Vibrio cholerae/genética , Cromosomas Bacterianos/genética , Medios de Cultivo , Momento de Replicación del ADN/efectos de los fármacos , Islas Genómicas , Humanos , Tasa de Mutación , Reproducibilidad de los Resultados , Rifampin/farmacología , Vibrio cholerae/efectos de los fármacos
10.
Bioinformatics ; 33(12): 1758-1764, 2017 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-28158612

RESUMEN

MOTIVATION: Previously constructed classifiers in predicting eukaryotic essential genes integrated a variety of features including experimental ones. If we can obtain satisfactory prediction using only nucleotide (sequence) information, it would be more promising. Three groups recently identified essential genes in human cancer cell lines using wet experiments and it provided wonderful opportunity to accomplish our idea. Here we improved the Z curve method into the λ-interval form to denote nucleotide composition and association information and used it to construct the SVM classifying model. RESULTS: Our model accurately predicted human gene essentiality with an AUC higher than 0.88 both for 5-fold cross-validation and jackknife tests. These results demonstrated that the essentiality of human genes could be reliably reflected by only sequence information. We re-predicted the negative dataset by our Pheg server and 118 genes were additionally predicted as essential. Among them, 20 were found to be homologues in mouse essential genes, indicating that some of the 118 genes were indeed essential, however previous experiments overlooked them. As the first available server, Pheg could predict essentiality for anonymous gene sequences of human. It is also hoped the λ-interval Z curve method could be effectively extended to classification issues of other DNA elements. AVAILABILITY AND IMPLEMENTATION: http://cefg.uestc.edu.cn/Pheg. CONTACT: fbguo@uestc.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Composición de Base , Genes Esenciales , Análisis de Secuencia de ADN/métodos , Programas Informáticos , Animales , Eucariontes/genética , Humanos , Ratones , Modelos Genéticos
11.
Biomed Res Int ; 2016: 7639397, 2016.
Artículo en Inglés | MEDLINE | ID: mdl-27660763

RESUMEN

Investigation of essential genes is significant to comprehend the minimal gene sets of cell and discover potential drug targets. In this study, a novel approach based on multiple homology mapping and machine learning method was introduced to predict essential genes. We focused on 25 bacteria which have characterized essential genes. The predictions yielded the highest area under receiver operating characteristic (ROC) curve (AUC) of 0.9716 through tenfold cross-validation test. Proper features were utilized to construct models to make predictions in distantly related bacteria. The accuracy of predictions was evaluated via the consistency of predictions and known essential genes of target species. The highest AUC of 0.9552 and average AUC of 0.8314 were achieved when making predictions across organisms. An independent dataset from Synechococcus elongatus, which was released recently, was obtained for further assessment of the performance of our model. The AUC score of predictions is 0.7855, which is higher than other methods. This research presents that features obtained by homology mapping uniquely can achieve quite great or even better results than those integrated features. Meanwhile, the work indicates that machine learning-based method can assign more efficient weight coefficients than using empirical formula based on biological knowledge.

12.
Genome Biol Evol ; 8(8): 2624-31, 2016 09 03.
Artículo en Inglés | MEDLINE | ID: mdl-27521813

RESUMEN

The differences in evolutionary patterns of young protein-protein interactions (PPIs) among distinct species have long been a puzzle. However, based on our genome-wide analysis of available integrated experimental data, we confirm that young genes preferentially integrate into ancestral PPI networks, and that this manner is consistent in all of six model organisms with widely different levels of phenotypic complexity. We demonstrate that the level of restrictions placed on the evolution of biological networks declines with a decrease of phenotypic complexity. Compared with young PPI networks, new co-expression links have less evolutionary restrictions, so a young gene with a high possibility to be coexpressed other young genes relatively frequently emerges in the four simpler genomes among the six studied. However, it is not favorable for such young-young coexpression in terms of a young gene evolving into a coexpression hub, so the coexpression pattern could gradually decline. To explain this apparent contradiction, we suggest that young genes that are initially peripheral to networks are temporarily coexpressed with other young genes, driving functional evolution because of low selective pressure. However, as the expression levels of genes increase and they gradually develop a greater effect on fitness, young genes start to be coexpressed more with members of ancestral networks and less with other young genes. Our findings provide new insights into the evolution of biological networks.


Asunto(s)
Evolución Molecular , Redes Reguladoras de Genes , Mapas de Interacción de Proteínas , Animales , Archaea/genética , Bacterias/genética , Hongos/genética , Aptitud Genética , Genoma , Humanos , Fenotipo
13.
Shanghai Kou Qiang Yi Xue ; 21(6): 718-20, 2012 Dec.
Artículo en Chino | MEDLINE | ID: mdl-23364564

RESUMEN

This study was to compare the difference of the existing course materials of basic restorative dental with the past materials, found out the weakness of teaching mode before the reform, and explored the reform in education through teaching content, method and evaluation, in order to improve the teaching quality.


Asunto(s)
Restauración Dental Permanente , Educación en Odontología , Tecnología Odontológica , Materiales Dentales , Humanos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA