Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
Más filtros

Banco de datos
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36882016

RESUMEN

Precisely calling chromatin loops has profound implications for further analysis of gene regulation and disease mechanisms. Technological advances in chromatin conformation capture (3C) assays make it possible to identify chromatin loops in the genome. However, a variety of experimental protocols have resulted in different levels of biases, which require distinct methods to call true loops from the background. Although many bioinformatics tools have been developed to address this problem, there is still a lack of special introduction to loop-calling algorithms. This review provides an overview of the loop-calling tools for various 3C-based techniques. We first discuss the background biases produced by different experimental techniques and the denoising algorithms. Then, the completeness and priority of each tool are categorized and summarized according to the data source of application. The summary of these works can help researchers select the most appropriate method to call loops and further perform downstream analysis. In addition, this survey is also useful for bioinformatics scientists aiming to develop new loop-calling algorithms.


Asunto(s)
Cromatina , Biología Computacional , Biología Computacional/métodos , Cromatina/genética , Cromosomas , Algoritmos , Genoma
2.
Brief Bioinform ; 24(2)2023 03 19.
Artículo en Inglés | MEDLINE | ID: mdl-36772998

RESUMEN

Chronic diseases, because of insidious onset and long latent period, have become the major global disease burden. However, the current chronic disease diagnosis methods based on genetic markers or imaging analysis are challenging to promote completely due to high costs and cannot reach universality and popularization. This study analyzed massive data from routine blood and biochemical test of 32 448 patients and developed a novel framework for cost-effective chronic disease prediction with high accuracy (AUC 87.32%). Based on the best-performing XGBoost algorithm, 20 classification models were further constructed for 17 types of chronic diseases, including 9 types of cancers, 5 types of cardiovascular diseases and 3 types of mental illness. The highest accuracy of the model was 90.13% for cardia cancer, and the lowest was 76.38% for rectal cancer. The model interpretation with the SHAP algorithm showed that CREA, R-CV, GLU and NEUT% might be important indices to identify the most chronic diseases. PDW and R-CV are also discovered to be crucial indices in classifying the three types of chronic diseases (cardiovascular disease, cancer and mental illness). In addition, R-CV has a higher specificity for cancer, ALP for cardiovascular disease and GLU for mental illness. The association between chronic diseases was further revealed. At last, we build a user-friendly explainable machine-learning-based clinical decision support system (DisPioneer: http://bioinfor.imu.edu.cn/dispioneer) to assist in predicting, classifying and treating chronic diseases. This cost-effective work with simple blood tests will benefit more people and motivate clinical implementation and further investigation of chronic diseases prevention and surveillance program.


Asunto(s)
Enfermedades Cardiovasculares , Trastornos Mentales , Humanos , Enfermedades Cardiovasculares/diagnóstico , Enfermedades Cardiovasculares/genética , Análisis Costo-Beneficio , Enfermedad Crónica , Algoritmos
3.
Brief Bioinform ; 25(1)2023 11 22.
Artículo en Inglés | MEDLINE | ID: mdl-38040491

RESUMEN

Pancreatic cancer is a globally recognized highly aggressive malignancy, posing a significant threat to human health and characterized by pronounced heterogeneity. In recent years, researchers have uncovered that the development and progression of cancer are often attributed to the accumulation of somatic mutations within cells. However, cancer somatic mutation data exhibit characteristics such as high dimensionality and sparsity, which pose new challenges in utilizing these data effectively. In this study, we propagated the discrete somatic mutation data of pancreatic cancer through a network propagation model based on protein-protein interaction networks. This resulted in smoothed somatic mutation profile data that incorporate protein network information. Based on this smoothed mutation profile data, we obtained the activity levels of different metabolic pathways in pancreatic cancer patients. Subsequently, using the activity levels of various metabolic pathways in cancer patients, we employed a deep clustering algorithm to establish biologically and clinically relevant metabolic subtypes of pancreatic cancer. Our study holds scientific significance in classifying pancreatic cancer based on somatic mutation data and may provide a crucial theoretical basis for the diagnosis and immunotherapy of pancreatic cancer patients.


Asunto(s)
Genómica , Neoplasias Pancreáticas , Humanos , Pronóstico , Genómica/métodos , Neoplasias Pancreáticas/genética , Mutación , Análisis por Conglomerados
4.
Methods ; 229: 156-162, 2024 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-39019099

RESUMEN

Diabetes stands as one of the most prevalent chronic diseases globally. The conventional methods for diagnosing diabetes are frequently overlooked until individuals manifest noticeable symptoms of the condition. This study aimed to address this gap by collecting comprehensive datasets, including 1000 instances of blood routine data from diabetes patients and an equivalent dataset from healthy individuals. To differentiate diabetes patients from their healthy counterparts, a computational framework was established, encompassing eXtreme Gradient Boosting (XGBoost), random forest, support vector machine, and elastic net algorithms. Notably, the XGBoost model emerged as the most effective, exhibiting superior predictive results with an area under the receiver operating characteristic curve (AUC) of 99.90% in the training set and 98.51% in the testing set. Moreover, the model showcased commendable performance during external validation, achieving an overall accuracy of 81.54%. The probability generated by the model serves as a risk score for diabetes susceptibility. Further interpretability was achieved through the utilization of the Shapley additive explanations (SHAP) algorithm, identifying pivotal indicators such as mean corpuscular hemoglobin concentration (MCHC), lymphocyte ratio (LY%), standard deviation of red blood cell distribution width (RDW-SD), and mean corpuscular hemoglobin (MCH). This enhances our understanding of the predictive mechanisms underlying diabetes. To facilitate the application in clinical and real-life settings, a nomogram was created based on the logistic regression algorithm, which can provide a preliminary assessment of the likelihood of an individual having diabetes. Overall, this research contributes valuable insights into the predictive modeling of diabetes, offering potential applications in clinical practice for more effective and timely diagnoses.


Asunto(s)
Diabetes Mellitus , Aprendizaje Automático , Humanos , Diabetes Mellitus/sangre , Diabetes Mellitus/diagnóstico , Femenino , Masculino , Máquina de Vectores de Soporte , Algoritmos , Curva ROC , Persona de Mediana Edad , Índices de Eritrocitos , Adulto
5.
Nucleic Acids Res ; 51(D1): D924-D932, 2023 01 06.
Artículo en Inglés | MEDLINE | ID: mdl-36189903

RESUMEN

The emerging importance of embryonic development research rapidly increases the volume for a professional resource related to multi-omics data. However, the lack of global embryogenesis repository and systematic analysis tools limits the preceding in stem cell research, human congenital diseases and assisted reproduction. Here, we developed the EmAtlas, which collects the most comprehensive multi-omics data and provides multi-scale tools to explore spatiotemporal activation during mammalian embryogenesis. EmAtlas contains data on multiple types of gene expression, chromatin accessibility, DNA methylation, nucleosome occupancy, histone modifications, and transcription factors, which displays the complete spatiotemporal landscape in mouse and human across several time points, involving gametogenesis, preimplantation, even fetus and neonate, and each tissue involves various cell types. To characterize signatures involved in the tissue, cell, genome, gene and protein levels during mammalian embryogenesis, analysis tools on these five scales were developed. Additionally, we proposed EmRanger to deliver extensive development-related biological background annotations. Users can utilize these tools to analyze, browse, visualize, and download data owing to the user-friendly interface. EmAtlas is freely accessible at http://bioinfor.imu.edu.cn/ematlas.


Asunto(s)
Embrión de Mamíferos , Desarrollo Embrionario , Animales , Humanos , Recién Nacido , Ratones , Cromatina/genética , Metilación de ADN , Desarrollo Embrionario/genética , Genoma , Mamíferos/genética , Nucleosomas , Atlas como Asunto
6.
Methods ; 204: 223-233, 2022 08.
Artículo en Inglés | MEDLINE | ID: mdl-34999214

RESUMEN

ABCB1 is an important gene that closely related to analgesic tolerance to opioids, and plays an important role in their postoperative treatment. Recent studies have demonstrated that ABCB1 genotype is significantly associated with the chemico-resistance and chemical sensitivity in breast cancer patients. So, it is become very important to investigate the important role of ABCB1 for predicting drug response in breast cancer patients. In this study, by conducting the Cox proportional hazards regression analysis in breast cancer patients, significant differences were found in prognosis between the ABCB1 high- and low-expression subtypes. Meanwhile, by using immune infiltration profiles as well as transcriptomics datasets, the ABCB1 high subtype was found to be significantly enriched in many immune-related KEGG pathways and biological processes, and was characterized by the high infiltration levels of immune cell types. Furthermore, bioinformatics inference revealed that the ABCB1 subtypes were associated with the therapeutic effect of immunotherapy, which would be important for patient prognosis. In conclusion, these findings may provide useful help for recognizing the diversity between ABCB1 subtypes in tumor immune microenvironment, and may unravel prognosis outcomes and immunotherapy utility for ABCB1 in breast cancer.


Asunto(s)
Fenómenos Biológicos , Neoplasias de la Mama , Subfamilia B de Transportador de Casetes de Unión a ATP/genética , Subfamilia B de Transportador de Casetes de Unión a ATP/uso terapéutico , Neoplasias de la Mama/tratamiento farmacológico , Neoplasias de la Mama/genética , Neoplasias de la Mama/patología , Femenino , Humanos , Pronóstico , Microambiente Tumoral/genética
7.
Amino Acids ; 53(2): 239-251, 2021 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-33486591

RESUMEN

Enzymes have been proven to play considerable roles in disease diagnosis and biological functions. The feature extraction that truly reflects the intrinsic properties of protein is the most critical step for the automatic identification of enzymes. Although lots of feature extraction methods have been proposed, some challenges remain. In this study, we developed a predictor called IHEC_RAAC, which has the capability to identify whether a protein is a human enzyme and distinguish the function of the human enzyme. To improve the feature representation ability, protein sequences were encoded by a new feature-vector called 'reduced amino acid cluster'. We calculated 673 amino acid reduction alphabets to determine the optimal feature representative scheme. The tenfold cross-validation test showed that the accuracy of IHEC_RAAC to identify human enzymes was 74.66% and further discriminate the human enzyme classes with an accuracy of 54.78%, which was 2.06% and 8.68% higher than the state-of-the-art predictors, respectively. Additionally, the results from the independent dataset indicated that IHEC_RAAC can effectively predict human enzymes and human enzyme classes to further provide guidance for protein research. A user-friendly web server, IHEC_RAAC, is freely accessible at http://bioinfor.imu.edu.cn/ihecraac .


Asunto(s)
Aminoácidos/química , Biología Computacional/métodos , Bases de Datos de Proteínas , Enzimas/química , Algoritmos , Humanos , Sistemas en Línea , Proteínas/química , Programas Informáticos , Máquina de Vectores de Soporte
8.
Biochem Biophys Res Commun ; 503(3): 1911-1918, 2018 09 10.
Artículo en Inglés | MEDLINE | ID: mdl-30064908

RESUMEN

Lysophosphatidylcholine (LPC) is a bioactive lipid constituent of oxidized low density lipoprotein (ox-LDL). It regulates various cellular functions, including migration of circulating monocytes, expression of endothelial adhesion molecules, proliferation and migration of vascular smooth muscle cells (VSMCs). LPC can also be hydrolyzed into lysophosphatidic acid (LPA) by autotaxin (ATX) which possesses lysophospholipase D (lyso-PLD) activity. The aim of this study was to explore the effects of LPC on proliferation and migration of human artery smooth muscle cells (HASMCs) and the involvement of LPC-ATX-LPA pathway in these processes. In vitro, we found that LPC and LPA stimulated HASMCs proliferation and migration. Knockdown of LPA1 by siRNA and inhibit Gi protein with pertussis toxin (PTX) showed the contrary results. Silencing of LPC receptor genes did not significantly affect the LPC induced proliferation and migration. We detected the higher expressed mRNA and protein of ATX in HASMCs, and measured lyso-PLD activity. In atherosclerotic rabbit model, we observed high LPC level and high lyso-D activity in blood, and high expression of LPA1 in aorta walls. We also found that neointima appeared to be thickened and mRNA expressions of LPA1 appeared to be increased. These results revealed that LPC was converted into LPA by ATX to induce the proliferation and migration in HASMCs through LPA1/Gi/o/MAP Kinase signaling pathway. Our research suggested that LPC-ATX-LPA system contributed to the atherogenic action induced by ox-LDL. LPA1 antagonist may be considered as a potential therapeutic and preventative drug for cardiovascular disease.


Asunto(s)
Aterosclerosis/metabolismo , Lisofosfatidilcolinas/metabolismo , Músculo Liso Vascular/metabolismo , Receptores del Ácido Lisofosfatídico/metabolismo , Animales , Aterosclerosis/genética , Movimiento Celular , Proliferación Celular , Células Cultivadas , Cromatografía en Capa Delgada , Humanos , Músculo Liso Vascular/citología , Conejos
9.
Mol Ther Nucleic Acids ; 34: 102044, 2023 Dec 12.
Artículo en Inglés | MEDLINE | ID: mdl-37869261

RESUMEN

Single-cell studies have demonstrated that somatic cell reprogramming is a continuous process of cell fates transition. Only partial reprogramming intermediates can overcome the molecular bottlenecks to acquire pluripotency. To decipher the underlying decisive factors driving cell fate, we identified induced pluripotent stem cells or stromal-like cells (iPSCs/SLCs) and iPSCs or trophoblast-like cells (iPSCs/TLCs) fate bifurcations by reconstructing cellular trajectory. The mesenchymal-epithelial transition and the activation of pluripotency networks are the main molecular series in successful reprogramming. Correspondingly, intermediates diverge into SLCs accompanied by the inhibition of cell cycle genes and the activation of extracellular matrix genes, whereas the TLCs fate is characterized by the up-regulation of placenta development genes. Combining putative gene regulatory networks, seven (Taf7, Ezh2, Klf2, etc.) and three key factors (Cdc5l, Klf4, and Nanog) were individually identified as drivers of the successful reprogramming by triggering downstream pluripotent networks during iPSCs/SLCs and iPSCs/TLCs fate bifurcation. Conversely, 11 factors (Cebpb, Sox4, Junb, etc.) and four factors (Gata2, Jund, Ctnnb1, etc.) drive SLCs fate and TLCs fate, respectively. Our study sheds new light on the understanding of decisive factors driving cell fate, which is helpful for improving reprogramming efficiency through manipulating cell fates to avoid alternative fates.

10.
J Mol Biol ; 435(14): 168117, 2023 07 15.
Artículo en Inglés | MEDLINE | ID: mdl-37086947

RESUMEN

Metal-binding proteins are essential for the vital activities and engage in their roles by acting in concert with metal cations. MbPA (The Metal-binding Protein Atlas) is the most comprehensive resource up to now dedicated to curating metal-binding proteins. Currently, it contains 106,373 entries and 440,187 sites related to 54 metals and 8169 species. Users can view all metal-binding proteins and species-specific proteins in MbPA. There are also metal-proteomics data that quantitatively describes protein expression in different tissues and organs. By analyzing the data of the amino acid residues at the metal-binding site, it is found that about 80% of the metal ions tend to bind to cysteine, aspartic acid, glutamic acid, and histidine. Moreover, we use Diversity Measure to confirm that the diversity of metal-binding is specific in different area of periodic table, and further elucidate the binding modes of 19 transition metals on 20 amino acids. In addition, MbPA also embraces 6855 potential pathogenic mutations related to metalloprotein. The resource is freely available at http://bioinfor.imu.edu.cn/mbpa.


Asunto(s)
Metaloproteínas , Aminoácidos/química , Sitios de Unión , Cationes/química , Metaloproteínas/química , Metaloproteínas/genética , Metales/química
11.
Cell Biosci ; 13(1): 41, 2023 Feb 28.
Artículo en Inglés | MEDLINE | ID: mdl-36849879

RESUMEN

BACKGROUND: The placenta, as a unique exchange organ between mother and fetus, is essential for successful human pregnancy and fetal health. Preeclampsia (PE) caused by placental dysfunction contributes to both maternal and infant morbidity and mortality. Accurate identification of PE patients plays a vital role in the formulation of treatment plans. However, the traditional clinical methods of PE have a high misdiagnosis rate. RESULTS: Here, we first designed a computational biology method that used single-cell transcriptome (scRNA-seq) of healthy pregnancy (38 wk) and early-onset PE (28-32 wk) to identify pathological cell subpopulations and predict PE risk. Based on machine learning methods and feature selection techniques, we observed that the Tuning ReliefF (TURF) score hybrid with XGBoost (TURF_XGB) achieved optimal performance, with 92.61% accuracy and 92.46% recall for classifying nine cell subpopulations of healthy placentas. Biological landscapes of placenta heterogeneity could be mapped by the 110 marker genes screened by TURF_XGB, which revealed the superiority of the TURF feature mining. Moreover, we processed the PE dataset with LASSO to obtain 497 biomarkers. Integration analysis of the above two gene sets revealed that dendritic cells were closely associated with early-onset PE, and C1QB and C1QC might drive preeclampsia by mediating inflammation. In addition, an ensemble model-based risk stratification card was developed to classify preeclampsia patients, and its area under the receiver operating characteristic curve (AUC) could reach 0.99. For broader accessibility, we designed an accessible online web server ( http://bioinfor.imu.edu.cn/placenta ). CONCLUSION: Single-cell transcriptome-based preeclampsia risk assessment using an ensemble machine learning framework is a valuable asset for clinical decision-making. C1QB and C1QC may be involved in the development and progression of early-onset PE by affecting the complement and coagulation cascades pathway that mediate inflammation, which has important implications for better understanding the pathogenesis of PE.

12.
Biochim Biophys Acta Gene Regul Mech ; 1865(7): 194861, 2022 10.
Artículo en Inglés | MEDLINE | ID: mdl-35998875

RESUMEN

DNMT3A/B and TET1 play indispensable roles in regulating DNA methylation that undergoes extensive reprogramming during mammalian embryogenesis. Yet the competitive and cooperative relationships between TET1 and DNMT3A/B remain largely unknown in the human embryonic stem cells. Here, we revealed that the main DNA-binding domain of TET1 contains more positive charges by using charge reduction of amino acid alphabet, followed by DNMT3A and DNMT3B. The genome-wide binding profiles showed that TET1 prefers binding to the proximal promoters and CpG islands compared with DNMT3A/B. Moreover, the binding regions of these three transcription factors can be divided into specific and co-binding regions. And a stronger inhibitory effect of DNMT3A on TET1 demethylation was observed in co-binding regions. Furthermore, we integrated TET1 knockout data to further discuss the competitive binding patterns of TET1 and DNMT3A/B. The lack of TET1 increased the occupation of DNMT3A/B at the specific binding regions of TET1 causing focal hypermethylation. The knockout of TET1 was also accompanied by a reduction of DNMT3A/B binding in the co-binding regions, further confirming the cooperative binding function between TET1 and DNMT3A/B. In conclusion, our studies found that the competitive binding of TET1 and DNMT3A/B cooperatively shapes the global DNA methylation pattern in human embryonic stem cells.


Asunto(s)
ADN (Citosina-5-)-Metiltransferasas , Metilación de ADN , ADN Metiltransferasa 3A , Células Madre Embrionarias Humanas , Oxigenasas de Función Mixta , Proteínas Proto-Oncogénicas , Aminoácidos/metabolismo , Unión Competitiva , ADN/metabolismo , ADN (Citosina-5-)-Metiltransferasas/genética , ADN (Citosina-5-)-Metiltransferasas/metabolismo , ADN Metiltransferasa 3A/genética , ADN Metiltransferasa 3A/metabolismo , Células Madre Embrionarias Humanas/metabolismo , Humanos , Oxigenasas de Función Mixta/genética , Oxigenasas de Función Mixta/metabolismo , Proteínas Proto-Oncogénicas/genética , Proteínas Proto-Oncogénicas/metabolismo , Factores de Transcripción/metabolismo , ADN Metiltransferasa 3B
13.
Brief Funct Genomics ; 21(2): 128-141, 2022 04 11.
Artículo en Inglés | MEDLINE | ID: mdl-34755827

RESUMEN

Breast cancer is a kind of malignant tumor that occurs in breast tissue, which is the most common cancer in women. Cellular metabolism is a critical determinant of the viability and function of cancer cells in tumor microenvironment. In this study, based on the gene expression profile of metabolism-related genes, the prognostic value of 20 metabolic pathways in patients with breast cancer was identified. A universal risk stratification signature that relies on 20 metabolic pathways was established and validated in training cohort, two testing cohorts and The Cancer Genome Atlas pan cancer cohort. Then, the relationship between metabolic risk score subtype, prognosis, immune infiltration level, cancer genotypes and their impact on therapeutic benefit were characterized. Results demonstrated that the patients with the low metabolic risk score subtype displayed good prognosis, high level of immune infiltration and exhibited a favorable response to neoadjuvant chemotherapy and immunotherapy. Taken together, the work presented in this study may deepen the understanding of metabolic hallmarks of breast cancer, and may provide some valuable information for personalized therapies in patients with breast cancer.


Asunto(s)
Neoplasias de la Mama , Biomarcadores de Tumor/genética , Neoplasias de la Mama/genética , Femenino , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Pronóstico , Factores de Riesgo , Microambiente Tumoral/genética
14.
Comput Math Methods Med ; 2021: 5518209, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-33927782

RESUMEN

Antioxidant proteins perform significant functions in disease control and delaying aging which can prevent free radicals from damaging organisms. Accurate identification of antioxidant proteins has important implications for the development of new drugs and the treatment of related diseases, as they play a critical role in the control or prevention of cancer and aging-related conditions. Since experimental identification techniques are time-consuming and expensive, many computational methods have been proposed to identify antioxidant proteins. Although the accuracy of these methods is acceptable, there are still some challenges. In this study, we developed a computational model called ANPrAod to identify antioxidant proteins based on a support vector machine. In order to eliminate potential redundant features and improve prediction accuracy, 673 amino acid reduction alphabets were calculated by us to find the optimal feature representation scheme. The final model could produce an overall accuracy of 87.53% with the ROC of 0.7266 in five-fold cross-validation, which was better than the existing methods. The results of the independent dataset also demonstrated the excellent robustness and reliability of ANPrAod, which could be a promising tool for antioxidant protein identification and contribute to hypothesis-driven experimental design.


Asunto(s)
Antioxidantes/química , Proteínas/química , Algoritmos , Secuencia de Aminoácidos , Análisis por Conglomerados , Biología Computacional , Bases de Datos de Proteínas , Humanos , Péptidos/química , Curva ROC , Análisis de Secuencia de Proteína , Máquina de Vectores de Soporte
15.
Comb Chem High Throughput Screen ; 23(6): 536-545, 2020.
Artículo en Inglés | MEDLINE | ID: mdl-32238133

RESUMEN

BACKGROUND: As the pathogen of malaria, malaria parasite secretes a variety of proteins for its growth and reproduction. OBJECTIVE: The identification of the secretory proteins of malaria parasite has crucial reference significance for the development of anti-malaria vaccines as well as medicine. METHODS: In this study, a computational classification method was developed to identify the secreted proteins of Plasmodium. Amino acid composition, dipeptide composition, and tripeptide composition as well as reduced amino acids alphabets were proposed to illuminate protein sequences. We further used SVM to train and predict respectively and optimized the features. RESULTS: 74 types of reduced amino acids alphabets were employed to predict secretory proteins. The results showed that the accuracy improved to 91.67% with 0.84 Mathew's correlation coefficient (MCC) by dipeptide composition, and the highest prediction accuracy reached 92.26% after feature selection, which demonstrated that our method is prominent and reliable in the field of malaria parasite secreted proteins prediction. CONCLUSION: A intuitive web server iSP-RAAC (http://bioinfor.imu.edu.cn/isppseraac) was established for the convenience of most experimental scientists.


Asunto(s)
Plasmodium falciparum/química , Proteínas de Secreción Prostática/análisis , Algoritmos , Secuencia de Aminoácidos , Aminoácidos/química , Animales , Bases de Datos de Proteínas , Análisis de Secuencia de Proteína
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA