Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 19 de 19
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
Comput Biol Med ; 138: 104911, 2021 11.
Artículo en Inglés | MEDLINE | ID: mdl-34634637

RESUMEN

Transcriptomics and metabolomics data often contain missing values or outliers due to limitations of the data acquisition techniques. Most of the statistical methods require complete datasets for downstream analysis. A number of methods have been developed for missing value imputation using the classical mean and variance based on maximum likelihood estimators, which are not robust against outliers. Consequently, the performance of these methods deteriorates in the presence of outliers. Hence precise imputation of missing values and outliers handling are both concurrently important. Therefore, in this paper, we developed a robust iterative approach using robust estimators based on the minimum beta divergence method, which simultaneously impute missing values and outliers. We investigate the performance of the proposed method in a comparison with six frequently used missing value imputation methods such as Zero, KNN, robust SVD, EM, random forest (RF) and weighted least square approach (WLSA) through feature selection using both simulated and real datasets. Ten performance indices were used to explore the optimal method such as Frobenius norm (FOBN), accuracy (ACC), sensitivity (SN), specificity (SP), positive predictive value (PPV), negative predictive value (NPV), detection rate (DR), misclassification error rate (MER), the area under the ROC curve (AUC) and computational runtime. Evaluation based on both simulated and real data suggests the superiority of the proposed method over the other traditional methods in terms of various rates of outliers and missing values. The suggested approach also keeps almost equal performance in absence of outliers with the other methods. The proposed method is accurate, simple, and consumes lower computational time compared to the other methods. Therefore, our recommendation is to apply the proposed procedure for large-scale transcriptomics and metabolomics data analysis. The computational tool has been implemented in an R package, which is publicly available from https://CRAN.R-project.org/package=rMisbeta.


Asunto(s)
Biología Computacional , Transcriptoma , Algoritmos , Análisis de Datos , Análisis de los Mínimos Cuadrados , Metabolómica , Transcriptoma/genética
2.
Inform Med Unlocked ; 25: 100702, 2021.
Artículo en Inglés | MEDLINE | ID: mdl-34423108

RESUMEN

The novel coronavirus (SARS-CoV-2) has expanded rapidly worldwide. Now it has covered more than 150 countries worldwide. It is referred to as COVID-19. SARS-CoV-2 mainly affects the respiratory systems of humans that can lead up to serious illness or even death in the presence of different comorbidities. However, most COVID-19 infected people show mild to moderate symptoms, and no medication is suggested. Still, drugs of other diseases have been used to treat COVID-19. Nevertheless, the absence of vaccines and proper drugs against the COVID-19 virus has increased the mortality rate. Albeit sex is a risk factor for COVID-19, none of the studies considered this risk factor for identifying biomarkers from the RNASeq count dataset. Men are more likely to undertake severe symptoms with different comorbidities and show greater mortality compared with women. From this standpoint, we aim to identify shared gene signatures between males and females from the human COVID-19 RNAseq count dataset of peripheral blood cells using a robust voom approach. We identified 1341 overlapping DEGs between male and female datasets. The gene ontology (GO) annotation and pathway enrichment analysis revealed that DEGs are involved in various BP categories such as nucleosome assembly, DNA conformation change, DNA packaging, and different KEGG pathways such as cell cycle, ECM-receptor interaction, progesterone-mediated oocyte maturation, etc. Ten hub-proteins (UBC, KIAA0101, APP, CDK1, SUMO2, SP1, FN1, CDK2, E2F1, and TP53) were unveiled using PPI network analysis. The top three miRNAs (mir-17-5p, mir-20a-5p, mir-93-5p) and TFs (PPARG, E2F1 and KLF5) were uncovered. In conclusion, the top ten significant drugs (roscovitine, curcumin, simvastatin, fulvestrant, troglitazone, alvocidib, L-alanine, tamoxifen, serine, and doxorubicin) were retrieved using drug repurposing analysis of overlapping DEGs, which might be therapeutic agents of COVID-19.

3.
Brief Bioinform ; 22(6)2021 11 05.
Artículo en Inglés | MEDLINE | ID: mdl-34260684

RESUMEN

Coronavirus disease 2019 (COVID-19) is an infectious disease caused by the newly discovered coronavirus, SARS-CoV-2. Increased severity of COVID-19 has been observed in patients with diabetes mellitus (DM). This study aimed to identify common transcriptional signatures, regulators and pathways between COVID-19 and DM. We have integrated human whole-genome transcriptomic datasets from COVID-19 and DM, followed by functional assessment with gene ontology (GO) and pathway analyses. In peripheral blood mononuclear cells (PBMCs), among the upregulated differentially expressed genes (DEGs), 32 were found to be commonly modulated in COVID-19 and type 2 diabetes (T2D), while 10 DEGs were commonly downregulated. As regards type 1 diabetes (T1D), 21 DEGs were commonly upregulated, and 29 DEGs were commonly downregulated in COVID-19 and T1D. Moreover, 35 DEGs were commonly upregulated in SARS-CoV-2 infected pancreas organoids and T2D islets, while 14 were commonly downregulated. Several GO terms were found in common between COVID-19 and DM. Prediction of the putative transcription factors involved in the upregulation of genes in COVID-19 and DM identified RELA to be implicated in both PBMCs and pancreas. Here, for the first time, we have characterized the biological processes and pathways commonly dysregulated in COVID-19 and DM, which could be in the next future used for the design of personalized treatment of COVID-19 patients suffering from DM as comorbidity.


Asunto(s)
COVID-19/genética , Diabetes Mellitus/genética , SARS-CoV-2/genética , Transcriptoma/genética , COVID-19/patología , COVID-19/virología , Biología Computacional , Diabetes Mellitus/patología , Perfilación de la Expresión Génica , Regulación de la Expresión Génica/genética , Humanos , Leucocitos Mononucleares/patología , Leucocitos Mononucleares/virología , Mapas de Interacción de Proteínas/genética , SARS-CoV-2/patogenicidad
4.
Saudi J Biol Sci ; 28(10): 5647-5656, 2021 Oct.
Artículo en Inglés | MEDLINE | ID: mdl-34127904

RESUMEN

COVID-19 has emerged as global health threats. Chronic kidney disease (CKD) patients are immune-compromised and may have a high risk of infection by the SARS-CoV-2. We aimed to detect common transcriptomic signatures and pathways between COVID-19 and CKD by systems biology analysis. We analyzed transcriptomic data obtained from peripheral blood mononuclear cells (PBMC) infected with SARS-CoV-2 and PBMC of CKD patients. We identified 49 differentially expressed genes (DEGs) which were common between COVID-19 and CKD. The gene ontology and pathways analysis showed the DEGs were associated with "platelet degranulation", "regulation of wound healing", "platelet activation", "focal adhesion", "regulation of actin cytoskeleton" and "PI3K-Akt signalling pathway". The protein-protein interaction (PPI) network encoded by the common DEGs showed ten hub proteins (EPHB2, PRKAR2B, CAV1, ARHGEF12, HSP90B1, ITGA2B, BCL2L1, E2F1, TUBB1, and C3). Besides, we identified significant transcription factors and microRNAs that may regulate the common DEGs. We investigated protein-drug interaction analysis and identified potential drugs namely, aspirin, estradiol, rapamycin, and nebivolol. The identified common gene signature and pathways between COVID-19 and CKD may be therapeutic targets in COVID-19 patients with CKD comorbidity.

5.
Brief Bioinform ; 22(5)2021 09 02.
Artículo en Inglés | MEDLINE | ID: mdl-33839760

RESUMEN

Current coronavirus disease-2019 (COVID-19) pandemic has caused massive loss of lives. Clinical trials of vaccines and drugs are currently being conducted around the world; however, till now no effective drug is available for COVID-19. Identification of key genes and perturbed pathways in COVID-19 may uncover potential drug targets and biomarkers. We aimed to identify key gene modules and hub targets involved in COVID-19. We have analyzed SARS-CoV-2 infected peripheral blood mononuclear cell (PBMC) transcriptomic data through gene coexpression analysis. We identified 1520 and 1733 differentially expressed genes (DEGs) from the GSE152418 and CRA002390 PBMC datasets, respectively (FDR < 0.05). We found four key gene modules and hub gene signature based on module membership (MMhub) statistics and protein-protein interaction (PPI) networks (PPIhub). Functional annotation by enrichment analysis of the genes of these modules demonstrated immune and inflammatory response biological processes enriched by the DEGs. The pathway analysis revealed the hub genes were enriched with the IL-17 signaling pathway, cytokine-cytokine receptor interaction pathways. Then, we demonstrated the classification performance of hub genes (PLK1, AURKB, AURKA, CDK1, CDC20, KIF11, CCNB1, KIF2C, DTL and CDC6) with accuracy >0.90 suggesting the biomarker potential of the hub genes. The regulatory network analysis showed transcription factors and microRNAs that target these hub genes. Finally, drug-gene interactions analysis suggests amsacrine, BRD-K68548958, naproxol, palbociclib and teniposide as the top-scored repurposed drugs. The identified biomarkers and pathways might be therapeutic targets to the COVID-19.


Asunto(s)
Neoplasias Encefálicas/patología , Enfermedades del Sistema Nervioso Central/patología , Biología Computacional/métodos , Glioblastoma/patología , Aprendizaje Automático , Algoritmos , Progresión de la Enfermedad , Humanos
6.
Brain Sci ; 10(10)2020 Oct 17.
Artículo en Inglés | MEDLINE | ID: mdl-33080834

RESUMEN

BACKGROUND: Autism spectrum disorder (ASD) is a neurodevelopmental disorder with deficits in social communication ability and repetitive behavior. The pathophysiological events involved in the brain of this complex disease are still unclear. METHODS: In this study, we aimed to profile the gene expression signatures of brain cortex of ASD patients, by using two publicly available RNA-seq studies, in order to discover new ASD-related genes. RESULTS: We detected 1567 differentially expressed genes (DEGs) by meta-analysis, where 1194 were upregulated and 373 were downregulated genes. Several ASD-related genes previously reported were also identified. Our meta-analysis identified 235 new DEGs that were not detected using the individual RNA-seq studies used. Some of those genes, including seven DEGs (PAK1, DNAH17, DOCK8, DAPP1, PCDHAC2, and ERBIN, SLC7A7), have been confirmed in previous reports to be associated with ASD. Gene Ontology (GO) and pathways analysis showed several molecular pathways enriched by the DEGs, namely, osteoclast differentiation, TNF signaling pathway, complement and coagulation cascade. Topological analysis of protein-protein interaction of the ASD brain cortex revealed proteomics hub gene signatures: MYC, TP53, HDAC1, CDK2, BAG3, CDKN1A, GABARAPL1, EZH2, VIM, and TRAF1. We also identified the transcriptional factors (TFs) regulating DEGs, namely, FOXC1, GATA2, YY1, FOXL1, USF2, NFIC, NFKB1, E2F1, TFAP2A, HINFP. CONCLUSION: Novel core genes and molecular signatures involved with ASD were identified by our meta-analysis.

7.
Eur J Pharmacol ; 887: 173594, 2020 Nov 15.
Artículo en Inglés | MEDLINE | ID: mdl-32971089

RESUMEN

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) disease, more commonly COVID-19 has emerged as a world health pandemic. There are couples of treatment methods for COVID-19, however, well-established drugs and vaccines are urgently needed to treat the COVID-19. The new drug discovery is a tremendous challenge; repurposing of existing drugs could shorten the time and expense compared with de novo drug development. In this study, we aimed to decode molecular signatures and pathways of the host cells in response to SARS-CoV-2 and the rapid identification of repurposable drugs using bioinformatics and network biology strategies. We have analyzed available transcriptomic RNA-seq COVID-19 data to identify differentially expressed genes (DEGs). We detected 177 DEGs specific for COVID-19 where 122 were upregulated and 55 were downregulated compared to control (FDR<0.05 and logFC ≥ 1). The DEGs were significantly involved in the immune and inflammatory response. The pathway analysis revealed the DEGs were found in influenza A, measles, cytokine signaling in the immune system, interleukin-4, interleukin -13, interleukin -17 signaling, and TNF signaling pathways. Protein-protein interaction analysis showed 10 hub genes (BIRC3, ICAM1, IRAK2, MAP3K8, S100A8, SOCS3, STAT5A, TNF, TNFAIP3, TNIP1). The regulatory network analysis showed significant transcription factors (TFs) that target DEGs, namely FOXC1, GATA2, YY1, FOXL1, NFKB1. Finally, drug repositioning analysis was performed with these 10 hub genes and showed that in silico validated three drugs with molecular docking. The transcriptomics signatures, molecular pathways, and regulatory biomolecules shed light on candidate biomarkers and drug targets which have potential roles to manage COVID-19. ICAM1 and TNFAIP3 were the key hubs that have demonstrated good binding affinities with repurposed drug candidates. Dabrafenib, radicicol, and AT-7519 were the top-scored repurposed drugs that showed efficient docking results when they tested with hub genes. The identified drugs should be further evaluated in molecular level wet-lab experiments in prior to clinical studies in the treatment of COVID-19.


Asunto(s)
Infecciones por Coronavirus/tratamiento farmacológico , Infecciones por Coronavirus/genética , Reposicionamiento de Medicamentos , Células Epiteliales/efectos de los fármacos , Pulmón/citología , Neumonía Viral/tratamiento farmacológico , Neumonía Viral/genética , Transcriptoma , Antivirales/uso terapéutico , COVID-19 , Células Cultivadas , Biología Computacional , Simulación por Computador , Regulación de la Expresión Génica/genética , Humanos , Pandemias , Transducción de Señal/efectos de los fármacos , Transducción de Señal/genética , Factores de Transcripción/genética
8.
Genomics ; 112(2): 1290-1299, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31377428

RESUMEN

Alzheimer's disease (AD) is a progressive neurodegenerative disease characterized by the accumulation of amyloid plaques and neurofibrillary tangles in the brain. However, there are no peripheral biomarkers available that can detect AD onset. This study aimed to identify the molecular signatures in AD through an integrative analysis of blood gene expression data. We used two microarray datasets (GSE4226 and GSE4229) comparing peripheral blood transcriptomes of AD patients and controls to identify differentially expressed genes (DEGs). Gene set and protein overrepresentation analysis, protein-protein interaction (PPI), DEGs-Transcription Factors (TFs) interactions, DEGs-microRNAs (miRNAs) interactions, protein-drug interactions, and protein subcellular localizations analyses were performed on DEGs common to the datasets. We identified 25 common DEGs between the two datasets. Integration of genome scale transcriptome datasets with biomolecular networks revealed hub genes (NOL6, ATF3, TUBB, UQCRC1, CASP2, SND1, VCAM1, BTF3, VPS37B), common transcription factors (FOXC1, GATA2, NFIC, PPARG, USF2, YY1) and miRNAs (mir-20a-5p, mir-93-5p, mir-16-5p, let-7b-5p, mir-708-5p, mir-24-3p, mir-26b-5p, mir-17-5p, mir-193-3p, mir-186-5p). Evaluation of histone modifications revealed that hub genes possess several histone modification sites associated with AD. Protein-drug interactions revealed 10 compounds that affect the identified AD candidate biomolecules, including anti-neoplastic agents (Vinorelbine, Vincristine, Vinblastine, Epothilone D, Epothilone B, CYT997, and ZEN-012), a dermatological (Podofilox) and an immunosuppressive agent (Colchicine). The subcellular localization of molecular signatures varied, including nuclear, plasma membrane and cytosolic proteins. In the present study, it was identified blood-cell derived molecular signatures that might be useful as candidate peripheral biomarkers in AD. It was also identified potential drugs and epigenetic data associated with these molecules that may be useful in designing therapeutic approaches to ameliorate AD.


Asunto(s)
Enfermedad de Alzheimer/genética , Mapas de Interacción de Proteínas , Transcriptoma , Enfermedad de Alzheimer/tratamiento farmacológico , Humanos , MicroARNs/genética , MicroARNs/metabolismo , Terapia Molecular Dirigida , Fármacos Neuroprotectores/uso terapéutico , Biología de Sistemas , Factores de Transcripción/genética , Factores de Transcripción/metabolismo
9.
Genomics ; 112(2): 2000-2010, 2020 03.
Artículo en Inglés | MEDLINE | ID: mdl-31756426

RESUMEN

BACKGROUND: Identification of differentially expressed genes (DEGs) under two or more experimental conditions is an important task for elucidating the molecular basis of phenotypic variation. In the recent years, next generation sequencing (RNA-seq) has become very attractive and competitive alternative to the microarrays because of reducing the cost of sequencing and limitations of microarrays. A number of methods have been developed for detecting the DEGs from RNA-seq data. Most of these methods are based on either Poisson distribution or negative binomial (NB) distribution. However, identification of DEGs based on read count data using skewed distribution is inflexible and complicated of in presence of outliers or extreme values. RESULTS: Most of the existing DEGs selection methods produce lower accuracies and higher false discoveries in presence of outliers. There are some robust approaches such as edgeR_robust and DEseq2 perform well in presence of outliers for large sample case. But they show weak performance for small-sample case, in presence of outliers. To address this issues an alternative approach has emerged by transforming the RNA-seq data into microarray like data. Among various transformation methods voom using limma pipeline is proven better for RNA-seq data. However, limma by voom transformation is sensitive to outliers for small-sample case. Therefore, in this paper, we robustify the voom approach using the minimum ß-divergence method. We demonstrate the performance of the proposed method in a comparison of seven popular biomarkers selection methods: DEseq, DEseq2, SAMseq, Bayseq, limma (voom), edgeR and edgeR_robust using both simulated and real dataset. Both types of experimental results show that the performance of the proposed method improve over the competing methods, in presence of outliers and in absence of outliers it keeps almost equal performance with these methods. CONCLUSION: We observe the improved performance of the proposed method from simulation and real RNA-seq count data analysis for both small-and large-sample cases, in presence of outliers. Therefore, our proposal is to use the proposed method instead of existing methods to obtain the better performance for selecting the DEGs.


Asunto(s)
Algoritmos , Perfilación de la Expresión Génica/métodos , RNA-Seq/métodos , Animales , Perfilación de la Expresión Génica/normas , Humanos , Ratones , MicroARNs/genética , RNA-Seq/normas , Transcriptoma
10.
Medicina (Kaunas) ; 55(6)2019 Jun 11.
Artículo en Inglés | MEDLINE | ID: mdl-31212673

RESUMEN

Background and objectives: Identification of cancer biomarkers that are differentially expressed (DE) between two biological conditions is an important task in many microarray studies. There exist several methods in the literature in this regards and most of these methods designed especially for unpaired samples, those are not suitable for paired samples. Furthermore, the traditional methods use p-values or fold change (FC) values to detect the DE genes. However, sometimes, p-value based results do not comply with FC based results due to the smaller pooled variance of gene expressions, which occurs when variance of each individual condition becomes smaller. There are some methods that combine both p-values and FC values to solve this problem. But, those methods also show weak performance for small sample cases in the presence of outlying expressions. To overcome this problem, in this paper, an attempt is made to propose a hybrid robust SAM-FC approach by combining rank of FC values and rank of p-values computed by SAM statistic using minimum ß-divergence method, which is designed for paired samples. Materials and Methods: The proposed method introduces a weight function known as ß-weight function. This weight function produces larger weights corresponding to usual and smaller weights for unusual expressions. The ß-weight function plays the significant role on the performance of the proposed method. The proposed method uses ß-weight function as a measure of outlier detection by setting ß = 0.2. We unify both classical and robust estimates using ß-weight function, such that maximum likelihood estimators (MLEs) are used in absence of outliers and minimum ß-divergence estimators are used in presence of outliers to obtain reasonable p-values and FC values in the proposed method. Results: We examined the performance of proposed method in a comparison of some popular methods (t-test, SAM, LIMMA, Wilcoxon, WAD, RP, and FCROS) using both simulated and real gene expression profiles for both small and large sample cases. From the simulation and a real spike in data analysis results, we observed that the proposed method outperforms other methods for small sample cases in the presence of outliers and it keeps almost equal performance with other robust methods (Wilcoxon, RP, and FCROS) otherwise. From the head and neck cancer (HNC) gene expression dataset, the proposed method identified two additional genes (CYP3A4 and NOVA1) that are significantly enriched in linoleic acid metabolism, drug metabolism, steroid hormone biosynthesis and metabolic pathways. The survival analysis through Kaplan-Meier curve revealed that combined effect of these two genes has prognostic capability and they might be promising biomarker of HNC. Moreover, we retrieved the 12 candidate drugs based on gene interaction from glad4u and drug bank literature based gene associations. Conclusions: Using pathway analysis, disease association study, protein-protein interactions and survival analysis we found that our proposed two additional genes might be involved in the critical pathways of cancer. Furthermore, the identified drugs showed statistical significance which indicates that proteins associated with these genes might be therapeutic target in cancer.


Asunto(s)
Biomarcadores de Tumor/análisis , Técnicas y Procedimientos Diagnósticos/normas , Biomarcadores de Tumor/genética , Simulación por Computador , Técnicas y Procedimientos Diagnósticos/instrumentación , Técnicas y Procedimientos Diagnósticos/estadística & datos numéricos , Perfilación de la Expresión Génica/instrumentación , Perfilación de la Expresión Génica/métodos , Humanos , Pronóstico
11.
Medicina (Kaunas) ; 55(5)2019 May 22.
Artículo en Inglés | MEDLINE | ID: mdl-31121943

RESUMEN

Background and objectives: Alzheimer's disease (AD) is a progressive neurodegenerative disease that results in severe dementia. Having ischemic strokes (IS) is one of the risk factors of the AD, but the molecular mechanisms that underlie IS and AD are not well understood. We thus aimed to identify common molecular biomarkers and pathways in IS and AD that can help predict the progression of these diseases and provide clues to important pathological mechanisms. Materials and Methods: We have analyzed the microarray gene expression datasets of IS and AD. To obtain robust results, combinatorial statistical methods were used to analyze the datasets and 26 transcripts (22 unique genes) were identified that were abnormally expressed in both IS and AD. Results: Gene Ontology (GO) and KEGG pathway analyses indicated that these 26 common dysregulated genes identified several altered molecular pathways: Alcoholism, MAPK signaling, glycine metabolism, serine metabolism, and threonine metabolism. Further protein-protein interactions (PPI) analysis revealed pathway hub proteins PDE9A, GNAO1, DUSP16, NTRK2, PGAM2, MAG, and TXLNA. Transcriptional and post-transcriptional components were then identified, and significant transcription factors (SPIB, SMAD3, and SOX2) found. Conclusions: Protein-drug interaction analysis revealed PDE9A has interaction with drugs caffeine, γ-glutamyl glycine, and 3-isobutyl-1-methyl-7H-xanthine. Thus, we identified novel putative links between pathological processes in IS and AD at transcripts levels, and identified possible mechanistic and gene expression links between IS and AD.


Asunto(s)
Enfermedad de Alzheimer/sangre , Biomarcadores/sangre , Isquemia Encefálica/sangre , 3',5'-AMP Cíclico Fosfodiesterasas/análisis , 3',5'-AMP Cíclico Fosfodiesterasas/sangre , Enfermedad de Alzheimer/complicaciones , Biomarcadores/análisis , Isquemia Encefálica/complicaciones , Fosfatasas de Especificidad Dual/análisis , Fosfatasas de Especificidad Dual/sangre , Subunidades alfa de la Proteína de Unión al GTP Gi-Go/análisis , Subunidades alfa de la Proteína de Unión al GTP Gi-Go/sangre , Humanos , Glicoproteínas de Membrana/análisis , Glicoproteínas de Membrana/sangre , Fosfatasas de la Proteína Quinasa Activada por Mitógenos/análisis , Fosfatasas de la Proteína Quinasa Activada por Mitógenos/sangre , Glicoproteína Asociada a Mielina/análisis , Glicoproteína Asociada a Mielina/sangre , Receptor trkB/análisis , Receptor trkB/sangre , Transducción de Señal/fisiología , Accidente Cerebrovascular/sangre , Accidente Cerebrovascular/complicaciones , Proteínas de Transporte Vesicular/análisis , Proteínas de Transporte Vesicular/sangre
12.
Medicina (Kaunas) ; 55(1)2019 Jan 17.
Artículo en Inglés | MEDLINE | ID: mdl-30658502

RESUMEN

Colorectal cancer (CRC) is the second most common cause of cancer-related death in the world, but early diagnosis ameliorates the survival of CRC. This report aimed to identify molecular biomarker signatures in CRC. We analyzed two microarray datasets (GSE35279 and GSE21815) from the Gene Expression Omnibus (GEO) to identify mutual differentially expressed genes (DEGs). We integrated DEGs with protein⁻protein interaction and transcriptional/post-transcriptional regulatory networks to identify reporter signaling and regulatory molecules; utilized functional overrepresentation and pathway enrichment analyses to elucidate their roles in biological processes and molecular pathways; performed survival analyses to evaluate their prognostic performance; and applied drug repositioning analyses through Connectivity Map (CMap) and geneXpharma tools to hypothesize possible drug candidates targeting reporter molecules. A total of 727 upregulated and 99 downregulated DEGs were detected. The PI3K/Akt signaling, Wnt signaling, extracellular matrix (ECM) interaction, and cell cycle were identified as significantly enriched pathways. Ten hub proteins (ADNP, CCND1, CD44, CDK4, CEBPB, CENPA, CENPH, CENPN, MYC, and RFC2), 10 transcription factors (ETS1, ESR1, GATA1, GATA2, GATA3, AR, YBX1, FOXP3, E2F4, and PRDM14) and two microRNAs (miRNAs) (miR-193b-3p and miR-615-3p) were detected as reporter molecules. The survival analyses through Kaplan⁻Meier curves indicated remarkable performance of reporter molecules in the estimation of survival probability in CRC patients. In addition, several drug candidates including anti-neoplastic and immunomodulating agents were repositioned. This study presents biomarker signatures at protein and RNA levels with prognostic capability in CRC. We think that the molecular signatures and candidate drugs presented in this study might be useful in future studies indenting the development of accurate diagnostic and/or prognostic biomarker screens and efficient therapeutic strategies in CRC.


Asunto(s)
Biomarcadores de Tumor/genética , Neoplasias Colorrectales/diagnóstico , Neoplasias Colorrectales/tratamiento farmacológico , Proteína 2 Similar a ELAV/genética , Genes Reguladores/genética , Genes Reporteros/genética , MicroARNs/genética , Terapia Molecular Dirigida , Factores de Transcripción/genética , Antineoplásicos/uso terapéutico , Neoplasias Colorrectales/genética , Neoplasias Colorrectales/mortalidad , Bases de Datos Genéticas , Diagnóstico Precoz , Perfilación de la Expresión Génica , Regulación Neoplásica de la Expresión Génica , Humanos , Factores Inmunológicos/uso terapéutico , Estimación de Kaplan-Meier , Pronóstico , Transducción de Señal , Análisis de Supervivencia , Biología de Sistemas/métodos
13.
Bioinformation ; 14(5): 206-212, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30108417

RESUMEN

The dyserythropoietic anemia disease is a genetic disorder of erythropoiesis characterized by morphological abnormalities of erythroblasts. This is caused by human gene C15orf41 mutation. The uncharacterized C15orf41 protein is involved in the formation of a functional complex structure. The uncharacterized C15orf41 protein is thermostable, unstable and acidic. This is associated with TPD (Treponema Pallidum) domain (135 to 265 residue position) and three PTM sites such as K50 (Acetylation), T114 (Phosphorylation) and K176 (Ubiquitination). C15orf41 is paralogous to isoform-1 (gi|194018542|) and open reading frame isoform-CRA_c (gi|119612744|) of Homo sapiens located at chromosome 15. It interacts with the human ATP (Adenosine Triphosphate) binding domain 4 (ATPBD4) having similarity score 0.725 as per protein-protein interaction (PPI) network analysis. This data provides valuable insights towards the functional characterization of human gene C15orf41.

14.
Bioinformation ; 14(5): 213-218, 2018.
Artículo en Inglés | MEDLINE | ID: mdl-30108418

RESUMEN

Lysine acetylation is one of the decisive categories of protein post-translational modification (PTM), it is convoluted in many significant cellular developments and severe diseases in the biological system. The experimental identification of protein-acetylated sites is painstaking, time-consuming and expensive. Hence, there is significant interest in the development of computational approaches for consistent prediction of acetylation sites using protein sequences. Features selection from protein sequences plays a significant role for acetylation sites prediction. We describe an improved feature selection approach for acetylation sites prediction based on kernel naive Bayes classifier (KNBC). We have shown that KNBC generated from selected features by a new feature selection method outperforms than the existing methods for identification of acetylation sites. The sensitivity, specificity, ACC (Accuracy), MCC (Matthews Correlation Coefficient) and AUC (Area under Curve of ROC) in our proposed method are as follows 80.71%, 93.39%, 76.73%, 41.37% and 83.0% with the optimum window size is 47. Thus the kernel naive Bayes classifier finds application in acetylation site prediction.

15.
Bioinformation ; 13(10): 327-332, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-29162964

RESUMEN

Patient classification through feature selection (FS) based on gene expression data (GED) has already become popular to the research communities. T-test is the well-known statistical FS method in GED analysis. However, it produces higher false positives and lower accuracies for small sample sizes or in presence of outliers. To get rid from the shortcomings of t-test with small sample sizes, SAM has been applied in GED. But, it is highly sensitive to outliers. Recently, robust SAM using the minimum ß-divergence estimators has overcome all the problems of classical t-test & SAM and it has been successfully applied for identification of differentially expressed (DE) genes. But, it was not applied in classification. Therefore, in this paper, we employ robust SAM as a feature selection approach along with classifiers for patient classification. We demonstrate the performance of the robust SAM in a comparison of classical t-test and SAM along with four popular classifiers (LDA, KNN, SVM and naive Bayes) using both simulated and real gene expression datasets. The results obtained from simulation and real data analysis confirm that the performance of the four classifiers improve with robust SAM than the classical t-test and SAM. From a real Colon cancer dataset we identified 21 additional DE genes using robust SAM that were not identified by the classical t-test or SAM. To reveal the biological functions and pathways of these 21 genes, we perform KEGG pathway enrichment analysis and found that these genes are involved in some important pathways related to cancer disease.

16.
Biomed Res Int ; 2017: 5310198, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28819626

RESUMEN

Identification of differentially expressed (DE) genes with two or more conditions is an important task for discovery of few biomarker genes. Significance Analysis of Microarrays (SAM) is a popular statistical approach for identification of DE genes for both small- and large-sample cases. However, it is sensitive to outlying gene expressions and produces low power in presence of outliers. Therefore, in this paper, an attempt is made to robustify the SAM approach using the minimum ß-divergence estimators instead of the maximum likelihood estimators of the parameters. We demonstrated the performance of the proposed method in a comparison of some other popular statistical methods such as ANOVA, SAM, LIMMA, KW, EBarrays, GaGa, and BRIDGE using both simulated and real gene expression datasets. We observe that all methods show good and almost equal performance in absence of outliers for the large-sample cases, while in the small-sample cases only three methods (SAM, LIMMA, and proposed) show almost equal and better performance than others with two or more conditions. However, in the presence of outliers, on an average, only the proposed method performs better than others for both small- and large-sample cases with each condition.


Asunto(s)
Perfilación de la Expresión Génica/estadística & datos numéricos , Regulación de la Expresión Génica/genética , Análisis por Micromatrices/estadística & datos numéricos , Análisis de Secuencia por Matrices de Oligonucleótidos/estadística & datos numéricos , Algoritmos , Biometría
17.
Biomed Res Int ; 2017: 3020627, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28848763

RESUMEN

The naïve Bayes classifier (NBC) is one of the most popular classifiers for class prediction or pattern recognition from microarray gene expression data (MGED). However, it is very much sensitive to outliers with the classical estimates of the location and scale parameters. It is one of the most important drawbacks for gene expression data analysis by the classical NBC. The gene expression dataset is often contaminated by outliers due to several steps involved in the data generating process from hybridization of DNA samples to image analysis. Therefore, in this paper, an attempt is made to robustify the Gaussian NBC by the minimum ß-divergence method. The role of minimum ß-divergence method in this article is to produce the robust estimators for the location and scale parameters based on the training dataset and outlier detection and modification in test dataset. The performance of the proposed method depends on the tuning parameter ß. It reduces to the traditional naïve Bayes classifier when ß â†’ 0. We investigated the performance of the proposed beta naïve Bayes classifier (ß-NBC) in a comparison with some popular existing classifiers (NBC, KNN, SVM, and AdaBoost) using both simulated and real gene expression datasets. We observed that the proposed method improved the performance over the others in presence of outliers. Otherwise, it keeps almost equal performance.


Asunto(s)
Perfilación de la Expresión Génica/métodos , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Transcriptoma/genética , Simulación por Computador , Neoplasias de Cabeza y Cuello/genética , Neoplasias de Cabeza y Cuello/metabolismo , Humanos , Distribución Normal
18.
Bioinformation ; 13(6): 202-208, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28729763

RESUMEN

In drug invention and early disease prediction of lung cancer, metabolomic biomarker detection is very important. Mortality rate can be decreased, if cancer is predicted at the earlier stage. Recent diagnostic techniques for lung cancer are not prognosis diagnostic techniques. However, if we know the name of the metabolites, whose intensity levels are considerably changing between cancer subject and control subject, then it will be easy to early diagnosis the disease as well as to discover the drug. Therefore, in this paper we have identified the influential plasma and serum blood sample metabolites for lung cancer and also identified the biomarkers that will be helpful for early disease prediction as well as for drug invention. To identify the influential metabolites, we considered a parametric and a nonparametric test namely student׳s t-test as parametric and Kruskal-Wallis test as non-parametric test. We also categorized the up-regulated and down-regulated metabolites by the heatmap plot and identified the biomarkers by support vector machine (SVM) classifier and pathway analysis. From our analysis, we got 27 influential (p-value<0.05) metabolites from plasma sample and 13 influential (p-value<0.05) metabolites from serum sample. According to the importance plot through SVM classifier, pathway analysis and correlation network analysis, we declared 4 metabolites (taurine, aspertic acid, glutamine and pyruvic acid) as plasma biomarker and 3 metabolites (aspartic acid, taurine and inosine) as serum biomarker.

19.
Biomed Res Int ; 2017: 2437608, 2017.
Artículo en Inglés | MEDLINE | ID: mdl-28293630

RESUMEN

Metabolomics is the sophisticated and high-throughput technology based on the entire set of metabolites which is known as the connector between genotypes and phenotypes. For any phenotypic changes, potential metabolite (biomarker) identification is very important because it provides diagnostic as well as prognostic markers and can help to develop new biomolecular therapy. Biomarker identification from metabolomics data analysis is hampered by the use of high-throughput technology that provides high dimensional data matrix which contains missing values as well as outliers. However, missing value imputation and outliers handling techniques play important role in identifying biomarker correctly. Although several missing value imputation techniques are available, outliers deteriorate the accuracy of imputation as well as the accuracy of biomarker identification. Therefore, in this paper we have proposed a new biomarker identification technique combining the groupwise robust singular value decomposition, t-test, and fold-change approach that can identify biomarkers more correctly from metabolomics dataset. We have also compared the performance of the proposed technique with those of other traditional techniques for biomarker identification using both simulated and real data analysis in absence and presence of outliers. Using our proposed method in hepatocellular carcinoma (HCC) dataset, we have also identified the four upregulated and two downregulated metabolites as potential metabolomic biomarkers for HCC disease.


Asunto(s)
Biomarcadores de Tumor/metabolismo , Carcinoma Hepatocelular/metabolismo , Neoplasias Hepáticas/metabolismo , Metabolómica , Algoritmos , Carcinoma Hepatocelular/diagnóstico , Biología Computacional , Bases de Datos Factuales , Reacciones Falso Positivas , Cromatografía de Gases y Espectrometría de Masas , Estudios de Asociación Genética , Humanos , Neoplasias Hepáticas/diagnóstico , Modelos Estadísticos , Pronóstico , Curva ROC
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA
...