Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
1.
BMC Genomics ; 18(1): 666, 2017 Aug 29.
Artigo em Inglês | MEDLINE | ID: mdl-28851270

RESUMO

BACKGROUND: TEs pervade mammalian genomes. However, compared with mice, fewer studies have focused on the TE expression patterns in rat, particularly the comparisons across different organs, developmental stages and sexes. In addition, TEs can influence the expression of nearby genes. The temporal and spatial influences of TEs remain unclear yet. RESULTS: To evaluate the TEs transcription patterns, we profiled their transcript levels in 11 organs for both sexes across four developmental stages of rat. The results show that most short interspersed elements (SINEs) are commonly expressed in all conditions, which are also the major TE types with commonly expression patterns. In contrast, long terminal repeats (LTRs) are more likely to exhibit specific expression patterns. The expression tendency of TEs and genes are similar in most cases. For example, few specific genes and TEs are in the liver, muscle and heart. However, TEs perform superior over genes on classing organ, which imply their higher organ specificity than genes. By associating the TEs with the closest genes in genome, we find their expression levels are correlated, independent of their distance in some cases. CONCLUSIONS: TEs sex-dependently associate with nearest genes. A gene would be associated with more than one TE. Our works can help to functionally annotate the genome and further understand the role of TEs in gene regulation.


Assuntos
Elementos de DNA Transponíveis/genética , Perfilação da Expressão Gênica , Regulação da Expressão Gênica no Desenvolvimento , Caracteres Sexuais , Animais , Feminino , Genômica , Masculino , Especificidade de Órgãos , Ratos
2.
PLoS One ; 12(3): e0174436, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-28346469

RESUMO

Hepatocellular carcinoma (HCC) is currently still a major factor leading to death, lacking of reliable biomarkers. Therefore, deep understanding the pathogenesis for HCC is of great importance. The emergence of circular RNA (circRNA) provides a new way to study the pathogenesis of human disease. Here, we employed the prediction tool to identify circRNAs based on RNA-seq data. Then, to investigate the biological function of the circRNA, the candidate circRNAs were associated with the protein-coding genes (PCGs) by GREAT. We found significant candidate circRNAs expression alterations between normal and tumor samples. Additionally, the PCGs associated with these candidate circRNAs were also found have discriminative expression patterns between normal and tumor samples. The enrichment analysis illustrated that these PCGs were predominantly enriched for liver/cardiovascular-related diseases such as atherosclerosis, myocardial ischemia and coronary heart disease, and participated in various metabolic processes. Together, a further network analysis indicated that these PCGs play important roles in the regulatory and the PPI network. Finally, we built a classification model to distinguish normal and tumor samples by using candidate circRNAs and their associated genes, respectively. Both of them obtained satisfactory results (~ 0.99 of AUC for circRNA and PCG). Our findings suggested that the circRNA could be a critical factor in HCC, providing a useful resource to explore the pathogenesis of HCC.


Assuntos
Carcinoma Hepatocelular/genética , Perfilação da Expressão Gênica , Neoplasias Hepáticas/genética , RNA , Algoritmos , Carcinoma Hepatocelular/patologia , Bases de Dados Genéticas , Humanos , Neoplasias Hepáticas/patologia , RNA Circular
3.
Sci Rep ; 7: 43709, 2017 03 06.
Artigo em Inglês | MEDLINE | ID: mdl-28262806

RESUMO

Genome-wide association studies (GWAS) have identified more than sixty single nucleotide polymorphisms (SNPs) associated with increased risk for type 2 diabetes (T2D). However, the identification of causal risk SNPs for T2D pathogenesis was complicated by the factor that each risk SNP is a surrogate for the hundreds of SNPs, most of which reside in non-coding regions. Here we provide a comprehensive annotation of 65 known T2D related SNPs and inspect putative functional SNPs probably causing protein dysfunction, response element disruptions of known transcription factors related to T2D genes and regulatory response element disruption of four histone marks in pancreas and pancreas islet. In new identified risk SNPs, some of them were reported as T2D related SNPs in recent studies. Further, we found that accumulation of modest effects of single sites markedly enhanced the risk prediction based on 1989 T2D samples and 3000 healthy controls. The AROC value increased from 0.58 to 0.62 by only using genotype score when putative risk SNPs were added. Besides, the net reclassification improvement is 10.03% on the addition of new risk SNPs. Taken together, functional annotation could provide a list of prioritized potential risk SNPs for the further estimation on the T2D susceptibility of individuals.


Assuntos
Diabetes Mellitus Tipo 2/genética , Predisposição Genética para Doença , Estudo de Associação Genômica Ampla , Polimorfismo de Nucleotídeo Único , Biologia Computacional/métodos , Diabetes Mellitus Tipo 2/metabolismo , Epigênese Genética , Éxons , Genômica/métodos , Histonas/metabolismo , Humanos , Desequilíbrio de Ligação , Anotação de Sequência Molecular , Razão de Chances , Regiões Promotoras Genéticas , Curva ROC , Sequências Reguladoras de Ácido Nucleico , Medição de Risco , Fatores de Transcrição/metabolismo
4.
BMC Bioinformatics ; 18(Suppl 14): 472, 2017 12 28.
Artigo em Inglês | MEDLINE | ID: mdl-29297280

RESUMO

BACKGROUND: Endometrial cancers (ECs) are one of the most common types of malignant tumor in females. Substantial efforts had been made to identify significantly mutated genes (SMGs) in ECs and use them as biomarkers for the classification of histological subtypes and the prediction of clinical outcomes. However, the impact of non-significantly mutated genes (non-SMGs), which may also play important roles in the prognosis of EC patients, has not been extensively studied. Therefore, it is essential for the discovery of biomarkers in ECs to further investigate the non-SMGs that were highly associated with clinical outcomes. RESULTS: For the 9681 non-SMGs reported by the mutation annotation pipeline, there were 1053, 1273 and 395 non-SMGs differentially expressed between the patient groups divided by the clinical endpoints of histological grade, histological type as well as the International Federation of Gynecology and Obstetrics (FIGO) stage of ECs, respectively. In the gene set enrichment analysis, the cancer-related pathways, namely neuroactive ligand-receptor interaction signaling pathway, cAMP signaling pathway and calcium signaling pathway, were significantly enriched with the differentially expressed non-SMGs for all the three endpoints. We further identified 23, 19 and 24 non-SMGs, which were highly associated with histological grade, histological type and FIGO stage, respectively, from the differentially expressed non-SMGs by using the variable combination population analysis (VCPA) approach and found that 69.6% (16/23), 78.9% (15/19) and 66.7% (16/24) of the identified non-SMGs had been previously reported to be correlated with cancers. In addition, the averaged areas under the receiver operating characteristic curve (AUCs) achieved by the predictive models with identified non-SMGs as predictors in predicting histological type, histological grade, and FIGO stage were 0.993, 0.961 and 0.832, respectively, which were superior to those achieved by the models with SMGs as features (averaged AUCs = 0.928, 0.864 and 0.535, resp.). CONCLUSIONS: Besides the SMGs, the non-SMGs reported in the mutation annotation analysis may also involve the crucial genes that were highly associated with clinical outcomes. Combining the mutation status with the gene expression profiles can efficiently identify the cancer-related non-SMGs as predictors for cancer prognostic prediction and provide more supplemental candidates for the discovery of biomarkers.


Assuntos
Biomarcadores Tumorais/genética , Neoplasias do Endométrio/diagnóstico , Neoplasias do Endométrio/genética , Genes Neoplásicos , Mutação/genética , Análise de Sequência de RNA , Neoplasias do Endométrio/patologia , Feminino , Humanos , Pessoa de Meia-Idade , Modelos Genéticos , Estadiamento de Neoplasias
5.
Sci Rep ; 5: 13867, 2015 Sep 09.
Artigo em Inglês | MEDLINE | ID: mdl-26350590

RESUMO

The prediction of drug-target interactions is a key step in the drug discovery process, which serves to identify new drugs or novel targets for existing drugs. However, experimental methods for predicting drug-target interactions are expensive and time-consuming. Therefore, the in silico prediction of drug-target interactions has recently attracted increasing attention. In this study, we propose an eigenvalue transformation technique and apply this technique to two representative algorithms, the Regularized Least Squares classifier (RLS) and the semi-supervised link prediction classifier (SLP), that have been used to predict drug-target interaction. The results of computational experiments with these techniques show that algorithms including eigenvalue transformation achieved better performance on drug-target interaction prediction than did the original algorithms. These findings show that eigenvalue transformation is an efficient technique for improving the performance of methods for predicting drug-target interactions. We further show that, in theory, eigenvalue transformation can be viewed as a feature transformation on the kernel matrix. Accordingly, although we only apply this technique to two algorithms in the current study, eigenvalue transformation also has the potential to be applied to other algorithms based on kernels.


Assuntos
Algoritmos , Biologia Computacional/métodos , Descoberta de Drogas/métodos , Simulação por Computador , Reprodutibilidade dos Testes , Fluxo de Trabalho
6.
Biomed Res Int ; 2015: 890381, 2015.
Artigo em Inglês | MEDLINE | ID: mdl-25961044

RESUMO

The human papillomavirus 16 (HPV16) has high risk to lead various cancers and afflictions, especially, the cervical cancer. Therefore, investigating the pathogenesis of HPV16 is very important for public health. Protein-protein interaction (PPI) network between HPV16 and human was used as a measure to improve our understanding of its pathogenesis. By adopting sequence and topological features, a support vector machine (SVM) model was built to predict new interactions between HPV16 and human proteins. All interactions were comprehensively investigated and analyzed. The analysis indicated that HPV16 enlarged its scope of influence by interacting with human proteins as much as possible. These interactions alter a broad array of cell cycle progression. Furthermore, not only was HPV16 highly prone to interact with hub proteins and bottleneck proteins, but also it could effectively affect a breadth of signaling pathways. In addition, we found that the HPV16 evolved into high carcinogenicity on the condition that its own reproduction had been ensured. Meanwhile, this work will contribute to providing potential new targets for antiviral therapeutics and help experimental research in the future.


Assuntos
Interações Hospedeiro-Patógeno/genética , Papillomavirus Humano 16/genética , Mapas de Interação de Proteínas/genética , Neoplasias do Colo do Útero/genética , Carcinogênese/genética , Biologia Computacional , Feminino , Genoma Humano , Papillomavirus Humano 16/patogenicidade , Humanos , Máquina de Vetores de Suporte , Neoplasias do Colo do Útero/metabolismo , Neoplasias do Colo do Útero/patologia
7.
J Comput Aided Mol Des ; 29(4): 349-60, 2015 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-25527073

RESUMO

The assessment of binding affinity between ligands and the target proteins plays an essential role in drug discovery and design process. As an alternative to widely used scoring approaches, machine learning methods have also been proposed for fast prediction of the binding affinity with promising results, but most of them were developed as all-purpose models despite of the specific functions of different protein families, since proteins from different function families always have different structures and physicochemical features. In this study, we proposed a random forest method to predict the protein-ligand binding affinity based on a comprehensive feature set covering protein sequence, binding pocket, ligand structure and intermolecular interaction. Feature processing and compression was respectively implemented for different protein family datasets, which indicates that different features contribute to different models, so individual representation for each protein family is necessary. Three family-specific models were constructed for three important protein target families of HIV-1 protease, trypsin and carbonic anhydrase respectively. As a comparison, two generic models including diverse protein families were also built. The evaluation results show that models on family-specific datasets have the superior performance to those on the generic datasets and the Pearson and Spearman correlation coefficients (R p and Rs) on the test sets are 0.740, 0.874, 0.735 and 0.697, 0.853, 0.723 for HIV-1 protease, trypsin and carbonic anhydrase respectively. Comparisons with the other methods further demonstrate that individual representation and model construction for each protein family is a more reasonable way in predicting the affinity of one particular protein family.


Assuntos
Inteligência Artificial , Desenho de Fármacos , Proteínas/metabolismo , Anidrases Carbônicas/metabolismo , Desenho Assistido por Computador , Bases de Dados de Proteínas , Protease de HIV/metabolismo , Humanos , Ligantes , Modelos Biológicos , Ligação Proteica , Tripsina/metabolismo
8.
PLoS One ; 9(9): e105889, 2014.
Artigo em Inglês | MEDLINE | ID: mdl-25180585

RESUMO

BACKGROUND: Early and accurate identification of adverse drug reactions (ADRs) is critically important for drug development and clinical safety. Computer-aided prediction of ADRs has attracted increasing attention in recent years, and many computational models have been proposed. However, because of the lack of systematic analysis and comparison of the different computational models, there remain limitations in designing more effective algorithms and selecting more useful features. There is therefore an urgent need to review and analyze previous computation models to obtain general conclusions that can provide useful guidance to construct more effective computational models to predict ADRs. PRINCIPAL FINDINGS: In the current study, the main work is to compare and analyze the performance of existing computational methods to predict ADRs, by implementing and evaluating additional algorithms that have been earlier used for predicting drug targets. Our results indicated that topological and intrinsic features were complementary to an extent and the Jaccard coefficient had an important and general effect on the prediction of drug-ADR associations. By comparing the structure of each algorithm, final formulas of these algorithms were all converted to linear model in form, based on this finding we propose a new algorithm called the general weighted profile method and it yielded the best overall performance among the algorithms investigated in this paper. CONCLUSION: Several meaningful conclusions and useful findings regarding the prediction of ADRs are provided for selecting optimal features and algorithms.


Assuntos
Simulação por Computador , Efeitos Colaterais e Reações Adversas Relacionados a Medicamentos , Sistemas de Notificação de Reações Adversas a Medicamentos , Algoritmos , Área Sob a Curva , Humanos
9.
Amino Acids ; 46(8): 2025-35, 2014 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-24849655

RESUMO

Single-nucleotide polymorphisms (SNPs) are the most frequent form of genetic variations. Non-synonymous SNPs (nsSNPs) occurring in coding region result in single amino acid substitutions that associate with human hereditary diseases. Plenty of approaches were designed for distinguishing deleterious from neutral nsSNPs based on sequence level information. Novel in this work, combinations of protein-protein interaction (PPI) network topological features were introduced in predicting disease-related nsSNPs. Based on a dataset that was compiled from Swiss-Prot, a random forest model was constructed with an average accuracy value of 80.43% and an MCC value of 0.60 in a rigorous tenfold crossvalidation test. For an independent dataset, our model achieved an accuracy of 88.05% and an MCC of 0.67. Compared with previous studies, our approach presented superior prediction ability. Results showed that the incorporated PPI network topological features outperform conventional features. Our further analysis indicated that disease-related proteins are topologically different from other proteins. This study suggested that nsSNPs may share some topological information of proteins and the change of topological attributes could provide clues in illustrating functional shift due to nsSNPs.


Assuntos
Substituição de Aminoácidos/genética , Doenças Genéticas Inatas/genética , Polimorfismo de Nucleotídeo Único/genética , Mapas de Interação de Proteínas , Biologia Computacional , Bases de Dados de Proteínas , Humanos , Proteínas/química , Análise de Sequência de Proteína
10.
Comput Biol Chem ; 49: 71-8, 2014 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-24440656

RESUMO

Estrogen receptor status and the pathologic response to preoperative chemotherapy are two important indicators of chemotherapeutic sensitivity of tumors in breast cancer, which are used to guide the selection of specific regimens for patients. Microarray-based gene expression profiling, which is successfully applied to the discovery of tumor biomarkers and the prediction of drug response, was suggested to predict the cancer outcomes using the gene signatures differentially expressed between two clinical states. However, many false positive genes unrelated to the phenotypic differences will be involved in the lists of differentially expressed genes (DEGs) when only using the statistical methods for gene selection, e.g. Student's t test, and subsequently affect the performance of the predictive models. For the purpose of improving the prediction of clinical outcomes, we optimized the selection of DEGs by using a combined strategy, for which the DEGs were firstly identified by the statistical methods, and then filtered by a similarity profiling approach that used for candidate gene prioritization. In our study, we firstly verified the molecular functions of the DEGs identified by the combined strategy with the gene expression data generated in the microarray experiments of Si-Wu-Tang, which is a popular formula in traditional Chinese medicine. The results showed that, for Si-Wu-Tang experimental data set, the cancer-related signaling pathways were significantly enriched by gene set enrichment analysis when using the DEG lists generated by the combined strategy, confirming the potentially cancer-preventive effect of Si-Wu-Tang. To verify the performance of the predictive models in clinical application, we used the combined strategy to select the DEGs as features from the gene expression data of the clinical samples, which were collected from the breast cancer patients, and constructed models to predict the chemotherapeutic sensitivity of tumors in breast cancer. After refining the DEG lists by a similarity profiling approach, the Matthew's correlation coefficients of predicting estrogen receptor status and the pathologic response to preoperative chemotherapy with the DEGs selected by the fold change ranking were 0.770 and 0.428, respectively, and were 0.748 and 0.373 with the DEGs selected by SAM, respectively, which were generally higher than those achieved with unrefined DEG lists and those achieved by the candidate models in the second phase of Microarray Quality Control project (0.732 and 0.301, respectively). Our results demonstrated that the strategy of integrating the statistical methods with the gene prioritization methods based on similarity profiling was a powerful tool for DEG selection, which effectively improved the performance of prediction models in clinical applications and can guide the personalized chemotherapy better.


Assuntos
Antineoplásicos Fitogênicos/uso terapêutico , Neoplasias da Mama/tratamento farmacológico , Neoplasias da Mama/genética , Medicamentos de Ervas Chinesas/uso terapêutico , Antineoplásicos Fitogênicos/farmacologia , Medicamentos de Ervas Chinesas/farmacologia , Feminino , Perfilação da Expressão Gênica , Humanos , Células MCF-7 , Análise de Sequência com Séries de Oligonucleotídeos , Valor Preditivo dos Testes
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA