Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 49
Filtrar
Más filtros

Bases de datos
País/Región como asunto
Tipo del documento
País de afiliación
Intervalo de año de publicación
1.
Bioinformatics ; 36(12): 3833-3840, 2020 06 01.
Artículo en Inglés | MEDLINE | ID: mdl-32399550

RESUMEN

MOTIVATION: Non-linear ordinary differential equation (ODE) models that contain numerous parameters are suitable for inferring an emulated gene regulatory network (eGRN). However, the number of experimental measurements is usually far smaller than the number of parameters of the eGRN model that leads to an underdetermined problem. There is no unique solution to the inference problem for an eGRN using insufficient measurements. RESULTS: This work proposes an evolutionary modelling algorithm (EMA) that is based on evolutionary intelligence to cope with the underdetermined problem. EMA uses an intelligent genetic algorithm to solve the large-scale parameter optimization problem. An EMA-based method, GREMA, infers a novel type of gene regulatory network with confidence levels for every inferred regulation. The higher the confidence level is, the more accurate the inferred regulation is. GREMA gradually determines the regulations of an eGRN with confidence levels in descending order using either an S-system or a Hill function-based ODE model. The experimental results showed that the regulations with high-confidence levels are more accurate and robust than regulations with low-confidence levels. Evolutionary intelligence enhanced the mean accuracy of GREMA by 19.2% when using the S-system model with benchmark datasets. An increase in the number of experimental measurements may increase the mean confidence level of the inferred regulations. GREMA performed well compared with existing methods that have been previously applied to the same S-system, DREAM4 challenge and SOS DNA repair benchmark datasets. AVAILABILITY AND IMPLEMENTATION: All of the datasets that were used and the GREMA-based tool are freely available at https://nctuiclab.github.io/GREMA. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.


Asunto(s)
Algoritmos , Redes Reguladoras de Genes , Evolución Biológica , Biología Computacional , Inteligencia
2.
Bioinformatics ; 33(5): 661-668, 2017 03 01.
Artículo en Inglés | MEDLINE | ID: mdl-28062441

RESUMEN

Motivation: Numerous ubiquitination sites remain undiscovered because of the limitations of mass spectrometry-based methods. Existing prediction methods use randomly selected non-validated sites as non-ubiquitination sites to train ubiquitination site prediction models. Results: We propose an evolutionary screening algorithm (ESA) to select effective negatives among non-validated sites and an ESA-based prediction method, ESA-UbiSite, to identify human ubiquitination sites. The ESA selects non-validated sites least likely to be ubiquitination sites as training negatives. Moreover, the ESA and ESA-UbiSite use a set of well-selected physicochemical properties together with a support vector machine for accurate prediction. Experimental results show that ESA-UbiSite with effective negatives achieved 0.92 test accuracy and a Matthews's correlation coefficient of 0.48, better than existing prediction methods. The ESA increased ESA-UbiSite's test accuracy from 0.75 to 0.92 and can improve other post-translational modification site prediction methods. Availability and Implementation: An ESA-UbiSite-based web server has been established at http://iclab.life.nctu.edu.tw/iclab_webtools/ESAUbiSite/ . Contact: syho@mail.nctu.edu.tw. Supplementary information: Supplementary data are available at Bioinformatics online.


Asunto(s)
Biología Computacional/métodos , Programas Informáticos , Máquina de Vectores de Soporte , Ubiquitinación , Humanos
3.
BMC Bioinformatics ; 16 Suppl 18: S14, 2015.
Artículo en Inglés | MEDLINE | ID: mdl-26681483

RESUMEN

BACKGROUND: Protein-protein interactions (PPIs) are involved in various biological processes, and underlying mechanism of the interactions plays a crucial role in therapeutics and protein engineering. Most machine learning approaches have been developed for predicting the binding affinity of protein-protein complexes based on structure and functional information. This work aims to predict the binding affinity of heterodimeric protein complexes from sequences only. RESULTS: This work proposes a support vector machine (SVM) based binding affinity classifier, called SVM-BAC, to classify heterodimeric protein complexes based on the prediction of their binding affinity. SVM-BAC identified 14 of 580 sequence descriptors (physicochemical, energetic and conformational properties of the 20 amino acids) to classify 216 heterodimeric protein complexes into low and high binding affinity. SVM-BAC yielded the training accuracy, sensitivity, specificity, AUC and test accuracy of 85.80%, 0.89, 0.83, 0.86 and 83.33%, respectively, better than existing machine learning algorithms. The 14 features and support vector regression were further used to estimate the binding affinities (Pkd) of 200 heterodimeric protein complexes. Prediction performance of a Jackknife test was the correlation coefficient of 0.34 and mean absolute error of 1.4. We further analyze three informative physicochemical properties according to their contribution to prediction performance. Results reveal that the following properties are effective in predicting the binding affinity of heterodimeric protein complexes: apparent partition energy based on buried molar fractions, relations between chemical structure and biological activity in principal component analysis IV, and normalized frequency of beta turn. CONCLUSIONS: The proposed sequence-based prediction method SVM-BAC uses an optimal feature selection method to identify 14 informative features to classify and predict binding affinity of heterodimeric protein complexes. The characterization analysis revealed that the average numbers of beta turns and hydrogen bonds at protein-protein interfaces in high binding affinity complexes are more than those in low binding affinity complexes.


Asunto(s)
Proteínas/química , Máquina de Vectores de Soporte , Área Bajo la Curva , Dimerización , Enlace de Hidrógeno , Análisis de Componente Principal , Unión Proteica , Mapas de Interacción de Proteínas , Estructura Terciaria de Proteína , Proteínas/metabolismo , Curva ROC
4.
ScientificWorldJournal ; 2014: 327306, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24955394

RESUMEN

The rapid and reliable identification of promoter regions is important when the number of genomes to be sequenced is increasing very speedily. Various methods have been developed but few methods investigate the effectiveness of sequence-based features in promoter prediction. This study proposes a knowledge acquisition method (named PromHD) based on if-then rules for promoter prediction in human and Drosophila species. PromHD utilizes an effective feature-mining algorithm and a reference feature set of 167 DNA sequence descriptors (DNASDs), comprising three descriptors of physicochemical properties (absorption maxima, molecular weight, and molar absorption coefficient), 128 top-ranked descriptors of 4-mer motifs, and 36 global sequence descriptors. PromHD identifies two feature subsets with 99 and 74 DNASDs and yields test accuracies of 96.4% and 97.5% in human and Drosophila species, respectively. Based on the 99- and 74-dimensional feature vectors, PromHD generates several if-then rules by using the decision tree mechanism for promoter prediction. The top-ranked informative rules with high certainty grades reveal that the global sequence descriptor, the length of nucleotide A at the first position of the sequence, and two physicochemical properties, absorption maxima and molecular weight, are effective in distinguishing promoters from non-promoters in human and Drosophila species, respectively.


Asunto(s)
Algoritmos , Drosophila/genética , Regiones Promotoras Genéticas/genética , Animales , Humanos
5.
Environ Monit Assess ; 185(5): 4125-39, 2013 May.
Artículo en Inglés | MEDLINE | ID: mdl-22961329

RESUMEN

On August 8, 2009, Typhoon Morakot brought heavy rain to Taiwan, causing numerous landslides and debris flows in the Taihe village area of Meishan Township, Chiayi County, in south-central Taiwan. In the Taihe land is primary used for agriculture and land use management may be a factor in the area's landslides. This study explores Typhoon Morakot-induced landslides and land use changes between 1999 and 2009 using GIS with the aid of field investigation. Spot 5 satellite images with a resolution of 2.5 m are used for landslide interpretation and manually digitalized in GIS. A statistical analysis for landslide frequency-area distribution was used to identify the landslide characteristics associated with different types of land use. There were 243 landslides with a total area of 2.75 km(2) in the study area. The area is located in intrinsically fragile combinations of sandstone and shale. Typhoon Morakot-induced landslides show a power-law distribution in the study area. Landslides were mainly located in steep slope areas containing natural forest and in areas planted with bamboo, tea, and betel nut. Land covered with natural forest shows the highest landslide ratio, followed by bamboo, betel nut, and tea. Landslides thus show a higher ratio in areas planted with shallow root vegetation such as bamboo, betel nut, and tea. Furthermore, the degree of basin development is proportional to the landslide ratio. The results show that a change in vegetation cover results in a modified landslide area and frequency and changed land use areas have higher landslide ratios than non-changed. Land use management and community-based disaster prevention are needed in mountainous areas of Taiwan for hazard mitigation.


Asunto(s)
Desastres/estadística & datos numéricos , Monitoreo del Ambiente/métodos , Deslizamientos de Tierra/estadística & datos numéricos , Gestión de Riesgos/métodos , Agricultura , Ciudades , Sistemas de Información Geográfica , Plantas , Medición de Riesgo , Taiwán , Urbanización , Tiempo (Meteorología)
6.
BMC Bioinformatics ; 13 Suppl 17: S3, 2012.
Artículo en Inglés | MEDLINE | ID: mdl-23282103

RESUMEN

BACKGROUND: Existing methods for predicting protein solubility on overexpression in Escherichia coli advance performance by using ensemble classifiers such as two-stage support vector machine (SVM) based classifiers and a number of feature types such as physicochemical properties, amino acid and dipeptide composition, accompanied with feature selection. It is desirable to develop a simple and easily interpretable method for predicting protein solubility, compared to existing complex SVM-based methods. RESULTS: This study proposes a novel scoring card method (SCM) by using dipeptide composition only to estimate solubility scores of sequences for predicting protein solubility. SCM calculates the propensities of 400 individual dipeptides to be soluble using statistic discrimination between soluble and insoluble proteins of a training data set. Consequently, the propensity scores of all dipeptides are further optimized using an intelligent genetic algorithm. The solubility score of a sequence is determined by the weighted sum of all propensity scores and dipeptide composition. To evaluate SCM by performance comparisons, four data sets with different sizes and variation degrees of experimental conditions were used. The results show that the simple method SCM with interpretable propensities of dipeptides has promising performance, compared with existing SVM-based ensemble methods with a number of feature types. Furthermore, the propensities of dipeptides and solubility scores of sequences can provide insights to protein solubility. For example, the analysis of dipeptide scores shows high propensity of α-helix structure and thermophilic proteins to be soluble. CONCLUSIONS: The propensities of individual dipeptides to be soluble are varied for proteins under altered experimental conditions. For accurately predicting protein solubility using SCM, it is better to customize the score card of dipeptide propensities by using a training data set under the same specified experimental conditions. The proposed method SCM with solubility scores and dipeptide propensities can be easily applied to the protein function prediction problems that dipeptide composition features play an important role. AVAILABILITY: The used datasets, source codes of SCM, and supplementary files are available at http://iclab.life.nctu.edu.tw/SCM/.


Asunto(s)
Dipéptidos/química , Proteínas Recombinantes/química , Máquina de Vectores de Soporte , Aminoácidos/química , Bases de Datos de Proteínas , Escherichia coli/genética , Escherichia coli/metabolismo , Proteínas Recombinantes/biosíntesis , Solubilidad
7.
J Virol ; 85(21): 11291-9, 2011 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-21880770

RESUMEN

Epstein-Barr virus (EBV)-encoded molecules have been detected in the tumor tissues of several cancers, including nasopharyngeal carcinoma (NPC), suggesting that EBV plays an important role in tumorigenesis. However, the nature of EBV with respect to genome width in vivo and whether EBV undergoes clonal expansion in the tumor tissues are still poorly understood. In this study, next-generation sequencing (NGS) was used to sequence DNA extracted directly from the tumor tissue of a patient with NPC. Apart from the human sequences, a clinically isolated EBV genome 164.7 kb in size was successfully assembled and named GD2 (GenBank accession number HQ020558). Sequence and phylogenetic analyses showed that GD2 was closely related to GD1, a previously assembled variant derived from a patient with NPC. GD2 contains the most prevalent EBV variants reported in Cantonese patients with NPC, suggesting that it might be the prevalent strain in this population. Furthermore, GD2 could be grouped into a single subtype according to common classification criteria and contains only 6 heterozygous point mutations, suggesting the monoclonal expansion of GD2 in NPC. This study represents the first genome-wide analysis of a clinical isolate of EBV directly extracted from NPC tissue. Our study reveals that NGS allows the characterization of genome-wide variations of EBV in clinical tumors and provides evidence of monoclonal expansion of EBV in vivo. The pipeline could also be applied to the study of other pathogen-related malignancies. With additional NGS studies of NPC, it might be possible to uncover the potential causative EBV variant involved in NPC.


Asunto(s)
ADN Viral/genética , Infecciones por Virus de Epstein-Barr/complicaciones , Herpesvirus Humano 4/genética , Herpesvirus Humano 4/aislamiento & purificación , Neoplasias Nasofaríngeas/virología , Carcinoma , China , Análisis por Conglomerados , ADN Viral/química , Herpesvirus Humano 4/clasificación , Secuenciación de Nucleótidos de Alto Rendimiento , Humanos , Datos de Secuencia Molecular , Carcinoma Nasofaríngeo , Filogenia , Análisis de Secuencia de ADN , Homología de Secuencia
8.
J Theor Biol ; 312: 105-13, 2012 Nov 07.
Artículo en Inglés | MEDLINE | ID: mdl-22967952

RESUMEN

Protein secretion is an important biological process for both eukaryotes and prokaryotes. Several sequence-based methods mainly rely on utilizing various types of complementary features to design accurate classifiers for predicting non-classical secretory proteins. Gene Ontology (GO) terms are increasing informative in predicting protein functions. However, the number of used GO terms is often very large. For example, there are 60,020 GO terms used in the prediction method Euk-mPLoc 2.0 for subcellular localization. This study proposes a novel approach to identify a small set of m top-ranked GO terms served as the only type of input features to design a support vector machine (SVM) based method Sec-GO to predict non-classical secretory proteins in both eukaryotes and prokaryotes. To evaluate the Sec-GO method, two existing methods and their used datasets are adopted for performance comparisons. The Sec-GO method using m=436 GO terms yields an independent test accuracy of 96.7% on mammalian proteins, much better than the existing method SPRED (82.2%) which uses frequencies of tri-peptides and short peptides, secondary structure, and physicochemical properties as input features of a random forest classifier. Furthermore, when applying to Gram-positive bacterial proteins, the Sec-GO with m=158 GO terms has a test accuracy of 94.5%, superior to NClassG+ (90.0%) which uses SVM with several feature types, comprising amino acid composition, di-peptides, physicochemical properties and the position specific weighting matrix. Analysis of the distribution of secretory proteins in a GO database indicates the percentage of the non-classical secretory proteins annotated by GO is larger than that of classical secretory proteins in both eukaryotes and prokaryotes. Of the m top-ranked GO features, the top-four GO terms are all annotated by such subcellular locations as GO:0005576 (Extracellular region). Additionally, the method Sec-GO is easily implemented and its web tool of prediction is available at iclab.life.nctu.edu.tw/secgo.


Asunto(s)
Biología Computacional/métodos , Bases de Datos de Proteínas , Células Eucariotas/metabolismo , Células Procariotas/metabolismo , Proteínas/metabolismo , Vocabulario Controlado , Secuencia de Aminoácidos , Animales , Humanos , Almacenamiento y Recuperación de la Información/métodos , Datos de Secuencia Molecular , Proteínas/genética , Reproducibilidad de los Resultados , Máquina de Vectores de Soporte
9.
BMC Bioinformatics ; 12 Suppl 1: S47, 2011 Feb 15.
Artículo en Inglés | MEDLINE | ID: mdl-21342579

RESUMEN

BACKGROUND: Existing methods of predicting DNA-binding proteins used valuable features of physicochemical properties to design support vector machine (SVM) based classifiers. Generally, selection of physicochemical properties and determination of their corresponding feature vectors rely mainly on known properties of binding mechanism and experience of designers. However, there exists a troublesome problem for designers that some different physicochemical properties have similar vectors of representing 20 amino acids and some closely related physicochemical properties have dissimilar vectors. RESULTS: This study proposes a systematic approach (named Auto-IDPCPs) to automatically identify a set of physicochemical and biochemical properties in the AAindex database to design SVM-based classifiers for predicting and analyzing DNA-binding domains/proteins. Auto-IDPCPs consists of 1) clustering 531 amino acid indices in AAindex into 20 clusters using a fuzzy c-means algorithm, 2) utilizing an efficient genetic algorithm based optimization method IBCGA to select an informative feature set of size m to represent sequences, and 3) analyzing the selected features to identify related physicochemical properties which may affect the binding mechanism of DNA-binding domains/proteins. The proposed Auto-IDPCPs identified m = 22 features of properties belonging to five clusters for predicting DNA-binding domains with a five-fold cross-validation accuracy of 87.12%, which is promising compared with the accuracy of 86.62% of the existing method PSSM-400. For predicting DNA-binding sequences, the accuracy of 75.50% was obtained using m = 28 features, where PSSM-400 has an accuracy of 74.22%. Auto-IDPCPs and PSSM-400 have accuracies of 80.73% and 82.81%, respectively, applied to an independent test data set of DNA-binding domains. Some typical physicochemical properties discovered are hydrophobicity, secondary structure, charge, solvent accessibility, polarity, flexibility, normalized Van Der Waals volume, pK (pK-C, pK-N, pK-COOH and pK-a(RCOOH)), etc. CONCLUSIONS: The proposed approach Auto-IDPCPs would help designers to investigate informative physicochemical and biochemical properties by considering both prediction accuracy and analysis of binding mechanism simultaneously. The approach Auto-IDPCPs can be also applicable to predict and analyze other protein functions from sequences.


Asunto(s)
Algoritmos , Proteínas de Unión al ADN/química , Análisis de Secuencia de Proteína/métodos , Aminoácidos/química , Análisis por Conglomerados , Biología Computacional/métodos , Bases de Datos de Proteínas , Unión Proteica
10.
Chin J Cancer ; 30(3): 182-8, 2011 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-21352695

RESUMEN

Gene therapy is one of the most attractive fields in tumor therapy. In past decades, significant progress has been achieved. Various approaches, such as viral and non-viral vectors and physical methods, have been developed to make gene delivery safer and more efficient. Several therapeutic strategies have evolved, including gene-based (tumor suppressor genes, suicide genes, antiangiogenic genes, cytokine and oxidative stress-based genes) and RNA-based (antisense oligonucleotides and RNA interference) approaches. In addition, immune response-based strategies (dendritic cell- and T cell-based therapy) are also under investigation in tumor gene therapy. This review highlights the progress and recent developments in gene delivery systems, therapeutic strategies, and possible clinical directions for gene therapy.


Asunto(s)
Células Dendríticas/inmunología , Técnicas de Transferencia de Gen , Terapia Genética/métodos , Vectores Genéticos , Neoplasias/terapia , Genes Transgénicos Suicidas , Genes Supresores de Tumor , Humanos , Neoplasias/genética , Interferencia de ARN
11.
J Gastroenterol Hepatol ; 25(7): 1315-20, 2010 Jul.
Artículo en Inglés | MEDLINE | ID: mdl-20594262

RESUMEN

UNLABELLED: In an earlier study, we found that hepatitis C virus core protein, HCV-C, participated in the malignant transformation of HCV-C transfected normal human biliary epithelial (hBE) cells by activating telomerase. Here we further investigated the signaling of the malignant transformation. METHODS: Reverse transcription-polymerase chain reaction (RT-PCR), western blotting and immunoprecipitation were used to analyze the expression of HCV-C, human telomerase reverse transcriptase (hTERT), nuclear factor-kappaB (NF-kappaB) and NF-kappaB inhibitor alpha (IkappaBalpha) genes and the phosphorylation level of IkappaBalpha protein. Electrophoretic mobility shift assays (EMSA) and NF-kappaB-linked luciferase reporter assays were carried out to measure NF-kappaB activity. RESULTS: The expression of HCV-C and hTERT was detected only in HCV-C-transfected hBE (hBE-HCV-C) cells but not in vector-transfected or parental hBE cells. More NF-kappaB protein accumulated in nuclear extracts of hBE-HCV-C cells rather than in those of control cells, though total NF-kappaB protein level showed no difference among these cells. DNA binding activity of NF-kappaB and the NF-kappaB-linked luciferase activity were much higher in HCV-C-transfected hBE cells than those in vector- or non-transfected hBE cells. In addition, the IkappaBalpha phosphorylation level, but not the IkappaBalpha mRNA or protein levels, was increased after HCV-C transfection. CONCLUSIONS: Hepatitis C virus core protein activates NF-kappaB pathway in hBE cells by increasing the phosphorylation of IkappaBalpha. The pathway may be responsible for HCV-C-induced malignant transformation of hBE cells.


Asunto(s)
Sistema Biliar/metabolismo , Transformación Celular Neoplásica/metabolismo , Transformación Celular Viral , Células Epiteliales/metabolismo , FN-kappa B/metabolismo , Transducción de Señal , Proteínas del Núcleo Viral/metabolismo , Sitios de Unión , Western Blotting , Línea Celular , Transformación Celular Neoplásica/genética , Transformación Celular Viral/genética , ADN/metabolismo , Ensayo de Cambio de Movilidad Electroforética , Genes Reporteros , Humanos , Proteínas I-kappa B/metabolismo , Inmunoprecipitación , Inhibidor NF-kappaB alfa , Fosforilación , Reacción en Cadena de la Polimerasa de Transcriptasa Inversa , Telomerasa/metabolismo , Factores de Tiempo , Transcripción Genética , Transfección , Proteínas del Núcleo Viral/genética
12.
Chin J Cancer ; 29(9): 775-80, 2010 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-20800018

RESUMEN

The application of nanotechnology significantly benefits clinical practice in cancer diagnosis, treatment, and management. Especially, nanotechnology offers a promise for the targeted delivery of drugs, genes, and proteins to tumor tissues and therefore alleviating the toxicity of anticancer agents in healthy tissues. This article reviews current nanotechnology platforms for anticancer drug delivery, including polymeric nanoparticles, liposomes, dendrimers, nanoshells, carbon nanotubes, superparamagnetic nanoparticles, and nucleic acid-based nanoparticles [DNA, RNA interference (RNAi), and antisense oligonucleotide (ASO)] as well as nanotechnologies for combination therapeutic strategies, for example, nanotechnologies combined with multidrug-resistance modulator, ultrasound, hyperthermia, or photodynamic therapy. This review raises awareness of the advantages and challenges for the application of these therapeutic nanotechnologies, in light of some recent advances in nanotechnologic drug delivery and cancer therapy.


Asunto(s)
Antineoplásicos/uso terapéutico , Nanopartículas/uso terapéutico , Nanotecnología/tendencias , Neoplasias/tratamiento farmacológico , Antineoplásicos/administración & dosificación , Dendrímeros/uso terapéutico , Portadores de Fármacos , Sistemas de Liberación de Medicamentos , Resistencia a Múltiples Medicamentos/efectos de los fármacos , Resistencia a Antineoplásicos/efectos de los fármacos , Humanos , Liposomas/uso terapéutico , Nanopartículas de Magnetita/uso terapéutico , Nanocáscaras/uso terapéutico , Nanotubos de Carbono , Polímeros/uso terapéutico
13.
Kaohsiung J Med Sci ; 36(3): 206-211, 2020 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-31749314

RESUMEN

Recently published studies had shown that there may be a potential link between the Single nucleotide polymorphism (SNP) of Toll-like receptor-4 (TLR4), and the risk of urinary tract infection (UTI); however, no consensus was reached. To further understand the relationship between TLR SNPs and urinary tract infections, we searched for related studies published in PubMed, EMBASE, and Web of Science before October 30, 2018, for further systematic review and meta-analysis. Our study accrued 10 case-control studies, which included 1476 urinary tract infection patients and 1449 healthy controls in TLR4(rs4986790, rs4986791). R3.4.2 and Stata 15.0 software were used for the analysis. In general, there was no statistically significant association between rs4986790 and urinary tract infection in the four genetic models. However, in the subgroup analysis, the Asian population showed significantly difference in the allelic model (G vs A: OR = 1.88 [95% CI:1.42-2.49], P = .03). In addition, there were also significant differences in the dominant model (GG + AG vs AA OR = 1.97 [95% CI:1.46-2.66], P = .01). Due to the small number of available literatures, no meaningful conclusion can be drawn regarding the relationship between TLR4 (rs4986791) and the risk of urinary tract infections in general. Nevertheless, our meta-analysis shows that in Asian populations, TLR4 (rs4986790) may be associated with risk of urinary tract infection.


Asunto(s)
Predisposición Genética a la Enfermedad/genética , Polimorfismo de Nucleótido Simple/genética , Receptor Toll-Like 4/genética , Infecciones Urinarias/metabolismo , Animales , Femenino , Humanos , Masculino , Infecciones Urinarias/genética
14.
BMC Bioinformatics ; 9: 80, 2008 Feb 01.
Artículo en Inglés | MEDLINE | ID: mdl-18241343

RESUMEN

BACKGROUND: Gene Ontology (GO) annotation, which describes the function of genes and gene products across species, has recently been used to predict protein subcellular and subnuclear localization. Existing GO-based prediction methods for protein subcellular localization use the known accession numbers of query proteins to obtain their annotated GO terms. An accurate prediction method for predicting subcellular localization of novel proteins without known accession numbers, using only the input sequence, is worth developing. RESULTS: This study proposes an efficient sequence-based method (named ProLoc-GO) by mining informative GO terms for predicting protein subcellular localization. For each protein, BLAST is used to obtain a homology with a known accession number to the protein for retrieving the GO annotation. A large number n of all annotated GO terms that have ever appeared are then obtained from a large set of training proteins. A novel genetic algorithm based method (named GOmining) combined with a classifier of support vector machine (SVM) is proposed to simultaneously identify a small number m out of the n GO terms as input features to SVM, where m <

Asunto(s)
Algoritmos , Células/metabolismo , Biología Computacional/métodos , Bases de Datos de Proteínas , Proteínas/fisiología , Análisis de Secuencia de Proteína/métodos , Programas Informáticos , Reconocimiento de Normas Patrones Automatizadas/métodos , Relación Estructura-Actividad
15.
Biochem Biophys Res Commun ; 375(3): 440-5, 2008 Oct 24.
Artículo en Inglés | MEDLINE | ID: mdl-18722351

RESUMEN

This study was aimed to identify tumor proteins that elicit a humoral response in patients with esophageal squamous cell carcinoma (ESCC). Autologous sera of 15 newly diagnosed patients with ESCC and age- and gender-matched 15 healthy controls were analyzed individually for antibody-based reactivity against proteins from 15 homogenized ESCC tissue mixture resolved by two-dimensional PAGE. One protein spot, which reacted with sera from ESCC patients but not with those from controls, was identified to be CDC25B by mass spectrometry and Western blotting. High expression of CDC25B was detected in ESCC cell lines and primary tumor tissues, but not in normal esophageal tissues. In addition, CDC25B expression was significantly higher in tumor tissue of patients with sera positive CDC25B-Abs than that of patients without CDC25B-Abs. Finally, anti-CDC25B antibodies were readily detectable in sera from 45 of 124 (36.29%) patients with ESCC, 13 of 150 (8.67%) patients with other types of cancer and 0 of 102 (0%) of healthy individuals. Thus, CDC25B autoantibodies may have clinical utility in ESCC screening and diagnosis.


Asunto(s)
Anticuerpos Antineoplásicos/sangre , Autoanticuerpos/sangre , Biomarcadores de Tumor/sangre , Carcinoma de Células Escamosas/diagnóstico , Neoplasias Esofágicas/diagnóstico , Proteómica , Fosfatasas cdc25/inmunología , Anticuerpos Antineoplásicos/inmunología , Formación de Anticuerpos , Autoanticuerpos/inmunología , Biomarcadores de Tumor/inmunología , Carcinoma de Células Escamosas/sangre , Carcinoma de Células Escamosas/inmunología , Neoplasias Esofágicas/sangre , Neoplasias Esofágicas/inmunología , Humanos
16.
Cancer Res ; 66(12): 6225-32, 2006 Jun 15.
Artículo en Inglés | MEDLINE | ID: mdl-16778197

RESUMEN

The Bmi-1 oncoprotein regulates proliferation and oncogenesis in human cells. Its overexpression leads to senescence bypass in human fibroblasts and immortalization of human mammary epithelial cells. In this study, we report that compared with normal nasopharyngeal epithelial cells (NPEC), Bmi-1 is overexpressed in nasopharyngeal carcinoma cell lines. Importantly, Bmi-1 was also found to be overexpressed in 29 of 75 nasopharyngeal carcinoma tumors (38.7%) by immunohistochemical analysis. In contrast to nasopharyngeal carcinoma, there was no detectable expression of Bmi-1 in noncancerous nasopharyngeal epithelium. Moreover, high Bmi-1 expression positively correlated with poor prognosis of nasopharyngeal carcinoma patients. We also report that the overexpression of Bmi-1 leads to bypass of senescence and immortalization of NPECs, which normally express p16(INK4a) and exhibit finite replicative life span. Overexpression of Bmi-1 in NPECs led to the induction of human telomerase reverse transcriptase activity and reduction of p16(INK4a) expression. Mutational analysis of Bmi-1 showed that both RING finger and helix-turn-helix domains of it are required for immortalization of NPECs. Our findings suggest that Bmi-1 plays an important role in the development and progression of nasopharyngeal carcinoma, and that Bmi-1 is a valuable marker for assessing the prognosis of nasopharyngeal carcinoma patients. Furthermore, this study provides the first cellular proto-oncogene immortalized nasopharyngeal epithelial cell line, which may serve as a cell model system for studying the mechanisms involved in the tumorigenesis of nasopharyngeal carcinoma.


Asunto(s)
Biomarcadores de Tumor/biosíntesis , Neoplasias Nasofaríngeas/metabolismo , Proteínas Nucleares/biosíntesis , Proteínas Proto-Oncogénicas/biosíntesis , Proteínas Represoras/biosíntesis , Biomarcadores de Tumor/genética , Línea Celular Tumoral , Inhibidor p16 de la Quinasa Dependiente de Ciclina/biosíntesis , Inhibidor p16 de la Quinasa Dependiente de Ciclina/genética , Daño del ADN , Progresión de la Enfermedad , Regulación hacia Abajo , Células Epiteliales/enzimología , Células Epiteliales/metabolismo , Femenino , Humanos , Inmunohistoquímica , Masculino , Persona de Mediana Edad , Neoplasias Nasofaríngeas/enzimología , Neoplasias Nasofaríngeas/genética , Neoplasias Nasofaríngeas/patología , Estadificación de Neoplasias , Proteínas Nucleares/genética , Complejo Represivo Polycomb 1 , Pronóstico , Proto-Oncogenes Mas , Proteínas Proto-Oncogénicas/genética , ARN Mensajero/biosíntesis , ARN Mensajero/genética , Proteínas Represoras/genética , Telomerasa/metabolismo
17.
Sci Rep ; 8(1): 951, 2018 01 17.
Artículo en Inglés | MEDLINE | ID: mdl-29343727

RESUMEN

Cyclic AMP receptor protein (CRP), a global regulator in Escherichia coli, regulates more than 180 genes via two roles: activation and repression. Few methods are available for predicting the regulatory roles from the binding sites of transcription factors. This work proposes an accurate method PredCRP to derive an optimised model (named PredCRP-model) and a set of four interpretable rules (named PredCRP-ruleset) for predicting and analysing the regulatory roles of CRP from sequences of CRP-binding sites. A dataset consisting of 169 CRP-binding sites with regulatory roles strongly supported by evidence was compiled. The PredCRP-model, using 12 informative features of CRP-binding sites, and cooperating with a support vector machine achieved a training and test accuracy of 0.98 and 0.93, respectively. PredCRP-ruleset has two activation rules and two repression rules derived using the 12 features and the decision tree method C4.5. This work further screened and identified 23 previously unobserved regulatory interactions in Escherichia coli. Using quantitative PCR for validation, PredCRP-model and PredCRP-ruleset achieved a test accuracy of 0.96 (=22/23) and 0.91 (=21/23), respectively. The proposed method is suitable for designing predictors for regulatory roles of all global regulators in Escherichia coli. PredCRP can be accessed at https://github.com/NctuICLab/PredCRP .


Asunto(s)
Sitios de Unión/fisiología , Proteína Receptora de AMP Cíclico/metabolismo , Proteínas de Escherichia coli/metabolismo , Escherichia coli/metabolismo , AMP Cíclico/metabolismo , ADN Bacteriano/genética , Regulación Bacteriana de la Expresión Génica/genética , Unión Proteica/fisiología , Factores de Transcripción/metabolismo
18.
J Chromatogr B Analyt Technol Biomed Life Sci ; 854(1-2): 320-7, 2007 Jul 01.
Artículo en Inglés | MEDLINE | ID: mdl-17467348

RESUMEN

4-Anilinoquinazolines (e.g. Iressa and Glivec) are a class of epidermal growth factor receptor tyrosine kinase (EGFR-TK) inhibitors widely used to treat non-small cell lung cancer and other tumors. However, low clinical response rate, resistance, and host toxicity of currently available EGFR-TK inhibitors prompt the development of second generation of TK inhibitors with improved efficacy, selectivity, and less resistance. CH330331 is a recently synthesized novel 4-anilinoquinazoline analog with confirmed anticancer activity in vitro and in vivo. To predict its oral pharmacokinetic behavior and transport nature in the intestine before entering clinical trials, we have developed and validated a high performance liquid chromatographic (HPLC) method for the determination of CH330331 in Caco-2 (a human colon cancer cell line) monolayers. The developed HPLC method was sensitive and reliable, with acceptable accuracy (90-110% of nominal values) and precision (intra- and inter-assay R.S.D.<10%). The total running time was within 10 min, with acceptable separation of the target analytes. The lower limit of quantitation (LLOQ) value for CH330331 was 200 ng/ml when an aliquot of 100 microl sample was injected onto the HPLC. The validated HPLC method was applied to characterize the epithelial transport of CH330331 in Caco-2 monolayers. The transport of CH330331 across the Caco-2 monolayers from the apical to basolateral side was 8- to 10-fold higher than that from the basolateral to apical side. Co-incubation of sodium azide or MK-571, but not verapamil, significantly inhibited the apical to basolateral transport of CH330331. These findings provide initial evidence that the intestinal absorption of CH330331 is mediated by an active mechanism. Further studies are required to explore the interaction of CH330331 with ATP-binding cassette transporters and the possible influence on its pharmacokinetics and pharmacodynamics.


Asunto(s)
Cromatografía Líquida de Alta Presión/métodos , Receptores ErbB/antagonistas & inhibidores , Inhibidores de Proteínas Quinasas/farmacología , Quinazolinas/farmacología , Espectrofotometría Ultravioleta/métodos , Transporte Biológico , Células CACO-2 , Epitelio/metabolismo , Humanos , Inhibidores de Proteínas Quinasas/farmacocinética , Quinazolinas/farmacocinética , Reproducibilidad de los Resultados
19.
Biosystems ; 90(2): 405-13, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17140725

RESUMEN

Amphiphilic pseudo-amino acid composition (Am-Pse-AAC) with extra sequence-order information is a useful feature for representing enzymes. This study first utilizes the k-nearest neighbor (k-NN) rule to analyze the distribution of enzymes in the Am-Pse-AAC feature space. This analysis indicates the distributions of multiple classes of enzymes are highly overlapped. To cope with the overlap problem, this study proposes an efficient non-parametric classifier for predicting enzyme subfamily class using an adaptive fuzzy r-nearest neighbor (AFK-NN) method, where k and a fuzzy strength parameter m are adaptively specified. The fuzzy membership values of a query sample Q are dynamically determined according to the position of Q and its weighted distances to the k nearest neighbors. Using the same enzymes of the oxidoreductases family for comparisons, the prediction accuracy of AFK-NN is 76.6%, which is better than those of Support Vector Machine (73.6%), the decision tree method C5.0 (75.4%) and the existing covariant-discriminate algorithm (70.6%) using a jackknife test. To evaluate the generalization ability of AFK-NN, the datasets for all six families of entirely sequenced enzymes are established from the newly updated SWISS-PROT and ENZYME database. The accuracy of AFK-NN on the new large-scale dataset of oxidoreductases family is 83.3%, and the mean accuracy of the six families is 92.1%.


Asunto(s)
Aminoácidos/química , Enzimas/química , Biología de Sistemas , Algoritmos , Animales , Simulación por Computador , Bases de Datos de Proteínas , Lógica Difusa , Vectores Genéticos , Modelos Estadísticos , Modelos Teóricos , Oxidorreductasas/genética , Reproducibilidad de los Resultados , Alineación de Secuencia , Análisis de Secuencia de Proteína
20.
Biosystems ; 90(2): 573-81, 2007.
Artículo en Inglés | MEDLINE | ID: mdl-17291684

RESUMEN

Accurate prediction methods of protein subnuclear localizations rely on the cooperation between informative features and classifier design. Support vector machine (SVM) based learning methods are shown effective for predictions of protein subcellular and subnuclear localizations. This study proposes an evolutionary support vector machine (ESVM) based classifier with automatic selection from a large set of physicochemical composition (PCC) features to design an accurate system for predicting protein subnuclear localization, named ProLoc. ESVM using an inheritable genetic algorithm combined with SVM can automatically determine the best number m of PCC features and identify m out of 526 PCC features simultaneously. To evaluate ESVM, this study uses two datasets SNL6 and SNL9, which have 504 proteins localized in 6 subnuclear compartments and 370 proteins localized in 9 subnuclear compartments. Using a leave-one-out cross-validation, ProLoc utilizing the selected m=33 and 28 PCC features has accuracies of 56.37% for SNL6 and 72.82% for SNL9, which are better than 51.4% for the SVM-based system using k-peptide composition features applied on SNL6, and 64.32% for an optimized evidence-theoretic k-nearest neighbor classifier utilizing pseudo amino acid composition applied on SNL9, respectively.


Asunto(s)
Biología Computacional/métodos , Biología de Sistemas , Algoritmos , Aminoácidos/química , Animales , Automatización , Biología/métodos , Técnicas de Apoyo para la Decisión , Humanos , Modelos Genéticos , Modelos Teóricos , Reproducibilidad de los Resultados , Programas Informáticos
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA