Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 6 de 6
Filtrar
Más filtros










Base de datos
Intervalo de año de publicación
1.
J Med Syst ; 40(9): 206, 2016 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-27518854

RESUMEN

This paper describes the design of an ellipsis and coreference resolution module integrated in a computerized virtual patient dialogue system. Real medical diagnosis dialogues have been collected and analyzed. Several groups of diagnosis-related concepts were defined and used to construct rules, patterns, and features to detect and resolve ellipsis and coreference. The best F-scores of ellipsis detection and resolution were 89.15 % and 83.40 %, respectively. The best F-scores of phrasal coreference detection and resolution were 93.83 % and 83.40 %, respectively. The accuracy of pronominal anaphora resolution was 92 % for the 3rd-person singular pronouns referring to specific entities, and 97.31 % for other pronouns.


Asunto(s)
Comunicación , Relaciones Médico-Paciente , Interfaz Usuario-Computador , Taiwán
2.
Biomed Res Int ; 2014: 678971, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-24800246

RESUMEN

Simple sequence repeats (SSRs) are not only applied as genetic markers in evolutionary studies but they also play an important role in gene regulatory activities. Efficient identification of conserved and exclusive SSRs through cross-species comparison is helpful for understanding the evolutionary mechanisms and associations between specific gene groups and SSR motifs. In this paper, we developed an online cross-species comparative system and integrated it with a tag cloud visualization technique for identifying potential SSR biomarkers within fourteen frequently used model species. Ultraconserved or exclusive SSRs among cross-species orthologous genes could be effectively retrieved and displayed through a friendly interface design. Four different types of testing cases were applied to demonstrate and verify the retrieved SSR biomarker candidates. Through statistical analysis and enhanced tag cloud representation on defined functional related genes and cross-species clusters, the proposed system can correctly represent the patterns, loci, colors, and sizes of identified SSRs in accordance with gene functions, pattern qualities, and conserved characteristics among species.


Asunto(s)
Marcadores Genéticos/genética , Genómica/métodos , Repeticiones de Microsatélite/genética , Modelos Genéticos , Animales , Simulación por Computador , ADN/química , ADN/genética , Bases de Datos Genéticas , Humanos , Especificidad de la Especie
3.
Int J Data Min Bioinform ; 10(2): 121-45, 2014.
Artículo en Inglés | MEDLINE | ID: mdl-25796734

RESUMEN

Selecting informative genes is the most important task for data analysis on microarray gene expression data. In this work, we aim at identifying regulatory gene pairs from microarray gene expression data. However, microarray data often contain multiple missing expression values. Missing value imputation is thus needed before further processing for regulatory gene pairs becomes possible. We develop a novel approach to first impute missing values in microarray time series data by combining k-Nearest Neighbour (KNN), Dynamic Time Warping (DTW) and Gene Ontology (GO). After missing values are imputed, we then perform gene regulation prediction based on our proposed DTW-GO distance measurement of gene pairs. Experimental results show that our approach is more accurate when compared with existing missing value imputation methods on real microarray data sets. Furthermore, our approach can also discover more regulatory gene pairs that are known in the literature than other methods.


Asunto(s)
Algoritmos , Bases de Datos Genéticas , Perfilación de la Expresión Génica/métodos , Ontología de Genes , Genes Reguladores/genética , Análisis de Secuencia por Matrices de Oligonucleótidos/métodos , Mapeo de Interacción de Proteínas/métodos , Minería de Datos/métodos , Regulación de la Expresión Génica/genética
4.
Int J Data Min Bioinform ; 7(2): 214-27, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23777177

RESUMEN

To determine the structure of a protein by X-ray crystallography, the protein needs to be purified and crystallized first. However, some proteins cannot be crystallized. This makes the average cost of protein structure determination much higher. Thus it is desired to predict the crystallizability of a protein by a computational method before starting the wet-lab procedure. Features from the primary structure of a target protein are collected first. With a proper set of features, protein crystallizability can be predicted with a high accuracy. In this research, 74 features from previous researches are re-examined by two filter-mode feature selection methods. The selected features are then used for crystallization prediction by three versions of AdaBoost. The Support Vector Machines (SVMs) are also tested for comparison. The best prediction accuracy of AdaBoost reaches 93 percent and 48 important features are identified from the collected 74 features.


Asunto(s)
Cristalografía por Rayos X , Proteínas/química , Biología Computacional , Cristalización , Bases de Datos de Proteínas , Máquina de Vectores de Soporte
5.
BMC Bioinformatics ; 14 Suppl 4: S4, 2013.
Artículo en Inglés | MEDLINE | ID: mdl-23514235

RESUMEN

BACKGROUND: Protein-ligand interactions are key processes in triggering and controlling biological functions within cells. Prediction of protein binding regions on the protein surface assists in understanding the mechanisms and principles of molecular recognition. In silico geometrical shape analysis plays a primary step in analyzing the spatial characteristics of protein binding regions and facilitates applications of bioinformatics in drug discovery and design. Here, we describe the novel software, PLB-SAVE, which uses parallel processing technology and is ideally suited to extract the geometrical construct of solid angles from surface atoms. Representative clusters and corresponding anchors were identified from all surface elements and were assigned according to the ranking of their solid angles. In addition, cavity depth indicators were obtained by proportional transformation of solid angles and cavity volumes were calculated by scanning multiple directional vectors within each selected cavity. Both depth and volume characteristics were combined with various weighting coefficients to rank predicted potential binding regions. RESULTS: Two test datasets from LigASite, each containing 388 bound and unbound structures, were used to predict binding regions using PLB-SAVE and two well-known prediction systems, SiteHound and MetaPocket2.0 (MPK2). PLB-SAVE outperformed the other programs with accuracy rates of 94.3% for unbound proteins and 95.5% for bound proteins via a tenfold cross-validation process. Additionally, because the parallel processing architecture was designed to enhance the computational efficiency, we obtained an average of 160-fold increase in computational time. CONCLUSIONS: In silico binding region prediction is considered the initial stage in structure-based drug design. To improve the efficacy of biological experiments for drug development, we developed PLB-SAVE, which uses only geometrical features of proteins and achieves a good overall performance for protein-ligand binding region prediction. Based on the same approach and rationale, this method can also be applied to predict carbohydrate-antibody interactions for further design and development of carbohydrate-based vaccines. PLB-SAVE is available at http://save.cs.ntou.edu.tw.


Asunto(s)
Ligandos , Proteínas/química , Programas Informáticos , Vacunas/química , Biología Computacional/métodos , Simulación por Computador , Bases de Datos de Proteínas , Diseño de Fármacos , Humanos , Modelos Moleculares , Unión Proteica , Estructura Terciaria de Proteína , Proteínas/metabolismo
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA