Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
Genet Epidemiol ; 36(2): 99-106, 2012 Feb.
Artículo en Inglés | MEDLINE | ID: mdl-22851473

RESUMEN

Complex genetic disorders are a result of a combination of genetic and nongenetic factors, all potentially interacting. Machine learning methods hold the potential to identify multilocus and environmental associations thought to drive complex genetic traits. Decision trees, a popular machine learning technique, offer a computationally low complexity algorithm capable of detecting associated sets of single nucleotide polymorphisms (SNPs) of arbitrary size, including modern genome-wide SNP scans. However, interpretation of the importance of an individual SNP within these trees can present challenges. We present a new decision tree algorithm denoted as Bagged Alternating Decision Trees (BADTrees) that is based on identifying common structural elements in a bootstrapped set of Alternating Decision Trees (ADTrees). The algorithm is order nk(2), where n is the number of SNPs considered and k is the number of SNPs in the tree constructed. Our simulation study suggests that BADTrees have higher power and lower type I error rates than ADTrees alone and comparable power with lower type I error rates compared to logistic regression. We illustrate the application of these data using simulated data as well as from the Lupus Large Association Study 1 (7,822 SNPs in 3,548 individuals). Our results suggest that BADTrees hold promise as a low computational order algorithm for detecting complex combinations of SNP and environmental factors associated with disease.


Asunto(s)
Árboles de Decisión , Enfermedades Genéticas Congénitas/genética , Modelos Genéticos , Epidemiología Molecular/métodos , Polimorfismo de Nucleótido Simple , Algoritmos , Inteligencia Artificial , Simulación por Computador , Marcadores Genéticos , Estudio de Asociación del Genoma Completo , Antígenos HLA/genética , Humanos , Desequilibrio de Ligamiento , Modelos Estadísticos , Reproducibilidad de los Resultados , Programas Informáticos
2.
Chem Biodivers ; 2(11): 1533-52, 2005 Nov.
Artículo en Inglés | MEDLINE | ID: mdl-17191953

RESUMEN

A major pharmaceutical problem is designing diverse and selective lead compounds. The human genome sequence provides opportunities to discover compounds that are protein selective if we can develop methods to identify specificity determinants from sequence alone. We have analyzed sequence and structural diversity of sheep COX-1 and mouse COX-2 proteins by Active Site Profiling (ASP). Eleven residues that should serve as specificity determinants between COX-1 and COX-2 were identified; however, the literature suggests that only one has been utilized in structure-based discovery. ASP was used to create a position-specific scoring matrix, which was used to identify possible cross-reacting proteins from the human sequences. This method proved selective for cyclooxygenases, comparing well with results using BLAST. The methods identify a probable misannotation of a cyclooxygenase in which there is high sequence similarity scores using BLAST, but ASP shows it does not contain the residues necessary for cyclooxygenase function. ASP Analysis of human COX proteins suggests that some specificity determinants that distinguish COX-1 and COX-2 proteins are similar between sheep COX-1/mouse COX-2 and human COX-1/COX2; however, residue identities at those positions are not necessarily conserved. Our results lay groundwork for development of family-specific pattern recognition methods to selectively match compounds with proteins.


Asunto(s)
Ciclooxigenasa 1/química , Ciclooxigenasa 1/genética , Ciclooxigenasa 2/química , Ciclooxigenasa 2/genética , Proteínas de la Membrana/química , Proteínas de la Membrana/genética , Secuencia de Aminoácidos , Animales , Sitios de Unión/fisiología , Ciclooxigenasa 1/metabolismo , Ciclooxigenasa 2/metabolismo , Inhibidores de la Ciclooxigenasa 2/química , Inhibidores de la Ciclooxigenasa 2/metabolismo , Inhibidores de la Ciclooxigenasa/química , Inhibidores de la Ciclooxigenasa/metabolismo , Humanos , Proteínas de la Membrana/metabolismo , Ratones , Datos de Secuencia Molecular , Homología de Secuencia de Aminoácido , Oveja Doméstica
3.
IEEE Trans Biomed Eng ; 57(9): 2229-38, 2010 Sep.
Artículo en Inglés | MEDLINE | ID: mdl-20172778

RESUMEN

Bioluminescence tomography (BLT) is an inverse source problem that localizes and quantifies bioluminescent probe distribution in 3-D. The generic BLT model is ill-posed, leading to nonunique solutions and aberrant reconstruction in the presence of measurement noise and optical parameter mismatches. In this paper, we introduce the knowledge of the number of bioluminescence sources to stabilize the BLT problem. Based on this regularized BLT model, we develop a differential evolution-based reconstruction algorithm to determine the source locations and strengths accurately and reliably. Then, we evaluate this novel approach in numerical, phantom, and mouse studies.


Asunto(s)
Procesamiento de Imagen Asistido por Computador/métodos , Mediciones Luminiscentes/métodos , Tomografía/métodos , Animales , Línea Celular Tumoral , Simulación por Computador , Análisis de Elementos Finitos , Humanos , Luciferasas/química , Luciferasas/genética , Ratones , Ratones SCID , Trasplante de Neoplasias , Neoplasias/patología , Fantasmas de Imagen , Microtomografía por Rayos X
4.
J Digit Imaging ; 18(1): 42-54, 2005 Mar.
Artículo en Inglés | MEDLINE | ID: mdl-15645334

RESUMEN

We present a fully automated three-dimensional (3-D) segmentation algorithm to extract the colon lumen surface in CT colonography. Focusing on significant-size polyp detection, we target at an efficient algorithm that maximizes overall colon coverage, minimizes the extracolonic components, maintains local shape accuracy, and achieves high segmentation speed. Two-dimensional (2-D) image processing techniques are employed first, resulting in automatic seed placement and better colon coverage. This is followed by near-air threshold 3-D region-growing using an improved marching-cubes algorithm, which provides fast and accurate surface generation. The algorithm constructs a well-organized vertex-triangle structure that uniquely employs a hash table method, yielding an order of magnitude speed improvement. We segment two scans, prone and supine, independently and with the goal of improved colon coverage. Both segmentations would be available for subsequent polyp detection systems. Segmenting and analyzing both scans improves surface coverage by at least 6% over supine or prone alone. According to subjective evaluation, the average coverage is about 87.5% of the entire colon. Employing near-air threshold and elongation criteria, only 6% of the data sets include extracolonic components (EC) in the segmentation. The observed surface shape accuracy of the segmentation is adequate for significant-size (6 mm) polyp detection, which is also verified by the results of the prototype detection algorithm. The segmentation takes less than 5 minutes on an AMD 1-GHz single-processor PC, which includes reading the volume data and writing the surface results. The surface-based segmentation algorithm is practical for subsequent polyp detection algorithms in that it produces high coverage, has a low EC rate, maintains local shape accuracy, and has a computational efficiency that makes real-time polyp detection possible. A fully automatic or computer-aided polyp detection system using this technique is likely to benefit future colon cancer early screening.


Asunto(s)
Colonografía Tomográfica Computarizada/métodos , Procesamiento de Imagen Asistido por Computador/métodos , Imagenología Tridimensional/métodos , Algoritmos , Colon/diagnóstico por imagen , Pólipos del Colon/diagnóstico por imagen , Medios de Contraste , Humanos , Insuflación , Posición Prona , Intensificación de Imagen Radiográfica/métodos , Posición Supina
5.
J Chem Inf Model ; 45(6): 1749-58, 2005.
Artículo en Inglés | MEDLINE | ID: mdl-16309281

RESUMEN

The utility of the supervised Kohonen self-organizing map was assessed and compared to several statistical methods used in QSAR analysis. The self-organizing map (SOM) describes a family of nonlinear, topology preserving mapping methods with attributes of both vector quantization and clustering that provides visualization options unavailable with other nonlinear methods. In contrast to most chemometric methods, the supervised SOM (sSOM) is shown to be relatively insensitive to noise and feature redundancy. Additionally, sSOMs can make use of descriptors having only nominal linear correlation with the target property. Results herein are contrasted to partial least squares, stepwise multiple linear regression, the genetic functional algorithm, and genetic partial least squares, collectively referred to throughout as the "standard methods". The k-nearest neighbor (kNN) classification method was also performed to provide a direct comparison with a different classification method. The widely studied dihydrofolate reductase (DHFR) inhibition data set of Hansch and Silipo is used to evaluate the ability of sSOMs to classify unknowns as a function of increasing class resolution. The contribution of the sSOM neighborhood kernel to its predictive ability is assessed in two experiments: (1) training with the k-means clustering limit, where the neighborhood radius is zero throughout the training regimen, and (2) training the sSOM until the neighborhood radius is reduced to zero. Results demonstrate that sSOMs provide more accurate predictions than standard linear QSAR methods.


Asunto(s)
Inteligencia Artificial , Bases de Datos Factuales/estadística & datos numéricos , Farmacología/estadística & datos numéricos , Relación Estructura-Actividad Cuantitativa , Algoritmos , Análisis por Conglomerados , Interpretación Estadística de Datos , Diseño de Fármacos , Modelos Moleculares , Modelos Estadísticos , Conformación Molecular , Dinámicas no Lineales , Valor Predictivo de las Pruebas , Tetrahidrofolato Deshidrogenasa/química
6.
J Comput Aided Mol Des ; 18(7-9): 483-93, 2004.
Artículo en Inglés | MEDLINE | ID: mdl-15729848

RESUMEN

Modeling non-linear descriptor-target activity/property relationships with many dependent descriptors has been a long-standing challenge in the design of biologically active molecules. In an effort to address this problem, we couple the supervised self-organizing map with the genetic algorithm. Although self-organizing maps are non-linear and topology-preserving techniques that hold great potential for modeling and decoding relationships, the large number of descriptors in typical quantitative structure-activity relationship or quantitative structure-property relationship analysis may lead to spurious correlation(s) and/or difficulty in the interpretation of resulting models. To reduce the number of descriptors to a manageable size, we chose the genetic algorithm for descriptor selection because of its flexibility and efficiency in solving complex problems. Feasibility studies were conducted using six different datasets, of moderate-to-large size and moderate-to-great diversity; each with a different biological endpoint. Since favorable training set statistics do not necessarily indicate a highly predictive model, the quality of all models was confirmed by withholding a portion of each dataset for external validation. We also address the variability introduced onto modeling through dataset partitioning and through the stochastic nature of the combined genetic algorithm supervised self-organizing map method using the z-score and other tests. Experiments show that the combined method provides comparable accuracy to the supervised self-organizing map alone, but using significantly fewer descriptors in the models generated. We observed consistently better results than partial least squares models. We conclude that the combination of genetic algorithms with the supervised self-organizing map shows great potential as a quantitative structure-activity/property relationship modeling tool.


Asunto(s)
Algoritmos , Modelos Moleculares , Relación Estructura-Actividad Cuantitativa
SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA