RESUMO
Many computer-aided diagnosis (CAD) problems can be best modelled as a multiple-instance learning (MIL) problem with unbalanced data, i.e., the training data typically consists of a few positive bags, and a very large number of negative instances. Existing MIL algorithms are much too computationally expensive for these datasets. We describe CH, a framework for learning a Convex Hull representation of multiple instances that is significantly faster than existing MIL algorithms. Our CH framework applies to any standard hyperplane-based learning algorithm, and for some algorithms, is guaranteed to find the global optimal solution. Experimental studies on two different CAD applications further demonstrate that the proposed algorithm significantly improves diagnostic accuracy when compared to both MIL and traditional classifiers. Although not designed for standard MIL problems (which have both positive and negative bags and relatively balanced datasets), comparisons against other MIL methods on benchmark problems also indicate that the proposed method is competitive with the state-of-the-art.
Assuntos
Algoritmos , Inteligência Artificial , Neoplasias do Colo/diagnóstico por imagem , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Embolia Pulmonar/diagnóstico por imagem , Humanos , Radiografia , Reprodutibilidade dos Testes , Sensibilidade e EspecificidadeRESUMO
BACKGROUND AND PURPOSE: Hypoxia is a common feature of solid tumors associated with therapy resistance, increased malignancy and poor prognosis. Several approaches have been developed with the hope of identifying patients harboring hypoxic tumors including the use of microarray based gene signatures. However, studies to date have largely ignored the strong time dependency of hypoxia-regulated gene expression. We hypothesized that use of time-dependent patterns of gene expression during hypoxia would enable development of superior prognostic expression signatures. MATERIALS AND METHODS: Using published data from the microarray study of Chi et al., we extracted gene signatures correlating with induction during either early or late hypoxic exposure. Gene signatures were derived from in vitro exposed human mammary epithelial cell line (HMEC) under 0% or 2% oxygen. Gene signatures correlating with early and late up-regulation were tested by means of Kaplan-Meier survival, univariate, and multivariate analysis on a patient data set with primary breast cancer treated conventionally (surgery plus on indication radiotherapy and systemic therapy). RESULTS: We found that the two early hypoxia gene signatures extracted from 0% and 2% hypoxia showed significant prognostic power (log-rank test: p=0.004 at 0%, p=0.034 at 2%) in contrast to the late hypoxia signatures. Both early gene signatures were linked to the insulin pathway. From the multivariate Cox-regression analysis, the early hypoxia signature (p=0.254) was found to be the 4th best prognostic factor after lymph node status (p=0.002), tumor size (p=0.016) and Elston grade (p=0.111). On this data set it indeed provided more information than ER status or p53 status. CONCLUSIONS: The hypoxic stress elicits a wide panel of temporal responses corresponding to different biological pathways. Early hypoxia signatures were shown to have a significant prognostic power. These data suggest that gene signatures identified from in vitro experiments could contribute to individualized medicine.
Assuntos
Hipóxia Celular/genética , Perfilação da Expressão Gênica , Fator 1 Induzível por Hipóxia/genética , Fator 1 Induzível por Hipóxia/metabolismo , Neoplasias/genética , Oxigênio/metabolismo , Bases de Dados Genéticas , Células Epiteliais/metabolismo , Feminino , Humanos , Pessoa de Meia-Idade , Neoplasias/diagnóstico , Neoplasias/fisiopatologia , Análise de Sequência com Séries de Oligonucleotídeos , Valor Preditivo dos Testes , Prognóstico , Análise de Sobrevida , Fatores de TempoAssuntos
Inteligência Artificial , Doença da Artéria Coronariana/diagnóstico por imagem , Ecocardiografia/métodos , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Disfunção Ventricular Esquerda/diagnóstico por imagem , Algoritmos , Análise por Conglomerados , Doença da Artéria Coronariana/complicações , Humanos , Modelos Lineares , Modelos Cardiovasculares , Reprodutibilidade dos Testes , Sensibilidade e Especificidade , Disfunção Ventricular Esquerda/complicaçõesRESUMO
Machine Learning techniques have been used quite widely for the task of predicting cognitive processes from fMRI data. However, these models do not describe well the fMRI signal when it is generated by multiple cognitive processes that are simultaneously active. In this paper we consider the problem of accurately modeling the fMRI signal of a human subject who is performing a task involving multiple concurrent cognitive processes. We present a Hierarchical Clustering extension of Hidden Process Models which, by taking advantage of automatically discovered similarities in the activation among neighboring voxels, achieves significantly better performance than standard generative models in terms of Average Log Likelihood.
Assuntos
Algoritmos , Inteligência Artificial , Encéfalo/fisiologia , Cognição/fisiologia , Imageamento por Ressonância Magnética , Modelos Biológicos , Mapeamento Encefálico/métodos , Humanos , Modelos Estatísticos , Reconhecimento Visual de Modelos/fisiologia , Processamento de Sinais Assistido por ComputadorRESUMO
Today's healthcare organizations have both an ethical and legal responsibility for protecting patient privacy. However, the HIPAA privacy rule allows for the release of de-identified patient data for certain purposes. Secure encryption technology can be used to encrypt patient identified data so only the owners of the original data can re-identify the patient. It further allows consistent de-identification over episodic collection events.
Assuntos
Segurança Computacional , ConfidencialidadeRESUMO
We apply machine learning to the problem of subpopulation assessment for Caesarian Section. In subpopulation assessment, we are interested in making predictions not for a single patient, but for groups of patients. Typically, in any large population, different subpopulations will have different "outcome" rates. In our example, the C-section rate of a population of 22,176 expectant mothers is 16.8%; yet, the 17 physician groups that serve this population have vastly different group C-section rates, ranging from 11% to 23%. The ultimate goal of subpopulation assessment is to determine if these variations in the observed rates can be attributed to (a) variations in intrinsic risk of the patient sub-populations (i.e. some groups contain more "high-risk C-section" patients), or (b) differences in physician practice (i.e. some groups do more C-sections). Our results indicate that although there is some variation in intrinsic risk, there is also much variation in physician practice.
Assuntos
Inteligência Artificial , Cesárea/estatística & dados numéricos , Árvores de Decisões , Redes Neurais de Computação , Padrões de Prática Médica/estatística & dados numéricos , Interpretação Estatística de Dados , Feminino , Humanos , GravidezRESUMO
The C-section rate of a population of 22,175 expectant mothers is 16.8%; yet the 17 physician groups that serve this population have vastly different group C-section rates, ranging from 13% to 23%. Our goal is to determine retrospectively if the variations in the observed rates can be attributed to variations in the intrinsic risk of the patient sub-populations (i.e. some groups contain more "high-risk C-section" patients), or differences in physician practice (i.e. some groups do more C-sections). We apply machine learning to this problem by training models to predict standard practice from retrospective data. We then use the models of standard practice to evaluate the C-section rate of each physician practice. Our results indicate that although there is variation in intrinsic risk among the groups, there also is much variation in physician practice.