Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 12 de 12
Filtrar
1.
Proc Mach Learn Res ; 162: 25955-25972, 2022 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-37139473

RESUMO

Transformers have emerged as a preferred model for many tasks in natural langugage processing and vision. Recent efforts on training and deploying Transformers more efficiently have identified many strategies to approximate the self-attention matrix, a key module in a Transformer architecture. Effective ideas include various prespecified sparsity patterns, low-rank basis expansions and combinations thereof. In this paper, we revisit classical Multiresolution Analysis (MRA) concepts such as Wavelets, whose potential value in this setting remains underexplored thus far. We show that simple approximations based on empirical feedback and design choices informed by modern hardware and implementation challenges, eventually yield a MRA-based approach for self-attention with an excellent performance profile across most criteria of interest. We undertake an extensive set of experiments and demonstrate that this multi-resolution scheme outperforms most efficient self-attention proposals and is favorable for both short and long sequences. Code is available at https://github.com/mlpen/mra-attention.

2.
Proc AAAI Conf Artif Intell ; 35(16): 14138-14148, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-34745767

RESUMO

Transformers have emerged as a powerful tool for a broad range of natural language processing tasks. A key component that drives the impressive performance of Transformers is the self-attention mechanism that encodes the influence or dependence of other tokens on each specific token. While beneficial, the quadratic complexity of self-attention on the input sequence length has limited its application to longer sequences - a topic being actively studied in the community. To address this limitation, we propose Nyströmformer - a model that exhibits favorable scalability as a function of sequence length. Our idea is based on adapting the Nyström method to approximate standard self-attention with O(n) complexity. The scalability of Nyströmformer enables application to longer sequences with thousands of tokens. We perform evaluations on multiple downstream tasks on the GLUE benchmark and IMDB reviews with standard sequence length, and find that our Nyströmformer performs comparably, or in a few cases, even slightly better, than standard self-attention. On longer sequence tasks in the Long Range Arena (LRA) benchmark, Nyströmformer performs favorably relative to other efficient self-attention methods. Our code is available at https://github.com/mlpen/Nystromformer.

3.
Proc AAAI Conf Artif Intell ; 34(4): 5487-5494, 2020 Jun 16.
Artigo em Inglês | MEDLINE | ID: mdl-34094697

RESUMO

Data dependent regularization is known to benefit a wide variety of problems in machine learning. Often, these regularizers cannot be easily decomposed into a sum over a finite number of terms, e.g., a sum over individual example-wise terms. The F ß measure, Area under the ROC curve (AUCROC) and Precision at a fixed recall (P@R) are some prominent examples that are used in many applications. We find that for most medium to large sized datasets, scalability issues severely limit our ability in leveraging the benefits of such regularizers. Importantly, the key technical impediment despite some recent progress is that, such objectives remain difficult to optimize via backpropapagation procedures. While an efficient general-purpose strategy for this problem still remains elusive, in this paper, we show that for many data-dependent nondecomposable regularizers that are relevant in applications, sizable gains in efficiency are possible with minimal code-level changes; in other words, no specialized tools or numerical schemes are needed. Our procedure involves a reparameterization followed by a partial dualization - this leads to a formulation that has provably cheap projection operators. We present a detailed analysis of runtime and convergence properties of our algorithm. On the experimental side, we show that a direct use of our scheme significantly improves the state of the art IOU measures reported for MSCOCO Stuff segmentation dataset.

4.
Comput Vis ECCV ; 11218: 575-590, 2018 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-32318687

RESUMO

A sizable body of work on relative attributes provides evidence that relating pairs of images along a continuum of strength pertaining to a visual attribute yields improvements in a variety of vision tasks. In this paper, we show how emerging ideas in graph neural networks can yield a solution to various problems that broadly fall under relative attribute learning. Our main idea is the observation that relative attribute learning naturally benefits from exploiting the graph of dependencies among the different relative attributes of images, especially when only partial ordering is provided at training time. We use message passing to perform end to end learning of the image representations, their relationships as well as the interplay between different attributes. Our experiments show that this simple framework is effective in achieving competitive accuracy with specialized methods for both relative attribute learning and binary attribute prediction, while relaxing the requirements on the training data and/or the number of parameters, or both.

5.
Artif Intell Med ; 65(2): 89-96, 2015 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-26363683

RESUMO

OBJECTIVE: The ability to predict patient readmission risk is extremely valuable for hospitals, especially under the Hospital Readmission Reduction Program of the Center for Medicare and Medicaid Services which went into effect starting October 1, 2012. There is a plethora of work in the literature that deals with developing readmission risk prediction models, but most of them do not have sufficient prediction accuracy to be deployed in a clinical setting, partly because different hospitals may have different characteristics in their patient populations. METHODS AND MATERIALS: We propose a generic framework for institution-specific readmission risk prediction, which takes patient data from a single institution and produces a statistical risk prediction model optimized for that particular institution and, optionally, for a specific condition. This provides great flexibility in model building, and is also able to provide institution-specific insights in its readmitted patient population. We have experimented with classification methods such as support vector machines, and prognosis methods such as the Cox regression. We compared our methods with industry-standard methods such as the LACE model, and showed the proposed framework is not only more flexible but also more effective. RESULTS: We applied our framework to patient data from three hospitals, and obtained some initial results for heart failure (HF), acute myocardial infarction (AMI), pneumonia (PN) patients as well as patients with all conditions. On Hospital 2, the LACE model yielded AUC 0.57, 0.56, 0.53 and 0.55 for AMI, HF, PN and All Cause readmission prediction, respectively, while the proposed model yielded 0.66, 0.65, 0.63, 0.74 for the corresponding conditions, all significantly better than the LACE counterpart. The proposed models that leverage all features at discharge time is more accurate than the models that only leverage features at admission time (0.66 vs. 0.61 for AMI, 0.65 vs. 0.61 for HF, 0.63 vs. 0.56 for PN, 0.74 vs. 0.60 for All Cause). Furthermore, the proposed admission-time models already outperform the performance of LACE, which is a discharge-time model (0.61 vs. 0.57 for AMI, 0.61 vs. 0.56 for HF, 0.56 vs. 0.53 for PN, 0.60 vs. 0.55 for All Cause). Similar conclusions can be drawn from other hospitals as well. The same performance comparison also holds for precision and recall at top-decile predictions. Most of the performance improvements are statistically significant. CONCLUSIONS: The institution-specific readmission risk prediction framework is more flexible and more effective than the one-size-fit-all models like the LACE, sometimes twice and three-time more effective. The admission-time models are able to give early warning signs compared to the discharge-time models, and may be able to help hospital staff intervene early while the patient is still in the hospital.


Assuntos
Modelos Teóricos , Readmissão do Paciente , Humanos , Modelos de Riscos Proporcionais , Medição de Risco , Máquina de Vetores de Suporte
6.
J Hypertens ; 31(11): 2142-50; discussion 2150, 2013 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-24077244

RESUMO

OBJECTIVE: Data mining represents an alternative approach to identify new predictors of multifactorial diseases. This work aimed at building an accurate predictive model for incident hypertension using data mining procedures. METHODS: The primary study population consisted of 1605 normotensive individuals aged 20-79 years with 5-year follow-up from the population-based study, that is the Study of Health in Pomerania (SHIP). The initial set was randomly split into a training and a testing set. We used a probabilistic graphical model applying a Bayesian network to create a predictive model for incident hypertension and compared the predictive performance with the established Framingham risk score for hypertension. Finally, the model was validated in 2887 participants from INTER99, a Danish community-based intervention study. RESULTS: In the training set of SHIP data, the Bayesian network used a small subset of relevant baseline features including age, mean arterial pressure, rs16998073, serum glucose and urinary albumin concentrations. Furthermore, we detected relevant interactions between age and serum glucose as well as between rs16998073 and urinary albumin concentrations [area under the receiver operating characteristic (AUC 0.76)]. The model was confirmed in the SHIP validation set (AUC 0.78) and externally replicated in INTER99 (AUC 0.77). Compared to the established Framingham risk score for hypertension, the predictive performance of the new model was similar in the SHIP validation set and moderately better in INTER99. CONCLUSION: Data mining procedures identified a predictive model for incident hypertension, which included innovative and easy-to-measure variables. The findings promise great applicability in screening settings and clinical practice.


Assuntos
Mineração de Dados , Hipertensão/epidemiologia , Adulto , Fatores Etários , Idoso , Algoritmos , Teorema de Bayes , Feminino , Alemanha/epidemiologia , Humanos , Hipertensão/etiologia , Masculino , Pessoa de Meia-Idade , Modelos Teóricos , Curva ROC , Distribuição Aleatória , Medição de Risco , Fatores de Risco , Adulto Jovem
7.
Nat Rev Cardiol ; 10(6): 308-16, 2013 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-23528962

RESUMO

The primary goals of personalized medicine are to optimize diagnostic and treatment strategies by tailoring them to the specific characteristics of an individual patient. In this Review, we summarize basic concepts and methods of personalizing cardiovascular medicine. In-depth characterization of study participants and patients in general practice using standardized methods is a pivotal component of study design in personalized medicine. Standardization and quality assurance of clinical data are similarly important, but in daily practice imprecise definitions of clinical variables can reduce power and introduce bias, which limits the validity of the data obtained as well as their potential clinical applicability. Changes in statistical methods with personalized medicine include a shift from dichotomous outcomes towards continuously measured variables, predictive modelling, and individualized medical decisions, subgroup analyses, and data-mining strategies. A variety of approaches to personalized medicine exist in cardiovascular research and clinical practice that might have the potential to individualize diagnostic and therapeutic procedures. For some of the emerging methods, such as data mining, the most-efficient way to use these tools is not yet fully understood. In addition, the predictive models-although promising-are far from mature, and are likely to be greatly improved by using available large-scale data sets.


Assuntos
Doenças Cardiovasculares/terapia , Medicina Baseada em Evidências/estatística & dados numéricos , Modelos Estatísticos , Medicina de Precisão/estatística & dados numéricos , Doenças Cardiovasculares/diagnóstico , Doenças Cardiovasculares/genética , Interpretação Estatística de Dados , Mineração de Dados/estatística & dados numéricos , Medicina Baseada em Evidências/métodos , Medicina Baseada em Evidências/normas , Predisposição Genética para Doença , Humanos , Fenótipo , Guias de Prática Clínica como Assunto , Medicina de Precisão/métodos , Medicina de Precisão/normas , Prognóstico , Indicadores de Qualidade em Assistência à Saúde/estatística & dados numéricos
8.
PLoS One ; 6(12): e28320, 2011.
Artigo em Inglês | MEDLINE | ID: mdl-22163293

RESUMO

BACKGROUND: Highly parallel analysis of gene expression has recently been used to identify gene sets or 'signatures' to improve patient diagnosis and risk stratification. Once a signature is generated, traditional statistical testing is used to evaluate its prognostic performance. However, due to the dimensionality of microarrays, this can lead to false interpretation of these signatures. PRINCIPAL FINDINGS: A method was developed to test batches of a user-specified number of randomly chosen signatures in patient microarray datasets. The percentage of random generated signatures yielding prognostic value was assessed using ROC analysis by calculating the area under the curve (AUC) in six public available cancer patient microarray datasets. We found that a signature consisting of randomly selected genes has an average 10% chance of reaching significance when assessed in a single dataset, but can range from 1% to ∼40% depending on the dataset in question. Increasing the number of validation datasets markedly reduces this number. CONCLUSIONS: We have shown that the use of an arbitrary cut-off value for evaluation of signature significance is not suitable for this type of research, but should be defined for each dataset separately. Our method can be used to establish and evaluate signature performance of any derived gene signature in a dataset by comparing its performance to thousands of randomly generated signatures. It will be of most interest for cases where few data are available and testing in multiple datasets is limited.


Assuntos
Biologia Computacional/métodos , Perfilação da Expressão Gênica , Regulação Neoplásica da Expressão Gênica , Neoplasias/genética , Neoplasias/metabolismo , Área Sob a Curva , Neoplasias da Mama/metabolismo , Interpretação Estatística de Dados , Humanos , Neoplasias Renais/metabolismo , Neoplasias Pulmonares/metabolismo , Modelos Estatísticos , Análise de Sequência com Séries de Oligonucleotídeos , Prognóstico , Curva ROC , Reprodutibilidade dos Testes
9.
AMIA Annu Symp Proc ; 2010: 682-6, 2010 Nov 13.
Artigo em Inglês | MEDLINE | ID: mdl-21347065

RESUMO

This paper describes a machine learning, text processing approach that allows the extraction of key medical information from unstructured text in Electronic Medical Records. The approach utilizes a novel text representation that shares the simplicity of the widely used bag-of-words representation, but can also represent some form of semantic information in the text. The large dimensionality of this type of learning models is controlled by the use of a ℓ(1) regularization to favor parsimonious models. Experimental results demonstrate the accuracy of the approach in extracting medical assertions that can be associated to polarity and relevance detection.


Assuntos
Registros Eletrônicos de Saúde , Processamento de Linguagem Natural , Humanos , Semântica
10.
IEEE Trans Biomed Eng ; 55(3): 1015-21, 2008 Mar.
Artigo em Inglês | MEDLINE | ID: mdl-18334393

RESUMO

Many computer-aided diagnosis (CAD) problems can be best modelled as a multiple-instance learning (MIL) problem with unbalanced data, i.e., the training data typically consists of a few positive bags, and a very large number of negative instances. Existing MIL algorithms are much too computationally expensive for these datasets. We describe CH, a framework for learning a Convex Hull representation of multiple instances that is significantly faster than existing MIL algorithms. Our CH framework applies to any standard hyperplane-based learning algorithm, and for some algorithms, is guaranteed to find the global optimal solution. Experimental studies on two different CAD applications further demonstrate that the proposed algorithm significantly improves diagnostic accuracy when compared to both MIL and traditional classifiers. Although not designed for standard MIL problems (which have both positive and negative bags and relatively balanced datasets), comparisons against other MIL methods on benchmark problems also indicate that the proposed method is competitive with the state-of-the-art.


Assuntos
Algoritmos , Inteligência Artificial , Neoplasias do Colo/diagnóstico por imagem , Aumento da Imagem/métodos , Interpretação de Imagem Assistida por Computador/métodos , Reconhecimento Automatizado de Padrão/métodos , Embolia Pulmonar/diagnóstico por imagem , Humanos , Radiografia , Reprodutibilidade dos Testes , Sensibilidade e Especificidade
11.
Radiother Oncol ; 83(3): 374-82, 2007 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-17532074

RESUMO

BACKGROUND AND PURPOSE: Hypoxia is a common feature of solid tumors associated with therapy resistance, increased malignancy and poor prognosis. Several approaches have been developed with the hope of identifying patients harboring hypoxic tumors including the use of microarray based gene signatures. However, studies to date have largely ignored the strong time dependency of hypoxia-regulated gene expression. We hypothesized that use of time-dependent patterns of gene expression during hypoxia would enable development of superior prognostic expression signatures. MATERIALS AND METHODS: Using published data from the microarray study of Chi et al., we extracted gene signatures correlating with induction during either early or late hypoxic exposure. Gene signatures were derived from in vitro exposed human mammary epithelial cell line (HMEC) under 0% or 2% oxygen. Gene signatures correlating with early and late up-regulation were tested by means of Kaplan-Meier survival, univariate, and multivariate analysis on a patient data set with primary breast cancer treated conventionally (surgery plus on indication radiotherapy and systemic therapy). RESULTS: We found that the two early hypoxia gene signatures extracted from 0% and 2% hypoxia showed significant prognostic power (log-rank test: p=0.004 at 0%, p=0.034 at 2%) in contrast to the late hypoxia signatures. Both early gene signatures were linked to the insulin pathway. From the multivariate Cox-regression analysis, the early hypoxia signature (p=0.254) was found to be the 4th best prognostic factor after lymph node status (p=0.002), tumor size (p=0.016) and Elston grade (p=0.111). On this data set it indeed provided more information than ER status or p53 status. CONCLUSIONS: The hypoxic stress elicits a wide panel of temporal responses corresponding to different biological pathways. Early hypoxia signatures were shown to have a significant prognostic power. These data suggest that gene signatures identified from in vitro experiments could contribute to individualized medicine.


Assuntos
Hipóxia Celular/genética , Perfilação da Expressão Gênica , Fator 1 Induzível por Hipóxia/genética , Fator 1 Induzível por Hipóxia/metabolismo , Neoplasias/genética , Oxigênio/metabolismo , Bases de Dados Genéticas , Células Epiteliais/metabolismo , Feminino , Humanos , Pessoa de Meia-Idade , Neoplasias/diagnóstico , Neoplasias/fisiopatologia , Análise de Sequência com Séries de Oligonucleotídeos , Valor Preditivo dos Testes , Prognóstico , Análise de Sobrevida , Fatores de Tempo
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA