Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
1.
IEEE Trans Pattern Anal Mach Intell ; 44(8): 4052-4064, 2022 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-33571089

RESUMO

Nonnegative matrix factorization (NMF) is a linear dimensionality reduction technique for analyzing nonnegative data. A key aspect of NMF is the choice of the objective function that depends on the noise model (or statistics of the noise) assumed on the data. In many applications, the noise model is unknown and difficult to estimate. In this paper, we define a multi-objective NMF (MO-NMF) problem, where several objectives are combined within the same NMF model. We propose to use Lagrange duality to judiciously optimize for a set of weights to be used within the framework of the weighted-sum approach, that is, we minimize a single objective function which is a weighted sum of the all objective functions. We design a simple algorithm based on multiplicative updates to minimize this weighted sum. We show how this can be used to find distributionally robust NMF (DR-NMF) solutions, that is, solutions that minimize the largest error among all objectives, using a dual approach solved via a heuristic inspired from the Frank-Wolfe algorithm. We illustrate the effectiveness of this approach on synthetic, document and audio data sets. The results show that DR-NMF is robust to our incognizance of the noise model of the NMF problem.

2.
Annu Int Conf IEEE Eng Med Biol Soc ; 2021: 4152-4158, 2021 11.
Artigo em Inglês | MEDLINE | ID: mdl-34892140

RESUMO

Mortality risk is a major concern to patients who have just been discharged from the intensive care unit (ICU). Many studies have been directed to construct machine learning models to predict such risk. Although these models are highly accurate, they are less amenable to interpretation and clinicians are typically unable to gain further insights into the patients' health conditions and the underlying factors that influence their mortality risk. In this paper, we use patients' profiles extracted from the MIMIC-III clinical database to construct risk calculators based on different machine learning techniques such as logistic regression, decision trees, random forests, k-nearest neighbors and multilayer perceptrons. We perform an extensive benchmarking study that compares the most salient features as predicted by various methods. We observe a high degree of agreement across the considered machine learning methods; in particular, age, blood urea nitrogen level and the indicator variable - whether the patient is discharged from the cardiac surgery recovery unit are commonly predicted to be the most salient features for determining patients' mortality risks. Our work has the potential to help clinicians interpret risk predictions.


Assuntos
Unidades de Terapia Intensiva , Aprendizado de Máquina , Bases de Dados Factuais , Humanos , Modelos Logísticos , Redes Neurais de Computação
3.
Entropy (Basel) ; 21(5)2019 May 07.
Artigo em Inglês | MEDLINE | ID: mdl-33267192

RESUMO

We revisit the distributed hypothesis testing (or hypothesis testing with communication constraints) problem from the viewpoint of privacy. Instead of observing the raw data directly, the transmitter observes a sanitized or randomized version of it. We impose an upper bound on the mutual information between the raw and randomized data. Under this scenario, the receiver, which is also provided with side information, is required to make a decision on whether the null or alternative hypothesis is in effect. We first provide a general lower bound on the type-II exponent for an arbitrary pair of hypotheses. Next, we show that if the distribution under the alternative hypothesis is the product of the marginals of the distribution under the null (i.e., testing against independence), then the exponent is known exactly. Moreover, we show that the strong converse property holds. Using ideas from Euclidean information theory, we also provide an approximate expression for the exponent when the communication rate is low and the privacy level is high. Finally, we illustrate our results with a binary and a Gaussian example.

4.
Nat Med ; 22(6): 606-13, 2016 06.
Artigo em Inglês | MEDLINE | ID: mdl-27183217

RESUMO

Human leukocyte antigen class I (HLA)-restricted CD8(+) T lymphocyte (CTL) responses are crucial to HIV-1 control. Although HIV can evade these responses, the longer-term impact of viral escape mutants remains unclear, as these variants can also reduce intrinsic viral fitness. To address this, we here developed a metric to determine the degree of HIV adaptation to an HLA profile. We demonstrate that transmission of viruses that are pre-adapted to the HLA molecules expressed in the recipient is associated with impaired immunogenicity, elevated viral load and accelerated CD4(+) T cell decline. Furthermore, the extent of pre-adaptation among circulating viruses explains much of the variation in outcomes attributed to the expression of certain HLA alleles. Thus, viral pre-adaptation exploits 'holes' in the immune response. Accounting for these holes may be key for vaccine strategies seeking to elicit functional responses from viral variants, and to HIV cure strategies that require broad CTL responses to achieve successful eradication of HIV reservoirs.


Assuntos
Adaptação Fisiológica/imunologia , Linfócitos T CD8-Positivos/imunologia , Infecções por HIV/transmissão , HIV-1/imunologia , Antígenos de Histocompatibilidade Classe I/imunologia , Evasão da Resposta Imune/imunologia , Vacinas contra a AIDS/imunologia , África Austral , Colúmbia Britânica , Contagem de Linfócito CD4 , Estudos de Coortes , Evolução Molecular , Infecções por HIV/imunologia , HIV-1/genética , Humanos , Evasão da Resposta Imune/genética , Imunidade Celular/imunologia , Modelos Lineares , Modelos Imunológicos , Modelos de Riscos Proporcionais , Receptores de Antígenos de Linfócitos T/imunologia , Carga Viral , Replicação Viral/genética
5.
IEEE Trans Neural Netw Learn Syst ; 25(12): 2226-39, 2014 Dec.
Artigo em Inglês | MEDLINE | ID: mdl-25420245

RESUMO

We propose a novel framework of using a parsimonious statistical model, known as mixture of Gaussian trees, for modeling the possibly multimodal minority class to solve the problem of imbalanced time-series classification. By exploiting the fact that close-by time points are highly correlated due to smoothness of the time-series, our model significantly reduces the number of covariance parameters to be estimated from O(d(2)) to O(Ld), where L is the number of mixture components and d is the dimensionality. Thus, our model is particularly effective for modeling high-dimensional time-series with limited number of instances in the minority positive class. In addition, the computational complexity for learning the model is only of the order O(Ln+d(2)) where n+ is the number of positively labeled samples. We conduct extensive classification experiments based on several well-known time-series data sets (both single- and multimodal) by first randomly generating synthetic instances from our learned mixture model to correct the imbalance. We then compare our results with several state-of-the-art oversampling techniques and the results demonstrate that when our proposed model is used in oversampling, the same support vector machines classifier achieves much better classification accuracy across the range of data sets. In fact, the proposed method achieves the best average performance 30 times out of 36 multimodal data sets according to the F-value metric. Our results are also highly competitive compared with nonoversampling-based classifiers for dealing with imbalanced time-series data sets.

6.
IEEE Trans Pattern Anal Mach Intell ; 35(7): 1592-605, 2013 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-23681989

RESUMO

This paper addresses the estimation of the latent dimensionality in nonnegative matrix factorization (NMF) with the ß-divergence. The ß-divergence is a family of cost functions that includes the squared euclidean distance, Kullback-Leibler (KL) and Itakura-Saito (IS) divergences as special cases. Learning the model order is important as it is necessary to strike the right balance between data fidelity and overfitting. We propose a Bayesian model based on automatic relevance determination (ARD) in which the columns of the dictionary matrix and the rows of the activation matrix are tied together through a common scale parameter in their prior. A family of majorization-minimization (MM) algorithms is proposed for maximum a posteriori (MAP) estimation. A subset of scale parameters is driven to a small lower bound in the course of inference, with the effect of pruning the corresponding spurious components. We demonstrate the efficacy and robustness of our algorithms by performing extensive experiments on synthetic data, the swimmer dataset, a music decomposition example, and a stock price prediction task.


Assuntos
Algoritmos , Modelos Teóricos , Reconhecimento Automatizado de Padrão/métodos , Teorema de Bayes , Simulação por Computador , Bases de Dados Factuais , Economia , Humanos , Modelos Estatísticos , Natação
7.
Am J Respir Crit Care Med ; 181(11): 1200-6, 2010 Jun 01.
Artigo em Inglês | MEDLINE | ID: mdl-20167852

RESUMO

RATIONALE: The pattern of IgE response (over time or to specific allergens) may reflect different atopic vulnerabilities which are related to the presence of asthma in a fundamentally different way from current definition of atopy. OBJECTIVES: To redefine the atopic phenotype by identifying latent structure within a complex dataset, taking into account the timing and type of sensitization to specific allergens, and relating these novel phenotypes to asthma. METHODS: In a population-based birth cohort in which multiple skin and IgE tests have been taken throughout childhood, we used a machine learning approach to cluster children into multiple atopic classes in an unsupervised way. We then investigated the relation between these classes and asthma (symptoms, hospitalizations, lung function and airway reactivity). MEASUREMENTS AND MAIN RESULTS: A five-class model indicated a complex latent structure, in which children with atopic vulnerability were clustered into four distinct classes (Multiple Early [112/1053, 10.6%]; Multiple Late [171/1053, 16.2%]; Dust Mite [47/1053, 4.5%]; and Non-dust Mite [100/1053, 9.5%]), with a fifth class describing children with No Latent Vulnerability (623/1053, 59.2%). The association with asthma was considerably stronger for Multiple Early compared with other classes and conventionally defined atopy (odds ratio [95% CI]: 29.3 [11.1-77.2] versus 12.4 [4.8-32.2] versus 11.6 [4.8-27.9] for Multiple Early class versus Ever Atopic versus Atopic age 8). Lung function and airway reactivity were significantly poorer among children in Multiple Early class. Cox regression demonstrated a highly significant increase in risk of hospital admissions for wheeze/asthma after age 3 yr only among children in the Multiple Early class (HR 9.2 [3.5-24.0], P < 0.001). CONCLUSIONS: IgE antibody responses do not reflect a single phenotype of atopy, but several different atopic vulnerabilities which differ in their relation with asthma presence and severity.


Assuntos
Asma/classificação , Animais , Asma/epidemiologia , Asma/imunologia , Criança , Análise por Conglomerados , Estudos de Coortes , Suscetibilidade a Doenças , Feminino , Hospitalização , Humanos , Imunoglobulina E/sangue , Masculino , Mães , Análise Multivariada , Fenótipo , Pletismografia Total , Pyroglyphidae , Testes de Função Respiratória , Sons Respiratórios , Testes Cutâneos , Fumar/epidemiologia , Espirometria , Reino Unido/epidemiologia
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA