Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 7 de 7
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Assunto da revista
Intervalo de ano de publicação
1.
Arch Toxicol ; 92(9): 2913-2922, 2018 09.
Artigo em Inglês | MEDLINE | ID: mdl-29995190

RESUMO

The development and application of high throughput in vitro assays is an important development for risk assessment in the twenty-first century. However, there are still significant challenges to incorporate in vitro assays into routine toxicity testing practices. In this paper, a robust learning approach was developed to infer the in vivo point of departure (POD) with in vitro assay data from ToxCast and Tox21 projects. Assay data from ToxCast and Tox21 projects were utilized to derive the in vitro PODs for several hundred chemicals. These were combined with in vivo PODs from ToxRefDB regarding the rat and mouse liver to build a high-dimensional robust regression model. This approach separates the chemicals into a majority, well-predicted set; and a minority, outlier set. Salient relationships can then be learned from the data. For both mouse and rat liver PODs, over 93% of chemicals have inferred values from in vitro PODs that are within ± 1 of the in vivo PODs on the log10 scale (the target learning region, or TLR) and R2 of 0.80 (rats) and 0.78 (mice) for these chemicals. This is comparable with extrapolation between related species (mouse and rat), which has 93% chemicals within the TLR and the R2 being 0.78. Chemicals in the outlier set tend to also have more biologically variable characteristics. With the continued accumulation of high throughput data for a wide range of chemicals, predictive modeling can provide a valuable complement for adverse outcome pathway based approach in risk assessment.


Assuntos
Modelos Teóricos , Testes de Toxicidade Crônica/métodos , Animais , Bases de Dados Factuais , Ensaios de Triagem em Larga Escala/métodos , Ensaios de Triagem em Larga Escala/estatística & dados numéricos , Humanos , Fígado/efeitos dos fármacos , Camundongos , Ratos , Testes de Toxicidade Crônica/estatística & dados numéricos
2.
Neural Netw ; 176: 106383, 2024 Aug.
Artigo em Inglês | MEDLINE | ID: mdl-38781758

RESUMO

Label noises, categorized into closed-set noise and open-set noise, are prevalent in real-world scenarios and can seriously hinder the generalization ability of models. Identifying noise is challenging because noisy samples closely resemble true positives. Existing approaches often assume a single noise source, oversimplify closed-set noise, or treat open-set noise as toxic and eliminate it, resulting in limited practical effects. To address these issues, we present a novel approach named uncertainty-guided label correction with wavelet-transformed discriminative representation enhancement (Ultra), designed to mitigate the effects of mixed noise. Specifically, our approach considers a more practical noise setting. To achieve robust mixed-noise identification, we initially look into a learnable wavelet filter for obtaining discriminative features and filtering spurious cues automatically at the representation level. Subsequently, we introduce a two-fold uncertainty estimation to stably locate noise within the corrupted supervised signal level. These insights pave the way for a simple yet potent label correction technique, enabling comprehensive utilization of open-set noise, which can be rendered non-toxic in a specific manner, in contrast to harmful closed-set noise. Experimental validation on datasets with synthetic mixed noise, web noise corruption, and a real-world dataset confirms the effectiveness and generality of Ultra. Furthermore, our approach enhances the application of efficient techniques (e.g., supervised contrastive learning) within label noise scenarios.


Assuntos
Análise de Ondaletas , Incerteza , Algoritmos , Humanos , Redes Neurais de Computação
3.
Genes (Basel) ; 14(2)2023 02 01.
Artigo em Inglês | MEDLINE | ID: mdl-36833313

RESUMO

Outliers in the training or test set used to fit and evaluate a classifier on transcriptomics data can considerably change the estimated performance of the model. Hence, an either too weak or a too optimistic accuracy is then reported and the estimated model performance cannot be reproduced on independent data. It is then also doubtful whether a classifier qualifies for clinical usage. We estimate classifier performances in simulated gene expression data with artificial outliers and in two real-world datasets. As a new approach, we use two outlier detection methods within a bootstrap procedure to estimate the outlier probability for each sample and evaluate classifiers before and after outlier removal by means of cross-validation. We found that the removal of outliers changed the classification performance notably. For the most part, removing outliers improved the classification results. Taking into account the fact that there are various, sometimes unclear reasons for a sample to be an outlier, we strongly advocate to always report the performance of a transcriptomics classifier with and without outliers in training and test data. This provides a more diverse picture of a classifier's performance and prevents reporting models that later turn out to be not applicable for clinical diagnoses.


Assuntos
Perfilação da Expressão Gênica , Transcriptoma , Probabilidade , Projetos de Pesquisa
4.
Big Data ; 11(2): 105-116, 2023 04.
Artigo em Inglês | MEDLINE | ID: mdl-36315168

RESUMO

Artificial neural networks (ANNs) have been frequently used in forecasting problems in recent years. One of the most popular types of ANNs in these days is Pi-Sigma artificial neural networks (PS-ANNs). PS-ANNs have a high order ANN structure and they use both multiplicative and additive neuron models in their architecture. PS-ANNs produce superior forecasting performance because of their high order structure. PS-ANNs are affected negatively by an outlier or outliers in a data set because of having a multiplicative neuron model in their architecture. In this study, a new robust learning algorithm based on particle swarm optimization and Huber's loss function for PS-ANNs is proposed. To evaluate the performance of the proposed method, Dow Jones stock exchange and Australian beer consumption data sets are analyzed and the obtained results are compared with many ANNs types proposed in the literature. Besides, the performance of the proposed method in outlier cases is also investigated by injecting outliers into these data sets. It is seen that the proposed learning algorithm has a satisfying performance both the data have an outlier or outliers' case and original case.


Assuntos
Algoritmos , Redes Neurais de Computação , Austrália , Previsões
5.
Proc IEEE Int Conf Data Min ; 2022: 1299-1304, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-37057074

RESUMO

Unsupervised Domain Adaptation (UDA) provides a promising solution for learning without supervision, which transfers knowledge from relevant source domains with accessible labeled training data. Existing UDA solutions hinge on clean training data with a short-tail distribution from the source domain, which can be fragile when the source domain data is corrupted either inherently or via adversarial attacks. In this work, we propose an effective framework to address the challenges of UDA from corrupted source domains in a principled manner. Specifically, we perform knowledge ensemble from multiple domain-invariant models that are learned on random partitions of training data. To further address the distribution shift from the source to the target domain, we refine each of the learned models via mutual information maximization, which adaptively obtains the predictive information of the target domain with high confidence. Extensive empirical studies demonstrate that the proposed approach is robust against various types of poisoned data attacks while achieving high asymptotic performance on the target domain.

6.
Neural Netw ; 143: 209-217, 2021 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-34157645

RESUMO

Most deep neural networks (DNNs) are trained with large amounts of noisy labels when they are applied. As DNNs have the high capacity to fit any noisy labels, it is known to be difficult to train DNNs robustly with noisy labels. These noisy labels cause the performance degradation of DNNs due to the memorization effect by over-fitting. Earlier state-of-the-art methods used small loss tricks to efficiently resolve the robust training problem with noisy labels. In this paper, relationship between the uncertainties and the clean labels is analyzed. We present novel training method to use not only small loss trick but also labels that are likely to be clean labels selected from uncertainty called "Uncertain Aware Co-Training (UACT)". Our robust learning techniques (UACT) avoid over-fitting the DNNs by extremely noisy labels. By making better use of the uncertainty acquired from the network itself, we achieve good generalization performance. We compare the proposed method to the current state-of-the-art algorithms for noisy versions of MNIST, CIFAR-10, CIFAR-100, T-ImageNet and News to demonstrate its excellence.


Assuntos
Algoritmos , Redes Neurais de Computação , Incerteza
7.
J Mach Learn Res ; 19(1): 517-564, 2018 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-34421397

RESUMO

We present a Distributionally Robust Optimization (DRO) approach to estimate a robustified regression plane in a linear regression setting, when the observed samples are potentially contaminated with adversarially corrupted outliers. Our approach mitigates the impact of outliers by hedging against a family of probability distributions on the observed data, some of which assign very low probabilities to the outliers. The set of distributions under consideration are close to the empirical distribution in the sense of the Wasserstein metric. We show that this DRO formulation can be relaxed to a convex optimization problem which encompasses a class of models. By selecting proper norm spaces for the Wasserstein metric, we are able to recover several commonly used regularized regression models. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior (prediction bias), and the other concerns the discrepancy between the estimated and true regression planes (estimation bias). Extensive numerical results demonstrate the superiority of our approach to a host of regression models, in terms of the prediction and estimation accuracies. We also consider the application of our robust learning procedure to outlier detection, and show that our approach achieves a much higher AUC (Area Under the ROC Curve) than M-estimation (Huber, 1964, 1973).

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA