Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 8 de 8
Filtrar
Mais filtros

Base de dados
País/Região como assunto
Tipo de documento
Assunto da revista
País de afiliação
Intervalo de ano de publicação
1.
Sensors (Basel) ; 24(3)2024 Jan 31.
Artigo em Inglês | MEDLINE | ID: mdl-38339655

RESUMO

During a heavy traffic flow featuring a substantial number of vehicles, the data reflecting the strain response of asphalt pavement under the vehicle load exhibit notable fluctuations with abnormal values, which can be attributed to the complex operating environment. Thus, there is a need to create a real-time anomalous-data diagnosis system which could effectively extract dynamic strain features, such as peak values and peak separation from the large amount of data. This paper presents a dynamic response signal data analysis method that utilizes the DBSCAN clustering algorithm and the findpeaks function. This method is designed to analyze data collected by sensors installed within the pavement. The first step involves denoising the data using low-pass filters and other techniques. Subsequently, the DBSCAN algorithm, which has been improved using the K-Dist method, is used to diagnose abnormal data after denoising. The refined findpeaks function is further implemented to carry out the adaptive feature extraction of the denoised data which is free from anomalies. The enhanced DBSCAN algorithm is tested via simulation and illustrates its effectiveness while detecting abnormal data in the road dynamic response signal. The findpeaks function enables the relatively accurate identification of peak values, thus leading to the identification of strain signal peaks of complex multi-axle lorries. This study is valuable for efficient data processing and effective information utilization in pavement monitoring.

2.
Sensors (Basel) ; 23(10)2023 May 22.
Artigo em Inglês | MEDLINE | ID: mdl-37430873

RESUMO

Seriously abnormal data exist in the synchronous monitoring data of transformer DC bias, which causes serious data feature contamination and even affects the identification of transformer DC bias. For this reason, this paper aims to ensure the reliability and validity of synchronous monitoring data. This paper proposes an identification of abnormal data for the synchronous monitoring of transformer DC bias based on multiple criteria. By analyzing the abnormal data of different types, the characteristics of abnormal data are obtained. Based on this, the abnormal data identification indexes are introduced, including gradient, sliding kurtosis and Pearson correlation coefficient. Firstly, the Pauta criterion is used to determine the threshold of the gradient index. Then, gradient is used to identify the suspected abnormal data. Finally, the sliding kurtosis and Pearson correlation coefficient are used to identify the abnormal data. Data for synchronous monitoring of transformer DC bias in a certain power grid are used to verify the proposed method. The results show that the accuracy of the proposed method in identifying mutated abnormal data and zero-value abnormal data is claimed to be 100%. Compared with traditional abnormal data identification methods, the accuracy of the proposed method is significantly improved.

3.
Sensors (Basel) ; 22(4)2022 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-35214354

RESUMO

Abnormal electricity data, caused by electricity theft or meter failure, leads to the inaccuracy of aggregation results. These inaccurate results not only harm the interests of users but also affect the decision-making of the power system. However, the existing data aggregation schemes do not consider the impact of abnormal data. How to filter out abnormal data is a challenge. To solve this problem, in this study, we propose a lightweight and privacy-friendly data aggregation scheme against abnormal data, in which the valid data can correctly be aggregated but abnormal data will be filtered out during the aggregation process. This is more suitable for resource-limited smart meters, due to the adoption of lightweight matrix encryption. The automatic filtering of abnormal data without additional processes and the detection of abnormal data sources are where our protocol outperforms other schemes. Finally, a detailed security analysis shows that the proposed scheme can protect the privacy of users' data. In addition, the results of extensive simulations demonstrate that the additional computation cost to filter the abnormal data is within the acceptable range, which shows that our proposed scheme is still very effective.


Assuntos
Segurança Computacional , Privacidade , Algoritmos , Confidencialidade , Agregação de Dados
4.
Cell Mol Biol (Noisy-le-grand) ; 66(7): 103-110, 2020 Oct 31.
Artigo em Inglês | MEDLINE | ID: mdl-33287929

RESUMO

In view of the shortcomings of the current abnormal data detection system of the protein gene library, such as low detection rate and high error detection rate, the abnormal data detection system of the protein gene library based on data mining technology is designed. The protein gene enters the firewall module of the system, and enters the immune module when it does not match the firewall rules; the memory detector in the immune module presents the protein gene, if the memory detector does not match the protein gene, the mature detector presents the protein gene, if the mature detector does not match the protein gene, it is determined as the normal protein gene data package, if it matches, it is considered that The abnormal data of protein gene was processed by the collaborative stimulation module, and the control module controlled by C8051F060 chip to detect the abnormal data of protein gene library. The immune module generates new protein gene sequences through an immature detector, simulates the immune mechanism of protein gene through a mature detector module, and simulates the secondary response in the abnormal data detection system of protein gene library through memory detector. The system introduces data mining technology into the detection and uses a two-level dynamic optimization algorithm to calculate the ASG similarity value of protein gene secondary structure arrangement. According to this value, the abnormal data detection of the protein gene library is realized by randomly generating protein genes, negative selection, clone selection and copying memory cells through gene expression. The experimental results show that the system can quickly detect abnormal data of the protein gene library, ensure the detection efficiency, and the detection accuracy reaches 97.1%. The system can reduce the error rate of normal protein gene detection as an abnormal protein gene.


Assuntos
Biologia Computacional/métodos , Mineração de Dados , Biblioteca Gênica , Proteínas/análise , Reprodutibilidade dos Testes
5.
Sensors (Basel) ; 20(15)2020 Jul 29.
Artigo em Inglês | MEDLINE | ID: mdl-32751248

RESUMO

Sensor networks in real-world environments, such as smart cities or ambient intelligent platforms, provide applications with large and heterogeneous sets of data streams. Outliers-observations that do not conform to an expected behavior-has then turned into a crucial task to establish and maintain secure and reliable databases in this kind of platforms. However, the procedures to obtain accurate models for erratic observations have to operate with low complexity in terms of storage and computational time, in order to attend the limited processing and storage capabilities of the sensor nodes in these environments. In this work, we analyze three binary classifiers based on three statistical prediction models-ARIMA (Auto-Regressive Integrated Moving Average), GAM (Generalized Additive Model), and LOESS (LOcal RegrESSion)-for outlier detection with low memory consumption and computational time rates. As a result, we provide (1) the best classifier and settings to detect outliers, based on the ARIMA model, and (2) two real-world classified datasets as ground truths for future research.

6.
J Am Stat Assoc ; 118(543): 2029-2044, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37771510

RESUMO

This paper develops an incremental learning algorithm based on quadratic inference function (QIF) to analyze streaming datasets with correlated outcomes such as longitudinal data and clustered data. We propose a renewable QIF (RenewQIF) method within a paradigm of renewable estimation and incremental inference, in which parameter estimates are recursively renewed with current data and summary statistics of historical data, but with no use of any historical subject-level raw data. We compare our renewable estimation method with both offline QIF and offline generalized estimating equations (GEE) approach that process the entire cumulative subject-level data all together, and show theoretically and numerically that our renewable procedure enjoys statistical and computational efficiency. We also propose an approach to diagnose the homogeneity assumption of regression coefficients via a sequential goodness-of-fit test as a screening procedure on occurrences of abnormal data batches. We implement the proposed methodology by expanding existing Spark's Lambda architecture for the operation of statistical inference and data quality diagnosis. We illustrate the proposed methodology by extensive simulation studies and an analysis of streaming car crash datasets from the National Automotive Sampling System-Crashworthiness Data System (NASS CDS). The supplementary material is available online.

7.
J Agric Biol Environ Stat ; 26(3): 428-445, 2021.
Artigo em Inglês | MEDLINE | ID: mdl-33840991

RESUMO

Ordinary differential equation (ODE) models are popularly used to describe complex dynamical systems. When estimating ODE parameters from noisy data, a common distribution assumption is using the Gaussian distribution. It is known that the Gaussian distribution is not robust when abnormal data exist. In this article, we develop a hierarchical semiparametric mixed-effects ODE model for longitudinal data under the Bayesian framework. For robust inference on ODE parameters, we consider a class of heavy-tailed distributions to model the random effects of ODE parameters and observations errors. An MCMC method is proposed to sample ODE parameters from the posterior distributions. Our proposed method is illustrated by studying a gene regulation experiment. Simulation studies show that our proposed method provides satisfactory results for the semiparametric mixed-effects ODE models with finite samples. Supplementary materials accompanying this paper appear online. SUPPLEMENTARY INFORMATION: Supplementary materials for this article are available at10.1007/s13253-021-00446-2.

8.
Big Data ; 7(2): 99-113, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31074632

RESUMO

For the problems of abnormal values existing in the water intake monitoring data and centralized uploaded report, the abnormal data region discrimination (ADRD) algorithm and the cross-monitoring points historical correlation repair (CMHCR) method are proposed to discriminate and repair the abnormal data. The characteristics of abnormal data distribution are analyzed, and the ADRD algorithm is proposed. ADRD uses the relationship between 0 values and the abnormal large value, and the ratio of the abnormal large value to the expectation to distinguish the abnormal data region. The correlation between the monitoring data of current detection points and the historical data of different detection points is analyzed. The results show that the data of current monitoring point and the historical data of corresponding point do not fully conform to the maximum correlation. Therefore, the CMHCR method is proposed to repair abnormal data. Experiments based on actual half year water intake data of 2016 and 2017 are performed by using ADRD. The experimental results show that the proposed algorithm and method can correctly distinguish the abnormal data region and repair the abnormal data properly.


Assuntos
Big Data , Recursos Hídricos , Algoritmos , China
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA