Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 30
Filtrar
1.
Stat Med ; 2024 Jun 19.
Artigo em Inglês | MEDLINE | ID: mdl-38897797

RESUMO

The analysis of streaming time-to-event cohorts has garnered significant research attention. Most existing methods require observed cohorts from a study sequence to be independent and identically sampled from a common model. This assumption may be easily violated in practice. Our methodology operates within the framework of online data updating, where risk estimates for each cohort of interest are continuously refreshed using the latest observations and historical summary statistics. At each streaming stage, we introduce parameters to quantify the potential discrepancy between batch-specific effects from adjacent cohorts. We then employ penalized estimation techniques to identify nonzero discrepancy parameters, allowing us to adaptively adjust risk estimates based on current data and historical trends. We illustrate our proposed method through extensive empirical simulations and a lung cancer data analysis.

2.
Stat Med ; 43(14): 2734-2746, 2024 Jun 30.
Artigo em Inglês | MEDLINE | ID: mdl-38693559

RESUMO

Streaming data routinely generated by social networks, mobile or web applications, e-commerce, and electronic health records present new opportunities to monitor the impact of an intervention on an outcome via causal inference methods. However, most existing causal inference methods have been focused on and applied to static data, that is, a fixed data set in which observations are pooled and stored before performing statistical analysis. There is thus a pressing need to turn static causal inference into online causal learning to support near real-time monitoring of treatment effects. In this paper, we present a framework for online estimation and inference of treatment effects that can incorporate new information as it becomes available without revisiting prior observations. We show that, under mild regularity conditions, the proposed online estimator is asymptotically equivalent to the offline oracle estimator obtained by pooling all data. Our proposal is motivated by the need for near real-time vaccine effectiveness and safety monitoring, and our proposed method is applied to a case study on COVID-19 vaccine safety surveillance.


Assuntos
Vacinas contra COVID-19 , COVID-19 , Vigilância de Produtos Comercializados , Humanos , Vigilância de Produtos Comercializados/estatística & dados numéricos , Vigilância de Produtos Comercializados/métodos , Vacinas contra COVID-19/efeitos adversos , COVID-19/prevenção & controle , Causalidade , Modelos Estatísticos , SARS-CoV-2 , Simulação por Computador
3.
BMC Med Inform Decis Mak ; 24(1): 77, 2024 Mar 18.
Artigo em Inglês | MEDLINE | ID: mdl-38500135

RESUMO

OBJECTIVE: To address the challenge of assessing sedation status in critically ill patients in the intensive care unit (ICU), we aimed to develop a non-contact automatic classifier of agitation using artificial intelligence and deep learning. METHODS: We collected the video recordings of ICU patients and cut them into 30-second (30-s) and 2-second (2-s) segments. All of the segments were annotated with the status of agitation as "Attention" and "Non-attention". After transforming the video segments into movement quantification, we constructed the models of agitation classifiers with Threshold, Random Forest, and LSTM and evaluated their performances. RESULTS: The video recording segmentation yielded 427 30-s and 6405 2-s segments from 61 patients for model construction. The LSTM model achieved remarkable accuracy (ACC 0.92, AUC 0.91), outperforming other methods. CONCLUSION: Our study proposes an advanced monitoring system combining LSTM and image processing to ensure mild patient sedation in ICU care. LSTM proves to be the optimal choice for accurate monitoring. Future efforts should prioritize expanding data collection and enhancing system integration for practical application.


Assuntos
Aprendizado Profundo , Agitação Psicomotora , Humanos , Agitação Psicomotora/diagnóstico , Inteligência Artificial , Unidades de Terapia Intensiva , Cuidados Críticos
4.
Biostatistics ; 2022 Oct 26.
Artigo em Inglês | MEDLINE | ID: mdl-36288541

RESUMO

In many biomedical applications, outcome is measured as a "time-to-event" (e.g., disease progression or death). To assess the connection between features of a patient and this outcome, it is common to assume a proportional hazards model and fit a proportional hazards regression (or Cox regression). To fit this model, a log-concave objective function known as the "partial likelihood" is maximized. For moderate-sized data sets, an efficient Newton-Raphson algorithm that leverages the structure of the objective function can be employed. However, in large data sets this approach has two issues: (i) The computational tricks that leverage structure can also lead to computational instability; (ii) The objective function does not naturally decouple: Thus, if the data set does not fit in memory, the model can be computationally expensive to fit. This additionally means that the objective is not directly amenable to stochastic gradient-based optimization methods. To overcome these issues, we propose a simple, new framing of proportional hazards regression: This results in an objective function that is amenable to stochastic gradient descent. We show that this simple modification allows us to efficiently fit survival models with very large data sets. This also facilitates training complex, for example, neural-network-based, models with survival data.

5.
Stat Med ; 42(7): 1013-1044, 2023 03 30.
Artigo em Inglês | MEDLINE | ID: mdl-36897184

RESUMO

In this work we introduce the personalized online super learner (POSL), an online personalizable ensemble machine learning algorithm for streaming data. POSL optimizes predictions with respect to baseline covariates, so personalization can vary from completely individualized, that is, optimization with respect to subject ID, to many individuals, that is, optimization with respect to common baseline covariates. As an online algorithm, POSL learns in real time. As a super learner, POSL is grounded in statistical optimality theory and can leverage a diversity of candidate algorithms, including online algorithms with different training and update times, fixed/offline algorithms that are not updated during POSL's fitting procedure, pooled algorithms that learn from many individuals' time series, and individualized algorithms that learn from within a single time series. POSL's ensembling of the candidates can depend on the amount of data collected, the stationarity of the time series, and the mutual characteristics of a group of time series. Depending on the underlying data-generating process and the information available in the data, POSL is able to adapt to learning across samples, through time, or both. For a range of simulations that reflect realistic forecasting scenarios and in a medical application, we examine the performance of POSL relative to other current ensembling and online learning methods. We show that POSL is able to provide reliable predictions for both short and long time series, and it's able to adjust to changing data-generating environments. We further cultivate POSL's practicality by extending it to settings where time series dynamically enter and exit.


Assuntos
Algoritmos , Aprendizado de Máquina , Humanos
6.
Sheng Wu Yi Xue Gong Cheng Xue Za Zhi ; 40(1): 103-109, 2023 Feb 25.
Artigo em Chinês | MEDLINE | ID: mdl-36854554

RESUMO

Internet of Things (IoT) technology plays an important role in smart healthcare. This paper discusses IoT solution for emergency medical devices in hospitals. Based on the cloud-edge-device architecture, different medical devices were connected; Streaming data were parsed, distributed, and computed at the edge nodes; Data were stored, analyzed and visualized in the cloud nodes. The IoT system has been working steadily for nearly 20 months since it run in the emergency department in January 2021. Through preliminary analysis with collected data, IoT performance testing and development of early warning model, the feasibility and reliability of the in-hospital emergency medical devices IoT was verified, which can collect data for a long time on a large scale and support the development and deployment of machine learning models. The paper ends with an outlook on medical device data exchange and wireless transmission in the IoT of emergency medical devices, the connection of emergency equipment inside and outside the hospital, and the next step of analyzing IoT data to develop emergency intelligent IoT applications.


Assuntos
Internet das Coisas , Reprodutibilidade dos Testes , Internet , Aprendizado de Máquina , Tecnologia
7.
Sensors (Basel) ; 22(2)2022 Jan 08.
Artigo em Inglês | MEDLINE | ID: mdl-35062419

RESUMO

Power system failures or outages due to short-circuits or "faults" can result in long service interruptions leading to significant socio-economic consequences. It is critical for electrical utilities to quickly ascertain fault characteristics, including location, type, and duration, to reduce the service time of an outage. Existing fault detection mechanisms (relays and digital fault recorders) are slow to communicate the fault characteristics upstream to the substations and control centers for action to be taken quickly. Fortunately, due to availability of high-resolution phasor measurement units (PMUs), more event-driven solutions can be captured in real time. In this paper, we propose a data-driven approach for determining fault characteristics using samples of fault trajectories. A random forest regressor (RFR)-based model is used to detect real-time fault location and its duration simultaneously. This model is based on combining multiple uncorrelated trees with state-of-the-art boosting and aggregating techniques in order to obtain robust generalizations and greater accuracy without overfitting or underfitting. Four cases were studied to evaluate the performance of RFR: 1. Detecting fault location (case 1), 2. Predicting fault duration (case 2), 3. Handling missing data (case 3), and 4. Identifying fault location and length in a real-time streaming environment (case 4). A comparative analysis was conducted between the RFR algorithm and state-of-the-art models, including deep neural network, Hoeffding tree, neural network, support vector machine, decision tree, naive Bayesian, and K-nearest neighborhood. Experiments revealed that RFR consistently outperformed the other models in detection accuracy, prediction error, and processing time.


Assuntos
Algoritmos , Redes Neurais de Computação , Teorema de Bayes , Máquina de Vetores de Suporte
8.
Sensors (Basel) ; 21(4)2021 Feb 04.
Artigo em Inglês | MEDLINE | ID: mdl-33557367

RESUMO

Efficient and accurate estimation of the probability distribution of a data stream is an important problem in many sensor systems. It is especially challenging when the data stream is non-stationary, i.e., its probability distribution changes over time. Statistical models for non-stationary data streams demand agile adaptation for concept drift while tolerating temporal fluctuations. To this end, a statistical model needs to forget old data samples and to detect concept drift swiftly. In this paper, we propose FlexSketch, an online probability density estimation algorithm for data streams. Our algorithm uses an ensemble of histograms, each of which represents a different length of data history. FlexSketch updates each histogram for a new data sample and generates probability distribution by combining the ensemble of histograms while monitoring discrepancy between recent data and existing models periodically. When it detects concept drift, a new histogram is added to the ensemble and the oldest histogram is removed. This allows us to estimate the probability density function with high update speed and high accuracy using only limited memory. Experimental results demonstrate that our algorithm shows improved speed and accuracy compared to existing methods for both stationary and non-stationary data streams.

9.
Sensors (Basel) ; 21(4)2021 Feb 14.
Artigo em Inglês | MEDLINE | ID: mdl-33672793

RESUMO

Developing robot control software systems is difficult because of a wide variety of requirements, including hardware systems and sensors, even though robots are demanding nowadays. Middleware systems, such as Robot Operating System (ROS), are being developed and widely used to tackle this difficulty. Streaming data Sharing Manager (SSM) is one of such middleware systems that allow developers to write and read sensor data with timestamps using a Personal Computer (PC). The timestamp feature is essential for the robot control system because it usually uses multiple sensors with their own measurement cycles, meaning that measured sensor values with different timestamps become useless for the robot control. Using SSM allows developers to use measured sensor values with the same timestamps; however, SSM assumes that only one PC is used. Thereby, if one process consumes CPU resources intensively, other processes cannot finish their assumed deadlines, leading to the unexpected behavior of a robot. This paper proposes an SSM middleware, named Distributed Streaming data Sharing Manager (DSSM), that enables distributing processes on SSM to different PCs. We have developed a prototype of DSSM and confirmed its behavior so far. In addition, we apply DSSM to an existing real SSM based robot control system that autonomously controls an unmanned vehicle robot. We then reveal its advantages and disadvantages via several experiments by measuring resource usages.

10.
Sensors (Basel) ; 21(1)2021 Jan 02.
Artigo em Inglês | MEDLINE | ID: mdl-33401750

RESUMO

In the hospital, a sleep postures monitoring system is usually adopted to transform sensing signals into sleep behaviors. However, a home-care sleep posture monitoring system needs to be user friendly. In this paper, we present iSleePost-a user-friendly home-care intelligent sleep posture monitoring system. We address the labor-intensive labeling issue of traditional machine learning approaches in the training phase. Our proposed mobile health (mHealth) system leverages the communications and computation capabilities of mobile phones for provisioning a continuous sleep posture monitoring service. Our experiments show that iSleePost can achieve up to 85 percent accuracy in recognizing sleep postures. More importantly, iSleePost demonstrates that an easy-to-wear wrist sensor can accurately quantify sleep postures after our designed training phase. It is our hope that the design concept of iSleePost can shed some lights on quantifying human sleep postures in the future.


Assuntos
Postura , Punho , Eletrocardiografia , Humanos , Monitorização Fisiológica , Sono
11.
Sensors (Basel) ; 21(20)2021 Oct 12.
Artigo em Inglês | MEDLINE | ID: mdl-34695987

RESUMO

In smart buildings, many different systems work in coordination to accomplish their tasks. In this process, the sensors associated with these systems collect large amounts of data generated in a streaming fashion, which is prone to concept drift. Such data are heterogeneous due to the wide range of sensors collecting information about different characteristics of the monitored systems. All these make the monitoring task very challenging. Traditional clustering algorithms are not well equipped to address the mentioned challenges. In this work, we study the use of MV Multi-Instance Clustering algorithm for multi-view analysis and mining of smart building systems' sensor data. It is demonstrated how this algorithm can be used to perform contextual as well as integrated analysis of the systems. Various scenarios in which the algorithm can be used to analyze the data generated by the systems of a smart building are examined and discussed in this study. In addition, it is also shown how the extracted knowledge can be visualized to detect trends in the systems' behavior and how it can aid domain experts in the systems' maintenance. In the experiments conducted, the proposed approach was able to successfully detect the deviating behaviors known to have previously occurred and was also able to identify some new deviations during the monitored period. Based on the results obtained from the experiments, it can be concluded that the proposed algorithm has the ability to be used for monitoring, analysis, and detecting deviating behaviors of the systems in a smart building domain.


Assuntos
Análise de Dados , Eletrocardiografia , Algoritmos , Análise por Conglomerados , Monitorização Fisiológica
12.
Sensors (Basel) ; 20(5)2020 Feb 26.
Artigo em Inglês | MEDLINE | ID: mdl-32110907

RESUMO

To design an algorithm for detecting outliers over streaming data has become an important task in many common applications, arising in areas such as fraud detections, network analysis, environment monitoring and so forth. Due to the fact that real-time data may arrive in the form of streams rather than batches, properties such as concept drift, temporal context, transiency, and uncertainty need to be considered. In addition, data processing needs to be incremental with limited memory resource, and scalable. These facts create big challenges for existing outlier detection algorithms in terms of their accuracies when they are implemented in an incremental fashion, especially in the streaming environment. To address these problems, we first propose C_KDE_WR, which uses sliding window and kernel function to process the streaming data online, and reports its results demonstrating high throughput on handling real-time streaming data, implemented in a CUDA framework on Graphics Processing Unit (GPU). We also present another algorithm, C_LOF, based on a very popular and effective outlier detection algorithm called Local Outlier Factor (LOF) which unfortunately works only on batched data. Using a novel incremental approach that compensates the drawback of high complexity in LOF, we show how to implement it in a streaming context and to obtain results in a timely manner. Like C_KDE_WR, C_LOF also employs sliding-window and statistical-summary to help making decision based on the data in the current window. It also addresses all those challenges of streaming data as addressed in C_KDE_WR. In addition, we report the comparative evaluation on the accuracy of C_KDE_WR with the state-of-the-art SOD_GPU using Precision, Recall and F-score metrics. Furthermore, a t-test is also performed to demonstrate the significance of the improvement. We further report the testing results of C_LOF on different parameter settings and drew ROC and PR curve with their area under the curve (AUC) and Average Precision (AP) values calculated respectively. Experimental results show that C_LOF can overcome the masquerading problem, which often exists in outlier detection on streaming data. We provide complexity analysis and report experiment results on the accuracy of both C_KDE_WR and C_LOF algorithms in order to evaluate their effectiveness as well as their efficiencies.

13.
IEEE Trans Knowl Data Eng ; 31(7): 1281-1295, 2019 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-31435181

RESUMO

Differential Privacy (DP) has received increasing attention as a rigorous privacy framework. Many existing studies employ traditional DP mechanisms (e.g., the Laplace mechanism) as primitives to continuously release private data for protecting privacy at each time point (i.e., event-level privacy), which assume that the data at different time points are independent, or that adversaries do not have knowledge of correlation between data. However, continuously generated data tend to be temporally correlated, and such correlations can be acquired by adversaries. In this paper, we investigate the potential privacy loss of a traditional DP mechanism under temporal correlations. First, we analyze the privacy leakage of a DP mechanism under temporal correlation that can be modeled using Markov Chain. Our analysis reveals that, the event-level privacy loss of a DP mechanism may increase over time. We call the unexpected privacy loss temporal privacy leakage (TPL). Although TPL may increase over time, we find that its supremum may exist in some cases. Second, we design efficient algorithms for calculating TPL. Third, we propose data releasing mechanisms that convert any existing DP mechanism into one against TPL. Experiments confirm that our approach is efficient and effective.

14.
Entropy (Basel) ; 21(1)2019 Jan 09.
Artigo em Inglês | MEDLINE | ID: mdl-33266756

RESUMO

In this work, three techniques for enhancing various chaos-based joint compression and encryption (JCAE) schemes are proposed. They respectively improved the execution time, compression ratio, and estimation accuracy of three different chaos-based JCAE schemes. The first uses auxiliary data structures to significantly accelerate an existing chaos-based JCAE scheme. The second solves the problem of huge multidimensional lookup table overheads by sieving out a small number of important sub-tables. The third increases the accuracy of frequency distribution estimations, used for compressing streaming data, by weighting symbols in the plaintext stream according to their positions in the stream. Finally, two modified JCAE schemes leveraging the above three techniques are obtained, one applicable to static files and the other working for streaming data. Experimental results show that the proposed schemes do run faster and generate smaller files than existing JCAE schemes, which verified the effectiveness of the three newly proposed techniques.

15.
Sensors (Basel) ; 16(5)2016 Apr 27.
Artigo em Inglês | MEDLINE | ID: mdl-27128918

RESUMO

The Web of Things aims to make physical world objects and their data accessible through standard Web technologies to enable intelligent applications and sophisticated data analytics. Due to the amount and heterogeneity of the data, it is challenging to perform data analysis directly; especially when the data is captured from a large number of distributed sources. However, the size and scope of the data can be reduced and narrowed down with search techniques, so that only the most relevant and useful data items are selected according to the application requirements. Search is fundamental to the Web of Things while challenging by nature in this context, e.g., mobility of the objects, opportunistic presence and sensing, continuous data streams with changing spatial and temporal properties, efficient indexing for historical and real time data. The research community has developed numerous techniques and methods to tackle these problems as reported by a large body of literature in the last few years. A comprehensive investigation of the current and past studies is necessary to gain a clear view of the research landscape and to identify promising future directions. This survey reviews the state-of-the-art search methods for the Web of Things, which are classified according to three different viewpoints: basic principles, data/knowledge representation, and contents being searched. Experiences and lessons learned from the existing work and some EU research projects related to Web of Things are discussed, and an outlook to the future research is presented.

16.
Front Big Data ; 6: 1271639, 2023.
Artigo em Inglês | MEDLINE | ID: mdl-37928176

RESUMO

The contemporary surge in data production is fueled by diverse factors, with contributions from numerous stakeholders across various sectors. Comparing the volumes at play among different big data entities is challenging due to the scarcity of publicly available data. This survey aims to offer a comprehensive perspective on the orders of magnitude involved in yearly data generation by some public and private leading organizations, using an array of online sources for estimation. These estimates are based on meaningful, individual data production metrics and plausible per-unit sizes. The primary objective is to offer insights into the comparative scales of major big data players, their sources, and data production flows, rather than striving for precise measurements or incorporating the latest updates. The results are succinctly conveyed through a visual representation of the relative data generation volumes across these entities.

17.
Neural Netw ; 153: 314-324, 2022 Sep.
Artigo em Inglês | MEDLINE | ID: mdl-35772252

RESUMO

This paper considers the completion problem of a partially observed high-order streaming data, which is cast as an online low-rank tensor completion problem. Though the online low-rank tensor completion problem has drawn lots of attention in recent years, most of them are designed based on the traditional decomposition method, such as CP and Tucker. Inspired by the advantages of Tensor Ring decomposition over the traditional decompositions in expressing high-order data and its superiority in missing values estimation, this paper proposes two online subspace learning and imputation methods based on Tensor Ring decomposition. Specifically, we first propose an online Tensor Ring subspace learning and imputation model by formulating an exponentially weighted least squares with Frobenium norm regularization of TR-cores. Then, two commonly used optimization algorithms, i.e. alternating recursive least squares and stochastic-gradient algorithms, are developed to solve the proposed model. Numerical experiments show that the proposed methods are more effective to exploit the time-varying subspace in comparison with the conventional Tensor Ring completion methods. Besides, the proposed methods are demonstrated to be superior to obtain better results than state-of-the-art online methods in streaming data completion under varying missing ratios and noise.


Assuntos
Algoritmos
18.
ISA Trans ; 129(Pt B): 594-608, 2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-35164962

RESUMO

The setting of alarm thresholds is a critical concern of alarm management systems in industrial processes. Conventional alarm thresholds less consider changes of operating conditions in production processes, which degrades the effectiveness of alarm management systems. In response to this problem, this paper proposes an adaptive alarm threshold setting approach based on stream data clustering (SDC). Firstly, we develop a stream data clustering algorithm termed as a-DenStream algorithm which realizes industrial flow data clustering through online micro-clustering and offline integration. Subsequently, we develop the C-BOUND algorithm to extract the edges of the clustering results. In response to alarms associated with multiple operating conditions, segmentations are conducted to set alarm threshold groups and build a multi-condition alarm threshold model. Consequently, an adaptive alarm threshold setting method based on model matching is created. The effectiveness of the proposed method is demonstrated by experiments on a coal gasification chemical process. The proposed method provides a potential application for industrial processes with multiple operating conditions alarm managements.

19.
J Supercomput ; 78(5): 7078-7105, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-34754141

RESUMO

The COronaVIrus Disease 2019 (COVID-19) pandemic is unfortunately highly transmissible across the people. In order to detect and track the suspected COVID-19 infected people and consequently limit the pandemic spread, this paper entails a framework integrating the machine learning (ML), cloud, fog, and Internet of Things (IoT) technologies to propose a novel smart COVID-19 disease monitoring and prognosis system. The proposal leverages the IoT devices that collect streaming data from both medical (e.g., X-ray machine, lung ultrasound machine, etc.) and non-medical (e.g., bracelet, smartwatch, etc.) devices. Moreover, the proposed hybrid fog-cloud framework provides two kinds of federated ML as a service (federated MLaaS); (i) the distributed batch MLaaS that is implemented on the cloud environment for a long-term decision-making, and (ii) the distributed stream MLaaS, which is installed into a hybrid fog-cloud environment for a short-term decision-making. The stream MLaaS uses a shared federated prediction model stored into the cloud, whereas the real-time symptom data processing and COVID-19 prediction are done into the fog. The federated ML models are determined after evaluating a set of both batch and stream ML algorithms from the Python's libraries. The evaluation considers both the quantitative (i.e., performance in terms of accuracy, precision, root mean squared error, and F1 score) and qualitative (i.e., quality of service in terms of server latency, response time, and network latency) metrics to assess these algorithms. This evaluation shows that the stream ML algorithms have the potential to be integrated into the COVID-19 prognosis allowing the early predictions of the suspected COVID-19 cases.

20.
Int J Mach Learn Cybern ; 12(6): 1803-1824, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-34149955

RESUMO

Accurate online density estimation is crucial to numerous applications that are prevalent with streaming data. Existing online approaches for density estimation somewhat lack prompt adaptability and robustness when facing concept-drifting and noisy streaming data, resulting in delayed or even deteriorated approximations. To alleviate this issue, in this work, we first propose an adaptive local online kernel density estimator (ALoKDE) for real-time density estimation on data streams. ALoKDE consists of two tightly integrated strategies: (1) a statistical test for concept drift detection and (2) an adaptive weighted local online density estimation when a drift does occur. Specifically, using a weighted form, ALoKDE seeks to provide an unbiased estimation by factoring in the statistical hallmarks of the latest learned distribution and any potential distributional changes that could be introduced by each incoming instance. A robust variant of ALoKDE, i.e., R-ALoKDE, is further developed to effectively handle data streams with varied types/levels of noise. Moreover, we analyze the asymptotic properties of ALoKDE and R-ALoKDE, and also derive their theoretical error bounds regarding bias, variance, MSE and MISE. Extensive comparative studies on various artificial and real-world (noisy) streaming data demonstrate the efficacies of ALoKDE and R-ALoKDE in online density estimation and real-time classification (with noise).

SELEÇÃO DE REFERÊNCIAS
Detalhe da pesquisa