Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 20 de 34
Filtrar
Mais filtros

País/Região como assunto
Tipo de documento
Intervalo de ano de publicação
1.
Sensors (Basel) ; 24(3)2024 Jan 28.
Artigo em Inglês | MEDLINE | ID: mdl-38339572

RESUMO

The effective operation of distributed energy sources relies significantly on the communication systems employed in microgrids. This article explores the fundamental communication requirements, structures, and protocols necessary to establish a secure connection in microgrids. This article examines the present difficulties facing, and progress in, smart microgrid communication technologies, including wired and wireless networks. Furthermore, it evaluates the incorporation of diverse security methods. This article showcases a case study that illustrates the implementation of a distributed cyber-security communication system in a microgrid setting. The study concludes by emphasizing the ongoing research endeavors and suggesting potential future research paths in the field of microgrid communications.

2.
BMC Bioinformatics ; 24(1): 77, 2023 Mar 03.
Artigo em Inglês | MEDLINE | ID: mdl-36869285

RESUMO

BACKGROUND: Data archiving and distribution are essential to scientific rigor and reproducibility of research. The National Center for Biotechnology Information's Database of Genotypes and Phenotypes (dbGaP) is a public repository for scientific data sharing. To support curation of thousands of complex data sets, dbGaP has detailed submission instructions that investigators must follow when archiving their data. RESULTS: We developed dbGaPCheckup, an R package which implements a series of check, awareness, reporting, and utility functions to support data integrity and proper formatting of the subject phenotype data set and data dictionary prior to dbGaP submission. For example, as a tool, dbGaPCheckup ensures that the data dictionary contains all fields required by dbGaP, and additional fields required by dbGaPCheckup; the number and names of variables match between the data set and data dictionary; there are no duplicated variable names or descriptions; observed data values are not more extreme than the logical minimum and maximum values stated in the data dictionary; and more. The package also includes functions that implement a series of minor/scalable fixes when errors are detected (e.g., a function to reorder the variables in the data dictionary to match the order listed in the data set). Finally, we also include reporting functions that produce graphical and textual descriptives of the data to further reduce the likelihood of data integrity issues. The dbGaPCheckup R package is available on CRAN ( https://CRAN.R-project.org/package=dbGaPCheckup ) and developed on GitHub ( https://github.com/lwheinsberg/dbGaPCheckup ). CONCLUSION: dbGaPCheckup is an innovative assistive and timesaving tool that fills an important gap for researchers by making dbGaP submission of large and complex data sets less error prone.


Assuntos
Biotecnologia , Disseminação de Informação , Reprodutibilidade dos Testes , Bases de Dados Factuais , Fenótipo
3.
Sensors (Basel) ; 23(17)2023 Aug 22.
Artigo em Inglês | MEDLINE | ID: mdl-37687787

RESUMO

In the manufacturing process, equipment failure is directly related to productivity, so predictive maintenance plays a very important role. Industrial parks are distributed, and data heterogeneity exists among heterogeneous equipment, which makes predictive maintenance of equipment challenging. In this paper, we propose two main techniques to enable effective predictive maintenance in this environment. We propose a 1DCNN-Bilstm model for time series anomaly detection and predictive maintenance of manufacturing processes. The model combines a 1D convolutional neural network (1DCNN) and a bidirectional LSTM (Bilstm), which is effective in extracting features from time series data and detecting anomalies. In this paper, we combine a federated learning framework with these models to consider the distributional shifts of time series data and perform anomaly detection and predictive maintenance based on them. In this paper, we utilize the pump dataset to evaluate the performance of the combination of several federated learning frameworks and time series anomaly detection models. Experimental results show that the proposed framework achieves a test accuracy of 97.2%, which shows its potential to be utilized for real-world predictive maintenance in the future.

4.
Sensors (Basel) ; 23(13)2023 Jul 07.
Artigo em Inglês | MEDLINE | ID: mdl-37448071

RESUMO

The landing gear structure suffers from large loads during aircraft takeoff and landing, and an accurate prediction of landing gear performance is beneficial to ensure flight safety. Nevertheless, the landing gear performance prediction method based on machine learning has a strong reliance on the dataset, in which the feature dimension and data distribution will have a great impact on the prediction accuracy. To address these issues, a novel MCA-MLPSA is developed. First, an MCA (multiple correlation analysis) method is proposed to select key features. Second, a heterogeneous multilearner integration framework is proposed, which makes use of different base learners. Third, an MLPSA (multilayer perceptron with self-attention) model is proposed to adaptively capture the data distribution and adjust the weights of each base learner. Finally, the excellent prediction performance of the proposed MCA-MLPSA is validated by a series of experiments on the landing gear data.


Assuntos
Aeronaves , Aprendizado de Máquina , Redes Neurais de Computação
5.
Magn Reson Med ; 87(2): 932-947, 2022 02.
Artigo em Inglês | MEDLINE | ID: mdl-34545955

RESUMO

PURPOSE: Supervised machine learning (ML) provides a compelling alternative to traditional model fitting for parameter mapping in quantitative MRI. The aim of this work is to demonstrate and quantify the effect of different training data distributions on the accuracy and precision of parameter estimates when supervised ML is used for fitting. METHODS: We fit a two- and three-compartment biophysical model to diffusion measurements from in-vivo human brain, as well as simulated diffusion data, using both traditional model fitting and supervised ML. For supervised ML, we train several artificial neural networks, as well as random forest regressors, on different distributions of ground truth parameters. We compare the accuracy and precision of parameter estimates obtained from the different estimation approaches using synthetic test data. RESULTS: When the distribution of parameter combinations in the training set matches those observed in healthy human data sets, we observe high precision, but inaccurate estimates for atypical parameter combinations. In contrast, when training data is sampled uniformly from the entire plausible parameter space, estimates tend to be more accurate for atypical parameter combinations but may have lower precision for typical parameter combinations. CONCLUSION: This work highlights that estimation of model parameters using supervised ML depends strongly on the training-set distribution. We show that high precision obtained using ML may mask strong bias, and visual assessment of the parameter maps is not sufficient for evaluating the quality of the estimates.


Assuntos
Imagem de Difusão por Ressonância Magnética , Aprendizado de Máquina , Algoritmos , Humanos , Imageamento por Ressonância Magnética , Redes Neurais de Computação
6.
Sensors (Basel) ; 22(7)2022 Mar 25.
Artigo em Inglês | MEDLINE | ID: mdl-35408155

RESUMO

Data distribution is a cornerstone of efficient automation for intelligent machines in Industry 4.0. Although in the recent literature there have been several comparisons of relevant methods, we identify that most of those comparisons are either theoretical or based on abstract simulation tools, unable to uncover the specific, detailed impacts of the methods to the underlying networking infrastructure. In this respect, as a first contribution of this paper, we develop more detailed and fine-tuned solutions for robust data distribution in smart factories on stationary and mobile scenarios of wireless industrial networking. Using the technological enablers of WirelessHART, RPL and the methodological enabler of proxy selection as building blocks, we compose the protocol stacks of four different methods (both centralized and decentralized) for data distribution in wireless industrial networks over the IEEE 802.15.4 physical layer. We implement the presented methods in the highly detailed OMNeT++ simulation environment and we evaluate their performance via an extensive simulation analysis. Interestingly enough, we demonstrate that the careful selection of a limited set of proxies for data caching in the network can lead to an increased data delivery success rate and low data access latency. Next, we describe two test cases demonstrated in an industrial smart factory environment. First, we show the collaboration between robotic elements and wireless data services. Second, we show the integration with an industrial fog node which controls the shop-floor devices. We report selected results in much larger scales, obtained via simulations.


Assuntos
Redes de Comunicação de Computadores , Tecnologia sem Fio , Automação , Simulação por Computador , Indústrias
7.
J Med Syst ; 46(11): 73, 2022 Oct 03.
Artigo em Inglês | MEDLINE | ID: mdl-36190581

RESUMO

Processing full-length cystoscopy videos is challenging for documentation and research purposes. We therefore designed a surgeon-guided framework to extract short video clips with bladder lesions for more efficient content navigation and extraction. Screenshots of bladder lesions were captured during transurethral resection of bladder tumor, then manually labeled according to case identification, date, lesion location, imaging modality, and pathology. The framework used the screenshot to search for and extract a corresponding 10-seconds video clip. Each video clip included a one-second space holder with a QR barcode informing the video content. The success of the framework was measured by the secondary use of these short clips and the reduction of storage volume required for video materials. From 86 cases, the framework successfully generated 249 video clips from 230 screenshots, with 14 erroneous video clips from 8 screenshots excluded. The HIPPA-compliant barcodes provided information of video contents with a 100% data completeness. A web-based educational gallery was curated with various diagnostic categories and annotated frame sequences. Compared with the unedited videos, the informative short video clips reduced the storage volume by 99.5%. In conclusion, our framework expedites the generation of visual contents with surgeon's instruction for cystoscopy and potential incorporation of video data towards applications including clinical documentation, education, and research.


Assuntos
Cistoscopia , Neoplasias da Bexiga Urinária , Cistoscopia/métodos , Diagnóstico por Imagem , Documentação , Humanos , Bexiga Urinária/diagnóstico por imagem , Bexiga Urinária/patologia , Neoplasias da Bexiga Urinária/diagnóstico por imagem , Neoplasias da Bexiga Urinária/cirurgia
8.
Sensors (Basel) ; 21(13)2021 Jul 02.
Artigo em Inglês | MEDLINE | ID: mdl-34283084

RESUMO

Renewable energy sources, which are controllable under the management of the microgrids with the contribution of energy storage systems and smart inverters, can support power system frequency regulation along with traditionally frequency control providers. This issue will not be viable without a robust communication architecture that meets all communication specification requirements of frequency regulation, including latency, reliability, and security. Therefore, this paper focuses on providing a communication framework of interacting between the power grid management system and microgrid central controller. In this scenario, the microgrid control center is integrated into the utility grid as a frequency regulation supporter for the main grid. This communication structure emulates the information model of the IEC 61850 protocol to meet interoperability. By employing IoT's transmission protocol data distribution services, the structure satisfies the communication requirements for interacting in the wide-area network. This paper represents an interoperable information model for the microgrid central controller and power system management sectors' interactions based on the IEC 61850-8-2 standard. Furthermore, we evaluate our scenario by measuring the latency, reliability, and security performance of data distribution services on a real communication testbed.

9.
Entropy (Basel) ; 23(8)2021 Aug 11.
Artigo em Inglês | MEDLINE | ID: mdl-34441170

RESUMO

Rett syndrome is a disease that involves acute cognitive impairment and, consequently, a complex and varied symptomatology. This study evaluates the EEG signals of twenty-nine patients and classify them according to the level of movement artifact. The main goal is to achieve an artifact rejection strategy that performs well in all signals, regardless of the artifact level. Two different methods have been studied: one based on the data distribution and the other based on the energy function, with entropy as its main component. The method based on the data distribution shows poor performance with signals containing high amplitude outliers. On the contrary, the method based on the energy function is more robust to outliers. As it does not depend on the data distribution, it is not affected by artifactual events. A double rejection strategy has been chosen, first on a motion signal (accelerometer or EEG low-pass filtered between 1 and 10 Hz) and then on the EEG signal. The results showed a higher performance when working combining both artifact rejection methods. The energy-based method, to isolate motion artifacts, and the data-distribution-based method, to eliminate the remaining lower amplitude artifacts were used. In conclusion, a new method that proves to be robust for all types of signals is designed.

10.
Mol Genet Metab ; 123(4): 495-500, 2018 04.
Artigo em Inglês | MEDLINE | ID: mdl-29530534

RESUMO

Deficiency of beta-glucocerebrosidase (GBA) leads to Gaucher disease (GD), an inherited disorder characterised by storage of glucosylceramide (GlcCer) in lysosomes of tissue macrophages. Macrophages activated by accumulated GlcCer secrete chitotriosidase. Plasma chitotriosidase activity is significantly elevated in patients with active GD and has been suggested to indicate total body Gaucher cell load. There are two biomarkers used to assess the severity of GD - chitotriosidase has been measured for over 20 years, and deacylated GlcCer, known as glucosylsphingosine (GlcSph) is thought to be even more adequate, as it is almost a direct storage substrate. In this paper we focused entirely on statistical analysis, performing a thorough search of possible relations, dependencies and differences in the levels of these two biomarkers in a cohort of 64 Polish GD patients. We found that the treatment of GD with enzyme replacement therapy (ERT) changes the distribution of the disease biomarkers; their levels follow a normal distribution only in untreated patients. The variable "disease biomarker level" was found dependent of the binary variable "treated with ERT or not". It was found independent of the following variables: "disease type", "splenectomized or not", and "heterozygous for 24-bp duplication for CHIT1 variant" or "CHIT1 wild type". An almost perfect linear correlation (coefficient of determination R2 = 0.99) between the chitotriosidase activity and GlcSph level was revealed in splenectomized patients.


Assuntos
Biomarcadores/sangue , Doença de Gaucher/metabolismo , Doença de Gaucher/patologia , Hexosaminidases/metabolismo , Modelos Estatísticos , Psicosina/análogos & derivados , Doença de Gaucher/classificação , Humanos , Fenótipo , Psicosina/metabolismo
11.
Sensors (Basel) ; 18(10)2018 Oct 12.
Artigo em Inglês | MEDLINE | ID: mdl-30322019

RESUMO

We claim the strong potential of data-centric communications in Unmanned Aircraft Systems (UAS), as a suitable paradigm to enhance collaborative operations via efficient information sharing, as well as to build systems supporting flexible mission objectives. In particular, this paper analyzes the primary contributions to data dissemination in UAS that can be given by the Data Distribution Service (DDS) open standard, as a solid and industry-mature data-centric technology. Our study is not restricted to traditional UAS where a set of Unmanned Aerial Vehicles (UAVs) transmit data to the ground station that controls them. Instead, we contemplate flexible UAS deployments with multiple UAV units of different sizes and capacities, which are interconnected to form an aerial communication network, enabling the provision of value-added services over a delimited geographical area. In addition, the paper outlines an approach to address the issues inherent to the utilization of network-level multicast, a baseline technology in DDS, in the considered UAS deployments. We complete our analysis with a practical experience aiming at validating the feasibility and the advantages of using DDS in a multi-UAV deployment scenario. For this purpose, we use a UAS testbed built up by heterogeneous hardware equipment, including a number of interconnected micro aerial vehicles, carrying single board computers as payload, as well as real equipment from a tactical UAS from the Spanish Ministry of Defense.

12.
BMC Vet Res ; 13(1): 397, 2017 Dec 22.
Artigo em Inglês | MEDLINE | ID: mdl-29273034

RESUMO

BACKGROUND: Today's globalised and interconnected world is characterized by intertwined and quickly evolving relationships between animals, humans and their environment and by an escalating number of accessible data for public health. The public veterinary services must exploit new modeling and decision strategies to face these changes. The organization and control of data flows have become crucial to effectively evaluate the evolution and safety concerns of a given situation in the territory. This paper discusses what is needed to develop modern strategies to optimize data distribution to the stakeholders. MAIN TEXT: If traditionally the system manager and knowledge engineer have been concerned with the increase of speed of data flow and the improvement of data quality, nowadays they need to worry about data overflow as well. To avoid this risk an information system should be capable of selecting the data which need to be shown to the human operator. In this perspective, two aspects need to be distinguished: data classification vs data distribution. Data classification is the problem of organizing data depending on what they refer to and on the way they are obtained; data distribution is the problem of selecting which data is accessible to which stakeholder. Data classification can be established and implemented via ontological analysis and formal logic but we claim that a context-based selection of data should be integrated in the data distribution application. Data distribution should provide these new features: (a) the organization of situation types distinguishing at least ordinary vs extraordinary scenarios (contextualization of scenarios); (b) the possibility to focus on the data that are really important in a given scenario (data contextualization by scenarios); and (c) the classification of which data is relevant to which stakeholder (data contextualization by users). SHORT CONCLUSION: Public veterinary services, to efficaciously and efficiently manage the information needed for today's health and safety challenges, should contextualize and filter the continuous and growing flow of data by setting suitable frameworks to classify data, users' roles and possible situations.


Assuntos
Disseminação de Informação , Animais , Coleta de Dados , Disseminação de Informação/métodos , Saúde Pública , Segurança , Medicina Veterinária/métodos , Medicina Veterinária/organização & administração
13.
Sensors (Basel) ; 18(1)2017 Dec 26.
Artigo em Inglês | MEDLINE | ID: mdl-29278370

RESUMO

eHealth systems have adopted recent advances on sensing technologies together with advances in information and communication technologies (ICT) in order to provide people-centered services that improve the quality of life of an increasingly elderly population. As these eHealth services are founded on the acquisition and processing of sensitive data (e.g., personal details, diagnosis, treatments and medical history), any security threat would damage the public's confidence in them. This paper proposes a solution for the design and runtime management of indoor eHealth applications with security requirements. The proposal allows applications definition customized to patient particularities, including the early detection of health deterioration and suitable reaction (events) as well as security needs. At runtime, security support is twofold. A secured component-based platform supervises applications execution and provides events management, whilst the security of the communications among application components is also guaranteed. Additionally, the proposed event management scheme adopts the fog computing paradigm to enable local event related data storage and processing, thus saving communication bandwidth when communicating with the cloud. As a proof of concept, this proposal has been validated through the monitoring of the health status in diabetic patients at a nursing home.

14.
J Mod Appl Stat Methods ; 16(1): 744-752, 2017.
Artigo em Inglês | MEDLINE | ID: mdl-30393468

RESUMO

Temporal changes in methods for collecting longitudinal data can generate inconsistent distributions of affected variables, but effects on parameter estimates have not been well described. We examined differences in Apgar scores of infants born in 2000-2006 to women with ovulatory dysfunction (risk) or tubal obstruction (reference) who underwent assisted reproductive technology (ART), using Florida, Massachusetts, and Michigan birth certificate data linked to the Centers for Disease Control and Prevention's National ART Surveillance System database. Florida had inconsistent information on induction of labor (a control variable) from a 2004 change in birth certificate format. Because we wanted to control for bias that may be introduced by the inconsistent distribution of labor induction in analysis, we used multiple imputation data in analysis. We used Cox-Iannacchione weighted sequential hot deck method to conduct multiple imputation for the labor induction values in Florida data collected before this change, and missing values in Florida data collected after the change and overall Massachusetts and Michigan data. The adjusted odds ratios for low Apgar score were 1.94 (95% confidence interval [CI] 1.32-2.85) using imputed induction of labor and 1.83 (95% CI 1.20-2.80) using not imputed induction of labor. Compared with the estimate from multiple imputation, the estimate obtained using not imputed induction of labor was biased towards the null with inflated standard errors, but the magnitude of differences was small.

15.
Ecol Evol ; 14(4): e11237, 2024 Apr.
Artigo em Inglês | MEDLINE | ID: mdl-38633526

RESUMO

Graphs in research articles can increase the comprehension of statistical data but may mislead readers if poorly designed. We propose a new plot type, the sea stack plot, which combines vertical histograms and summary statistics to represent large univariate datasets accurately, usefully, and efficiently. We compare five commonly used plot types (dot and whisker plots, boxplots, density plots, univariate scatter plots, and dot plots) to assess their relative strengths and weaknesses when representing distributions of data commonly observed in biological studies. We find the assessed plot types are either difficult to read at large sample sizes or have the potential to misrepresent certain distributions of data, showing the need for an improved method of data visualisation. We present an analysis of the plot types used in four ecology and conservation journals covering multiple areas of these research fields, finding widespread use of uninformative bar charts and dot and whisker plots (60% of all panels showing univariate data from multiple groups for the purpose of comparison). Some articles presented more informative figures by combining plot types (16% of panels), generally boxplots and a second layer such as a flat density plot, to better display the data. This shows an appetite for more effective plot types within conservation and ecology, which may further increase if accurate and user-friendly plot types were made available. Finally, we describe sea stack plots and explain how they overcome the weaknesses associated with other alternatives to uninformative plots when used for large and/or unevenly distributed data. We provide a tool to create sea stack plots with our R package 'seastackplot', available through GitHub.

16.
Biosystems ; 236: 105126, 2024 Feb.
Artigo em Inglês | MEDLINE | ID: mdl-38278505

RESUMO

The inference of gene regulatory networks (GRNs) is a widely addressed problem in Systems Biology. GRNs can be modeled as Boolean networks, which is the simplest approach for this task. However, Boolean models need binarized data. Several approaches have been developed for the discretization of gene expression data (GED). Also, the advance of data extraction technologies, such as single-cell RNA-Sequencing (scRNA-Seq), provides a new vision of gene expression and brings new challenges for dealing with its specificities, such as a large occurrence of zero data. This work proposes a new discretization approach for dealing with scRNA-Seq time-series data, named Distribution and Successive Spline Points Discretization (DSSPD), which considers the data distribution and a proper preprocessing step. Here, Cartesian Genetic Programming (CGP) is used to infer GRNs using the results of DSSPD. The proposal is compared with CGP with the standard data handling and five state-of-the-art algorithms on curated models and experimental data. The results show that the proposal improves the results of CGP in all tested cases and outperforms the state-of-the-art algorithms in most cases.


Assuntos
Redes Reguladoras de Genes , Análise da Expressão Gênica de Célula Única , Compostos de Tosil , Redes Reguladoras de Genes/genética , Algoritmos , Biologia de Sistemas , Perfilação da Expressão Gênica/métodos
17.
Sci Total Environ ; 946: 174099, 2024 Oct 10.
Artigo em Inglês | MEDLINE | ID: mdl-38917894

RESUMO

This paper highlights the critical role of pH or proton activity measurements in environmental studies and emphasises the importance of applying proper statistical approaches when handling pH data. This allows for more informed decisions to effectively manage environmental data such as from mining influenced water. Both the pH and {H+} of the same system display different distributions, with pH mostly displaying a normal or bimodal distribution and {H+} showing a lognormal distribution. It is therefore a challenge of whether to use pH or {H+} to compute the mean or measures of central tendency for further environmental statistical analyses. In this study, different statistical techniques were applied to understand the distribution of pH and {H+} from four different mine sites, Metsämonttu in Finland, Felsendome Rabenstein in Germany, Eastrand and Westrand mine water treatment plants in South Africa. Based on the statistical results, the geometric mean can be used to calculate the average of pH if the distribution is unimodal. For a multimodal pH data distribution, peak identifying methods can be applied to extract the mean for each data population and use them for further statistical analyses.

18.
Artigo em Inglês | MEDLINE | ID: mdl-36798963

RESUMO

A method for managing, securing, and validating health data distribution records using a genetic-based hashing algorithm in a decentralized environment is presented in this research report. The rationale for choosing blockchain is to secure the transaction of health data and protect these data from manipulated fraudulent movement and corruption by a contributor to the chain, or any individual. Our approach uses technology that provides an efficient surveillance measure, including transparency of records, immunity from fraud, and protection from tampering, as well as sustaining the order of data. For medical research, the results here provide a genetic-based hashing algorithm for data security, which has lower computational complexity, low space coverage, higher security and integrity, and a high avalanche effect. The simulation will show the validity, immunity, and integrity of the data record. The technique modified in this secure decentralized network is a cryptographic hashing algorithm for 512 bits. In this study, a genetic algorithm (GA) is used to generate a key that must be used in the encryption and decryption of medical data. A GA is a metaheuristic approach inspired by the laws of genetics; and it is generally used to generate high-quality solutions for complex problems. Applications of GAs are possible in medical fields, such as radiology, oncology, cardiology, endocrinology, surgery, oncology, and radiotherapy in healthcare management.

19.
Math Biosci Eng ; 20(1): 1195-1128, 2023 Jan.
Artigo em Inglês | MEDLINE | ID: mdl-36650808

RESUMO

Most current deep learning-based news headline generation models only target domain-specific news data. When a new news domain appears, it is usually costly to obtain a large amount of data with reference truth on the new domain for model training, so text generation models trained by traditional supervised approaches often do not generalize well on the new domain-inspired by the idea of transfer learning, this paper designs a cross-domain transfer text generation method based on domain data distribution alignment, intermediate domain redistribution, and zero-shot learning semantic prototype transduction, focusing on the data problem with no reference truth in the target domain. Eventually, the model can be guided by the most relevant source domain data to generate headlines from the target domain news text through the semantic correlation between source and target domain data during the training process of generating headlines for the target domain news, even without any reference truth of the news headlines in the target domain, which improves the usability of the text generation model in real scenarios. The experimental results show that the proposed transfer text generation method has a good domain transfer effect and outperforms other existing transfer text generation methods in various text generation evaluation indexes, proving the proposed method's effectiveness in this paper.


Assuntos
Semântica
20.
Radiol Artif Intell ; 5(3): e220082, 2023 May.
Artigo em Inglês | MEDLINE | ID: mdl-37293342

RESUMO

Purpose: To investigate the correlation between differences in data distributions and federated deep learning (Fed-DL) algorithm performance in tumor segmentation on CT and MR images. Materials and Methods: Two Fed-DL datasets were retrospectively collected (from November 2020 to December 2021): one dataset of liver tumor CT images (Federated Imaging in Liver Tumor Segmentation [or, FILTS]; three sites, 692 scans) and one publicly available dataset of brain tumor MR images (Federated Tumor Segmentation [or, FeTS]; 23 sites, 1251 scans). Scans from both datasets were grouped according to site, tumor type, tumor size, dataset size, and tumor intensity. To quantify differences in data distributions, the following four distance metrics were calculated: earth mover's distance (EMD), Bhattacharyya distance (BD), χ2 distance (CSD), and Kolmogorov-Smirnov distance (KSD). Both federated and centralized nnU-Net models were trained by using the same grouped datasets. Fed-DL model performance was evaluated by using the ratio of Dice coefficients, θ, between federated and centralized models trained and tested on the same 80:20 split datasets. Results: The Dice coefficient ratio (θ) between federated and centralized models was strongly negatively correlated with the distances between data distributions, with correlation coefficients of -0.920 for EMD, -0.893 for BD, and -0.899 for CSD. However, KSD was weakly correlated with θ, with a correlation coefficient of -0.479. Conclusion: Performance of Fed-DL models in tumor segmentation on CT and MRI datasets was strongly negatively correlated with the distances between data distributions.Keywords: CT, Abdomen/GI, Liver, Comparative Studies, MR Imaging, Brain/Brain Stem, Convolutional Neural Network (CNN), Federated Deep Learning, Tumor Segmentation, Data Distribution Supplemental material is available for this article. © RSNA, 2023See also the commentary by Kwak and Bai in this issue.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA