RESUMEN
HYPOTHESIS: The formation of distorted lamellar phases, distinguished by their arrangement of crumpled, stacked layers, is frequently accompanied by the disruption of long-range order, leading to the formation of interconnected network structures commonly observed in the sponge phase. Nevertheless, traditional scattering functions grounded in deterministic modeling fall short of fully representing these intricate structural characteristics. Our hypothesis posits that a deep learning method, in conjunction with the generalized leveled wave approach used for describing structural features of distorted lamellar phases, can quantitatively unveil the inherent spatial correlations within these phases. EXPERIMENTS AND SIMULATIONS: This report outlines a novel strategy that integrates convolutional neural networks and variational autoencoders, supported by stochastically generated density fluctuations, into a regression analysis framework for extracting structural features of distorted lamellar phases from small angle neutron scattering data. To evaluate the efficacy of our proposed approach, we conducted computational accuracy assessments and applied it to the analysis of experimentally measured small angle neutron scattering spectra of AOT surfactant solutions, a frequently studied lamellar system. FINDINGS: The findings unambiguously demonstrate that deep learning provides a dependable and quantitative approach for investigating the morphology of wide variations of distorted lamellar phases. It is adaptable for deciphering structures from the lamellar to sponge phase including intermediate structures exhibiting fused topological features. This research highlights the effectiveness of deep learning methods in tackling complex issues in the field of soft matter structural analysis and beyond.
RESUMEN
Geographical research using historical maps has progressed considerably as the digitalization of topological maps across years provides valuable data and the advancement of AI machine learning models provides powerful analytic tools. Nevertheless, analysis of historical maps based on supervised learning can be limited by the laborious manual map annotations. In this work, we propose a semi-supervised learning method that can transfer the annotation of maps across years and allow map comparison and anthropogenic studies across time. Our novel two-stage framework first performs style transfer of topographic map across years and versions, and then supervised learning can be applied on the synthesized maps with annotations. We investigate the proposed semi-supervised training with the style-transferred maps and annotations on four widely-used deep neural networks (DNN), namely U-Net, fully-convolutional network (FCN), DeepLabV3, and MobileNetV3. The best performing network of U-Net achieves [Formula: see text] and [Formula: see text] trained on style-transfer synthesized maps, which indicates that the proposed framework is capable of detecting target features (bridges) on historical maps without annotations. In a comprehensive comparison, the [Formula: see text] of U-Net trained on Contrastive Unpaired Translation (CUT) generated dataset ([Formula: see text]) achieves 57.3 % than the comparative score ([Formula: see text]) of the least valid configuration (MobileNetV3 trained on CycleGAN synthesized dataset). We also discuss the remaining challenges and future research directions.
Asunto(s)
Redes Neurales de la Computación , Aprendizaje Automático Supervisado , Procesamiento de Imagen Asistido por Computador/métodosRESUMEN
Human detection and pose estimation are essential for understanding human activities in images and videos. Mainstream multi-human pose estimation methods take a top-down approach, where human detection is first performed, then each detected person bounding box is fed into a pose estimation network. This top-down approach suffers from the early commitment of initial detections in crowded scenes and other cases with ambiguities or occlusions, leading to pose estimation failures. We propose the DetPoseNet, an end-to-end multi-human detection and pose estimation framework in a unified three-stage network. Our method consists of a coarse-pose proposal extraction sub-net, a coarse-pose based proposal filtering module, and a multi-scale pose refinement sub-net. The coarse-pose proposal sub-net extracts whole-body bounding boxes and body keypoint proposals in a single shot. The coarse-pose filtering step based on the person and keypoint proposals can effectively rule out unlikely detections, thus improving subsequent processing. The pose refinement sub-net performs cascaded pose estimation on each refined proposal region. Multi-scale supervision and multi-scale regression are used in the pose refinement sub-net to simultaneously strengthen context feature learning. Structure-aware loss and keypoint masking are applied to further improve the pose refinement robustness. Our framework is flexible to accept most existing top-down pose estimators as the role of the pose refinement sub-net in our approach. Experiments on COCO and OCHuman datasets demonstrate the effectiveness of the proposed framework. The proposed method is computationally efficient (5-6x speedup) in estimating multi-person poses with refined bounding boxes in sub-seconds.
RESUMEN
This paper proposes the Parallel Residual Bi-Fusion Feature Pyramid Network (PRB-FPN) for fast and accurate single-shot object detection. Feature Pyramid (FP) is widely used in recent visual detection, however the top-down pathway of FP cannot preserve accurate localization due to pooling shifting. The advantage of FP is weakened as deeper backbones with more layers are used. In addition, it cannot keep up accurate detection of both small and large objects at the same time. To address these issues, we propose a new parallel FP structure with bi-directional (top-down and bottom-up) fusion and associated improvements to retain high-quality features for accurate localization. We provide the following design improvements: 1) parallel bifusion FP structure with a bottom-up fusion module (BFM) to detect both small and large objects at once with high accuracy; 2) concatenation and re-organization (CORE) module provides a bottom-up pathway for feature fusion, which leads to the bi-directional fusion FP that can recover lost information from lower-layer feature maps; 3) CORE feature is further purified to retain richer contextual information. Such CORE purification in both top-down and bottom-up pathways can be finished in only a few iterations; 4) adding of a residual design to CORE leads to a new Re-CORE module that enables easy training and integration with a wide range of deeper or lighter backbones. The proposed network achieves state-of-the-art performance on the UAVDT17 and MS COCO datasets.
RESUMEN
We propose a fast online video pose estimation method to detect and track human upper-body poses based on a conditional dynamic Bayesian modeling of pose modes without referring to future frames. The estimation of human body poses from videos is an important task with many applications. Our method extends fast image-based pose estimation to live video streams by leveraging the temporal correlation of articulated poses between frames. Video pose estimation is inferred over a time window using a conditional dynamic Bayesian network (CDBN), which we term time-windowed CDBN. Specifically, latent pose modes and their transitions are modeled and co-determined from the combination of three modules: 1) inference based on current observations; 2) the modeling of mode-to-mode transitions as a probabilistic prior; and 3) the modeling of state-to-mode transitions using a multimode softmax regression. Given the predicted pose modes, the body poses in terms of arm joint locations can then be determined more accurately and robustly. Our method is suitable to investigate high frame rate (HFR) scenarios, where pose mode transitions can effectively capture action-related temporal information to boost performance. We evaluate our method on a newly collected HFR-Pose dataset and four major video pose datasets (VideoPose2, TUM Kitchen, FLIC, and Penn_Action). Our method achieves improvements in both accuracy and efficiency over existing online video pose estimation methods.
RESUMEN
Accurate pharmacokinetic (PK) modeling of dynamic contrast enhanced MRI (DCE-MRI) in prostate cancer (PCa) requires knowledge of the concentration time course of the contrast agent in the feeding vasculature, the so-called arterial input function (AIF). The purpose of this study was to compare AIF choice in differentiating peripheral zone PCa from non-neoplastic prostatic tissue (NNPT), using PK analysis of high temporal resolution prostate DCE-MRI data and whole-mount pathology (WMP) validation. This prospective study was performed in 30 patients who underwent multiparametric endorectal prostate MRI at 3.0T and WMP validation. PCa foci were annotated on WMP slides and MR images using 3D Slicer. Foci ≥0.5cm(3) were contoured as tumor regions of interest (TROIs) on subtraction DCE (early-arterial - pre-contrast) images. PK analyses of TROI and NNPT data were performed using automatic AIF (aAIF) and model AIF (mAIF) methods. A paired t-test compared mean and 90th percentile (p90) PK parameters obtained with the two AIF approaches. Receiver operating characteristic (ROC) analysis determined diagnostic accuracy (DA) of PK parameters. Logistic regression determined correlation between PK parameters and histopathology. Mean TROI and NNPT PK parameters were higher using aAIF vs. mAIF (p<0.05). There was no significant difference in DA between AIF methods: highest for p90 volume transfer constant (K(trans)) (aAIF differences in the area under the ROC curve (Az) = 0.827; mAIF Az=0.93). Tumor cell density correlated with aAIF K(trans) (p=0.03). Our results indicate that DCE-MRI using both AIF methods is excellent in discriminating PCa from NNPT. If quantitative DCE-MRI is to be used as a biomarker in PCa, the same AIF method should be used consistently throughout the study.
Asunto(s)
Gadolinio DTPA/farmacocinética , Interpretación de Imagen Asistida por Computador/métodos , Imagen por Resonancia Magnética/métodos , Modelos Biológicos , Neoplasias de la Próstata/metabolismo , Neoplasias de la Próstata/patología , Anciano , Algoritmos , Simulación por Computador , Medios de Contraste/farmacocinética , Humanos , Aumento de la Imagen/métodos , Masculino , Persona de Mediana Edad , Reproducibilidad de los Resultados , Sensibilidad y EspecificidadRESUMEN
Pharmacokinetic analysis of dynamic contrast-enhanced magnetic resonance imaging (DCE-MRI) time-course data allows estimation of quantitative parameters such as K (trans) (rate constant for plasma/interstitium contrast agent transfer), v e (extravascular extracellular volume fraction), and v p (plasma volume fraction). A plethora of factors in DCE-MRI data acquisition and analysis can affect accuracy and precision of these parameters and, consequently, the utility of quantitative DCE-MRI for assessing therapy response. In this multicenter data analysis challenge, DCE-MRI data acquired at one center from 10 patients with breast cancer before and after the first cycle of neoadjuvant chemotherapy were shared and processed with 12 software tools based on the Tofts model (TM), extended TM, and Shutter-Speed model. Inputs of tumor region of interest definition, pre-contrast T1, and arterial input function were controlled to focus on the variations in parameter value and response prediction capability caused by differences in models and associated algorithms. Considerable parameter variations were observed with the within-subject coefficient of variation (wCV) values for K (trans) and v p being as high as 0.59 and 0.82, respectively. Parameter agreement improved when only algorithms based on the same model were compared, e.g., the K (trans) intraclass correlation coefficient increased to as high as 0.84. Agreement in parameter percentage change was much better than that in absolute parameter value, e.g., the pairwise concordance correlation coefficient improved from 0.047 (for K (trans)) to 0.92 (for K (trans) percentage change) in comparing two TM algorithms. Nearly all algorithms provided good to excellent (univariate logistic regression c-statistic value ranging from 0.8 to 1.0) early prediction of therapy response using the metrics of mean tumor K (trans) and k ep (=K (trans)/v e, intravasation rate constant) after the first therapy cycle and the corresponding percentage changes. The results suggest that the interalgorithm parameter variations are largely systematic, which are not likely to significantly affect the utility of DCE-MRI for assessment of therapy response.
RESUMEN
A Klebsiella sp. HE1 strain isolated from hydrogen-producing sewage sludge was examined for its ability to produce H2 and other valuable soluble metabolites (e.g., ethanol and 2,3-butanediol) from sucrose-based medium. The effect of pH and carbon substrate concentration on the production of soluble and gaseous products was investigated. The major soluble metabolite produced from Klebsiella sp. HE1 was 2,3-butanediol, accounting for over 42-58% of soluble microbial products (SMP) and its production efficiency enhanced after increasing the initial culture pH to 7.3 (without pH control). The HE1 strain also produced ethanol (contributing to 29-42% of total SMP) and a small amount of lactic acid and acetic acid. The gaseous products consisted of H2 (25-36%) and CO2 (64-75%). The optimal cumulative hydrogen production (2.7 l) and hydrogen yield (0.92mol H2 mol sucrose(-1)) were obtained at an initial sucrose concentration of 30g CODl(-1) (i.e., 26.7gl(-1)), which also led to the highest production rate for H2 (3.26mmol h(-1)l(-1)), ethanol (6.75mmol h(-1)l(-1)) and 2,3-butanediol (7.14mmol h(-1)l(-1)). The highest yield for H2, ethanol and 2,3-butanediol was 0.92, 0.81 and 0.59molmol-sucrose(-1), respectively. As for the overall energy production performance, the highest energy generation rate was 27.7kJ h(-1)l(-1) and the best energy yield was 2.45kJmolsucrose(-1), which was obtained at a sucrose concentration of 30 and 20g CODl(-1), respectively.