Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 15 de 15
Filtrar
1.
J Acoust Soc Am ; 154(1): 5-15, 2023 Jul 01.
Artigo em Inglês | MEDLINE | ID: mdl-37403993

RESUMO

The classification of underwater acoustic signals has garnered a great deal of attention in recent years due to its potential applications in military and civilian contexts. While deep neural networks have emerged as the preferred method for this task, the representation of the signals plays a crucial role in determining the performance of the classification. However, the representation of underwater acoustic signals remains an under-explored area. In addition, the annotation of large-scale datasets for the training of deep networks is a challenging and expensive task. To tackle these challenges, we propose a novel self-supervised representation learning method for underwater acoustic signal classification. Our approach consists of two stages: a pretext learning stage using unlabeled data and a downstream fine-tuning stage using a small amount of labeled data. The pretext learning stage involves randomly masking the log Mel spectrogram and reconstructing the masked part using the Swin Transformer architecture. This allows us to learn a general representation of the acoustic signal. Our method achieves a classification accuracy of 80.22% on the DeepShip dataset, outperforming or matching previous competitive methods. Furthermore, our classification method demonstrates good performance in low signal-to-noise ratio or few-shot settings.

2.
Entropy (Basel) ; 23(8)2021 Aug 13.
Artigo em Inglês | MEDLINE | ID: mdl-34441184

RESUMO

Recently, deep reinforcement learning (RL) algorithms have achieved significant progress in the multi-agent domain. However, training for increasingly complex tasks would be time-consuming and resource intensive. To alleviate this problem, efficient leveraging of historical experience is essential, which is under-explored in previous studies because most existing methods fail to achieve this goal in a continuously dynamic system owing to their complicated design. In this paper, we propose a method for knowledge reuse called "KnowRU", which can be easily deployed in the majority of multi-agent reinforcement learning (MARL) algorithms without requiring complicated hand-coded design. We employ the knowledge distillation paradigm to transfer knowledge among agents to shorten the training phase for new tasks while improving the asymptotic performance of agents. To empirically demonstrate the robustness and effectiveness of KnowRU, we perform extensive experiments on state-of-the-art MARL algorithms in collaborative and competitive scenarios. The results show that KnowRU outperforms recently reported methods and not only successfully accelerates the training phase, but also improves the training performance, emphasizing the importance of the proposed knowledge reuse for MARL.

3.
J Acoust Soc Am ; 147(6): EL441, 2020 06.
Artigo em Inglês | MEDLINE | ID: mdl-32611167

RESUMO

Understanding the dynamic system that produces speech is essential to advancing speech science, and several simultaneous sensory streams can be leveraged to describe the process. As the tongue functional deformation correlates with the lip's shapes of the speaker, this paper aims to explore the association between them. The problem is formulated as a sequence to sequence learning task and a deep neural network is trained using unlabeled lip videos to predict an upcoming ultrasound tongue image sequence. Experimental results show that the machine learning model can predict the tongue's motion with satisfactory performance, which demonstrates that the learned neural network can build the association between two imaging modalities.


Assuntos
Lábio , Língua , Lábio/diagnóstico por imagem , Redes Neurais de Computação , Fala , Língua/diagnóstico por imagem , Ultrassonografia
4.
J Acoust Soc Am ; 145(6): EL521, 2019 06.
Artigo em Inglês | MEDLINE | ID: mdl-31255155

RESUMO

Audio tagging aims to infer descriptive labels from audio clips and it is challenging due to the limited size of data and noisy labels. The solution to the tagging task is described in this paper. The main contributions include the following: an ensemble learning framework is applied to ensemble statistical features and the outputs from the deep classifiers, with the goal to utilize complementary information. Moreover, a sample re-weight strategy is employed to address the noisy label problem within the framework. The approach achieves a mean average precision of 0.958, outperforming the baseline system with a large margin.


Assuntos
Aprendizado Profundo , Rede Nervosa/fisiologia , Redes Neurais de Computação , Neoplasias Cutâneas/fisiopatologia , Biometria/métodos , Humanos
5.
Entropy (Basel) ; 21(3)2019 Mar 19.
Artigo em Inglês | MEDLINE | ID: mdl-33267009

RESUMO

In a decentralized multi-robot exploration problem, the robots have to cooperate effectively to map a strange environment as soon as possible without a centralized controller. In the past few decades, a set of "human-designed" cooperation strategies have been proposed to address this problem, such as the well-known frontier-based approach. However, many real-world settings, especially the ones that are constantly changing, are too complex for humans to design efficient and decentralized strategies. This paper presents a novel approach, the Attention-based Communication neural network (CommAttn), to "learn" the cooperation strategies automatically in the decentralized multi-robot exploration problem. The communication neural network enables the robots to learn the cooperation strategies with explicit communication. Moreover, the attention mechanism we introduced additionally can precisely calculate whether the communication is necessary for each pair of agents by considering the relevance of each received message, which enables the robots to communicate only with the necessary partners. The empirical results on a simulated multi-robot disaster exploration scenario demonstrate that our proposal outperforms the traditional "human-designed" methods, as well as other competing "learning-based" methods in the exploration task.

6.
Entropy (Basel) ; 21(4)2019 Apr 02.
Artigo em Inglês | MEDLINE | ID: mdl-33267071

RESUMO

Recently, deep learning has achieved state-of-the-art performance in more aspects than traditional shallow architecture-based machine-learning methods. However, in order to achieve higher accuracy, it is usually necessary to extend the network depth or ensemble the results of different neural networks. Increasing network depth or ensembling different networks increases the demand for memory resources and computing resources. This leads to difficulties in deploying depth-learning models in resource-constrained scenarios such as drones, mobile phones, and autonomous driving. Improving network performance without expanding the network scale has become a hot topic for research. In this paper, we propose a cross-architecture online-distillation approach to solve this problem by transmitting supplementary information on different networks. We use the ensemble method to aggregate networks of different structures, thus forming better teachers than traditional distillation methods. In addition, discontinuous distillation with progressively enhanced constraints is used to replace fixed distillation in order to reduce loss of information diversity in the distillation process. Our training method improves the distillation effect and achieves strong network-performance improvement. We used some popular models to validate the results. On the CIFAR100 dataset, AlexNet's accuracy was improved by 5.94%, VGG by 2.88%, ResNet by 5.07%, and DenseNet by 1.28%. Extensive experiments were conducted to demonstrate the effectiveness of the proposed method. On the CIFAR10, CIFAR100, and ImageNet datasets, we observed significant improvements over traditional knowledge distillation.

7.
J Acoust Soc Am ; 141(6): EL531, 2017 06.
Artigo em Inglês | MEDLINE | ID: mdl-28618815

RESUMO

Tongue gestural target classification is of great interest to researchers in the speech production field. Recently, deep convolutional neural networks (CNN) have shown superiority to standard feature extraction techniques in a variety of domains. In this letter, both CNN-based speaker-dependent and speaker-independent tongue gestural target classification experiments are conducted to classify tongue gestures during natural speech production. The CNN-based method achieves state-of-the-art performance, even though no pre-training of the CNN (with the exception of a data augmentation preprocessing) was carried out.


Assuntos
Gestos , Redes Neurais de Computação , Processamento de Sinais Assistido por Computador , Acústica da Fala , Língua/diagnóstico por imagem , Língua/fisiologia , Ultrassonografia/métodos , Qualidade da Voz , Fenômenos Biomecânicos , Aprendizado Profundo , Feminino , Humanos , Masculino , Reconhecimento Automatizado de Padrão
8.
Molecules ; 22(12)2017 Nov 23.
Artigo em Inglês | MEDLINE | ID: mdl-29168750

RESUMO

The automatic detection of diabetic retinopathy is of vital importance, as it is the main cause of irreversible vision loss in the working-age population in the developed world. The early detection of diabetic retinopathy occurrence can be very helpful for clinical treatment; although several different feature extraction approaches have been proposed, the classification task for retinal images is still tedious even for those trained clinicians. Recently, deep convolutional neural networks have manifested superior performance in image classification compared to previous handcrafted feature-based image classification methods. Thus, in this paper, we explored the use of deep convolutional neural network methodology for the automatic classification of diabetic retinopathy using color fundus image, and obtained an accuracy of 94.5% on our dataset, outperforming the results obtained by using classical approaches.


Assuntos
Retinopatia Diabética/diagnóstico , Angiofluoresceinografia , Redes Neurais de Computação , Algoritmos , Fundo de Olho , Humanos , Processamento de Imagem Assistida por Computador , Retina/diagnóstico por imagem , Retina/patologia
9.
J Acoust Soc Am ; 139(5): EL154, 2016 05.
Artigo em Inglês | MEDLINE | ID: mdl-27250201

RESUMO

The feasibility of an automatic re-initialization of contour tracking is explored by using an image similarity-based method in the ultrasound tongue sequences. To this end, the re-initialization method was incorporated into current state-of-art tongue tracking algorithms, and a quantitative comparison was made between different algorithms by computing the mean sum of distances errors. The results demonstrate that with automatic re-initialization, the tracking error can be reduced from an average of 5-6 to about 4 pixels, a result obtained by using a large number of hand-labeled frames and similarity measurements to extract the contours, which results in improved performance.

10.
Clin Linguist Phon ; 30(3-5): 313-27, 2016.
Artigo em Inglês | MEDLINE | ID: mdl-26786063

RESUMO

A new contour-tracking algorithm is presented for ultrasound tongue image sequences, which can follow the motion of tongue contours over long durations with good robustness. To cope with missing segments caused by noise, or by the tongue midsagittal surface being parallel to the direction of ultrasound wave propagation, active contours with a contour-similarity constraint are introduced, which can be used to provide 'prior' shape information. Also, in order to address accumulation of tracking errors over long sequences, we present an automatic re-initialization technique, based on the complex wavelet image similarity index. Experiments on synthetic data and on real 60 frame per second (fps) data from different subjects demonstrate that the proposed method gives good contour tracking for ultrasound image sequences even over durations of minutes, which can be useful in applications such as speech recognition where very long sequences must be analyzed in their entirety.


Assuntos
Algoritmos , Língua/fisiologia , Ultrassonografia , Feminino , Humanos , Masculino , Modelos Biológicos , Língua/diagnóstico por imagem
11.
IEEE Open J Eng Med Biol ; 5: 226-237, 2024.
Artigo em Inglês | MEDLINE | ID: mdl-38606402

RESUMO

Recently, deep learning-based methods have emerged as the preferred approach for ultrasound data analysis. However, these methods often require large-scale annotated datasets for training deep models, which are not readily available in practical scenarios. Additionally, the presence of speckle noise and other imaging artifacts can introduce numerous hard examples for ultrasound data classification. In this paper, drawing inspiration from self-supervised learning techniques, we present a pre-training method based on mask modeling specifically designed for ultrasound data. Our study investigates three different mask modeling strategies: random masking, vertical masking, and horizontal masking. By employing these strategies, our pre-training approach aims to predict the masked portion of the ultrasound images. Notably, our method does not rely on externally labeled data, allowing us to extract representative features without the need for human annotation. Consequently, we can leverage unlabeled datasets for pre-training. Furthermore, to address the challenges posed by hard samples in ultrasound data, we propose a novel hard sample mining strategy. To evaluate the effectiveness of our proposed method, we conduct experiments on two datasets. The experimental results demonstrate that our approach outperforms other state-of-the-art methods in ultrasound image classification. This indicates the superiority of our pre-training method and its ability to extract discriminative features from ultrasound data, even in the presence of hard examples.

12.
Cancer Manag Res ; 14: 51-65, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35018121

RESUMO

OBJECTIVE: To develop an approach for automatically analyzing bone metastases (BMs) on bone scintigrams based on deep learning technology. METHODS: This research included a bone scan classification model, a regional segmentation model, an assessment model for tumor burden and a diagnostic report generation model. Two hundred eighty patients with BMs and 341 patients with non-BMs were involved. Eighty percent of cases were randomly extracted from two groups as training set. Remaining cases were as testing set. A deep residual convolutional neural network with different structures was used to determine whether metastatic bone lesions existed, regions of lesions were automatically segmented. Bone scan tumor burden index (BSTBI) was calculated; finally, diagnostic report could be automatically generated. The sensitivity, specificity and accuracy of classification model were compared with three physicians with different clinical experience. The Dice coefficient evaluated the effect of segmentation model and compared to the result of nnU-Net model. The correlation between BSTBI and blood alkaline phosphatase (ALP) level was analyzed to verify the efficiency of BSTBI. The performance of report generation model was evaluated by the accuracy of interpretation of report. RESULTS: In testing set, the sensitivity, specificity and accuracy of classification model were 92.59%, 85.51% and 88.62%, respectively. The accuracy showed no statistical difference with moderately and experienced physicians and obviously outperformed the inexperienced. The Dice coefficient of BMs area was 0.7387 in segmentation stage. Based on the whole model frame, our segmentation model outperformed the nnU-Net. BSTBI value changed as the BMs changed. There was a positive correlation between BSTBI and ALP level. The accuracy of report generation model was 78.05%. CONCLUSION: Deep learning based on automatic analysis frameworks for BMs can accurately identify BMs, preliminarily realize a fully automatic analysis process from raw data to report generation. BSTBI can be used as a quantitative evaluation indicator to assess the effect of therapy on BMs in different patients or in the same patient before and after treatment.

13.
J Imaging ; 8(8)2022 Jul 31.
Artigo em Inglês | MEDLINE | ID: mdl-36005456

RESUMO

Breast cancer is the most common malignancy in women worldwide, and is responsible for more than half a million deaths each year. The appropriate therapy depends on the evaluation of the expression of various biomarkers, such as the human epidermal growth factor receptor 2 (HER2) transmembrane protein, through specialized techniques, such as immunohistochemistry or in situ hybridization. In this work, we present the HER2 on hematoxylin and eosin (HEROHE) challenge, a parallel event of the 16th European Congress on Digital Pathology, which aimed to predict the HER2 status in breast cancer based only on hematoxylin-eosin-stained tissue samples, thus avoiding specialized techniques. The challenge consisted of a large, annotated, whole-slide images dataset (509), specifically collected for the challenge. Models for predicting HER2 status were presented by 21 teams worldwide. The best-performing models are presented by detailing the network architectures and key parameters. Methods are compared and approaches, core methodologies, and software choices contrasted. Different evaluation metrics are discussed, as well as the performance of the presented models for each of these metrics. Potential differences in ranking that would result from different choices of evaluation metrics highlight the need for careful consideration at the time of their selection, as the results show that some metrics may misrepresent the true potential of a model to solve the problem for which it was developed. The HEROHE dataset remains publicly available to promote advances in the field of computational pathology.

14.
J Pathol Inform ; 13: 100149, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-36605109

RESUMO

The French Society of Pathology (SFP) organized its first data challenge in 2020 with the help of the Health Data Hub (HDH). The organization of this event first consisted of recruiting nearly 5000 cervical biopsy slides obtained from 20 pathology centers. After ensuring that patients did not refuse to include their slides in the project, the slides were anonymized, digitized, and annotated by expert pathologists, and finally uploaded to a data challenge platform for competitors from around the world. Competing teams had to develop algorithms that could distinguish 4 diagnostic classes in cervical epithelial lesions. Among the many submissions from competitors, the best algorithms achieved an overall score close to 95%. The final part of the competition lasted only 6 weeks, and the goal of SFP and HDH is now to allow for the collection to be published in open access for the scientific community. In this report, we have performed a "post-competition analysis" of the results. We first described the algorithmic pipelines of 3 top competitors. We then analyzed several difficult cases that even the top competitors could not predict correctly. A medical committee of several expert pathologists looked for possible explanations for these erroneous results by reviewing the images, and we present their findings here targeted for a large audience of pathologists and data scientists in the field of digital pathology.

15.
IEEE Trans Med Imaging ; 40(12): 3413-3423, 2021 12.
Artigo em Inglês | MEDLINE | ID: mdl-34086562

RESUMO

Detecting various types of cells in and around the tumor matrix holds a special significance in characterizing the tumor micro-environment for cancer prognostication and research. Automating the tasks of detecting, segmenting, and classifying nuclei can free up the pathologists' time for higher value tasks and reduce errors due to fatigue and subjectivity. To encourage the computer vision research community to develop and test algorithms for these tasks, we prepared a large and diverse dataset of nucleus boundary annotations and class labels. The dataset has over 46,000 nuclei from 37 hospitals, 71 patients, four organs, and four nucleus types. We also organized a challenge around this dataset as a satellite event at the International Symposium on Biomedical Imaging (ISBI) in April 2020. The challenge saw a wide participation from across the world, and the top methods were able to match inter-human concordance for the challenge metric. In this paper, we summarize the dataset and the key findings of the challenge, including the commonalities and differences between the methods developed by various participants. We have released the MoNuSAC2020 dataset to the public.


Assuntos
Algoritmos , Núcleo Celular , Humanos , Processamento de Imagem Assistida por Computador
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA