Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Más filtros











Base de datos
Intervalo de año de publicación
1.
Sensors (Basel) ; 24(14)2024 Jul 20.
Artículo en Inglés | MEDLINE | ID: mdl-39066111

RESUMEN

In air traffic control (ATC), speech communication with radio transmission is the primary way to exchange information between the controller and the pilot. As a result, the integration of automatic speech recognition (ASR) systems holds immense potential for reducing controllers' workload and plays a crucial role in various ATC scenarios, which is particularly significant for ATC research. This article provides a comprehensive review of ASR technology's applications in the ATC communication system. Firstly, it offers a comprehensive overview of current research, including ATC corpora, ASR models, evaluation measures and application scenarios. A more comprehensive and accurate evaluation methodology tailored for ATC is proposed, considering advancements in communication sensing systems and deep learning techniques. This methodology helps researchers in enhancing ASR systems and improving the overall performance of ATC systems. Finally, future research recommendations are identified based on the primary challenges and issues. The authors sincerely hope this work will serve as a clear technical roadmap for ASR endeavors within the ATC domain and make a valuable contribution to the research community.

2.
Sci Rep ; 14(1): 9791, 2024 Apr 29.
Artículo en Inglés | MEDLINE | ID: mdl-38684909

RESUMEN

In air traffic control (ATC), Key Information Recognition (KIR) of ATC instructions plays a pivotal role in automation. The field's specialized nature has led to a scarcity of related research and a gap with the industry's cutting-edge developments. Addressing this, an innovative end-to-end deep learning framework, Small Sample Learning for Key Information Recognition (SLKIR), is introduced for enhancing KIR in ATC instructions. SLKIR incorporates a novel Multi-Head Local Lexical Association Attention (MHLA) mechanism, specifically designed to enhance accuracy in identifying boundary words of key information by capturing their latent representations. Furthermore, the framework includes a task focused on prompt, aiming to bolster the semantic comprehension of ATC instructions within the core network. To overcome the challenges posed by category imbalance in boundary word and prompt discrimination tasks, tailored loss function optimization strategies are implemented, effectively expediting the learning process and boosting recognition accuracy. The framework's efficacy and adaptability are demonstrated through experiments on two distinct ATC instruction datasets. Notably, SLKIR outperforms the leading baseline model, W2NER, achieving a 3.65% increase in F1 score on the commercial flight dataset and a 12.8% increase on the training flight dataset. This study is the first of its kind to apply small-sample learning in KIR for ATC and the source code of SLKIR will be available at: https://github.com/PANPANKK/ATC_KIR .

3.
Front Neurorobot ; 18: 1360094, 2024.
Artículo en Inglés | MEDLINE | ID: mdl-38505326

RESUMEN

Introduction: Enhancing the generalization and reliability of speech recognition models in the field of air traffic control (ATC) is a challenging task. This is due to the limited storage, difficulty in acquisition, and high labeling costs of ATC speech data, which may result in data sample bias and class imbalance, leading to uncertainty and inaccuracy in speech recognition results. This study investigates a method for assessing the quality of ATC speech based on accents. Different combinations of data quality categories are selected according to the requirements of different model application scenarios to address the aforementioned issues effectively. Methods: The impact of accents on the performance of speech recognition models is analyzed, and a fusion feature phoneme recognition model based on prior text information is constructed to identify phonemes of speech uttered by speakers. This model includes an audio encoding module, a prior text encoding module, a feature fusion module, and fully connected layers. The model takes speech and its corresponding prior text as input and outputs a predicted phoneme sequence of the speech. The model recognizes accented speech as phonemes that do not match the transcribed phoneme sequence of the actual speech text and quantitatively evaluates the accents in ATC communication by calculating the differences between the recognized phoneme sequence and the transcribed phoneme sequence of the actual speech text. Additionally, different levels of accents are input into different types of speech recognition models to analyze and compare the recognition accuracy of the models. Result: Experimental results show that, under the same experimental conditions, the highest impact of different levels of accents on speech recognition accuracy in ATC communication is 26.37%. Discussion: This further demonstrates that accents affect the accuracy of speech recognition models in ATC communication and can be considered as one of the metrics for evaluating the quality of ATC speech.

4.
Front Neurorobot ; 17: 1285831, 2023.
Artículo en Inglés | MEDLINE | ID: mdl-37885770

RESUMEN

Using computers to replace pilot seats in air traffic control (ATC) simulators is an effective way to improve controller training efficiency and reduce training costs. To achieve this, we propose a deep reinforcement learning model, RoBERTa-RL (RoBERTa with Reinforcement Learning), for generating pilot repetitions. RoBERTa-RL is based on the pre-trained language model RoBERTa and is optimized through transfer learning and reinforcement learning. Transfer learning is used to address the issue of scarce data in the ATC domain, while reinforcement learning algorithms are employed to optimize the RoBERTa model and overcome the limitations in model generalization caused by transfer learning. We selected a real-world area control dataset as the target task training and testing dataset, and a tower control dataset generated based on civil aviation radio land-air communication rules as the test dataset for evaluating model generalization. In terms of the ROUGE evaluation metrics, RoBERTa-RL achieved significant results on the area control dataset with ROUGE-1, ROUGE-2, and ROUGE-L scores of 0.9962, 0.992, and 0.996, respectively. On the tower control dataset, the scores were 0.982, 0.954, and 0.982, respectively. To overcome the limitations of ROUGE in this field, we conducted a detailed evaluation of the proposed model architecture using keyword-based evaluation criteria for the generated repetition instructions. This evaluation criterion calculates various keyword-based metrics based on the segmented results of the repetition instruction text. In the keyword-based evaluation criteria, the constructed model achieved an overall accuracy of 98.8% on the area control dataset and 81.8% on the tower control dataset. In terms of generalization, RoBERTa-RL improved accuracy by 56% compared to the model before improvement and achieved a 47.5% improvement compared to various comparative models. These results indicate that employing reinforcement learning strategies to enhance deep learning algorithms can effectively mitigate the issue of poor generalization in text generation tasks, and this approach holds promise for future application in other related domains.

SELECCIÓN DE REFERENCIAS
DETALLE DE LA BÚSQUEDA