Abstract

VT-MCNet: High-accuracy automatic modulation classification model based on vision transformer
Thien-Thanh Dao, Dae-Il Noh, Mikio Hasegawa, Hiroo Sekiya, and Won-Joo Hwang,
IEEE Communications Letters, , vol.28, no.1, pp.98-102, Apr., 2024. [pdf document]

Cognitive radio networks’ evolution hinges significantly on the use of automatic modulation classification (AMC). However, existing research reveals limitations in attaining high AMC accuracy due to ineffective feature extraction from signals. To counter this, we propose a vision-centric approach employing diverse kernel sizes to augment signal extraction. In addition, we refine the transformer architecture by incorporating a dual-branch multi-layer perceptron network, enabling diverse pattern learning and enhancing the model’s running speed. Specifically, our architecture allows the system to focus on relevant portions of the input sequence, thus, it improves classification accuracy for both high and low signal-to-noise regimes. By utilizing the widely recognized DeepSig dataset, our pioneering deep model, termed as VT-MCNet, outshines prior leading-edge deep networks in terms of classification accuracy and computational costs. Notably, VT-MCNet reaches an exceptional cumulative classification rate of up to 99.24%, while the state-of-the-art method, even with higher computational complexity, can only achieve 99.06%.