Showing 2 results for Classification.
Dalila Yessad,
Volume 20, Issue 4 (11-2024)
Abstract
This paper introduces the CTDRCepstrum, a novel feature extraction technique designed to differentiate various human activities using Doppler radar classification. Real data were collected from a Doppler radar system, capturing nine return echoes while monitoring three distinct human activities: walking, fast walking, and running. These activities were performed by three subjects, either individually or in pairs. We focus on analyzing the Doppler signatures using time-frequency reassignment, emphasizing its advantages such as improved component separability. The proposed CTDRCepstrum explores different window functions, transforming each echo signal into three forms of Short-Time Fourier Transform reassignments (RSTFT): time RSTFT (TSTFT), time derivative RSTFT (TDSTFT), and reassigned STFT (RSTFT). A convolutional neural network (CNN) model was then trained using the feature vector, which is generated by combining the cepstral analysis results of each RSTFT form. Experimental results demonstrate the effectiveness of the proposed method, achieving a remarkable classification accuracy of 99.83% by using the Bartlett-Hanning window to extract key features from real-time Doppler radar data of moving targets.
Manh-Hung Ha, Duc-Chinh Nguyen, Thai-Kim Dinh, Tran Tien-Tam, Do Tien Thanh , Oscal Tzyh-Chiang Chen,
Volume 22, Issue 1 (3-2026)
Abstract
This paper develops a robust and efficient method for the classification of Vietnamese Sign Language gestures. The study focuses on leveraging deep learning techniques, specifically a Graph Convolutional Network (GCN), to analyze hand skeletal points for gesture recognition. The Vietnamese Sign Language custom dataset (ViSL) of 33 characters and numbers, conducting experiments to validate the model's performance, and comparing it with existing architectures. The proposed approach integrates multiple streams of GCN, based on the lightweight MobileNet architecture. The custom dataset is preprocessed to extract key skeletal points using Mediapipe, forming the input for the multiple GCN. Experiments were conducted to evaluate the proposed model's accuracy, comparing its performance with traditional architectures such as VGG and ViT. The experimental results highlight the proposed model superior performance, achieving an accuracy of 99.94% test on the custom ViSL dataset, reach accuracy of 0.993% and 0.994% on American Sign Language (ASL) and ASL MINST dataset, respectivly. The multi-stream GCN approach significantly outperformed traditional architectures in terms of both accuracy and computational efficiency. This study demonstrates the effectiveness of using multi-stream GCNs based on MobileNet for ViSL recognition, showcasing their potential for real-world applications.