Joseph, Ajay Mathew, Ullah, Fath U min ORCID: 0000-0002-1243-9358 and Talavera, Estefania (2024) Body-part Tubelet Transformer for Human-Related Crime Classification. 2024 IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) . pp. 1-8. ISSN 2643-6205
Preview |
PDF (AAM)
- Accepted Version
2MB |
Official URL: https://doi.org/10.1109/AVSS61716.2024.10672609
Abstract
Detecting human-related crimes from surveillance videos poses an increasingly difficult challenge, especially when confronted with human actions that are relatively similar. In this work, we propose a transformer-based model that induces bias through the incorporation of a Tubelet embedder module-a 3D convolutional layer. The aim is to capture spatiotemporal embeddings from skeletal trajectories extracted from videos using 3D convolutional operations. Our experiments are conducted on the Human-Related Crime dataset, revealing that the use of tubelet embeddings maintains competitive performance (49% accuracy) to the state-of-the-art, while considerably reducing the computational complexity of the model.
Repository Staff Only: item control page