Body-part Tubelet Transformer for Human-Related Crime Classification

Joseph, Ajay Mathew, Ullah, Fath U min orcid iconORCID: 0000-0002-1243-9358 and Talavera, Estefania (2024) Body-part Tubelet Transformer for Human-Related Crime Classification. 2024 IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) . pp. 1-8. ISSN 2643-6205

[thumbnail of AAM]
Preview
PDF (AAM) - Accepted Version
2MB

Official URL: https://doi.org/10.1109/AVSS61716.2024.10672609

Abstract

Detecting human-related crimes from surveillance videos poses an increasingly difficult challenge, especially when confronted with human actions that are relatively similar. In this work, we propose a transformer-based model that induces bias through the incorporation of a Tubelet embedder module-a 3D convolutional layer. The aim is to capture spatiotemporal embeddings from skeletal trajectories extracted from videos using 3D convolutional operations. Our experiments are conducted on the Human-Related Crime dataset, revealing that the use of tubelet embeddings maintains competitive performance (49% accuracy) to the state-of-the-art, while considerably reducing the computational complexity of the model.


Repository Staff Only: item control page