📝 Reinforcement Learning-Based Mixture of Vision Transformers for Video Violence Recognition 🔭
"The proposed transformer-based Mixture of Experts (MoE) video violence recognition system consists of two main modules: (1) a backbone network and (2) an intelligent router." [gal30b+] 🤖 #CV
🔗 https://arxiv.org/abs/2310.03108v1 #arxiv
https://creative.ai/system/media_attachments/files/111/187/992/126/163/637/original/916b2727e08f4532.jpg
https://creative.ai/system/media_attachments/files/111/187/992/196/317/774/original/ad0df19c5f24a731.jpg
https://creative.ai/system/media_attachments/files/111/187/992/251/302/816/original/d726792826f2e237.jpg
https://creative.ai/system/media_attachments/files/111/187/992/323/054/661/original/1987f19a11b28f6a.jpg