📝 Exploiting Modality-Specific Features for Multi-Modal Manipulation Detection and Grounding 🔭
"Designs a novel framework that explores modality-specific features while preserving the capability for multi-modal alignment by introducing visual/language pre-trained encoders and dual-branch cross-attention." [gal30b+] 🤖 #CV
🔗 https://arxiv.org/abs/2309.12657v1 #arxiv
https://creative.ai/system/media_attachments/files/111/129/406/846/171/904/original/4cb94cb54069b0dd.jpg
https://creative.ai/system/media_attachments/files/111/129/406/899/702/280/original/0b0c3a6d488ef5d6.jpg