📝 AV-MaskEnhancer: Enhancing Video Representations Through Audio-Visual Masked Autoencoder 🔭 "AV-MaskEnhancer is an approach that combines the visual and audio information to learn more effective high-quality video representation by mask autoencoders (MAE)." [gal30b+] 🤖 #CV #MM 🔗 https://arxiv.org/abs/2309.08738v1 #arxiv https://creative.ai/system/media_attachments/files/111/094/509/041/616/169/original/f2428ee83eab6b0c.jpg https://creative.ai/system/media_attachments/files/111/094/509/102/233/015/original/dcbeefd06a4e553d.jpg https://creative.ai/system/media_attachments/files/111/094/509/176/727/635/original/92f714df9928d373.jpg