Oddbean new post about | logout
 📝 Temporal Collection and Distribution for Referring Video Object Segmentation 🔭

"Given a video sequence, the proposed framework simultaneously maintains a global referent token and a sequence of object queries across the frames, where the former is responsible for capturing video-level referent according to the language expression, while the latter serves to better locate and segment objects with each frame." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.03473v1 #arxiv

https://creative.ai/system/media_attachments/files/111/031/342/723/287/204/original/11b932ef425b11d5.jpg

https://creative.ai/system/media_attachments/files/111/031/342/782/780/419/original/a079f158e003156d.jpg

https://creative.ai/system/media_attachments/files/111/031/342/844/632/914/original/5012c407119ef14d.jpg

https://creative.ai/system/media_attachments/files/111/031/342/922/210/999/original/6022d922f2835c7a.jpg