📝 End-to-End (Instance)-Image Goal Navigation Through Correspondence as an Emergent Phenomenon 🔭 "The first stage pretext task is cross-view completion, where the model is trained to take an RGB-D observation from view A (source view) and complete the depth from a different view B (target view)." [gal30b+] 🤖 #CV ⚙️ https://github.com/naver/croco 🔗 https://arxiv.org/abs/2309.16634v1 #arxiv https://creative.ai/system/media_attachments/files/111/160/135/958/555/129/original/99e5ff621492e9e4.jpg https://creative.ai/system/media_attachments/files/111/160/136/013/789/567/original/94e1d8e61b84df21.jpg https://creative.ai/system/media_attachments/files/111/160/136/075/645/168/original/0ca32a39ee60155b.jpg https://creative.ai/system/media_attachments/files/111/160/136/124/352/752/original/052307eba1d1122e.jpg