Oddbean new post about | logout
 📝 InstructDiffusion: A Generalist Modeling Interface for Vision Tasks 🔭

"Formulates human instructions to a pixel prediction task, where an InstructDiffusion model is trained to predict pixels according to user instructions, such as encircling the man's left shoulder in red or applying a blue mask to the left car." [gal30b+] 🤖 #CV

🔗 https://arxiv.org/abs/2309.03895v1 #arxiv

https://creative.ai/system/media_attachments/files/111/034/174/024/504/315/original/14090293f7bd3e8b.jpg

https://creative.ai/system/media_attachments/files/111/034/174/126/303/393/original/2e534adef4024c3c.jpg

https://creative.ai/system/media_attachments/files/111/034/174/182/459/235/original/e0b71e57dff1796a.jpg

https://creative.ai/system/media_attachments/files/111/034/174/239/131/280/original/8b09709e3300955c.jpg