📝 Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation 🔭
"Proposes a method that marries pixel-based and latent-based VDMs for text-to-video generation (text-to-image is also feasible)." [gal30b+] 🤖 #CV
⚙️ https://github.com/showlab/Show-1
🔗 https://arxiv.org/abs/2309.15818v1 #arxiv
https://creative.ai/system/media_attachments/files/111/151/723/144/059/573/original/a84058b605be30c3.jpg
https://creative.ai/system/media_attachments/files/111/151/723/200/845/119/original/6645367b0cc461bd.jpg
https://creative.ai/system/media_attachments/files/111/151/723/254/030/357/original/635ac6d6defa52c5.jpg
https://creative.ai/system/media_attachments/files/111/151/723/302/778/793/original/1d137594b610462b.jpg