📝 HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption 🔭
"$\textit{CCEval}$ is a GPT-4 assisted method that can assess the detailed captioning capability of large vision-language model (LVLM)." [gal30b+] 🤖 #CV
⚙️ https://github.com/haotian-liu/LLaVA
🔗 https://arxiv.org/abs/2310.01779v1 #arxiv
https://creative.ai/system/media_attachments/files/111/178/437/816/986/283/original/21dc7386dc43cc55.jpg
https://creative.ai/system/media_attachments/files/111/178/437/871/851/904/original/b3d8ec44a43d93f1.jpg