Discussion about this post

User's avatar
Sahar Mor's avatar

I found CogVLM to be the most performant multimodal model out of the existing ones.

Also, did you get the chance to try Apple's recently open-sourced Ferret? https://github.com/apple/ml-ferret

Expand full comment
1 more comment...

No posts