2 Comments
User's avatar
Sahar Mor's avatar

I found CogVLM to be the most performant multimodal model out of the existing ones.

Also, did you get the chance to try Apple's recently open-sourced Ferret? https://github.com/apple/ml-ferret

Expand full comment
Ben Dickson's avatar

I haven't played around with Ferret. Is it good?

Expand full comment