1/5 MiniCPM-V 4.6 (1.3B) is now live 🚀🚀
High-res visual processing, optimized for consumer-grade and mobile hardware. We’ve leveraged the latest LLaVA-UHD v4 technique to cut vision encoding costs by 55%, enabling native edge deployment with extreme efficiency.
🔥 Beats Gemma4-E2B-it and Qwen3.5-0.8B across key multimodal and Artificial Analysis benchmarks — scoring higher than Qwen3.5-0.8B using just 2.5% of its token budget.
⚡ TTFT (75.7ms) 2.2x Faster than Qwen3.5-0.8B even with 3136² high-res images.
🏗️ ~1.5x Token Throughput compared with Qwen3.5-0.8B on a single RTX 4090.
Try the model here:
🤗 Hugging Face:
huggingface.co/openbmb/MiniC…
💻 GitHub:
github.com/OpenBMB/MiniCPM-V
🔭 Modelscope:
modelscope.cn/models/OpenBMB…
🌐 Web Demo:
huggingface.co/spaces/openbm…
📱 App Demo:
github.com/OpenBMB/MiniCPM-V…
Video