Top OSS multimodal model from OpenGVLab
InternVL2 series from Shanghai AI Lab — consistently top-ranked on open-source multimodal benchmarks. Strong at document understanding, chart analysis, and multi-image reasoning.
Vision-language models for image understanding, captioning, visual QA, and document parsing
AIchitect's Genome scanner detects InternVL2 in your project via these signals:
transformersAdd to your GitHub README
[](https://aichitect.dev/tool/internvl2)Explore the full AI landscape
See how InternVL2 fits into the bigger picture — browse all 207 tools and their relationships.