Moondream

Tiny OSS vision language model

⭐ 11,000 stars● Health 45 — SlowingApp Infrastructure

About

2B parameter vision-language model optimized to run on edge devices and single GPUs. Supports image captioning, visual QA, and object detection. Runs via Ollama or directly with Python.

Choose Moondream when…

•You need a vision model that runs on a single GPU or edge device
•You want a compact model for image captioning and visual QA
•Low memory footprint is a hard constraint

Builder Slot

How does your AI see and understand images?Optional for most stacks

Vision-language models for image understanding, captioning, visual QA, and document parsing

Dev Tools

Not applicable

App Infra

Optional

Hybrid

Optional

Other tools in this slot:

Fal.ai LLaVA PaliGemma Pixtral Qwen-VL InternVL2

Stack Genome Detection

AIchitect's Genome scanner detects Moondream in your project via these signals:

pip packages

moondream

Integrates with (1)

OllamaLLM Infrastructure

Compare →

Alternatives to consider (1)

LLaVAcompare →

Pricing

✦ Free tier available

Badge

Add to your GitHub README

[![Moondream](https://aichitect.dev/badge/tool/moondream)](https://aichitect.dev/tool/moondream)

Explore the full AI landscape

See how Moondream fits into the bigger picture — browse all 207 tools and their relationships.

Explore graph →