LLM InfrastructureOpen Source✦ Free Tier

llama.cpp

C++ LLM inference for local and edge deployment

⭐ 68,000 stars● Health 80 — ActiveDev Productivity & App Infrastructure

About

Highly optimized C++ inference engine for running quantized LLMs on CPU and GPU. The foundation for Ollama and many local AI tools.

Where do your models actually run?Required for most stacks

LLM providers and inference servers — where the actual model computation happens

Dev Tools

Not applicable

App Infra

Required

Hybrid

Required

Other tools in this slot:

AIchitect's Genome scanner detects llama.cpp in your project via these signals:

pip packages

llama-cpp-python

config files

Modelfile

RunPodLLM Infrastructure

✦ Free tier available

Edge / On-Device AI Stack

“Ollama already uses llama.cpp under the hood — listing both creates redundancy without adding value.”

Add to your GitHub README

[![llama.cpp](https://aichitect.dev/badge/tool/llama-cpp)](https://aichitect.dev/tool/llama-cpp)

Explore the full AI landscape

See how llama.cpp fits into the bigger picture — browse all 207 tools and their relationships.