Prompt & EvalOpen Source✦ Free Tier

Inspect

Open-source LLM evaluation framework by the UK AI Safety Institute

⭐ 1,800 stars● Health 75 — ActiveApp Infrastructure

About

Inspect is an open-source framework for building LLM evaluations, developed by the UK AI Safety Institute. It provides task composition, built-in datasets, scorers, and solvers for systematic benchmarking of LLM capabilities, safety, and alignment properties.

Choose Inspect when…

•running capability and safety evaluations on LLMs
•building custom benchmarks for model comparison
•need government-backed evaluation methodology

Builder Slot

How do you know it's working?Optional for most stacks

Tests, evals, and experiment tracking to measure and improve your AI output quality

Dev Tools

Not applicable

App Infra

Recommended

Hybrid

Optional

Other tools in this slot:

PromptFoo DeepEval RAGAS Vellum PromptLayer Agenta TruLens Humanloop

Stack Genome Detection

AIchitect's Genome scanner detects Inspect in your project via these signals:

pip packages

inspect-ai

Alternatives to consider (1)

DeepEvalcompare →

Pricing

✦ Free tier available

Open SourceFree

Badge

Add to your GitHub README

[![Inspect](https://aichitect.dev/badge/tool/inspect-ai)](https://aichitect.dev/tool/inspect-ai)

Explore the full AI landscape

See how Inspect fits into the bigger picture — browse all 207 tools and their relationships.

Explore graph →