Skip to main content

Two ways to prove agent quality

Flint AI ScanFlint AI Eval
WhatCatch issues in Python agent codeTest agent behavior at runtime
ProofClean scan or fix list0.0-1.0 reliability score
OutputCode and configuration findingsRuntime evaluation results
Run them separately or together for full coverage.
  • AI-powered analysis. Understand context, not patterns. Identify real problems, not just false alarms.
  • Behavioral testing. LLM-as-judge scores agent reliability.
  • 100% free. First results in minutes.

Try it now

Install Flint AI CLI and configure your LLM provider:
Requirements:
  • Python 3.13 or later
  • OpenGrep (required for flintai scan)
Supported frameworks: Google ADK, Google GenAI, Anthropic, OpenAI, OpenAI Agents SDK, LangGraph, CrewAI, AutoGen, HuggingFace Transformers, HuggingFace smolagents
1

Install

pip install flintai-cli
2

Configure your LLM provider

flintai-cli uses AI to analyze agent code and score reliability. Run the interactive setup:
flintai init
You’ll be prompted to select a provider (Gemini, OpenAI, Anthropic, or LiteLLM), choose a model, and enter your API key.flintai init runs automatically the first time you use Flint AI CLI in a non-CI environment. You can re-run it any time to reconfigure.
Run into issues? See install troubleshooting →
What’s next? Choose your path:

Scan in less than 5 minutes

Find agent code issues before deployment

Eval agent behavior at runtime

Get a 0.0-1.0 reliability score

Why Flint AI CLI?

Context, not patterns. Follows data flows. Flags real issues, not every match. Ship with confidence. Validate behavior, catch risks, prove readiness. Fast results. Install, scan, and ship in minutes.
Built for AI developers. Ask questions, get grounded answers. No context switching. Connect via MCP →

Start here

Install

Get started in minutes

Learn

Explore tutorials

Explore

Browse built-in tests