Two ways to prove agent quality
| Flint AI Scan | Flint AI Eval | |
|---|---|---|
| What | Catch issues in Python agent code | Test agent behavior at runtime |
| Proof | Clean scan or fix list | 0.0-1.0 reliability score |
| Output | Code and configuration findings | Runtime evaluation results |
- AI-powered analysis. Understand context, not patterns. Identify real problems, not just false alarms.
- Behavioral testing. LLM-as-judge scores agent reliability.
- 100% free. First results in minutes.
Try it now
Install Flint AI CLI and configure your LLM provider:Requirements:
- Python 3.13 or later
- OpenGrep (required for
flintai scan)
Configure your LLM provider
flintai-cli uses AI to analyze agent code and score reliability. Run the interactive setup:flintai init runs automatically the first time you use Flint AI CLI in a non-CI environment. You can re-run it any time to reconfigure.Where to get API keys
Where to get API keys
- Google Gemini: aistudio.google.com/apikey (free tier available)
- OpenAI: platform.openai.com/api-keys
- Anthropic: console.anthropic.com/settings/keys
- LiteLLM: Supports 100+ providers. See docs.litellm.ai
Scan in less than 5 minutes
Find agent code issues before deployment
Eval agent behavior at runtime
Get a 0.0-1.0 reliability score
Why Flint AI CLI?
Context, not patterns. Follows data flows. Flags real issues, not every match. Ship with confidence. Validate behavior, catch risks, prove readiness. Fast results. Install, scan, and ship in minutes.Built for AI developers. Ask questions, get grounded answers. No context switching. Connect via MCP →
Start here
Install
Get started in minutes
Learn
Explore tutorials
Explore
Browse built-in tests