Test your prompts, agents, and RAGs. Red teaming/pentesting/vulnerability scanning for AI. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
📊 Project Info
- Language
- TypeScript
- Stars
- ⭐ 12,536
- Forks
- 1,155
- Today
- +718
- Ranking
- #3
- Collection
- Overall
- Trending Date
- March 11, 2026
- Last Push
- 3/11/2026
🏷️ Topics
cici-cdcicdevaluationevaluation-frameworkllmllm-evalllm-evaluationllm-evaluation-frameworkllmopspentestingprompt-engineeringprompt-testingpromptsragred-teamingtestingvulnerability-scanners

