Standards for building agents, better
-
Updated
Jan 5, 2026 - TypeScript
Standards for building agents, better
Agentic testing for agentic codebases
Ship agents you can audit.
Agent testing automation 🤖 by simulating users 👥 and agents 🤝 with judge ⚖️(langwatch-scenario)
𝘈 𝘔𝘶𝘭𝘵𝘪-𝘈𝘨𝘦𝘯𝘵 𝘚𝘺𝘴𝘵𝘦𝘮 𝘧𝘰𝘳 𝘊𝘳𝘰𝘴𝘴-𝘊𝘩𝘦𝘤𝘬𝘪𝘯𝘨 𝘗𝘩𝘪𝘴𝘩𝘪𝘯𝘨 𝘜𝘙𝘓𝘴.
Qualitative benchmark suite for evaluating AI coding agents and orchestration paradigms on realistic, complex development tasks
Logic static security scanner for AI agents. OWASP LLM Top 10, EU AI Act compliance.
🧮 Solve mathematical problems and write proofs in natural language using this easy-to-use reasoning harness. Enhance your problem-solving skills effortlessly.
Add a description, image, and links to the agent-testing topic page so that developers can more easily learn about it.
To associate your repository with the agent-testing topic, visit your repo's landing page and select "manage topics."