Quickstart¶
Get up and running with AgentProbe in minutes.
1. Install AgentProbe¶
2. Initialize Your Project¶
This creates an agentprobe.yaml configuration file with sensible defaults.
3. Write Your First Test¶
Create a file tests/test_my_agent.py:
from agentprobe import scenario, expect
@scenario(
name="greeting_test",
input_text="Say hello to the user",
tags=["smoke"],
)
def test_greeting():
"""Agent should produce a friendly greeting."""
pass
The @scenario decorator registers the function as a test case. When AgentProbe discovers this file, it creates a TestCase with the specified name, input, and tags.
4. Discover Tests¶
Output:
5. Working with Traces¶
AgentProbe records execution traces for every agent run. Traces capture tool calls, token usage, latency, and outputs.
List Traces¶
View a Specific Trace¶
This shows full trace details including agent name, model, input/output, token counts, latency, and tool calls.
Example: Multi-Scenario Test File¶
from agentprobe import scenario
@scenario(
name="summarization_test",
input_text="Summarize the following article: ...",
tags=["eval", "summarization"],
)
def test_summarization():
"""Agent should produce a concise summary."""
pass
@scenario(
name="tool_usage_test",
input_text="Search for the current weather in London",
tags=["tools", "search"],
)
def test_tool_usage():
"""Agent should invoke the search tool."""
pass
@scenario(
name="error_handling_test",
input_text="Process this invalid JSON: {broken",
tags=["robustness"],
timeout=10.0,
)
def test_error_handling():
"""Agent should handle malformed input gracefully."""
pass
Example: Using Assertions¶
from agentprobe import scenario, expect
@scenario(name="math_test", input_text="What is 2 + 2?")
def test_math():
pass
# After running, evaluate the trace:
# expect(trace).output_contains("4")
# expect(trace).has_tool_calls(count=0)
Next Steps¶
- Configuration --- Customize AgentProbe for your project
- Writing Tests --- In-depth test authoring guide
- Evaluators --- Set up behavioral evaluation
- Safety Testing --- Run security test suites
- CLI Reference --- All available commands