Documentation Index
Fetch the complete documentation index at: https://playgent.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Test Runs
A Test Run executes a test case and captures the agent’s output, performance metrics (latency, tokens, cost), and execution trace.
Quick Start
run = client.runs.create(test_case_id="tc_xyz789")
print(f"Run ID: {run.id}")
print(f"Success: {run.success}")
print(f"Output: {run.output}")
Run Results
Each run returns:
success - Whether test passed
output - Agent’s response
metrics - Latency, tokens, cost
trace_id - Execution trace link
run = client.runs.create(test_case_id="tc_xyz789")
print(f"Success: {run.success}")
print(f"Latency: {run.metrics.latency_ms}ms")
print(f"Cost: ${run.metrics.cost_usd:.4f}")
Running Multiple Tests
# Multiple test cases
run = client.runs.create(test_case_ids=["tc_001", "tc_002", "tc_003"])
# By tags
run = client.runs.create(tags=["refunds"])
Async Mode
For long-running tests:
run = client.runs.create(test_case_id="tc_xyz789", async_mode=True)
# Check status later
run_result = client.runs.get(run.id)
Compare Agent Versions
run_v1 = client.runs.create(test_case_id="tc_xyz789")
run_v2 = client.runs.create(test_case_id="tc_xyz789", agent_id="agent_v2")
Next Steps
Evaluate runs with metrics:
run = client.runs.create(test_case_id="tc_xyz789")
evaluation = client.evaluate(
run_id=run.id,
scorers=["answer_relevancy", "faithfulness", "bias"]
)
Evaluations
Score runs with 29 built-in metrics
API Reference
Full API documentation