Skip to main content

Documentation Index

Fetch the complete documentation index at: https://playgent.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Test Runs

A Test Run executes a test case and captures the agent’s output, performance metrics (latency, tokens, cost), and execution trace.

Quick Start

run = client.runs.create(test_case_id="tc_xyz789")

print(f"Run ID: {run.id}")
print(f"Success: {run.success}")
print(f"Output: {run.output}")

Run Results

Each run returns:
  • success - Whether test passed
  • output - Agent’s response
  • metrics - Latency, tokens, cost
  • trace_id - Execution trace link
run = client.runs.create(test_case_id="tc_xyz789")
print(f"Success: {run.success}")
print(f"Latency: {run.metrics.latency_ms}ms")
print(f"Cost: ${run.metrics.cost_usd:.4f}")

Running Multiple Tests

By IDs or Tags

# Multiple test cases
run = client.runs.create(test_case_ids=["tc_001", "tc_002", "tc_003"])

# By tags
run = client.runs.create(tags=["refunds"])

Async Mode

For long-running tests:
run = client.runs.create(test_case_id="tc_xyz789", async_mode=True)
# Check status later
run_result = client.runs.get(run.id)

Compare Agent Versions

run_v1 = client.runs.create(test_case_id="tc_xyz789")
run_v2 = client.runs.create(test_case_id="tc_xyz789", agent_id="agent_v2")

Next Steps

Evaluate runs with metrics:
run = client.runs.create(test_case_id="tc_xyz789")

evaluation = client.evaluate(
    run_id=run.id,
    scorers=["answer_relevancy", "faithfulness", "bias"]
)

Evaluations

Score runs with 29 built-in metrics

API Reference

Full API documentation