Skip to main content

Documentation Index

Fetch the complete documentation index at: https://playgent.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

Installation

Install the Playgent SDK:
pip install playgent

Step 1: Get your API key

  1. Go to the Playgent dashboard
  2. Navigate to AccountAPI Keys
  3. Click Create API Key and save it securely

Step 2: Create an agent

Register your agent with Playgent:
from playgent import Playgent

client = Playgent(api_key="your-api-key")

agent = client.agents.create(
    name="Customer Support Agent",
    provider="openai",
    system_prompt="You are a helpful customer support agent...",
    default_scorers=["relevance", "faithfulness"]
)

Step 3: Create a test case

Define test cases for your agent:
test_case = client.test_cases.create(
    name="Refund Request",
    agent_id=agent.id,
    turns=[{
        "input": {"text": "I want a refund for order #1234"},
        "expected_behavior": "Agent should ask for order details",
        "scorers": ["relevance", "completeness"]
    }]
)

Step 4: Run tests

Execute your tests and get results:
run = client.runs.create(test_case_id=test_case.id)

# Get results
result = client.runs.get(run.id)
print(f"Pass rate: {result.summary.turns_passed}/{len(result.turns)}")

Step 5: Evaluate with 27 metrics (optional)

Run comprehensive evaluations using built-in RAG, safety, agentic, and multi-turn metrics:
# Evaluate a single response
evaluation = client.evaluate(
    input="What is your refund policy?",
    output="Returns accepted within 30 days for full refund.",
    context=["Policy: 30 day returns for full refund"],
    scorers=[
        "answer_relevancy",  # RAG metric
        "faithfulness",      # RAG metric
        "bias",             # Safety metric
        "toxicity"          # Safety metric
    ]
)

print(f"Overall pass: {evaluation.overall_pass}")
for scorer, result in evaluation.results.items():
    print(f"{scorer}: {result.score:.2f}")
Playgent provides 27 built-in metrics including RAG (RAGAS), safety checks, agentic workflows, and multi-turn conversations. See all metrics →

Next steps

API Reference

Explore all available endpoints

27 Evaluation Metrics

Explore RAG, safety, agentic, and multi-turn metrics