Run comprehensive evaluation with 20+ built-in metrics including RAG, agentic, and multi-turn evaluations
Evaluate agent responses using Playgent’s comprehensive suite of evaluation metrics. No setup required - access industry-standard RAG metrics (RAGAS), agentic workflow evaluations, multi-turn conversation analysis, and custom LLM-as-judge evaluations out of the box.Documentation Index
Fetch the complete documentation index at: https://playgent.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Playval
Answer Relevancy
Faithfulness
Contextual Precision
Contextual Recall
Contextual Relevancy
Task Completion
Tool Correctness
Argument Correctness
Step Efficiency
Plan Adherence
Plan Quality
Bias
Toxicity
Non-Advice
Misuse
PII Leakage
Role Violation
Turn Relevancy
Role Adherence
Knowledge Retention
Conversation Completeness
Goal Accuracy
Tool Use
Topic Adherence
Turn Faithfulness
Turn Contextual Precision
Turn Contextual Recall
Turn Contextual Relevancy
playval RAG:
answer_relevancy, faithfulness, contextual_precision,
contextual_recall, contextual_relevancy Safety: bias, toxicity,
non_advice, misuse, pii_leakage, role_violation Agentic:
task_completion, tool_correctness, argument_correctness,
step_efficiency, plan_adherence, plan_quality Multi-Turn:
turn_relevancy, role_adherence, knowledge_retention,
conversation_completeness, goal_accuracy, tool_use, topic_adherence,
turn_faithfulness, turn_contextual_precision, turn_contextual_recall,
turn_contextual_relevancy Or use custom scorer IDs created via Create
Custom Scorercontext parameterconversation_history parametertool_calls and optionally agent_planplayval for custom evaluation criteria via expected_behavior