Agent Evaluation
Quick Start
import mixedvoices as mv
from mixedvoices.metrics import empathy, Metric
# Create project with metrics
hangup_metric = Metric(
name="call_hangup",
definition="FAILS if the bot faces problems in ending the call",
scoring="binary"
)
project = mv.create_project("dental_clinic", metrics=[empathy, hangup_metric])
# Create version
v1 = project.create_version(
"v1",
prompt="You are a friendly dental receptionist...",
metadata={"model": "gpt-4", "deployment_date": "2024-01-15"}
)
# Generate test cases
test_generator = mv.TestCaseGenerator(v1.prompt)
test_cases = test_generator.add_from_transcripts([existing_conversation])
.add_edge_cases(2)
.add_from_descriptions(["An elderly patient", "A rushed parent"])
.generate()
# Create and run evaluator
evaluator = project.create_evaluator(test_cases, metric_names=["empathy", "call_hangup"])
evaluator.run(v1, MyAgent, agent_starts=False)Test Case Generation
Implementing Your Agent
Running Evaluations
Best Practices
Last updated