Metrics
Metrics
In MixedVoices, metrics help you evaluate and analyze your voice agent's performance. Each metric can be either binary (PASS/FAIL) or continuous (0-10 scale), allowing for both strict checks and nuanced performance evaluation.
Built-in Metrics
MixedVoices comes with several pre-defined metrics that cover common evaluation needs:
from mixedvoices.metrics import (
empathy, # Measures emotional intelligence and response appropriateness
hallucination, # Checks for made-up information
conciseness, # Evaluates response brevity and clarity
context_awareness, # Assesses understanding of conversation context
adaptive_qa, # Measures ability to handle follow-up questions
objection_handling, # Evaluates handling of customer objections
scheduling, # Assesses appointment scheduling accuracy
verbatim_repetition, # Checks for unnecessary repetition
)
# Get all default metrics at once
from mixedvoices.metrics import get_all_default_metrics
metrics = get_all_default_metrics()
Creating Custom Metrics
You can create custom metrics to evaluate specific aspects of your agent's performance:
from mixedvoices.metrics import Metric
# Binary metric (PASS/FAIL)
call_hangup = Metric(
name="call_hangup",
definition="FAILS if the bot faces problems in ending the call appropriately",
scoring="binary"
)
# Continuous metric (0-10 scale)
accent_handling = Metric(
name="accent_handling",
definition="Measures how well the agent understands and adapts to different accents",
scoring="continuous"
)
# Metric that needs to check against agent prompt
hallucination_check = Metric(
name="factual_accuracy",
definition="Checks if agent makes claims not supported by its prompt",
scoring="binary",
include_prompt=True # Prompt will be included during evaluation
)
Using Metrics in Projects
Metrics can be added when creating a project or updated later:
import mixedvoices as mv
from mixedvoices.metrics import empathy, get_all_default_metrics
# Create project with specific metrics
project = mv.create_project(
"dental_clinic",
metrics=[empathy, call_hangup]
)
# Or use all default metrics
project = mv.create_project(
"medical_clinic",
metrics=get_all_default_metrics()
)
# Add new metrics to existing project
project.add_metrics([accent_handling])
# Update existing metric by creating a metric with the same name
project.update_metric(new_call_hangup)
# List available metrics
metric_names = project.list_metric_names()
Example: Metric Set
# Create a comprehensive set of metrics for a medical receptionist
metrics = [
Metric(
name="hipaa_compliance",
definition="Checks if agent maintains patient privacy standards",
scoring="binary"
),
Metric(
name="urgency_detection",
definition="Measures ability to identify and escalate medical emergencies",
scoring="continuous"
),
Metric(
name="insurance_handling",
definition="Evaluates accuracy in insurance information collection",
scoring="continuous"
)
]
project = mv.create_project("medical_reception", metrics=metrics)
Evaluation with Metrics
When creating an evaluator, you can choose which metrics to use. Check Agent Evaluation for more details.
# Use all project metrics by default
evaluator = project.create_evaluator(test_cases)
# Use specific metrics, these should be a subset of the project metrics
evaluator = project.create_evaluator(
test_cases,
metric_names=["hipaa_compliance", "urgency_detection"]
)
Tips
Creating Effective Metrics
Clear Definitions: Make metric definitions specific and measurable
Appropriate Scoring: Choose binary for pass/fail requirements, continuous for nuanced evaluation
Prompt Awareness: Use
include_prompt=True
for metrics that need to check against agent knowledgeConsistent Naming: Use lowercase, descriptive names without spaces
Last updated