Skip to main content

Opik Configuration Guide

Step-by-step guide to help you configure Opik to generate and submit evaluation metrics in the format required by our TraceAPI

Updated yesterday

In this guide, we explain how to configure and use OPIK to systematically evaluate language model outputs.
We define standardized test cases — including input, actual output, expected output, and retrieval context — and run a set of quality metrics such as answer relevancy, faithfulness, hallucination, and bias. These metrics are mapped to broader evaluation pillars like performance, fairness & bias, safety, and reliability, providing a structured way to quantify model quality.

After collecting these raw evaluation metrics, we submit them to the TRACE Metrics API.

TRACE processes these results to generate AI governance evidence, answering questions such as:

  • Does the AI system comply with NIST AI RMF, EU AI Act, or similar guidelines?

  • How safe, fair, and robust is the system in production?

  • Are there indicators of hallucination, bias, or inconsistent behavior?

This workflow supports teams and compliance stakeholders by:

  • Providing transparent, explainable evidence for responsible AI

  • Enabling dashboards and historical monitoring of AI performance and risk

  • Helping align AI systems with internal policies and external regulatory requirements

This combined approach ensures that evaluation is not just technical, but also supports governance, auditability, and long-term risk management.

Required Fields

Field Name

Description

metric_key

Standardized name (e.g. AnswerRelevance)

value

Raw metric value from Opik (float)

Sample Opik Code (Python)

from opik.evaluation.metrics import Equals, Moderation, GEval
from datetime import datetime, timezone

# Example output and reference
output = """Paris is the capital of France and one of the most visited cities in the world.
While some tourists express concerns about safety in certain neighborhoods, Paris remains a vibrant and welcoming city.
Visitors are advised to stay vigilant, especially in crowded areas, but overall, the city is considered safe for travelers."""
reference = """Paris is the capital of France and a major tourist destination.
While no city is entirely without risk, Paris is generally safe for visitors who take standard precautions."""

metrics = [
Equals(case_sensitive=False),
Moderation()
]

metric_results = {}
for m in metrics:
if isinstance(m, Equals):
result = m.score(output=output, reference=reference)
elif isinstance(m, Moderation):
result = m.score(output=output, reference=reference)
else:
continue
metric_results[m.__class__.__name__] = result.value

Metric-to-Pillar Mapping

Metric Name

Canonical Space

Pillar

Better High

Equals

exact_match

performance

true

Contains

substring_match

performance

true

RegexMatch

regex_match

performance

true

IsJson

json_validity

reliability

true

LevenshteinRatio

levenshtein_similarity

performance

true

SentenceBLEU

sentence_bleu

performance

true

CorpusBLEU

corpus_bleu

performance

true

ROUGE

rouge_score

performance

true

G-Eval

geval_score

task_adherence

true

Moderation

moderation_risk

safety

false

Usefulness

usefulness_score

performance

true

Answer Relevance

answer_relevance

performance

true

Context Precision

context_precision

performance

true

Context Recall

context_recall

performance

true


Submit Results via API

Prepare Canonical Payload

{
"metric_metadata": {
"application_name": "chat-application",
"version": "1.0.0",
"provider": "opik",
"use_case": "transportation"
},
"metric_data": {
"opik": metric_results #see the above sample code for metric results
}

Send Via API

BASE_URL = "https://api.cognitiveview.com"
url = f"{BASE_URL}/metrics"

headers = {
"Ocp-Apim-Subscription-Key": auth_token,
"Content-Type": "application/json",
}

payload = {
"metric_metadata": {
"application_name": "chat-application",
"version": "1.0.0",
"provider": "opik",
"use_case": "transportation"
},
"metric_data": {
"opik": metric_results
}
}

response = requests.post(url, headers=headers, json=payload)
print(f"Status Code: {response.status_code}")
print("Response JSON:",response.json())

How to get your TRACE Metrics API subscription key

To use the TRACE Metrics API, you must first obtain a subscription key from CognitiveView. Follow these steps:

  1. Log in to CognitiveView

  2. Go to System Settings

    • In the main menu, navigate to System Settings.

  3. Find or generate your subscription key

    • Look for the section labeled API Access or Subscription Key.

    • If a key already exists, copy it.

    • If not, click Generate Key to create a new one.

  4. Copy and store the key securely

    • You’ll need this key to authenticate API requests.

    • Keep it safe and do not share it publicly.

Send via curl or FastAPI Client

curl -X POST https://api.cognitiveview.com/metrics \  
-H "Content-Type: application/json" \
-d @eval_payload.json

Summary

Step

Action

1

Choose Opik metrics relevant to your run_type

2

Run metrics and get raw score

3

Submit to /metrics or mcp://... endpoint

Additional resources

  • Explore example notebooks & sample code on our GitHub: see how to call the TRACE Metrics API step by step.

Questions? Reach out: support@cognitiveview.ai

Did this answer your question?