How to use evaluations | Prompt Hub Help Center

Evaluations allow you to systematically assess your prompt outputs using a data-driven approach. Whether you want to check for specific text using string-based rules, leverage an LLM as a judge, or validate outputs against a dataset with target values, this guide shows you how to get started.

Overview

Evaluations help you:

- Gain Objective Insights: Evaluate prompt outputs using measurable criteria.

- Iterate Quickly: Identify issues and refine your prompts based on clear, automated feedback.

- Empower Your Team: Designed for both developers and product managers, evaluations make prompt refinement accessible to everyone.

Use Case 1: String-Based Evaluations Using Regex

Setting Up a String-Based Evaluation

1. Navigate to the Evals Tab: Open your PromptHub dashboard and go to the Evals tab.

2. Create a New Evaluation: Set up an evaluation—for example, a product feedback classifier prompt.

3. Choose String-Based Method: Select the string-based evaluation option. Use operations such as “regex contains” or “does not contain” to check whether the prompt output includes specific values.

Running a String-Based Evaluation

1. Go to the playground

2. Turn on the Evaluation: Scroll down on the left panel and click the checkbox next to any evaluation you want to turn on

3. Click Run Test and Review the Results: The system will check the output against your string-based rules, helping you determine if the prompt output meets your criteria.

Use Case 2: LLM-as-a-Judge Evaluations

Setting Up an LLM-as-a-Judge Evaluation

1. Open the Evals Tab: In your PromptHub dashboard, navigate to the **Evals** tab and open the evaluation you want to use.

2. Select the LLM Method: Choose the evaluation type as LLM.

3. Configure the Evaluator Prompt: Write a custom evaluator prompt in the provided text box, or select one from your library.

- In our example, the evaluator checks if the output follows a specific JSON structure. It might ask:

Use Case 3: Evaluations with Datasets

Evaluations Tips & Troubleshooting

- Multiple Evaluators: You can run multiple evaluators at once by selecting more than one checkbox in the playground

- Eval model config: The evaluator will use the same model config as the prompt that you are testing, unless you use an evaluator prompt from your library when configuring an LLM evaluator

Additional Information

- Availability: Evaluations are currently available on the Team and Enterprise tiers

- Future Enhancements: More evaluation methods and configuration options are planned for future updates!