Congratulations, you've built your very own conversational AI simulation!
Now it's time to test it and iterate. Plan to run several rounds of testing with different stakeholders to refine the interaction and build a more robust experience.
To test the simulation, the quickest and easiest way is to use the Preview Panel on the right-hand side of the Simulation Editor and select “Launch Preview” (see Launch simulation conversation testing image).
Launch simulation conversation testing.
Testing simulation conversation.
If the conversation is not progressing as you anticipated, make changes to the prompt and refresh the discussion in the Preview Panel so that the updates are reflected (see image below).
Refresh simulation conversation.
You can also select SEE STUDENT VIEW to open the simulation.
Student View from Preview Panel.
This view displays the simulation as learners would see and experience it (see image below; in this case, the simulation has an introduction, which learners would see first).
Simulation from learner perspective.
Keep in mind that iterating is key. The more you test and improve the AI Character’s prompt, the better the simulation will be.
How to test the prompt?
The goal of testing is to identify gaps, refine responses, and ensure the character behaves consistently across a variety of interactions. Below we walk you through different techniques to effectively test and improve your AI Character. During this process, you might want to loop in a colleague or collaborator to collect a greater diversity of approaches.
1. Experiment with Various Types of Conversations
The AI Character will encounter different types of learners and scenarios, so to prepare for this, test it with a wide range of conversational styles.
Start the conversation differently:
Test multiple starting points to see how well the AI adapts. This helps you evaluate whether the AI Character can smoothly transition into its role regardless of how the interaction begins.
Direct Start: Begin with a clear, straightforward statement or response to the AI Character.
Indirect Start: Use vague or incomplete language to see if the AI can guide the conversation forward.
Off-topic Start: Open with unrelated or unusual remarks to test if the AI can redirect focus appropriately.
Bring up diverse points:
Challenge the AI by introducing varied topics and ideas in varying orders. For example, if the AI Character represents a Non-violent Communication (NVC) coach, jump between questions on the NVC framework, context on personal workplace issues, and requests for suggestions.
Test if it can handle multiple threads without losing context.
Observe how it prioritizes or responds when topics change suddenly.
Ensure it doesn’t repeat itself or provide conflicting information.
Change phrasing of statements:
Learners rarely express an idea in the same way, so the AI Character needs to constantly interpret intent, regardless of wording. To see whether the AI Character consistently understands and accurately responds, use different phrasing in the conversation
Reword Statements: Swap synonyms or rearrange sentence structures.
Vary Length: Use both short responses and long, detailed ones.
Add Ambiguity: Include unclear or incomplete statements to test how well the AI Character clarifies meaning.
Test Low, Average, and High Quality Conversations:
Not every learner will respond with the highest quality of statements, and not every learner will navigate the conversation in the exact manner in which you intend. Simulate different levels of interaction quality–we recommend at least low, average, and high quality–to see how the AI Character adapts. This helps you identify whether the AI Character can engage productively with all types of learners, not just ideal ones.
2. Stress Test the AI Character
To identify weaknesses and reduce the chance of hallucinations, deliberately challenge the AI with unexpected inputs. These tests can reveal where additional prompting or fine-tuning may be needed.
Contradictory Statements: See if the AI Character can navigate conflicting information.
Out-of-Scope Topics: Introduce topics outside the domain of the simulation to test how gracefully the AI Character handles them and redirects the conversation back to the topic(s) of focus.
Edge Cases: Use unusual but realistic scenarios to see if it breaks character or fails to respond correctly.
Emotional Triggers: Include strong emotional language to test the Character’s empathy and professionalism.
How to edit the prompt?
As you revise the prompt to more closely align the performance of the AI Character with your vision, we suggest keeping the following in mind.
1. Add or remove 1-2 points at a time before retesting, as making small tweaks avoids confusion and unnecessary complexity down the road. Adjusting too many points at once makes it difficult to identify which changes had a positive or negative impact on the prompt’s performance.
2. Significant points, such as the goals of the simulation and important pieces of information that the AI Character needs to remember, should be repeated at least once in the prompt. Repetition helps reinforce priorities in the AI’s memory, making it less likely to deviate from them during long or complex conversations, and especially in the case of long or complex Character prompts.
Moreover, the repeated instructions should be placed strategically.
The first mention should be towards the beginning of the prompt.
The second mention should be in the last category (often titled “Key Information,” “Background Information,” or “Important Details Not to Forget”).