Create a Test Plan
A Test Plan enables the definition of a structured set of test cases for a given agent. It serves to identify key evaluation characteristics, define scenarios, and establish the expected outcomes for each case. The Test Plan ensures that all necessary conditions and criteria are covered during testing, providing a systematic approach to quality assurance.
How to Create a New Test Plan
Accessing Test Plans:
Navigate to the Test Plan section via the sidebar. This will display a grid containing all existing test plans.
Creating a New Test Plan:
To create a new test plan, click the "New" button. This action will redirect you to the Test Plan Creation Screen, where you can configure the details of your new test plan.
Defining a Test Plan
In the Test Plan Creation Screen, you will need to define the following key elements:
-
Test Plan Name: Provide a unique name for the test plan to easily identify it.
-
Agent: Select the agent for which this test plan will be created. The agent is the entity whose behavior will be evaluated based on the defined test cases. This feature only allows for assistant agents at the moment
-
Characteristics: Characteristics are a set of conditions or criteria that will be evaluated. These conditions represent the expected behavior the agent should follow in each response scenario. Characteristics can be:
- Qualitative: The result will be either "Passed" or "Failed", indicating whether the agent met the expected behavior.
- Quantitative: The result will be numeric, e.g., a score like 70, representing a measurable outcome.
-
Test Cases: Test cases are predefined inputs or scenarios that will be executed against the agent. They simulate potential conversations or messages the agent might encounter. Each test case can include an optional expected result—a response or behavior that the agent should follow in order to pass the test.
Dimensions
Characteristics and Test Cases are grouped into four distinct dimensions, each designed to evaluate different aspects of the agent's behavior:
-
General: This dimension assesses the overall performance of the agent. It focuses on how well the agent engages with users and whether its responses meet basic expectations such as clarity, tone, and language consistency.
- Examples: Ensuring that the agent provides understandable responses, maintains an appropriate tone, or consistently responds in the language used by the user. Additionally, all test cases defined in other dimensions will also be evaluated under this dimension, providing a holistic view of the agent's general behavior.
-
Provided Information: This dimension evaluates the accuracy and relevance of the agent's responses based on the information explicitly provided in its definition.
- Examples: Ensuring that the agent accurately answers questions about the services or features it has explicitly defined
-
Inferred Information: In this dimension, the focus is on the agent’s ability to infer or deduce information that was not explicitly stated in its definition. It evaluates how well the agent can handle incomplete information and make reasonable assumptions.
- Examples: Ensuring that the agent can infer missing details or make logical assumptions when a user provides incomplete or indirect information.
-
Out of context: This dimension assesses the agent’s ability to handle unexpected or irrelevant queries, such as abrupt changes in topic or nonsensical requests. It tests how gracefully the agent navigates situations that are beyond its scope or context.
- Examples: Ensuring that the agent appropriately handles out-of-scope questions by informing the user it cannot provide answers on the topic.
Finishing the test Plan
Once you have defined at least one characteristic and one test case for each dimension, you can proceed with finalizing your test plan. Follow these steps:
-
Review: Ensure that each dimension has been populated with the necessary characteristics and test cases. Verify that the scenarios are relevant and the expected outcomes are clear.
-
Create Test Plan: After reviewing, click the "Create" button to complete the process. The test plan will now be available for execution within the test management system. You can revisit and update the test plan at any time if changes are required.
Tips for Creating Effective Test Plans
- Start Simple: Begin with basic test cases that validate essential agent behaviors before adding more complex scenarios.
- Balance Qualitative and Quantitative Characteristics: Use a mix of qualitative (pass/fail) and quantitative (numerical) characteristics to get both high-level insights and detailed performance metrics.
- Think Like the User: When defining test cases, consider various user perspectives, including edge cases, common queries, and unexpected behaviors.
- Iterate and Improve: Test plans are not static. Continuously refine your test cases based on feedback and results from prior executions.
- Leverage Real-World Scenarios: Use real conversations or interactions from users as test cases to ensure the agent behaves as expected in practical use cases.
- Prioritize Critical Behaviors: Focus on defining test cases for critical functionalities or behaviors that are most important to your agent’s success.