Skip to main content

Create a Test Plan

A Test Plan enables the definition of a structured set of test cases for a given agent. It serves to identify key evaluation characteristics, define scenarios, and establish the expected outcomes for each case. The Test Plan ensures that all necessary conditions and criteria are covered during testing, providing a systematic approach to quality assurance.

How to Create a New Test Plan

Accessing Test Plans

Navigate to the Test Plan section via the sidebar. This will display a grid containing all existing test plans.

Creating a New Test Plan

To create a new test plan, click the "New" button. This action will redirect you to the Test Plan Creation Screen, where you can configure the details of your new test plan.

Test plan access on sidebar

Defining a Test Plan

In the Test Plan Creation Screen, you will need to define the following key elements:

Test Plan Name

Provide a unique name for the test plan to easily identify it.

Agent

Select the agent for which this test plan will be created. The agent is the entity whose behavior will be evaluated based on the defined test cases.

Characteristics

Characteristics are a set of conditions or criteria that will be evaluated. These conditions represent the expected behavior the agent should follow in each response scenario. Characteristics can be:

  • Qualitative: The result will be either "Passed" or "Failed", indicating whether the agent met the expected behavior.
  • Quantitative: The result will be numeric, e.g., a score like 70, representing a measurable outcome.

Test Cases

Test cases are predefined inputs or scenarios that will be executed against the agent. They simulate potential conversations or messages the agent might encounter. Each test case can include an optional expected result—a response or behavior that the agent should follow in order to pass the test. It also includes a code that you can use to more simply identify the case when analyzing the results.

Input Parameters

Differences between Agent Types.
To design a test plan, it is essential to define the test cases to be evaluated. Each test case represents an execution of the agent for a specific scenario, which varies depending on the type of agent.

  • Activity, AIProxy, Chat, Plan
    For these agents, the scenario is determined by the input parameters. If the agent has required input parameters, they must be specified for each test case. Generally, test cases will differ based on these parameters, as they represent different evaluation scenarios.

    Defining system agent test case

  • Assistant, Copilot
    For these agents, the input parameters can be set optionally, since the scenario is given by the definition of the test case (i.e. the message to which the agent must respond). Although for this type of agents the input parameters are not required to establish the cases, Agent Quality Studio allows to configure them in order to reuse this input parameter configuration for different test plan executions.

    As in the test plan execution screen, you can configure global parameters and overwrite them only in the test cases that are needed. Global parameters

    Defining conversational agent test case

Important
The agent version used by Agent Quality Studio to build the input parameter form is the published version at the time of creating the test plan. However, when creating a test plan run, a specific version of the agent can be selected, regardless of its status (draft, unpublished, or published). When a version different from the one used to create the test plan is selected, a smart merge is performed between both input parameter settings, ensuring that the values set to the parameters shared between the agent versions are preserved.

Defining a test plan

Dimensions

Characteristics and Test Cases are grouped into four distinct dimensions, each designed to evaluate different aspects of the agent's behavior:

  1. General: This dimension assesses the overall performance of the agent. It focuses on how well the agent engages with users and whether its responses meet basic expectations such as clarity, tone, and language consistency.

    • Examples: Ensuring that the agent provides understandable responses, maintains an appropriate tone, or consistently responds in the language used by the user. Additionally, all test cases defined in other dimensions will also be evaluated under this dimension, providing a holistic view of the agent's general behavior.
  2. Provided Information: This dimension evaluates the accuracy and relevance of the agent's responses based on the information explicitly provided in its definition.

    • Examples: Ensuring that the agent accurately answers questions about the services or features it has explicitly defined
  3. Inferred Information: In this dimension, the focus is on the agent’s ability to infer or deduce information that was not explicitly stated in its definition. It evaluates how well the agent can handle incomplete information and make reasonable assumptions.

    • Examples: Ensuring that the agent can infer missing details or make logical assumptions when a user provides incomplete or indirect information.
  4. Out of context: This dimension assesses the agent’s ability to handle unexpected or irrelevant queries, such as abrupt changes in topic or nonsensical requests. It tests how gracefully the agent navigates situations that are beyond its scope or context.

    • Examples: Ensuring that the agent appropriately handles out-of-scope questions by informing the user it cannot provide answers on the topic.

Finishing the test Plan

Once you have defined at least one characteristic and one test case for each dimension, you can proceed with finalizing your test plan. Follow these steps:

  1. Review: Ensure that each dimension has been populated with the necessary characteristics and test cases. Verify that the scenarios are relevant and the expected outcomes are clear.

  2. Create Test Plan: After reviewing, click the "Create" button to complete the process. The test plan will now be available for execution within the test management system. You can revisit and update the test plan at any time if changes are required.

Tips for Creating Effective Test Plans

  • Start Simple: Begin with basic test cases that validate essential agent behaviors before adding more complex scenarios.
  • Balance Qualitative and Quantitative Characteristics: Use a mix of qualitative (pass/fail) and quantitative (numerical) characteristics to get both high-level insights and detailed performance metrics.
  • Think Like the User: When defining test cases, consider various user perspectives, including edge cases, common queries, and unexpected behaviors.
  • Iterate and Improve: Test plans are not static. Continuously refine your test cases based on feedback and results from prior executions.
  • Leverage Real-World Scenarios: Use real conversations or interactions from users as test cases to ensure the agent behaves as expected in practical use cases.
  • Prioritize Critical Behaviors: Focus on defining test cases for critical functionalities or behaviors that are most important to your agent’s success.