Testing Overview

Test agents deterministically with MockLLM, TestHarness, and StepDebugger.

Testing tools

Agents are non-deterministic, so AgentForge ships utilities that pin down behavior for tests:

MockLLM — scripted LLM responses for reproducible tests
TestHarness — run agents and inspect results with rich assertions
StepDebugger — step through agent execution one step at a time

Quick Start

import { createMockLLM, createTestHarness } from '@ahzan-agentforge/core';
import { describe, it, expect } from 'vitest';

describe('my agent', () => {
  it('should call the lookup tool', async () => {
    const mockLLM = createMockLLM({
      responses: [
        { toolCalls: [{ name: 'lookup', input: { id: '123' } }] },
        { text: 'Found the item.' },
      ],
    });

    const harness = createTestHarness({
      agent: myAgentConfig,
      llm: mockLLM,
    });

    const result = await harness.run({ task: 'Find item 123' });

    expect(result.status).toBe('completed');
    expect(result.toolCalls('lookup')).toHaveLength(1);
    expect(result.hasError()).toBe(false);
  });
});

Key Points

MockLLM uses text field (not content) for text responses
TestHarness result: toolCalls() and hasError() are methods, not properties
toolCalls(toolName?) filters by tool name; without argument returns all calls

Next Steps

MockLLM — scripted LLM responses
TestHarness — test runner
StepDebugger — step-by-step debugging
Recipes — common test patterns

Testing Overview

Testing tools

Quick Start

Key Points

Next Steps

On this page