Scenario Testing

Inject context values, simulate execution paths, and assert outcomes — all without calling LLMs.

How It Works

The dippin test command runs scenario-based assertions against workflow files. Test cases inject context values into the simulator and verify the execution path.

.test.json
Load & Parse
Inject Scenario
Simulate Workflow
Check Assertions
PASS / FAIL

Test File Format

Test files use .test.json extension and are auto-discovered from the workflow path:

pipeline.dip       → pipeline.test.json
src/flow.dip       → src/flow.test.json

Schema

{
  "tests": [
    {
      "name": "descriptive test name",
      "scenario": {
        "key": "value"
      },
      "expect": {
        "status": "success",
        "visited": ["NodeA", "NodeB"],
        "not_visited": ["NodeC"],
        "path_contains": ["NodeA", "NodeB"],
        "immediately_after": {"NodeA": "NodeB"}
      }
    }
  ]
}

Expectation Fields

All expectation fields are optional. Only specified fields are checked.

FieldTypeDescription
statusstringExpected simulation status: "success" (reached exit), "fail", or "dead_end"
visitedstring[]Node IDs that must appear in the execution path
not_visitedstring[]Node IDs that must NOT appear in the execution path
path_containsstring[]Node IDs that must appear in order (non-contiguous matches allowed)
immediately_afterobjectMap of {"NodeA": "NodeB"} pairs asserting NodeB is the immediate next node after NodeA

Example

A workflow that routes based on outcome, with matching test scenarios:

gate.dip
workflow Gate
  goal: "Route based on outcome"
  start: Start
  exit: Exit

  agent Start
    label: Start

  agent Pass
    model: claude-sonnet-4-6
    prompt: Handle success.

  agent Fix
    model: claude-sonnet-4-6
    prompt: Handle failure.

  agent Exit
    label: Exit

  edges
    Start -> Pass  when ctx.outcome = success
    Start -> Fix   when ctx.outcome = fail
    Pass -> Exit
    Fix -> Exit
gate.test.json
{
  "tests": [
    {
      "name": "success path",
      "scenario": {"outcome": "success"},
      "expect": {
        "status": "success",
        "visited": ["Start", "Pass", "Exit"],
        "not_visited": ["Fix"]
      }
    },
    {
      "name": "failure path",
      "scenario": {"outcome": "fail"},
      "expect": {
        "status": "success",
        "visited": ["Start", "Fix", "Exit"],
        "not_visited": ["Pass"],
        "immediately_after": {"Start": "Fix"}
      }
    }
  ]
}

Test Output

dippin test
$ dippin test gate.dip
═══ Test Results ════════════════════
  PASS  success path
  PASS  failure path
─── Summary ───────────────────────
  2 tests: 2 passed, 0 failed

$ dippin test --verbose gate.dip
═══ Test Results ════════════════════
  PASS  success path
        path: Start → Pass → Exit
  PASS  failure path
        path: Start → Fix → Exit
─── Summary ───────────────────────
  2 tests: 2 passed, 0 failed

Scenario Keys

The scenario object maps context keys to values. The simulator resolves conditions by looking up ctx.<key> in the scenario context.

KeyDescription
outcomeMaps to ctx.outcome — the most common routing variable
statusMaps to ctx.status
tool_stdoutMaps to ctx.tool_stdout — tool command output

Caveats

not_visited and loop breaking

The test runner limits per-node visits to 3. When a loop exceeds this limit, the simulator forces the loop-exit edge and continues execution rather than stopping. Nodes downstream of the loop-exit can be visited even though the loop was broken. For edge-routing assertions in workflows with loops, prefer path_contains over not_visited.

immediately_after for edge routing

When testing which specific edge a node takes, immediately_after is more precise than path_contains. Use it to verify that a conditional edge routes to the expected next node.

Clearing tool defaults

Tool nodes auto-seed ctx.tool_stdout and ctx.outcome to "success". To test unconditional fallback edges after a tool node, set the key to an empty string: "ToolNode.tool_stdout": "".

Coverage Flag

Use --coverage to report node and edge coverage across all test scenarios:

$ dippin test --coverage gate.dip
  PASS  success path
  PASS  failure path
  Coverage: 4/4 nodes (100%), 4/4 edges (100%)

This helps identify nodes or edges that no test scenario exercises.

CI Integration

Use --format json for machine-readable output. Exit code is 0 if all tests pass, 1 if any fail.

$ dippin --format json test pipeline.dip
{
  "results": [
    {"name": "happy path", "passed": true, "path": ["Start", "Gate", "Pass", "Exit"]},
    {"name": "error path", "passed": false, "errors": ["expected status \"fail\", got \"success\""]}
  ],
  "passed": 1,
  "failed": 1,
  "total": 2
}