Testing

Scenario Testing

Inject context values, simulate execution paths, and assert outcomes — all without calling LLMs.

How It Works

The dippin test command runs scenario-based assertions against workflow files. Test cases inject context values into the simulator and verify the execution path.

.test.json

→

Load & Parse

→

Inject Scenario

→

Simulate Workflow

→

Check Assertions

→

PASS / FAIL

Test File Format

Test files use .test.json extension and are auto-discovered from the workflow path:

pipeline.dip       → pipeline.test.json
src/flow.dip       → src/flow.test.json

Schema

{
  "tests": [
    {
      "name": "descriptive test name",
      "scenario": {
        "key": "value"
      },
      "expect": {
        "status": "success",
        "visited": ["NodeA", "NodeB"],
        "not_visited": ["NodeC"],
        "path_contains": ["NodeA", "NodeB"],
        "immediately_after": {"NodeA": "NodeB"}
      }
    }
  ]
}

Expectation Fields

All expectation fields are optional. Only specified fields are checked.

Field	Type	Description
`status`	string	Expected simulation status: `"success"` (reached exit) or `"dead_end"`
`visited`	string[]	Node IDs that must appear in the execution path
`not_visited`	string[]	Node IDs that must NOT appear in the execution path
`path_contains`	string[]	Node IDs that must appear in order (non-contiguous matches allowed)
`immediately_after`	object	Map of `{"NodeA": "NodeB"}` pairs asserting NodeB is the immediate next node after NodeA

Example

A workflow that routes based on outcome, with matching test scenarios:

gate.dip

workflow Gate goal: "Route based on outcome" start: Start exit: Exit

agent Start label: Start

agent Pass model: claude-sonnet-4-6 prompt: Handle success.

agent Fix model: claude-sonnet-4-6 prompt: Handle failure.

agent Exit label: Exit

edges Start -> Pass when ctx.outcome = success Start -> Fix when ctx.outcome = fail Pass -> Exit Fix -> Exit

gate.test.json

{
  "tests": [
    {
      "name": "success path",
      "scenario": {"outcome": "success"},
      "expect": {
        "status": "success",
        "visited": ["Start", "Pass", "Exit"],
        "not_visited": ["Fix"]
      }
    },
    {
      "name": "failure path",
      "scenario": {"outcome": "fail"},
      "expect": {
        "status": "success",
        "visited": ["Start", "Fix", "Exit"],
        "not_visited": ["Pass"],
        "immediately_after": {"Start": "Fix"}
      }
    }
  ]
}

Test Output

dippin test

$ dippin test gate.dip ═══ Test Results ════════════════════ PASS success path PASS failure path ─── Summary ─────────────────────── 2 tests: 2 passed, 0 failed

$ dippin test –verbose gate.dip ═══ Test Results ════════════════════ PASS success path path: Start → Pass → Exit PASS failure path path: Start → Fix → Exit ─── Summary ─────────────────────── 2 tests: 2 passed, 0 failed

Scenario Keys

The scenario object maps context keys to values. The simulator resolves conditions by looking up ctx.<key> in the scenario context.

Key	Description
`outcome`	Maps to `ctx.outcome` — the most common routing variable
`status`	Maps to `ctx.status`
`tool_stdout`	Maps to `ctx.tool_stdout` — tool command output

Caveats

not_visited and loop breaking

The test runner limits per-node visits to 3. When a loop exceeds this limit, the simulator forces the loop-exit edge and continues execution rather than stopping. Nodes downstream of the loop-exit can be visited even though the loop was broken. For edge-routing assertions in workflows with loops, prefer path_contains over not_visited.

immediately_after for edge routing

When testing which specific edge a node takes, immediately_after is more precise than path_contains. Use it to verify that a conditional edge routes to the expected next node.

Clearing tool defaults

Tool nodes auto-seed ctx.tool_stdout and ctx.outcome to "success". To test unconditional fallback edges after a tool node, set the key to an empty string: "ToolNode.tool_stdout": "".

Coverage Flag

Use --coverage to report edge coverage across all test scenarios:

$ dippin test --coverage gate.dip
  PASS  success path
  PASS  failure path
─── Edge Coverage ───
  4/4 edges covered (100.0%)

This helps identify edges that no test scenario exercises. (Coverage is edge-only; there is no node-coverage report.)

CI Integration

Use --format json for machine-readable output. Exit code is 0 if all tests pass, 1 if any fail.

$ dippin --format json test pipeline.dip
{
  "results": [
    {"name": "happy path", "passed": true, "path": ["Start", "Gate", "Pass", "Exit"]},
    {"name": "error path", "passed": false, "errors": ["expected status \"dead_end\", got \"success\""]}
  ],
  "passed": 1,
  "failed": 1,
  "total": 2
}