forge-lcdl

Decomposition graph (`forge_lcdl.graph`)

Small LLMs and constrained prompts often produce graphs of small steps (nodes with dependencies) instead of one-shot “mega” outputs. The forge_lcdl.graph package is an optional orchestration shell: validate a DAG, track…

Decomposition graph (forge_lcdl.graph)

Purpose

Small LLMs and constrained prompts often produce graphs of small steps (nodes with dependencies) instead of one-shot “mega” outputs. The forge_lcdl.graph package is an optional orchestration shell: validate a DAG, track per-node status, parse plan_decision_pack / draft_pack-shaped JSON into LcdlGraph, and run nodes sequentially in deterministic order with a user-supplied run_node callable.

It does not replace forge_lcdl.operators (seq, fallback_chain, etc.) or run_task. A future consumer can call run_task from inside run_node when wiring to real tasks.

Relationship to DecisionPack / draft_pack

The task plan_decision_pack (and helpers like run_plan_decision_pack_v1) aim for schema_version == 2 objects whose nodes map is a dict of node id → object with type in kernel | rank | invoke_pack | terminal (plus optional fields). Cheap models may omit task_id or depends_on.

Use parse_decision_pack(obj) to accept either:

  • a wrapper {"draft_pack": {...}}, or
  • the raw pack dict.

Parsing is tolerant where possible: missing depends_on becomes (), input merges with static_payload, terminal nodes get task_id lcdl.terminal, and metadata.default_task_id can supply a default for kernel-like nodes when task_id is absent.

API overview

Piece Role
LcdlNode, LcdlGraph, NodeStatus Frozen dataclasses and statuses: pending, running, ok, err, blocked, skipped
validate_graph, ready_nodes, mark_* Pure validation and immutable transitions
GraphExecutor Loop: ready_nodes → first ready (sorted by node_id) → run_node or terminal short-circuit
GraphExecutionPolicy stop_on_first_error (default) vs continue after an error
parse_decision_pack JSON/dict → LcdlGraph; raises DecisionPackParseError on bad shape

Dependency and blocking behavior

  • ready_nodes only schedules nodes whose every dependency is ok. Dependencies in err, blocked, skipped, or pending do not unblock children (MVP: lazy blocking—children stay pending when a parent is err/blocked).
  • graph_is_complete: every node is ok, err, blocked, or skipped.

Terminal nodes (lcdl.terminal)

Nodes with task_id == "lcdl.terminal" (including parsed type: "terminal") are not passed to run_node. The executor marks them running then ok with result = node.input (merged static payload / input).

Example: three-node draft_pack-compatible JSON

{
  "schema_version": 2,
  "pack_id": "example-3",
  "start": "a",
  "nodes": {
    "a": {
      "type": "kernel",
      "task_id": "llm_boolean_gate",
      "version": "v1",
      "input": {"facts": {}, "question": "Is this feasible?"},
      "depends_on": []
    },
    "b": {
      "type": "kernel",
      "task_id": "llm_enum_route",
      "version": "v1",
      "input": {"labels": ["path_a", "path_b"], "text": "choose"},
      "depends_on": ["a"]
    },
    "c": {
      "type": "terminal",
      "depends_on": ["b"],
      "static_payload": {"done": true}
    }
  },
  "metadata": {"title": "Three-step pipeline"}
}

After g = parse_decision_pack(pack_or_wrapper) and validate_graph(g), a GraphExecutor runs a then b in order; c completes automatically with result equal to the terminal node’s input (here roughly {"done": true}).

Local verification

From the forge-lcdl repo root, use python3 (some systems have no python symlink) and put src on the path unless the package is installed editably:

  • LcdlGraph.nodes is a dict[str, LcdlNode] (keys are node ids). LcdlNode requires all fields including status, result, error, verification, attempts.
  • ready_nodes(graph) returns tuple[str, ...] of node ids, not LcdlNode instances.

One-liner suite:

./scripts/verify-graph-mvp.sh

Manual equivalents:

export PYTHONPATH=src
python3 -m pytest -q tests/test_graph_mvp.py tests/test_operators.py
python3 -m compileall -q src tests

Future wiring

For catalog tasks, model profiles, and transport, see:

Risk / follow-up: mapping generic kernel nodes to real task_id values may require pack-level defaults (metadata.default_task_id) or explicit task_id per node; parallel execution is out of scope for this MVP.