Handbook

forge-lcdl

Start here if you want a compact narrative (humans / agents): docs/WHAT-IS-LCDL.md — what LCDL is, what it deliberately is not, and how forge-certificators wraps governed pw_* tasks and Phase A routing. Consumer paths…

Private Python library: governed synchronous LLM tasks, domain-agnostic OpenAI-compatible helpers (forge_lcdl.generic), and control-flow operators (forge_lcdl.operators) for scripts that embed LLM calls without scattering transport and JSON quirks.

Location: sibling repo under Code/forge-lcdl (not ~/LCDL/).
Python: 3.11+
Alpha / dogfood: docs/ALPHA-ROADMAP.md, docs/DOGFOODING.md, offline smoke script scripts/lcdl_alpha_check.py.
Gateway timeout isolation: scripts/curl_chat_gateway_probe.sh (curl minimal vs heavy POST, same URL as forge_lcdl.generic.urls), scripts/gateway_probe_lcdl.py (chat_completion_sync parity), scripts/emit_heavy_gateway_probe_prompt.py (print or save the heavy user-message blob).
Phased subprocess gates: forge_lcdl.testing.phased_gates (run_until_exit_zero, between_attempts improve hook after each failed A try before retry, optional argv_factory, python_script_between_attempts); CLI scripts/phased_gate_run.py supports --between-a-script.
Cursor integration: docs/CURSOR-INTEGRATION.md (rules, skills, agents under .cursor/).
MCP: docs/MCP-CLIENT.md (Python client — Playwright MCP, policy, examples); docs/MCP-SIDECAR.md (LCDL as MCP tools in Cursor).
Cheap-model loop: docs/CHEAP-MODEL-OPERATING-GUIDE.md.
Benchmarks: see docs/BENCHMARKS.md (offline fake transport by default; optional --live; includes lcdl_dogfood_alpha).
Contract metadata: see docs/CONTRACT-SPEC.md (optional contract.json beside v1 Markdown; ContractSpec v2 extensions for RAG / inference / cache wiring).
Execution / RAG / graph: docs/CLIENT-API.md, docs/EXECUTION-ENGINE.md, docs/RAG.md, docs/PROMPT-CACHING.md, docs/TASK-PACKS.md.
Model routing: see docs/MODEL-ROUTING.md (offline profiles and cheap-first ModelRouter; does not replace LlmEnvProfile).
Decomposition graph: see docs/GRAPH.md (forge_lcdl.graph — DecisionPack parse, DAG validation, sequential GraphExecutor).
Context packs: see docs/CONTEXT-PACKS.md (forge_lcdl.context — bounded repo context from a task string, no LLM).
Verification: see docs/VERIFICATION.md (forge_lcdl.verification — schema/shape checks, optional subprocess, registry).
Repair loops: see docs/REPAIR-LOOPS.md (forge_lcdl.repair — failure taxonomy, retry memory, deterministic repair hints).
Coding executor: see docs/CODING-EXECUTOR.md (forge_lcdl.coding — patch units, planner, proof reports; no auto-apply in library).

Responsible use and boundaries

Use this library for authorized practice material you own, internal QA, and question-bank ingestion with explicit rights to the content. Do not build bypasses for proctored exams, paywalls, logins, CAPTCHA, rate limits, or anti-bot measures; do not add credential theft or logging of secrets (API keys, tokens, raw auth headers, full .env). Prefer deterministic extraction and validation; keep browser execution in consumers/runtime, not inside LCDL tasks. Full contributor policy: CONTRIBUTING.md.

Install

cd forge-lcdl
pip install -e ".[dev]"
pytest

Quick use

from forge_lcdl import (
    read_certificator_profile,
    run_task,
    TaskRunner,
)
from forge_lcdl.generic import chat_with_json_mode_then_plain, parse_json_object_lenient
from forge_lcdl.operators import fallback_chain, until_ok
from forge_lcdl.result import Err, Ok

Task runner (reference: chunk classify)

profile = read_certificator_profile()
out = run_task(
    "pw_chunk_classify",
    "v1",
    {"url": "https://example.com/exam", "chunks": [{"chunk_id": "a", "text_snippet": "Q1?"}]},
    profile=profile,
)

Contract: src/forge_lcdl/contracts/pw_chunk_classify/v1/contract.md.

Generic catalog tasks (v1)

Solution-agnostic governed tasks are registered in runner.run_task via TASK_REGISTRY_V1. Examples:

`task_id`	Purpose
`extract_schema_from_text`	Pull JSON matching a schema description from free text
`llm_boolean_gate`	Facts + question → yes/no + confidence
`llm_enum_route`	Single-label routing
`llm_multi_label`	Per-label scores
`word_problem_to_calc_plan`	Word problem → variables + expression + `formula_eval_safe`
`summarize_sections_schema`, `timeline_extract`, `contradiction_scan`, …	Fixed-shape summaries / analysis
`decompose_problem`	Problem statement → subproblems + assumptions
`plan_decision_pack`	Draft DecisionPack v2 JSON graph
`board_game_choose_move`	Player view + enumerated legal `move_id`s → validated choice
`board_game_rank_moves`	Rank all legal moves (permutation check)
`board_game_coach_notes`	Teaching notes constrained to listed legal moves

Contracts live under src/forge_lcdl/contracts/<task_id>/v1/contract.md.

Deterministic games (engine vs LCDL)

Engine (stdlib only): forge_lcdl.game_engine — legal moves, immutable states, JSON replay, reference rulesets (tic_tac_toe, nim, connect_four_like, checkers_am (American/English draughts), resource_build_lite, hidden_hand_lite, trade_lite, coop_defense_lite). See docs/GAME-ENGINE.md.
LCDL helpers: forge_lcdl.game_lcdl builds task payloads from engine state and validates model outputs against engine move_id sets. The engine never imports LLM helpers; constrained tasks reject invented move_ids.

Page mechanics and Playwright discovery (source ingest)

Consumer repos run Playwright and validators; forge-lcdl supplies governed tasks for pw_page_kind_route, pw_quiz_mechanics_discover, pw_mechanics_repair, and related helpers. Start with docs/PLAYWRIGHT-DISCOVERY.md (end-to-end loop and payloads) and docs/PAGE-MECHANICS.md (schema split and task catalog).

Messages and transport aliases

FileRef, LlmMessage, build_openai_messages (messages.py); transport_chat_completions / transport_chat_with_policies / transport_batch_sequential / estimate_token_budget for the catalog transport surface.

Injectable transport (tests)

def fake_chat(messages, **kwargs):
    from forge_lcdl.types import ChatResult
    return ChatResult(True, '{"chunk_results":[{"chunk_id":"a","is_question_block":true,"confidence":0.9,"reason":"mcq"}]}')

runner = TaskRunner(chat=fake_chat)
runner.run("pw_chunk_classify", "v1", {...}, profile=profile)

Layout

Path	Role
`src/forge_lcdl/generic/`	URL builders, headers policy, `chat_with_json_mode_then_plain`, lenient JSON parse, merge-by-key, UTF-8 truncation, `transport_batch_sequential`, `estimate_token_budget`
`src/forge_lcdl/transport.py`	Blocking `urllib` `chat/completions`
`src/forge_lcdl/env.py`	`read_certificator_profile` / `read_taxonomy_profile`
`src/forge_lcdl/operators.py`	`seq`, `repeat`, `for_each`, `until_ok`, `branch`, `fallback_chain`, `try_catch`, `optional_step`
`src/forge_lcdl/messages.py`	`LlmMessage`, `FileRef`, `build_openai_messages`
`src/forge_lcdl/math_safe.py`	`formula_eval_safe` (allowlisted arithmetic)
`src/forge_lcdl/execution/`, `retrieval/`, `inference/`, `prompts/`	`LcdlClient`, RAG adapters, planner tasks, prompt cache helpers — docs/CLIENT-API.md, docs/RAG.md
`src/forge_lcdl/game_engine/`	Stdlib-only deterministic games (legality, replay, player views)
`src/forge_lcdl/game_lcdl/`	Payload builders + validators for board-game catalog tasks
`contracts/`	Per-task Markdown contracts (under `src/forge_lcdl/contracts/`)
`docs/GAME-ENGINE.md`	Boundary between `game_engine` and LLM layer
`docs/PLAYWRIGHT-DISCOVERY.md`	Source-ingest: probes, routing, mechanics inference, repair (consumer Playwright runtime vs LCDL)
`docs/PAGE-MECHANICS.md`	Mechanics artifact conventions (`page_mechanics_v1` vs `page_mechanics.v1`) and focused task IDs
`CONTRIBUTING.md`	Governance, scope, layering (LCDL vs runtime), contribution norms
`docs/EXTRACTION-CONVERGENCE.md`	Staged LLM + deterministic convergence playbook (DOM/PDF)
`docs/operators/`	Operator reference
`docs/WHAT-IS-LCDL.md`	Narrative overview (humans + agents); boundaries vs Fleet / Playwright
`docs/ADOPTION.md`	Dependencies + forge-certificators consumer index

Replacing in-script LLM plumbing (cheat sheet)

You have today	Use in forge-lcdl
Double `chat_completion` (JSON mode then `response_format=None`)	`chat_with_json_mode_then_plain`
Second attempt with extra user suffix, plain mode	`chat_with_json_then_nudge_plain`
Strip ```json fences then `json.loads`	`strip_markdown_code_fence` / `parse_json_object_lenient`
“First `{…}` slice” recovery	inside `parse_json_object_lenient`
`{"error":{…}}` in 200 body	check pattern in task code; helper `is_gateway_error_json`
Merge classifier rows by `chunk_id`	`merge_by_key(items, updates, "chunk_id")`
User JSON capped at 100k chars	`truncate_utf8_bytes`
Append `raw_http_body` to logs	`format_chat_error_message`
Retry until predicate / max N	`until_ok`
First success among strategies	`fallback_chain` over `lambda: Ok(...)` / `Err(...)`

Live Granite tests (optional integration)

Integration tests live under tests/integration/. They call the real OpenAI-compatible Granite gateway and are marked granite so default pytest stays quiet when credentials are absent: if the env file is missing or LLM_BASE_URL / API key are empty after load, those tests skip (no network).

Variable	Role
(default file)	`../forge-composer-workbench/project-management-certification/certification-llm.local.env` relative to this repo (same pattern as the PM certification workbench; file is gitignored).
`FORGE_LCDL_GRANITE_ENV_FILE`	Absolute path, or path relative to the forge-lcdl repo root, to any `KEY=value` Granite env file (`LLM_BASE_URL`, `LLM_API_KEY` or `OPENAI_API_KEY`, optional `LLM_TIMEOUT_SEC`).
`FORGE_LCDL_LIVE_MODEL`	After the file is merged into the process environment, integration tests set `LLM_MODEL` to this value. Default when unset: `ctx-unlim-qwen3-8b:latest` (confirm with `GET /v1/models` on your deployment).
`FORGE_LCDL_STRESS_CONTEXT`	Set to `1` / `true` / `yes` / `on` to enable the optional `stress_context` probe (large user payload).
`FORGE_LCDL_CONTEXT_CHARS`	Approximate user padding size for the stress probe (default 50000). Gateway 413 / 400 or timeouts are deployment limits, not necessarily forge-lcdl bugs.
`FORGE_LCDL_USE_THERMAL_GUARD`	Reserved for future optional `forge_composer` thermal guard wiring via `pre_chat`; integration tests do not require `forge_composer` today.

Commands:

cd forge-lcdl
pytest -m granite          # live smoke + pw_chunk_classify (+ stress only if FORGE_LCDL_STRESS_CONTEXT=1)
pytest -m stress_context   # subset: large prompt probe (still needs Granite env + -m granite or full run)
pytest                     # unit tests + skipped integration when no env file

“Unlimited context” in catalog names is a product label, not a hard guarantee; treat stress results as informational.

Sibling package: `forge-lcdl-runtime`

Disk-backed ChatSession, RAG-lite rag_concat_prefix, decision tables/trees, matrix helpers, and Playwright/MCP bridge stubs live in the separate repo ../forge-lcdl-runtime. Install with pip install -e . from that folder after editable forge-lcdl.

Git

Initialize a private remote (GitHub/GitLab) and push this tree. Do not commit .env files or live gateway URLs.