forge-lcdl

forge-lcdl

Start here if you want a compact narrative (humans / agents): docs/WHAT-IS-LCDL.md — what LCDL is, what it deliberately is not, and how forge-certificators wraps governed pw_* tasks and Phase A routing. Consumer paths…

Private Python library: governed synchronous LLM tasks, domain-agnostic OpenAI-compatible helpers (forge_lcdl.generic), and control-flow operators (forge_lcdl.operators) for scripts that embed LLM calls without scattering transport and JSON quirks.

Responsible use and boundaries

Use this library for authorized practice material you own, internal QA, and question-bank ingestion with explicit rights to the content. Do not build bypasses for proctored exams, paywalls, logins, CAPTCHA, rate limits, or anti-bot measures; do not add credential theft or logging of secrets (API keys, tokens, raw auth headers, full .env). Prefer deterministic extraction and validation; keep browser execution in consumers/runtime, not inside LCDL tasks. Full contributor policy: CONTRIBUTING.md.

Install

cd forge-lcdl
pip install -e ".[dev]"
pytest

Quick use

from forge_lcdl import (
    read_certificator_profile,
    run_task,
    TaskRunner,
)
from forge_lcdl.generic import chat_with_json_mode_then_plain, parse_json_object_lenient
from forge_lcdl.operators import fallback_chain, until_ok
from forge_lcdl.result import Err, Ok

Task runner (reference: chunk classify)

profile = read_certificator_profile()
out = run_task(
    "pw_chunk_classify",
    "v1",
    {"url": "https://example.com/exam", "chunks": [{"chunk_id": "a", "text_snippet": "Q1?"}]},
    profile=profile,
)

Contract: src/forge_lcdl/contracts/pw_chunk_classify/v1/contract.md.

Generic catalog tasks (v1)

Solution-agnostic governed tasks are registered in runner.run_task via TASK_REGISTRY_V1. Examples:

task_id Purpose
extract_schema_from_text Pull JSON matching a schema description from free text
llm_boolean_gate Facts + question → yes/no + confidence
llm_enum_route Single-label routing
llm_multi_label Per-label scores
word_problem_to_calc_plan Word problem → variables + expression + formula_eval_safe
summarize_sections_schema, timeline_extract, contradiction_scan, … Fixed-shape summaries / analysis
decompose_problem Problem statement → subproblems + assumptions
plan_decision_pack Draft DecisionPack v2 JSON graph
board_game_choose_move Player view + enumerated legal move_ids → validated choice
board_game_rank_moves Rank all legal moves (permutation check)
board_game_coach_notes Teaching notes constrained to listed legal moves

Contracts live under src/forge_lcdl/contracts/<task_id>/v1/contract.md.

Deterministic games (engine vs LCDL)

  • Engine (stdlib only): forge_lcdl.game_engine — legal moves, immutable states, JSON replay, reference rulesets (tic_tac_toe, nim, connect_four_like, checkers_am (American/English draughts), resource_build_lite, hidden_hand_lite, trade_lite, coop_defense_lite). See docs/GAME-ENGINE.md.
  • LCDL helpers: forge_lcdl.game_lcdl builds task payloads from engine state and validates model outputs against engine move_id sets. The engine never imports LLM helpers; constrained tasks reject invented move_ids.

Page mechanics and Playwright discovery (source ingest)

Consumer repos run Playwright and validators; forge-lcdl supplies governed tasks for pw_page_kind_route, pw_quiz_mechanics_discover, pw_mechanics_repair, and related helpers. Start with docs/PLAYWRIGHT-DISCOVERY.md (end-to-end loop and payloads) and docs/PAGE-MECHANICS.md (schema split and task catalog).

Messages and transport aliases

FileRef, LlmMessage, build_openai_messages (messages.py); transport_chat_completions / transport_chat_with_policies / transport_batch_sequential / estimate_token_budget for the catalog transport surface.

Injectable transport (tests)

def fake_chat(messages, **kwargs):
    from forge_lcdl.types import ChatResult
    return ChatResult(True, '{"chunk_results":[{"chunk_id":"a","is_question_block":true,"confidence":0.9,"reason":"mcq"}]}')

runner = TaskRunner(chat=fake_chat)
runner.run("pw_chunk_classify", "v1", {...}, profile=profile)

Layout

Path Role
src/forge_lcdl/generic/ URL builders, headers policy, chat_with_json_mode_then_plain, lenient JSON parse, merge-by-key, UTF-8 truncation, transport_batch_sequential, estimate_token_budget
src/forge_lcdl/transport.py Blocking urllib chat/completions
src/forge_lcdl/env.py read_certificator_profile / read_taxonomy_profile
src/forge_lcdl/operators.py seq, repeat, for_each, until_ok, branch, fallback_chain, try_catch, optional_step
src/forge_lcdl/messages.py LlmMessage, FileRef, build_openai_messages
src/forge_lcdl/math_safe.py formula_eval_safe (allowlisted arithmetic)
src/forge_lcdl/execution/, retrieval/, inference/, prompts/ LcdlClient, RAG adapters, planner tasks, prompt cache helpers — docs/CLIENT-API.md, docs/RAG.md
src/forge_lcdl/game_engine/ Stdlib-only deterministic games (legality, replay, player views)
src/forge_lcdl/game_lcdl/ Payload builders + validators for board-game catalog tasks
contracts/ Per-task Markdown contracts (under src/forge_lcdl/contracts/)
docs/GAME-ENGINE.md Boundary between game_engine and LLM layer
docs/PLAYWRIGHT-DISCOVERY.md Source-ingest: probes, routing, mechanics inference, repair (consumer Playwright runtime vs LCDL)
docs/PAGE-MECHANICS.md Mechanics artifact conventions (page_mechanics_v1 vs page_mechanics.v1) and focused task IDs
CONTRIBUTING.md Governance, scope, layering (LCDL vs runtime), contribution norms
docs/EXTRACTION-CONVERGENCE.md Staged LLM + deterministic convergence playbook (DOM/PDF)
docs/operators/ Operator reference
docs/WHAT-IS-LCDL.md Narrative overview (humans + agents); boundaries vs Fleet / Playwright
docs/ADOPTION.md Dependencies + forge-certificators consumer index

Replacing in-script LLM plumbing (cheat sheet)

You have today Use in forge-lcdl
Double chat_completion (JSON mode then response_format=None) chat_with_json_mode_then_plain
Second attempt with extra user suffix, plain mode chat_with_json_then_nudge_plain
Strip ```json fences then json.loads strip_markdown_code_fence / parse_json_object_lenient
“First {…} slice” recovery inside parse_json_object_lenient
{"error":{…}} in 200 body check pattern in task code; helper is_gateway_error_json
Merge classifier rows by chunk_id merge_by_key(items, updates, "chunk_id")
User JSON capped at 100k chars truncate_utf8_bytes
Append raw_http_body to logs format_chat_error_message
Retry until predicate / max N until_ok
First success among strategies fallback_chain over lambda: Ok(...) / Err(...)

Live Granite tests (optional integration)

Integration tests live under tests/integration/. They call the real OpenAI-compatible Granite gateway and are marked granite so default pytest stays quiet when credentials are absent: if the env file is missing or LLM_BASE_URL / API key are empty after load, those tests skip (no network).

Variable Role
(default file) ../forge-composer-workbench/project-management-certification/certification-llm.local.env relative to this repo (same pattern as the PM certification workbench; file is gitignored).
FORGE_LCDL_GRANITE_ENV_FILE Absolute path, or path relative to the forge-lcdl repo root, to any KEY=value Granite env file (LLM_BASE_URL, LLM_API_KEY or OPENAI_API_KEY, optional LLM_TIMEOUT_SEC).
FORGE_LCDL_LIVE_MODEL After the file is merged into the process environment, integration tests set LLM_MODEL to this value. Default when unset: ctx-unlim-qwen3-8b:latest (confirm with GET /v1/models on your deployment).
FORGE_LCDL_STRESS_CONTEXT Set to 1 / true / yes / on to enable the optional stress_context probe (large user payload).
FORGE_LCDL_CONTEXT_CHARS Approximate user padding size for the stress probe (default 50000). Gateway 413 / 400 or timeouts are deployment limits, not necessarily forge-lcdl bugs.
FORGE_LCDL_USE_THERMAL_GUARD Reserved for future optional forge_composer thermal guard wiring via pre_chat; integration tests do not require forge_composer today.

Commands:

cd forge-lcdl
pytest -m granite          # live smoke + pw_chunk_classify (+ stress only if FORGE_LCDL_STRESS_CONTEXT=1)
pytest -m stress_context   # subset: large prompt probe (still needs Granite env + -m granite or full run)
pytest                     # unit tests + skipped integration when no env file

“Unlimited context” in catalog names is a product label, not a hard guarantee; treat stress results as informational.

Sibling package: forge-lcdl-runtime

Disk-backed ChatSession, RAG-lite rag_concat_prefix, decision tables/trees, matrix helpers, and Playwright/MCP bridge stubs live in the separate repo ../forge-lcdl-runtime. Install with pip install -e . from that folder after editable forge-lcdl.

Git

Initialize a private remote (GitHub/GitLab) and push this tree. Do not commit .env files or live gateway URLs.