forge-lcdl

Prompt layout and cache telemetry

Forge LCDL supports observable provider cache usage and optional extra_body on chat requests so higher layers can pass provider-specific keys (for example OpenAI-style prompt cache hints) without changing run_task…

ChatResult and UsageDetails

forge_lcdl.types.ChatResult includes:

  • prompt_tokens, completion_tokens, total_tokens
  • cached_tokens (when the gateway returns usage.prompt_tokens_details.cached_tokens in the JSON body)

The usage property returns a UsageDetails view of the same fields.

Transport

chat_completion_sync(..., extra_body=None) merges extra_body into the POST JSON after standard fields (model, temperature, messages, …). Use only provider extensions you trust; do not put secrets in extra_body.

chat_once / chat_with_json_mode_then_plain / chat_with_json_then_nudge_plain forward extra_body.

JSON contract tasks

run_json_contract_task(..., extra_body=..., canonical_user_json=False)

  • extra_body — passed through to chat policy (single completion path).
  • canonical_user_json=True — sorts keys and uses compact separators for the serialized user payload (stable prompts when caching).

Prompt planning helpers

Package forge_lcdl.prompts:

  • PromptPlanstable_prefix_messages vs dynamic_suffix_messages, optional prompt_cache_key
  • build_prompt_cache_key — hashes task_id, version, contract/schema hashes, optional corpus_version
  • canonical_json, stable_hash

Design rule: put stable instructions and schemas first; put user text, retrieved evidence, and conversation tail last so shared prefixes hit provider caches more often.

Example: reading cache stats after a call

from forge_lcdl.transport import chat_completion_sync
from forge_lcdl.env import read_certificator_profile

profile = read_certificator_profile()
res = chat_completion_sync(
    [{"role": "user", "content": "ping"}],
    profile=profile,
    temperature=0.0,
)
if res.ok:
    print("cached_tokens", res.cached_tokens, "prompt", res.prompt_tokens)

See also

  • CLIENT-API.md — high-level client (future: trace summaries may include cache ratios).