Handbook
Prompt layout and cache telemetry
Forge LCDL supports observable provider cache usage and optional extra_body on chat requests so higher layers can pass provider-specific keys (for example OpenAI-style prompt cache hints) without changing run_task…
ChatResult and UsageDetails
forge_lcdl.types.ChatResult includes:
prompt_tokens,completion_tokens,total_tokenscached_tokens(when the gateway returnsusage.prompt_tokens_details.cached_tokensin the JSON body)
The usage property returns a UsageDetails view of the same fields.
Transport
chat_completion_sync(..., extra_body=None) merges extra_body into the POST JSON after standard fields (model, temperature, messages, …). Use only provider extensions you trust; do not put secrets in extra_body.
chat_once / chat_with_json_mode_then_plain / chat_with_json_then_nudge_plain forward extra_body.
JSON contract tasks
run_json_contract_task(..., extra_body=..., canonical_user_json=False)
extra_body— passed through to chat policy (single completion path).canonical_user_json=True— sorts keys and uses compact separators for the serialized user payload (stable prompts when caching).
Prompt planning helpers
Package forge_lcdl.prompts:
PromptPlan—stable_prefix_messagesvsdynamic_suffix_messages, optionalprompt_cache_keybuild_prompt_cache_key— hashestask_id,version, contract/schema hashes, optionalcorpus_versioncanonical_json,stable_hash
Design rule: put stable instructions and schemas first; put user text, retrieved evidence, and conversation tail last so shared prefixes hit provider caches more often.
Example: reading cache stats after a call
from forge_lcdl.transport import chat_completion_sync
from forge_lcdl.env import read_certificator_profile
profile = read_certificator_profile()
res = chat_completion_sync(
[{"role": "user", "content": "ping"}],
profile=profile,
temperature=0.0,
)
if res.ok:
print("cached_tokens", res.cached_tokens, "prompt", res.prompt_tokens)
See also
- CLIENT-API.md — high-level client (future: trace summaries may include cache ratios).