forge-lcdl

Model routing and profiles

Sprint 2 introduces forge_lcdl.models: offline ModelProfile metadata and a deterministic ModelRouter that chooses primary / optional verifier / fallback model_id strings plus orchestration hints. This layer is orthogonal…

Cheap-first policy

  • Default primary model id is ctx-unlim-qwen3-8b:latest, matching integration defaults (tests/integration/conftest.py / README).
  • Simple tasks: primary only, no decomposition or verification flags.
  • Medium tasks: primary stays cheap; a fallback model_id may be suggested (ibm/granite4:tiny-h standard tier string) for consumer-level fallback_chain-style use — transport is not switched here.
  • Complex tasks: require decomposition and verification; verifier defaults to the built-in standard profile id when registered.
  • Risk (risk=True or TaskComplexity.risky): escalation_reasons is non-empty (e.g. risk_flag, complexity_risky), and requires_verification is set.

No network calls, no provider APIs, and no changes to transport.py or task runners in this sprint.

Tiers

Tier Meaning
cheap Default execution path; tight caps
standard Second-pass / verifier candidate in built-in registry
strong Enum value reserved for future profiles
verifier Enum value reserved for dedicated verifier profiles
unknown Synthetic profile when model_id is not in the built-in registry

API sketch

from forge_lcdl.models import ModelRouter, TaskComplexity, routing_decision_to_json

router = ModelRouter()
decision = router.choose(
    "my_task",
    TaskComplexity.complex,
    risk=False,
    preferred_model=None,  # omit for default cheap id
)
print(routing_decision_to_json(decision))

JSON serialization

routing_decision_to_json returns a dict suitable for json.dumps(..., sort_keys=True) for stable logs.

Reserved escalation tokens

retry_budget_exhausted is reserved for future retry-budget / operator integration; it is not emitted by ModelRouter yet.

See also

  • CONTRACT-SPEC.md for task metadata (ContractSpec) — future wiring may map task_id → default TaskComplexity.