forge-lcdl

Task `pw_page_kind_route` v1

Given a bounded page_probe snapshot plus optional operator hints, classify page_kind, estimate confidence, propose supported strategies, cite short evidence strings, and recommend which follow-up probes the source-ingest…

Task pw_page_kind_route v1

Summary

Given a bounded page_probe snapshot plus optional operator hints, classify page_kind, estimate confidence, propose supported strategies, cite short evidence strings, and recommend which follow-up probes the source-ingest runtime should run.

LCDL does not execute Playwright or network calls; the runtime supplies page_probe and consumes this routing JSON.

Inputs

Field Type Required Notes
url string yes Page URL for context (must be non-empty after strip).
page_probe object yes Bounded probe JSON (page_probe_v1 aligned); may be {} when empty.
operator_hints string no Short human hints appended for the model; omit or "" if none.
temperature number no Default 0.05.
timeout_sec int no Default profile.timeout_sec.

User JSON is UTF-8 capped at 100000 bytes before the LLM call (same pattern as other pw tasks).

Output

JSON object (task Ok.value):

Field Type Notes
page_kind string One of the allowed values below.
confidence number Coerced to [0.0, 1.0] after parse.
supported_strategies array of string Short strategy ids meaningful to the runtime (free-form labels).
evidence array of string Short factual cues from the probe (no secrets).
next_probe_needed object Exactly boolean flags: interaction_probe, network_probe, static_chunk_probe.

Allowed page_kind

  • static_mcq_page
  • interactive_quiz
  • paginated_static_quiz
  • api_backed_quiz
  • pdf_or_document
  • login_or_blocked
  • unknown

Any other string fails schema validation after parse.

Policy

  • Model must output exactly one JSON object (no Python, no prose outside JSON).
  • Do not instruct bypass of login, CAPTCHA, paywalls, anti-bot protections, or proctored exam controls.
  • Prefer login_or_blocked or unknown when the probe is insufficient or gated.

Implementation

  • Uses run_json_contract_task (chat_with_json_mode_then_plain, parse_json_object_lenient).
  • Required keys enforced before post-validation; next_probe_needed must contain the three boolean fields.

Changelog

  • v1 — Initial page-kind routing task.