forge-lcdl

Task `pw_extractor_synthesize_exemplar` v1

Synthesizes a Playwright sync extract_questions(page) module body from exemplar chunks plus a minimal page probe summary (not full HTML). Returns structure_notes and extractor_python strings.

Task pw_extractor_synthesize_exemplar v1

Summary

Synthesizes a Playwright sync extract_questions(page) module body from exemplar chunks plus a minimal page probe summary (not full HTML). Returns structure_notes and extractor_python strings.

Inputs

Field Type Notes
url string Page URL for context
operator_hints string Optional; alias page_hints accepted
page_probe_summary object Required. Title, headings sample, text prefix, etc.
exemplar_chunks list Required. Chunk dicts with trimmed text/html snippets
temperature number Optional; default 0.1
timeout_sec int Optional; default 420

User JSON is UTF-8 truncated to 120000 bytes.

Output

{
  "structure_notes": "<string>",
  "extractor_python": "<module body with def extract_questions>"
}

Policy

Uses shared discover system prompts plus exemplar note and extra Locator API guardrails. Parses with parse_json_object_lenient; attempts fenced-code recovery for extractor_python when the model omits JSON keys.