forge-lcdl

Retrieval and RAG (`forge_lcdl.retrieval`)

Optional, dependency-light retrieval for the execution engine: a Retriever protocol, EvidencePack, deterministic keyword context adapter, and governed tasks rag_query_plan, rag_enough_context_gate, answer_from_evidence.

Retrieval and RAG (forge_lcdl.retrieval)

Optional, dependency-light retrieval for the execution engine: a Retriever protocol, EvidencePack, deterministic keyword context adapter, and governed tasks rag_query_plan, rag_enough_context_gate, answer_from_evidence.

LCDL does not require a vector database. The default adapter wraps forge_lcdl.context.build_context_pack (repo scan + rank + trim).

Modes (ExecutionPolicy.rag)

Mode Behavior
off No retrieval; input must satisfy the contract alone (e.g. caller-supplied evidence).
auto Uses InferencePlanner (may call rag_query_plan) and heuristics; may skip retrieval if the planner says it is unnecessary.
on Resolves queries (from question / policy-on path) and retrieves when a retriever is configured.
required Same as on, but fails if no retriever, no chunks, or (when gate is enabled) insufficient context for required

Retriever protocol

Implement retrieve(self, query: RetrievalQuery) -> tuple[RetrievedChunk, ...].

Built-ins:

  • EmptyRetriever — returns no chunks (tests, disabling retrieval instrumentally).
  • KeywordContextRetriever(repo_path, budget_chars=...)build_context_pack(query.query, repo_path, ...), then maps ContextItems to RetrievedChunks (ids from paths; top_k applied).

Evidence pack

build_evidence_pack(chunks, corpus_id=...) produces:

  • canonical_text — deterministic join for hashing and model input.
  • hash — short SHA-256 prefix for traces and cache scoping.
  • chunks — structured for evidence_pack_to_input_dict (user JSON for answer_from_evidence).

Citations

validate_output_citations(output, evidence) returns error strings if citations reference unknown source_id / chunk_id. The rag.citations verifier runs when listed in contract.json verification_policy.verifier_ids and evidence is present in the verifier context. If insufficient_evidence is true, the verifier passes without strict citation checks.

Contract hooks

contract.json v2 (see CONTRACT-SPEC.md):

  • capabilities.supports_rag, requires_rag, returns_citations
  • rag_policy: mode, top_k, require_citations, enough_context_gate, etc.

answer_from_evidence input schema allows question alone for engine-driven RAG; evidence is optional until augmented.

Failure messages (engine)

Examples:

  • rag required but no retriever configured
  • rag required but no evidence found
  • insufficient evidence (gate + required)
  • rag required but task contract does not accept evidence

Example: unit test pattern

from forge_lcdl.retrieval import KeywordContextRetriever, RetrievalQuery
from pathlib import Path

r = KeywordContextRetriever(Path(".").resolve(), budget_chars=8000)
chunks = r.retrieve(RetrievalQuery(query="contract spec", top_k=4))
assert isinstance(chunks, tuple)

See also