The GOAP-lite Selector Built behind a flag · pilot pending
How a goal-directed group of states picks its next state — deterministically, from the slot contract, while the LLM stays language-only. This is phase 7 (the last) of the goal hierarchy.
f68fb73 → b27f69d →
48cf042), behind a dual gate: a segment must declare
selector: goap_lite and runtime RIFF_GOAP_SELECTOR=1. With the flag
unset (the default) it is a proven no-op — existing flows are byte-for-byte unchanged. Done:
the additive capability schema, the pure riff/goap_selector.py module + 7 predicate
tests, and the integration into evaluate_next_step. Remaining (Codex's
plan, increments 6/7): the first pilot flow conversion, and the held-out non-regression promotion. The deterministic recovery
half shipped earlier (see Stall recovery).
What & why
Today a flow's states are wired into a hand-authored chain: ask_name → ask_phone →
ask_address. That chain is brittle — it can't adapt when a caller front-loads three facts at
once, or withholds one, or answers out of order. The GOAP-lite selector replaces the
chain inside a GoalSegment: the segment owns a
slot contract, and after every caller turn the FSM recomputes which member state to
enter next, based on which target slots are still unfilled. The member states become
capabilities ("I can collect name + phone") rather than fixed next-steps.
Why “GOAP-lite”?
GOAP (Goal-Oriented Action Planning, from game AI — F.E.A.R.) picks actions by their preconditions and effects to reach a goal, instead of hard-coding behavior chains. Full GOAP runs a planner (A* over world states) — too open-ended and opaque for a phone call where every “tick” is an expensive caller turn and safety demands determinism.
RIFF borrows only the idea, not the planner: states declare what slots they
collect/repair and their preconditions, and the selector does one cheap,
deterministic thing — pick the next state that can satisfy the next missing slot. No
search, no planner opacity. That’s the “-lite.”
The mechanism
The whole selector is this pure function — deterministic, no LLM in the loop:
# the FSM picks the next member state of a GoalSegment def choose_next(segment, ctx): apply_slot_observations(ctx) # extraction/validation only if exit_guard(segment, ctx): return segment.exit_target repair = first_repair_needed(segment, ctx) # a slot invalid / low-confidence? if repair: return repair.state slot = first_unsatisfied_slot(segment, ctx) if slot is None: return segment.exit_target candidates = [s for s in states_that_collect(slot) if preconditions_hold(s, ctx)] if not candidates: return segment.fallback_state return min(candidates, key=deterministic_cost_tuple)
Read it top to bottom: validate what we have → can we leave? → is anything broken (repair first)? → what’s the next missing slot? → which state can collect it? → pick deterministically. The order encodes the policy.
next_state it suggests is ignored and logged. The
FSM owns every transition — the selector is what lets the group act together toward its
goal without handing control to the model. This is RIFF’s core safety boundary.
Failure modes & deterministic defenses
| Risk | Defense |
|---|---|
| Oscillation (A↔B forever) | Hysteresis — keep pursuing the current slot until it’s valid / hits max_attempts / is blocked. Tie-break by declared order, never by LLM wording. |
| Premature exit | Exit only when every required slot is valid (mentioned ≠ valid), with confidence + confirmation satisfied. |
| Deadlock (no state collects a slot) | Compile-time check: every required slot has ≥1 collecting/repairing state; ordering acyclic. Runtime fallback_state if no candidate is actionable. |
| Repair loops | Per-slot + per-segment attempt caps, then deterministic fallback. |
| Bad re-entry | Statechart history is a hint; recompute the contract from the slot ledger before choosing. |
The dual gate (how it stays safe)
The selector hooks into StateManager.evaluate_next_step at exactly one place —
after the declared transitions: loop and before the missing-slot ask
fallback. So authored exits, tool/confirm branches, and the when: stalled recovery all
still win; the selector only acts when no declared edge fired. And it acts only behind two locks:
# riff/state_manager.py — evaluate_next_step, after the declared transition loop if os.environ.get("RIFF_GOAP_SELECTOR") == "1": # lock 1: runtime flag seg = active_segment_for_state(self.flow, ctx.current_state) if seg is not None: # lock 2: state is in a selector:goap_lite segment dec = choose_next(self.flow, seg, ctx, evaluate_guard) if dec and dec.to_state != ctx.current_state: return NextStep(recommended_action="transition", candidate_transition=dec.to_state, ...)
With the flag unset (default), the first if is false and the whole block is skipped —
proven no-op: 121 unit tests pass unchanged, and weather/coffee/austin_plumbing score 10.0 with
RIFF_GOAP_SELECTOR=1 (they have no goap segments, so lock 2 never opens). With the
flag on and a member state, the selector routes deterministically: e.g. ask_name with the
name filled → ask_phone for the next missing slot, “goap_lite: collect
phone via ask_phone”.
What's shipped vs. remaining
| Increment | Status |
|---|---|
1. Additive capability schema (StateDef.collects/repairs/preconditions/effects, GoalSegment.selector) + loader | ✓ shipped f68fb73 |
3. Pure riff/goap_selector.py (choose_next + predicates) + 7 tests | ✓ shipped b27f69d |
4. Integration into evaluate_next_step behind the dual gate | ✓ shipped 48cf042 |
| 2. Lint gates for goap segments (collector coverage, single-segment membership, exit guard) | ✓ shipped 3c836b3 |
| 5/6. Replay A/B probe flow + first pilot conversion of a real collect segment | planned |
| 7. Held-out non-regression before promoting any production flow | planned |
The hardest, riskiest piece — deterministic recovery when a caller stonewalls a
required slot — shipped earlier as the stalled guard + sets:
primitive (commit e22399d), which is choose_next’s
fallback_state branch made concrete. See Stall
recovery. What remains is hardening (lint) and proving it on a real flow (pilot + held-out), not
the core mechanism.
Use case
A booking flow’s collect segment has three target slots (name, phone, address)
and three member states. A cooperative caller front-loads all three in one breath: the selector sees
all slots filled, exit_guard passes, and it jumps straight to read-back — no walking
empty “ask” states. A withholding caller gives only two: the selector routes to the state
that collects the third; if they keep stonewalling, the stalled recovery edge books a
default and exits gracefully instead of looping to escalation. One contract, every caller shape
handled — without a hand-wired chain.
Where it fits
The selector is the action half of the goal hierarchy: the GoalSegment schema declares the contract, the group-cohesion metric measures how well a group’s states already work together, and the selector is how a group would be driven to maximize that cohesion — deterministically, with the LLM kept to language. It is the last phase precisely because it depends on everything below it being in place.