What it takes to turn RIFF — a working AI voice agent — into a paid after-hours answering service, priced like a night-shift worker: $1 / hour on shift (8pm–8am = $12 a night, unlimited calls).
RIFF already answers real phone calls: Telnyx routes the call, Gemini Live handles the voice, and a deterministic state machine drives the conversation through YAML-defined flows (booking, inquiries, message-taking). The conversation engine is production-grade. The business wrapper around it — billing, multi-tenant accounts, message hand-off, secrets/auth — is the work that stands between today and the first paying customer.
You hire RIFF for a shift the way you'd hire a person — $1 for every hour it's on the line, billed for the window, not per call or per minute. A typical 8pm–8am shift is 12 hours = $12. The customer pays for coverage (hours staffed); RIFF only spends money on the talk minutes when calls actually come in. That gap is the margin.
At moderate volume the margin is large; the live risk is a single line that rings nonstop. The companion spreadsheet (riff_cost_per_hour.csv) recalculates all of this from real numbers as soon as vendor prices are filled in.
No competitor prices per shift-hour — the "hire a worker for the night" framing is open whitespace. The two flanks:
| Alternative | What you pay | Catch |
|---|---|---|
| Live human / answering service | $250–$720/mo, or $15–25/hr | Per-minute meter + night surcharge; one 10-min 2am call ≈ $20–30 |
| Budget AI bots (Rosie, Dialzara) | $29–$49/mo | Cheaper sticker, but cap you at 60–250 min/mo |
| RIFF | $1/hr ($12/night) | Dedicated coverage for the window, unlimited calls |
Positioning: not the cheapest sticker — the cheapest dedicated worker. The value line: one after-hours call to a human service can cost a whole RIFF night. Lead with reliability (won't hallucinate, takes accurate messages), not price — research shows "too cheap" reads as "untrustworthy" for calls people care about. Competitor pricing verified on vendor sites, June 2026.
| Capability | Status | Notes |
|---|---|---|
| Voice conversation engine | Ready | Telnyx + Gemini Live + deterministic FSM, ~4,800 tests |
| Flow authoring (YAML) | Ready | Booking, inquiry, message-taking flows exist |
| Calendar / SMS providers | Ready | Pluggable, degrade gracefully when down |
| Secrets & config | Partial | Local .env (not committed); needs a prod secret store |
| Observability | Partial | On-disk logs; no central dashboard or alerts |
| Human message hand-off | Stub | The core answering-service job — must be finished |
| Multi-tenant accounts | Missing | Single business / number / calendar today |
| Auth on control plane | Missing | Call transcripts readable without auth |
| Usage metering & billing | Missing | Tokens measured internally; no per-customer billing |
tenant_id on every session, per-tenant config (flow, business name, calendar, phone-number routing), tenant-scoped storage. Serve customer #2 without a code change..env in production./healthz, structured logs shipped somewhere queryable, alerts on call errors and degraded providers.Today's voice path uses Gemini Live native audio — the most natural-sounding option, but priced on audio tokens (the dominant cost). An alternative path already partly built in RIFF: a cascade of speech-to-text → text LLM → text-to-speech.
riff/adapters/dashscope.py) — used today only for evaluation, not live calls.tts_server.py, Piper ≈ free) + local STT (Whisper) + a Qwen text model via DashScope. Text tokens cost an order of magnitude less than native-audio tokens, and local TTS removes the priciest line entirely.NEXT Model both paths side-by-side in the cost sheet (Scenario A: Gemini Live · Scenario B: Qwen cascade) once live Qwen/DashScope pricing is confirmed.
Pull live vendor pricing (Telnyx inbound per-minute, Gemini Live audio tokens, Qwen/DashScope tokens) into riff_cost_per_hour.csv. That single number — cost per coverage-hour — tells us whether $1/hr is a business or a hobby, and whether the Qwen cascade is worth building.