A 24/7 voice-driven apartment viewing scheduling assistant using Twilio/WebRTC, an 8-step FSM, and Google Calendar integration.
https://github.com/davidbmar/voice-calendar-scheduler-FSM · public · shipped

An automated voice agent that handles inbound calls or browser connections to schedule apartment viewings. It uses a Finite State Machine to guide users through preference gathering, listing search via RAG, availability checking, and final booking on Google Calendar, powered by Faster-Whisper for speech-to-text and Piper for text-to-speech.
git clone --recursive https://github.com/davidbmar/voice-calendar-scheduler-FSM cd voice-calendar-scheduler-FSM ./scripts/setup.sh cp .env.example .env $EDITOR .env ./scripts/start.sh
flowchart TD
Caller[Caller Phone or Browser]
Twilio[Twilio PSTN]
WebRTC[Browser WebRTC]
MediaStream[TwilioMediaStreamChannel]
Signaling[WebRTC Signaling WS]
Session[SchedulingSession]
STT[STT Faster Whisper]
FSM[FSM Orchestrator]
TTS[TTS Piper]
RAG[RAG Service LanceDB]
GCal[Google Calendar API]
LLM[LLM Claude or Ollama]
Caller -->|PSTN| Twilio
Caller -->|WebRTC| WebRTC
Twilio --> MediaStream
WebRTC --> Signaling
MediaStream --> Session
Signaling --> Session
Session --> STT
Session --> FSM
Session --> TTS
FSM --> RAG
FSM --> GCal
FSM --> LLM
LLM --> FSM
Built with Python 3.11+ using FastAPI for the backend. It integrates Twilio Media Streams for PSTN and WebRTC for browser audio. The core logic relies on a git submodule engine for FSM orchestration, LLM abstraction (Claude/Ollama), and voice processing. Apartment data is indexed in LanceDB for RAG-based search, and bookings are managed via the Google Calendar API.
sequenceDiagram
participant Caller
participant Channel as Voice Channel
participant Session as Scheduling Session
participant FSM as FSM Orchestrator
participant Tool as External Tools
Caller->>Channel: Speak Audio
Channel->>Session: Stream PCM Audio
Session->>FSM: Process Input
FSM->>Tool: Query Listings or Calendar
Tool-->>FSM: Return Data
FSM->>Session: Generate Response Text
Session->>Channel: Synthesize Speech
Channel->>Caller: Play Audio
Clone the repository recursively to include the engine submodule. Run the setup script to create a virtual environment and install dependencies. Configure API keys for LLM, Twilio, and Google Calendar in the .env file. Start the RAG service, backend, and optional editor using the provided start script.