Deep dive
You swap the default utterance generator from Anthropic Haiku to Alibaba Qwen using a lightweight HTTP client. This cuts costs and dependencies while preserving Anthropic as a fallback option.
The `Persona` class hardcoded Anthropic Haiku for default utterances, creating unnecessary cost and vendor lock-in. You built `AlibabaUtteranceClient` with plain `urllib` to hit DashScope’s OpenAI-compatible endpoint directly. This avoids heavy SDK bloat. The `load_alibaba_creds()` function parses `~/.aws/alibaba` for keys, allowing tests to skip integration when configs are missing. `Persona` now defaults to this new client but keeps `AnthropicUtteranceClient` ready for switching.
Strategy Pattern for pluggable LLM backends
Configuration loading with environment variable overrides
Add retry logic to `AlibabaUtteranceClient` to mitigate `urllib` limitations.
Validate the `~/.aws/alibaba` file format strictly to prevent whitespace parsing errors.
Benchmark Qwen latency against Haiku to adjust downstream timeout expectations.