Droid

A local text-to-speech experiment for android-style voice design. The first pass compares core DSP/vocoder paths; the new Space Channel pass tests ten network-operations droid voices against the same prompt set.

10Space Channel presets

66playable samples

0.5sA/B/D generation, typical

localmacOS, Piper, NumPy/SciPy

Space Channel Droid Voices

Ten new presets tuned from the current MVP feedback: keep the metallic a_dsp character, reduce scratch, and push toward a Euro/protocol network-operations voice.

Open voice switcher

retirement_party_mvp space_ops_switchboard uap_archive_vocoder bug_intake_clerk deep_space_radio

Genesis Of The Variants

`a_dsp` - Processed Speech Droid

This began as the fastest baseline: synthesize clean Piper speech, then make the voice mechanical with ring modulation, comb filtering, bit-crush, tone leak, and soft clipping. It keeps the words clear because the human-like TTS remains the dominant layer.

`b_vocoder` - Carrier-Synthesis Droid

This variant came from the classic robot-vocoder idea: extract the speech envelope and use it to drive a saw/square synthetic carrier. It is the most machine-generated path, with stronger droid character and less natural intelligibility.

`c_light` - Premium Android

This tests the opposite hypothesis: start from a higher-quality Piper voice and apply only subtle metallic treatment. It is meant to sound like a cleaner protocol-style android rather than a harsh machine.

`d_hybrid` - Vocoder-Forward Hybrid

This was retuned after the first blend sounded too much like a_dsp. The new version makes the vocoder the primary layer and mixes in a smaller processed-speech layer only to recover clarity. It should sit closer to b_vocoder than to A.

Listen

Affirmative. All systems nominal. Awaiting your command.

Variant

Audio

Generated

a_dsp

0.504s

b_vocoder

0.537s

c_light

0.891s

d_hybrid

0.544s

I am a synthetic intelligence. State your designation.

Variant

Audio

Generated

a_dsp

0.501s

b_vocoder

0.525s

c_light

0.820s

d_hybrid

0.530s

Directive seven one four acknowledged. Power cell at forty two percent.

Variant

Audio

Generated

a_dsp

0.525s

b_vocoder

0.553s

c_light

0.942s

d_hybrid

0.568s

I do not experience fear, but I recognize the urgency of your request.

Variant

Audio

Generated

a_dsp

0.530s

b_vocoder

0.575s

c_light

1.013s

d_hybrid

0.587s

What This Shows

The effects are cheap. The generation time is dominated by launching Piper and synthesizing speech. A, B, and D all share the same Piper render, so they cluster around the same latency; B and D add only a few dozen milliseconds for vocoder-band processing. C is slower because it uses a higher-quality Piper voice.

For sub-200 ms replies, the practical path is to pre-generate common short responses and keep live synthesis for dynamic text. Full timing data is available in timings.csv.

Built locally in ~/src/droid-voice. Deployed as a static listening page on davidbmar.com.

Space Channel Droid Voices

Genesis Of The Variants

a_dsp - Processed Speech Droid

b_vocoder - Carrier-Synthesis Droid

c_light - Premium Android

d_hybrid - Vocoder-Forward Hybrid

Listen

What This Shows

`a_dsp` - Processed Speech Droid

`b_vocoder` - Carrier-Synthesis Droid

`c_light` - Premium Android

`d_hybrid` - Vocoder-Forward Hybrid