Developer Resources
Everything you need to install, configure, and deploy OpenVoiceUI. From zero to a working voice AI assistant in under 5 minutes.
Choose your preferred installation method. All options get you to a working voice AI assistant.
One command. Sets up the project, installs dependencies, and walks you through LLM configuration interactively.
npx openvoiceui setupDownload Pinokio, search for "OpenVoiceUI" in the app store, and click Install. Zero terminal interaction required.
Pinokio App Store → Search "OpenVoiceUI" → InstallClone the repo, copy the example env file, add your API keys, and start both OpenVoiceUI and OpenClaw in containers.
git clone https://github.com/MCERQUA/OpenVoiceUI
cd OpenVoiceUI
cp .env.example .env # Add your API keys
docker compose upDeploy to any Linux VPS with Docker installed. Point a domain, set up SSL via Cloudflare, and your assistant is accessible from anywhere. Runs on 2 cores / 4GB RAM minimum.
See the deployment guide on GitHub →Open the repository in VS Code, accept the "Reopen in Container" prompt, and the dev container spins up with all dependencies pre-configured.
Open in Dev Container → Automatic setupAfter installation, configure your LLM, TTS, and STT providers in the .env file.
Set your preferred AI model. OpenClaw routes to any Anthropic-compatible API endpoint.
# OpenAI
OPENAI_API_KEY=sk-...
# Anthropic
ANTHROPIC_API_KEY=sk-ant-...
# Groq (fast + free tier)
GROQ_API_KEY=gsk_...
# Ollama (local, free)
OLLAMA_BASE_URL=http://localhost:11434Choose how your AI speaks. Multiple TTS engines supported.
# Supertonic (self-hosted)
TTS_PROVIDER=supertonic
TTS_URL=http://supertonic:5050
# Browser native
TTS_PROVIDER=browser
# Custom endpoint
TTS_PROVIDER=custom
TTS_URL=http://your-tts:8080Configure how your voice is captured and transcribed.
# Web Speech API (Chrome, free)
STT_PROVIDER=webspeech
# Deepgram (streaming)
DEEPGRAM_API_KEY=...
# Groq Whisper (batch)
GROQ_API_KEY=gsk_...How the pieces fit together. Three components, one seamless experience.
Voice capture (STT), audio playback (TTS), canvas rendering, desktop environment. Runs in any modern browser.
OpenVoiceUI application server. Handles file management, canvas pages, uploads, TTS routing, and API endpoints.
LLM router and agent orchestrator. Manages sessions, tool execution, sub-agents, skills, and model switching.
Browser (Voice + Canvas) ↔ Flask Server (API + Files) ↔ OpenClaw (LLM + Tools) ↔ Any LLM ProviderOpenVoiceUI is free, open source, and MIT licensed. Start building your voice AI assistant today.