FreeSpeech: AI Voice Restoration for the Mute Community - ElevenLabs Worldwide Hackathon
AI Tinkerers - New York City
Hackathon Showcase

FreeSpeech: AI Voice Restoration for the Mute Community

Team led by an Anthropic Claude Ambassador (UPenn/Amazon) and Drexel CS/ML builder, focused on RAG systems, AWS tooling, and applied full-stack AI.

2 members

FreeSpeech: AI Voice Restoration for the Mute Community
Project Description
User Problem: Millions of people who have lost their voice due to ALS, laryngectomy, or other conditions are forced to communicate through robotic text-to-speech or slow typing, stripping away their identity and making real-time, natural conversation nearly impossible; you’d think that there has been something
Core Flow: FreeSpeech restores natural, real-time communication through two integrated features:
Personalized Voice Cloning: Users upload pre-existing voice recordings or videos, and we generate a custom ElevenLabs voice model that authentically replicates their voice—not a generic TTS output.
Conversational Co-Pilot: An always-listening agent transcribes live conversations (in-person, phone, or Zoom), then generates Dynamic Smart Replies—context-aware response options that adapt to conversational flow. Unlike static “Yes/No/Maybe” buttons, options might surface as “I love you,” “I miss you,” or “Let me think about it” based on emotional and topical context. Users can select a suggestion or type freely, with all output spoken in their cloned voice.
In addition, FreeSpeech actually ENHANCES real-time communication through:
Tool-Calling Sub-Agent: Users can trigger an embedded research agent mid-conversation (e.g., “Find nearby restaurants matching my preferences”), which returns results directly into the Smart Replies interface for seamless integration.
Key Demo: A live conversation where the co-pilot listens, generates contextual replies, triggers a sub-agent search, and speaks responses in a cloned voice—all in real time.
End-to-End Run: Launch the app → simulate uploading voice samples for cloning → start a conversation (mic input or Zoom integration) → receive dynamic smart replies → select or customize response → hear output in your personalized voice.

Judging Criteria
Criterion
How FreeSpeech Addresses It
Working Prototype
Functional demo with voice cloning, real-time transcription, dynamic reply generation, and sub-agent tool calling. Users can run a full conversation loop.
Technical Complexity & Integration
Multi-model orchestration: speech-to-text transcription, Claude for contextual reply generation and steering, ElevenLabs for voice synthesis, sub-agent with tool calling for live research—all synchronized in real time.
Innovation & Creativity
First solution combining voice identity restoration with an agentic conversational co-pilot. Dynamic Smart Replies and steerable context represent novel UX paradigms for assistive tech.
Real-World Impact
Directly addresses accessibility for the mute/nonverbal community, restoring not just communication but identity through personalized voice. Reduces conversational latency from minutes to seconds.
Theme Alignment
Exemplifies agentic AI: autonomous listening, context-aware generation, user-steerable outputs, and tool-calling sub-agents working collaboratively with human intent.

Tech Stack
LLM/Agent: Claude API (reply generation, context management, steering logic)
Voice Cloning & TTS: ElevenLabs API (custom voice model creation + synthesis)
Speech-to-Text: Deepgram or Whisper API (real-time transcription)
Sub-Agent Tools: Web search, location APIs for contextual research
Frontend: React/Next.js
Backend: Python/FastAPI for orchestration
Memory: Conversation history storage for personalization (future: Notion/Google Drive integrations)

Setup & Demo Steps
Clone repo and install dependencies (npm install, pip install -r requirements.txt)
Add API keys (Claude, ElevenLabs, Deepgram) to .env
Upload voice samples via onboarding flow
Start live conversation mode
Observe: transcription → smart reply generation → voice output in cloned voice
Trigger sub-agent mid-conversation to demonstrate tool calling

No prior code was generated before the hackathon. We only ideated beforehand and came up with the product idea, as well as generated a possible schema for the file structure of the app, so we could hit the ground running once the timer

AI Tinkerers Civic Hall NYC Clerk Convex ElevenLabs

GitHub

Summarizing URL...