ZerahUI - AI Tinkerers - New York City Hackathon
AI Tinkerers - New York City
Hackathon Showcase

ZerahUI

Zerah organizes your life within a 3D particle sphere, using bilingual voice interaction and an intuitive spatial-temporal orrery.

4 members Watch Demo

Zerah — The Personal Assistant That Lives in a Sphere

We didn’t build another chat box. We built a living, breathing visual field that understands you, speaks to you, and organizes your entire life in three dimensions.


The Problem

Every productivity tool today is a list. A calendar grid. A kanban column. A flat surface that makes you feel like you’re managing spreadsheets, not living your life. You open five apps to see your week. You miss urgent items buried in a tab. You talk to an AI and get a wall of text back.

We asked a different question: what if your assistant could feel spatial? What if it could talk back?


What We Built

Zerah is an agentic personal assistant built around a 3D particle sphere. Your life — work, health, finances, personal tasks, learning, and urgent alerts — exists as six luminous clusters orbiting inside a single field. Nothing is hidden in a menu. Everything is present, always, in one glance.

The Sphere

Six categories of your life are represented as glowing dot clouds. Each cluster has its own color, its own position in 3D space, its own gravitational center. The sphere breathes. It rotates. When something is urgent, it pulses. When you select a cluster, it expands toward you and every task inside it becomes visible, floating in space, labeled in light.

The Orrery

At the top of the screen sits a miniature solar system — four concentric orbital rings, each representing a time scale. The innermost ring orbits fast: that’s your day. Moving outward: week, month, year. The center node resets everything.

There are no labels until you need them. Hover any ring and tick marks materialize — 24 for a day, 7 for a week, 12 for a year. Click a ring and the entire sphere reorganizes itself temporally, pulling tasks into orbital bands by when they’re scheduled. The visual metaphor is immediate: things close to the center are near in time. Things far out are far away. No reading required. Your brain understands it before your eyes finish moving.

Clusters That Talk Back

This is the feature that changes everything. When you select a cluster — by clicking, by touching, or by voice — Zerah speaks to you. Not a notification sound. A full natural-language summary of what’s in that cluster, delivered in your voice, in your language.

“Work. You have five items. First, team standup at nine AM. Next, client presentation on Wednesday. Also, the product roadmap is due in two weeks…”

Switch to the week view and Zerah reads a summary of everything scheduled across all clusters for the next seven days. The sphere becomes a spatial voice assistant. It doesn’t wait for you to ask. It tells you what you need to know the moment you look at it.


Technical Architecture

Zerah is a single HTML file. No framework. No build step. No node_modules. It runs anywhere.

  • Rendering: Canvas 2D, fully custom particle engine with 3D projection, depth sorting, and per-dot physics
  • Voice Input: Web Speech API (Chrome/Edge) with a clean external STT integration surface for Deepgram, Whisper, or any provider
  • Voice Output: SpeechSynthesis API with smart voice selection, designed to be swapped for ElevenLabs or Azure TTS in production
  • Bilingual: Full EN/ES support with automatic language detection from transcript content, character heuristics, and stopword analysis — no explicit language switch required
  • Agent API: window.Zerah exposes a complete programmatic surface for backend integration — push tasks, trigger pulses, respond from agents, drive voice state externally
  • Guardrails: A full claude.md governs every agent in the system — anti-hallucination rules, voice security, PII handling, confidence gating, prompt injection defense, and bilingual routing contracts

The backend hooks are three async functions in a CONFIG block at the top of the file. Point them at your API and the interface comes alive with real data.


Goals

Short term (hackathon)

  • Demonstrate the full voice loop: speak → sphere reacts → cluster responds → Zerah talks back
  • Show bilingual switching mid-conversation with no explicit command
  • Connect to a live Claude API backend that populates clusters in real time

Medium term

  • Replace Web Speech API with Deepgram for production-grade STT accuracy
  • Replace SpeechSynthesis with ElevenLabs for a voice that matches the aesthetic
  • Connect calendar, email, and health APIs to populate clusters automatically
  • Add persistent memory via a memory agent that learns your patterns

Long term

  • Make the sphere collaborative — a shared field for teams where each person’s tasks appear as a distinct orbital layer
  • Mobile-native with gesture controls replacing mouse interaction
  • AR layer so the sphere lives in your physical space

What Makes This Different

Most AI assistants are reactive. You type, they answer. You ask, they respond. The interaction is sequential, text-based, and fundamentally flat.

Zerah is proactive and spatial. It doesn’t wait for a question. It shows you everything, all at once, in a form your visual cortex can process in under a second. And when you point at any part of your life, it speaks. The assistant comes to you.

The design philosophy: the interface should feel like the AI is already thinking, already organized, already aware — and you’re just looking through a window into its mind.


How to Contribute

We’re looking for people who care about the intersection of AI, spatial UI, and voice interaction. Specifically:

Designers — We have the aesthetic foundation. We need someone to push the visual language further: motion design, haptics, sound design for the sphere itself.

Backend engineers — The agent architecture is designed and documented. We need someone to build the orchestrator: calendar integration, email agents, live task syncing.

Voice / NLP engineers — The STT/TTS integration surface is fully defined. We need someone to wire Deepgram and ElevenLabs and tune the language detection model.

Mobile engineers — The core is canvas-based and touch-aware. We need someone to native-wrap it and add gesture layers: pinch-to-zoom the orrery, swipe between time views, haptic feedback on cluster selection.

AI engineers — The backend hook is three async functions. We need someone to build the Claude orchestrator that populates clusters from real-world data sources and generates the spoken summaries dynamically instead of templating them.

If any part of this excites you — the visual, the voice, the agent architecture, or just the idea that a calendar shouldn’t look like a spreadsheet — we want you on the team.


Demo

The full working demo runs in a single HTML file. No installation.

  1. Download zerah.html
  2. Run python3 -m http.server 8080 in the same folder
  3. Open http://localhost:8080/zerah.html in Chrome or Edge
  4. Press Space to speak — or just click a cluster and listen

Try saying: “show my work” — or in Spanish: “muéstrame el trabajo”

Try clicking the week ring on the orrery and watch your entire life reorganize itself in three dimensions.

Then listen to Zerah tell you what’s in it.


Built in 6 hours. One file. Zero dependencies. Full voice. Six dimensions of your life, one sphere.

N/A With did the whole thing here from scratch

Anthropic CopilotKit Cursor Descript ElevenLabs Gemini VAPI