Glass Monkey
Team consisting of NymbleAI co-founders, featuring a Johns Hopkins biophysics graduate and a Yeshiva CS alumnus skilled in Python, LLM orchestration, and multi-agent systems.
YouTube Video
Project Description
Project Overview
Glass Monkey is a transparent AI overlay that lives on top of your entire desktop. Instead of opening a chat window, you pull down a sheet of glass over your world. A 3D monkey character hangs from a vine, watches your screen through Claude’s vision API, and offers observations without being asked. You draw circles and arrows directly on your work; Claude responds spatially, placing annotations right where they matter. The monkey spawns creative modes: Constellation draws glowing threads between cross-domain connections, Seance summons competing intellectual perspectives as orbs you can collide to trigger debates, and Wander scatters ambient thoughts that drift and fade like passing ideas.
The core conviction: AI should be proactive, not reactive. It should stop living inside chat windows that force context-switching. Glass Monkey brings the AI to you, on your actual work surface, surfacing connections you never asked for. The user is always the builder, never the audience.
Stability
The prototype achieves stability through graceful degradation. The full experience runs with nothing more than an API key. Optional services (Screenpipe OCR, Redis memory) enhance but never gate the experience. Low-confidence responses route to chat instead of cluttering the glass. Annotation limits, coordination locks, and auto-fading prevent visual overload and race conditions. Click-through mode ensures you never lose access to apps underneath.
Shattering Conventions with Claude’s Reasoning
No sidebar, no modal, no chat-first interface. Claude’s reasoning directly drives what you see and where. A three-tier cascade (Haiku for fast background watching, Sonnet for deep vision analysis, Opus for deep pattern recognition) powers the experience. The monkey’s parallel agent system (Think, Build, Find) fires concurrent Claude calls that merge into a synthesized “Golden Thread” insight. Claude determines annotation placement via fuzzy text matching, confidence-based routing, and spatial positioning. It drives the monkey’s expressions and behavioral state. Every visual element on screen is a product of Claude reasoning about what you are doing right now.
Technologies, Frameworks, and Libraries
Desktop: Electron (transparent overlay, global hotkeys, IPC), Node.js, macOS native screencapture + Sharp
Frontend: React 19 + TypeScript, tldraw (drawing surface), Framer Motion (animations), Tailwind CSS, Vanilla JS + SVG/Canvas/DOM (Monkey renderer)
3D & Physics: Three.js v0.170 (cel-shaded monkey, vine geometry), custom Verlet physics (ragdoll, rope tail), FABRIK inverse kinematics, Matter.js
AI: Anthropic SDK + Claude Vision API (Haiku 4.5 / Sonnet 4.6 / Opus 4.6 cascade), three-agent parallel system (Think/Build/Find + Synthesis), Fuse.js (fuzzy text anchoring), structured JSON prompting with confidence-based routing
Optional: Screenpipe (continuous OCR), Redis Stack (persistent memory), DuckDuckGo API (web search)
Typography: Google Fonts (Caveat, IBM Plex Mono)
Every tool serves one goal: replacing the chat box with spatial, contextual, always-present intelligence that comes to you where you already work.