Bones AI - The Agent-Driven Just In Time UI for Any App
Team of Datadog experts including a Director of AI, Applied Scientist, and Senior PM, specializing in AI agents, ML Ops, and Cornell/UVA-educated engineering leadership.
YouTube Video
Project Description
- Overview -
Bones is a macOS menu bar tool that replaces the traditional AI chat window with an agent-driven interaction model that can see your screen and collaborate with you on it. A pixel-art skeleton lives in your menu bar, grab him with your mouse and drag him onto any app window. Drop him on a window and suddenly Bones (powered by Claude) can “see” your app or web page through real-time and interact with it directly, with its own mouse and keyboard, as well as a variety of tools that lets it create Just-in-Time (JIT) UI, generating widgets, overlays, or even running connected apps on the fly based on what you’re doing.
Bones is proactive (based on what it things would be useful) or reactive (based on what you ask for). It’s also extensible through skills (and apps) based on the context it sees, so if it sees a partiful website for example, and can use an installed partiful skill + app which guides the generation and available actions too so knowledge and abilities can be grabbed on the fly. As Bones generates UI to fit your current view, it can accomplish tasks for you, collaborate with you, and customize your desktop, being both a helpful AI agent and a direct layer that augments your abilities for fun or profit.
- Technologies, Frameworks & Libraries -
Native macOS (Swift / AppKit)
- Swift 6.2
- AppKit, ScreenCaptureKit, CoreGraphics / CGEvent, ApplicationServices / Accessibility APIs (AXUIElement), etc.
Python Agent
- Anthropic agent SDK w/ opus 4.6
- extensible through contextual apps
Redis
- Redis is used to store JIT generated UI, such that it can be searched over later (using the vector search) by the agent again if there’s been any similar UI generated before creating new UI, so it can reuse when possible. This is only stored locally at this point.
Architecture notes on the update “loop” and hot different pieces work together
- JSON Lines protocol (bidirectional stdin/stdout communication between Swift and Python processes)
- Streaming token delivery (text_delta messages enable real-time response rendering in the sidebar)
- Tool orchestration loop (Python agent manages multi-step agentic workflows: screenshot → reason → act → verify), up to 20 iterations per turn
- Change detection to trigger the agents as you switch web pages, scroll through files, etc.
- How These Enable Intelligence-Driven UI -
The entire stack is designed so Claude can be proactive and drive the interface directly on your desktop with just-in-time augmentations. The Accessibility API integration gives Claude a structured understanding of any app’s UI (roles, labels, frames, codes), in addition to visual data from viewing the screen. The overlay system lets Claude generate arbitrary interactive UI on the fly, dashboards, controls, visualizations, that can call back into native app interaction through the window.bones.* bridge. The content change detector proactively feeds Claude visual updates without user prompting. The widget system lets Claude place contextual information (color swatches from a design tool, JSON from an API response, code snippets) at precise screen locations anchored to the elements they reference. The run_javascript tool gives Claude deep access to browser DOM for web-based workflows. Claude’s reasoning determines what UI exists, where it appears, and what it does.