PersonaFlow - AI Tinkerers - New York City Hackathon
AI Tinkerers - New York City
Hackathon Showcase

PersonaFlow

Team led by Dhruv Sharma, Ph.D. (ENS, École Polytechnique), MSCI Director—expert in LLM internals, agents/RL, vector search, C++/Python, portfolio optimization and risk analytics.

1 member

PersonaFlow: AI-Powered Persona Testing Platform

The Business Value: Automated UX Intelligence for Modern Product Teams

PersonaFlow transforms the slow, expensive process of user feedback into an automated, continuous intelligence pipeline.

For agile product teams, understanding user experience across different customer segments is critical but traditionally requires weeks of user research, surveys, and testing sessions. PersonaFlow eliminates this bottleneck by providing immediate, persona-driven UX feedback directly within the development lifecycle.

The Vision: UX Expert in Your CI/CD Pipeline

Imagine integrating a team of UX experts directly into your deployment process. With every significant release, PersonaFlow automatically deploys AI agents—each embodying key user personas—against your application. Within minutes, you receive detailed reports: “Casual Casey got confused by the new checkout flow due to case-sensitive search” or “Power-User Paula discovered a potential data leak in the admin endpoints.”

This isn’t just automated testing—it’s automated user empathy at scale.

Immediate ROI for Product Teams

  • Faster feedback cycles: From weeks of user research to minutes of automated testing
  • Comprehensive user persona coverage: Test against multiple personas simultaneously
  • Actionable insights: Specific, prioritized recommendations for each user segment
  • Continuous validation: Integrate directly into CI/CD for every deployment
  • Time and cost efficiency: Replace expensive and time-consuming user testing sessions with automated persona agents

Key Technical Innovations

1. Dual-AI Architecture: Dynamic Personas + Testing Agents

  • Architect Agent (Vertex AI): Dynamically generates custom personas based on target audience description
  • Testing Agents (Gemma 3-12B): Execute persona-driven testing with real-time decision logging
  • Intelligent coordination: Architect synthesizes final reports with strategic recommendations

2. Strategic Flaw Discovery System

Our mock API contains intentionally designed flaws that mirror real-world UX issues:

# Example: Case-sensitive search (frustrates casual users)  
@app.get("/search")  
def search_products(q: str):  
    # Intentional flaw: case-sensitive matching  
    results = [p for p in products if q in p["name"]]  # No .lower()  
    return {"results": results, "total": len(results)}  

Comprehensive test coverage validates both correct functionality AND intentional flaws:

def test_search_case_sensitivity():  
    """Verify case-sensitivity flaw works as designed"""  
    response = client.get("/search?q=wireless mouse")  # lowercase  
    assert response.json()["total"] == 0  # No results (intended)  
      
    response = client.get("/search?q=Wireless Mouse")  # proper case    
    assert response.json()["total"] > 0  # Has results  

3. Real-Time Agent Monitoring with WebSockets

Real-time streaming of agent thoughts and actions:

# Real-time agent observation  
@app.websocket("/api/test-sessions/{session_id}/logs")  
async def websocket_logs(websocket: WebSocket, session_id: str):  
    async def broadcast_log(type: str, message: str):  
        log = LogMessage(  
            timestamp=datetime.now().isoformat(),  
            type=type,  # "thinking", "acting", "observing"  
            message=message  
        )  
        await manager.send_log(session_id, log)  

Demonstration of Core Requirements

LLM Deployment

  • Gemma 3-12B deployed on GPU-enabled Google Cloud Run
  • Vertex AI integration for persona generation
  • Advanced prompt engineering with persona context
  • Robust response parsing with error recovery

Agent Development

  • Custom agent architecture built from ground up (no Frameworks used)
  • REASON→PLAN→ACT→OBSERVE decision loop
  • Memory management and conversation context
  • Tool integration with API interaction capabilities

Agent Deployment

  • Microservices on Google Cloud Run
  • Real-time WebSocket streaming

Service Architecture

Core Services

Service Technology Purpose Deployment
UI Next.js/React User interface and real-time monitoring Local/Vercel
API Server FastAPI + WebSockets Orchestrates testing, handles real-time logs Local (deployment shortly)
Mock API FastAPI Target API with strategic flaws for testing Google Cloud Run
Vertex AI Google Cloud Architect agent for dynamic persona generation Managed Service
Gemma LLM Vertex AI Model Persona-based API testing agent Google Cloud Run + GPU

Strategic Design Philosophy

PersonaFlow is designed around the principle of Strategic Failure Discovery:

  1. Intentional Flaws: The mock API contains carefully designed issues that real APIs commonly have
  2. Persona-Driven Discovery: Different personas naturally discover different types of issues
  3. Realistic Testing: Personas behave like real users, not perfect test scripts
  4. Actionable Insights: Final reports provide specific, prioritized recommendations

Next Steps

  1. Integrate with an agent traceability tool like Opik. (there just wasn’t enough time :( )
  2. Test different models and verify what best follows the persona description.
  3. Latency improvements - possible via quantized models.
  4. User testing with actual PMs… or even simulated PMs (à la karpathy’s recursive self-improvement).
AI Tinkerers Betaworks Gemini-cli Google Cloud claude-code.

Repo for project

Summarizing URL...

Slides from talk on "embodied" agents

Summarizing URL...