Hackathon Showcase

PersonaFlow

Team led by Dhruv Sharma, Ph.D. (ENS, École Polytechnique), MSCI Director—expert in LLM internals, agents/RL, vector search, C++/Python, portfolio optimization and risk analytics.

1 member

Project Description

PersonaFlow: AI-Powered Persona Testing Platform

The Business Value: Automated UX Intelligence for Modern Product Teams

PersonaFlow transforms the slow, expensive process of user feedback into an automated, continuous intelligence pipeline.

For agile product teams, understanding user experience across different customer segments is critical but traditionally requires weeks of user research, surveys, and testing sessions. PersonaFlow eliminates this bottleneck by providing immediate, persona-driven UX feedback directly within the development lifecycle.

The Vision: UX Expert in Your CI/CD Pipeline

Imagine integrating a team of UX experts directly into your deployment process. With every significant release, PersonaFlow automatically deploys AI agents—each embodying key user personas—against your application. Within minutes, you receive detailed reports: “Casual Casey got confused by the new checkout flow due to case-sensitive search” or “Power-User Paula discovered a potential data leak in the admin endpoints.”

This isn’t just automated testing—it’s automated user empathy at scale.

Immediate ROI for Product Teams

Faster feedback cycles: From weeks of user research to minutes of automated testing
Comprehensive user persona coverage: Test against multiple personas simultaneously
Actionable insights: Specific, prioritized recommendations for each user segment
Continuous validation: Integrate directly into CI/CD for every deployment
Time and cost efficiency: Replace expensive and time-consuming user testing sessions with automated persona agents

Key Technical Innovations

1. Dual-AI Architecture: Dynamic Personas + Testing Agents

Architect Agent (Vertex AI): Dynamically generates custom personas based on target audience description
Testing Agents (Gemma 3-12B): Execute persona-driven testing with real-time decision logging
Intelligent coordination: Architect synthesizes final reports with strategic recommendations

2. Strategic Flaw Discovery System

Our mock API contains intentionally designed flaws that mirror real-world UX issues:

# Example: Case-sensitive search (frustrates casual users)  
@app.get("/search")  
def search_products(q: str):  
    # Intentional flaw: case-sensitive matching  
    results = [p for p in products if q in p["name"]]  # No .lower()  
    return {"results": results, "total": len(results)}  

Comprehensive test coverage validates both correct functionality AND intentional flaws:

def test_search_case_sensitivity():  
    """Verify case-sensitivity flaw works as designed"""  
    response = client.get("/search?q=wireless mouse")  # lowercase  
    assert response.json()["total"] == 0  # No results (intended)  
      
    response = client.get("/search?q=Wireless Mouse")  # proper case    
    assert response.json()["total"] > 0  # Has results  

3. Real-Time Agent Monitoring with WebSockets

Real-time streaming of agent thoughts and actions:

# Real-time agent observation  
@app.websocket("/api/test-sessions/{session_id}/logs")  
async def websocket_logs(websocket: WebSocket, session_id: str):  
    async def broadcast_log(type: str, message: str):  
        log = LogMessage(  
            timestamp=datetime.now().isoformat(),  
            type=type,  # "thinking", "acting", "observing"  
            message=message  
        )  
        await manager.send_log(session_id, log)  

Demonstration of Core Requirements

LLM Deployment

Gemma 3-12B deployed on GPU-enabled Google Cloud Run
Vertex AI integration for persona generation
Advanced prompt engineering with persona context
Robust response parsing with error recovery

Agent Development

Custom agent architecture built from ground up (no Frameworks used)
REASON→PLAN→ACT→OBSERVE decision loop
Memory management and conversation context
Tool integration with API interaction capabilities

Agent Deployment

Microservices on Google Cloud Run
Real-time WebSocket streaming

Service Architecture

Core Services

Service	Technology	Purpose	Deployment
UI	Next.js/React	User interface and real-time monitoring	Local/Vercel
API Server	FastAPI + WebSockets	Orchestrates testing, handles real-time logs	Local (deployment shortly)
Mock API	FastAPI	Target API with strategic flaws for testing	Google Cloud Run
Vertex AI	Google Cloud	Architect agent for dynamic persona generation	Managed Service
Gemma LLM	Vertex AI Model	Persona-based API testing agent	Google Cloud Run + GPU

Strategic Design Philosophy

PersonaFlow is designed around the principle of Strategic Failure Discovery:

Intentional Flaws: The mock API contains carefully designed issues that real APIs commonly have
Persona-Driven Discovery: Different personas naturally discover different types of issues
Realistic Testing: Personas behave like real users, not perfect test scripts
Actionable Insights: Final reports provide specific, prioritized recommendations

Next Steps

Integrate with an agent traceability tool like Opik. (there just wasn’t enough time :( )
Test different models and verify what best follows the persona description.
Latency improvements - possible via quantized models.
User testing with actual PMs… or even simulated PMs (à la karpathy’s recursive self-improvement).

Team

Dhruv Sharma

Products & Tools

AI Tinkerers Betaworks Gemini-cli Google Cloud claude-code.

Additional Links

https://github.com/dhruvshrma/persona-flow

Repo for project

Summarizing URL...

https://dhruv-sharma.ovh/post/talk-ccs-2023/presentation_ccs2023.pdf

Slides from talk on "embodied" agents

Summarizing URL...