XL AI
Team consisting of Xavier Huang (Founder, Mandi Tech; AI R&D — web/mobile, Python/JS, adaptive tutoring) and Lyndsay Goldfarb (Owner, XL Marketing; event/project director).
YouTube Video
Project Description
Our project is an advanced AI-powered assistant for college students designed for the Cloud Run Hackathon, focusing on seamless LLM deployment, robust agent development, and scalable agent deployment. The solution leverages state-of-the-art technologies to deliver a rich, interactive user experience and meets all judging criteria for technical excellence, innovation, and usability.
LLM Deployment
Custom LLM (Gemma) Deployment:
We deployed the open-source Gemma 3-4B model as a secure, authenticated REST API on Google Cloud Run. This allows for scalable, low-latency inference and easy integration with other cloud-native services.
Service-to-Service Authentication:
All requests to the LLM are authenticated using Google Cloud identity tokens, ensuring secure and authorized access.
Agent Development
Agent Development Kit (ADK):
We built our agent using Google’s ADK, enabling tool-based reasoning and modular extensibility.
Retrieval-Augmented Generation (RAG):
The agent integrates with Pinecone, a managed vector database, to perform semantic search over a large PDF textbook. This allows the agent to answer user questions with contextually relevant, up-to-date information from the document.
Custom Tools:
The agent exposes several tools, including:
ask_pdf_pinecone: Answers questions using semantic search over the PDF and responds in the author’s style.
Code generation, brainstorming, and concept explanation tools, all powered by the deployed LLM.
Agent Deployment
Cloud Run Deployment:
The entire agent backend (FastAPI + ADK) is containerized and deployed to Google Cloud Run, ensuring high availability, scalability, and easy updates.
Environment Management:
All secrets and configuration are managed via environment variables and .env files for security and portability.
Unique Features & User Experience
Conversational UI:
Users interact with the agent through a modern web UI (ADK Dev UI), enabling natural language queries and rich, chat-based interaction.
Author-Style Responses:
When answering questions from the PDF, the agent mimics the tone and style of the textbook’s author, enhancing trust and educational value.
Seamless RAG:
The integration with Pinecone allows for fast, accurate retrieval of relevant information, even from large documents.
Extensible Tooling:
The agent’s modular tool system makes it easy to add new capabilities or connect to additional data sources.
Technologies, Frameworks, and Tools
Google Cloud Run: For scalable, serverless deployment of both the LLM and the agent backend.
Google Agent Development Kit (ADK): For agent orchestration, tool management, and conversational UI.
FastAPI: For building the backend API and integrating with ADK.
Pinecone: For managed vector search and semantic retrieval from large documents.
PyPDF2: For PDF text extraction and chunking.
Python: The primary programming language for all backend and agent logic.
Docker: For containerization and reproducible deployments.
dotenv: For environment variable management.
GitHub: For version control and collaboration.
Value and Technical Excellence
This project demonstrates technical excellence by combining secure, scalable LLM deployment with advanced agent reasoning and retrieval-augmented generation. The use of Pinecone for semantic search, combined with a modular agent architecture, delivers a unique and highly functional user experience. The solution is cloud-native, extensible, and ready for real-world educational or enterprise applications.
Business Value
ProfAI empowers students to get immediate, contextually accurate textbook help, reducing dependence on instructors and generic search engines. Its scalable, serverless architecture ensures cost efficiency and reliability—ideal for educational platforms aiming to enhance student learning and engagement.