Browser Assistant
Team led by a Stevens Applied AI master's student and SDE2 at FanCode with Android (Kotlin, Coroutines, DI), Python ML, and CV/NLP prototyping experience.
YouTube Video
Project Description
This is chrome extension where it helps kids, elderly, or novoice users to browse the internet with screen shot with additional help/explanation provided by gemini based on screen shot. User submits the query like “How to book flight tickets from JFK to CLT?”. Extension makes a call to backend, which along with puppeter takes screen shot, and gemini generates additional help based on screen shot and user query. Project is generated using gemini-cli, Guardian Angel is the name of chrome extension autogenerated by gemini. Project has 2 parts, frontend and backend. Frontend is pure html and js. Backend in nodejs. Internet is searched using tavily (intially used google search API, but reached the limit of search, switched to tavily), then using puppeteer takes the screen shot. Screen is passed to gemini multimodal llm for generating additional context or help along with user prompt, and finally all the data is sent back to chrome extension.