TextEvolve: Automated LLM Design

A live demo of TextEvolve, an automated system that writes, tests, and refines Python scripts for LLM tasks using iterative feedback and memory.

Video

Overview

I’ll be presenting TextEvolve, a system I’ve been working on for a few months.

TextEvolve was created to drastically reduce the time it takes to design and refine LLM-based applications.

You provide TextEvolve with some text dataset of inputs/outputs (like Q&A datasets, LLM benchmarks, real text applications) and the system uses LLMs to write programs mapping inputs to outputs, gradually learning over time through memory and LLM-based feedback what program approaches work well.

For example, given some question and answer dataset that involves complex reasoning over some reference passages, the system might design a program on iteration 1 that fetches the context and makes a simple LLM call. Through LLM-based feedback, the system finds that this approach doesn’t handle the complex reasoning well, so on iteration 2 it makes complex control flow to handle different types of questions. On iteration 3 it decides a better approach is to revise iteration 1 and add a ReAct-style loop, etc.

I’ll probably do exactly this kind of demo, and talk through how the system works and show the different approaches (simple python scripts) that it generates, show the memory files it generates, the experimental log it generates, etc. For most datasets you can review the performance history for each iteration and can see the system gradually improve over time.

Concretely, the system uses LLMs to:

write a script, end-to-end, that attempts to map the dataset inputs to outputs
run the script on a batch of test data
generate feedback
write feedback and previous attempts into memory
pick a strategy for the next iteration: try something new? refine a good past attempt? combine the best performing attempts?
repeat

For the sake of comparison, it’s kind of like DSPy on steroids or an alternative to AlphaEvolve.

Links

https://github.com/nickcdryan/textevolve
TextEvolve: LLM-driven automated program discovery generating optimized Python scripts.

Tech stack