Mathieu Larose

Optiverse: Evolving Code with LLMs

July 2025

I've released the initial version of Optiverse, a Python library for evolving code and algorithms using Large Language Models (LLMs). Inspired by DeepMind's AlphaEvolve, Optiverse lets LLMs iteratively refine program solutions to a given problem.

With Optiverse, you define a problem and provide an evaluation function. The system then generates and evolves candidate solutions over multiple iterations, learning which approaches yield better results.

First Experiment: Solving TSP

As an initial test, I used Optiverse on the classic Traveling Salesman Problem (TSP).

TSP can be described as follows: given a list of cities and the distances between them, the goal is to find the shortest route that visits each city exactly once and returns to the starting city.

It’s a challenging problem due to the rapidly increasing number of possible routes as cities are added. But efficient heuristics for TSP have practical applications in areas like delivery routing.

Problem Definition

The problem definition provided to Optiverse was as follows:

Write a Python program that implements a heuristic for the Traveling Salesman Problem (TSP).

Your script must define a `solve` function with the following signature:


def solve(context: Context) -> None:
    pass


- The `Context` object provides the TSP instance data and methods to report solutions.
- You may only modify the `solve` function. You are allowed to define and call additional helper functions within your script, but you cannot modify the `Context` class itself.

Your implementation should:

- Access the TSP data through the Context object.
- Call `context.report_new_best_solution(solution)` only when you have found a better solution than previously reported.
- The most recently reported solution will be used as the final answer when time runs out.

Important: Any solution reported after time runs out will be ignored. Ensure your implementation checks the remaining time and only reports solutions while time is available.

The initial baseline solution simply generated a random permutation of cities for the tour:

from abc import ABC, abstractmethod
from datetime import timedelta
import random
from typing import List, Tuple


class Context(ABC):
    pass # Omitted for brevity


def solve(context: Context) -> None:
    num_cities = len(context.instance)

    while context.remaining_time() > timedelta():
        # Generate a random solution (permutation of city indices)
        solution = random.sample(range(num_cities), num_cities)
        context.report_new_best_solution(solution)
        break  # since it's pointless to continue in this example

Results

I ran Optiverse for 100 iterations using Gemini 2.0 Flash as the LLM. The evaluator executed each solver 3 times with a 30-second time limit on a 280-city instance, using the average tour length as the score.

So the best solution found is within 1% of optimality. While this result isn't surprising since solvers can get within 1% of optimality for much larger instances, the code it generated was nonetheless interesting.

For example, it precomputed a distance matrix so that checking distances became simple lookups rather than repeated calculations. It also implemented one of the best-known heuristics, 2-opt, and did so efficiently by computing only the delta in tour length rather than recalculating the entire route after each swap. Finally, it periodically perturbed the solution to explore different parts of the solution space and avoid getting stuck in local optima.

Solution

At each iteration, the LLM provided a description of the solution. For the best iteration, the description was as follows:

Here's the best solver generated by Optiverse:

from abc import ABC, abstractmethod
from datetime import timedelta
import random
from typing import List, Tuple
import math
import time

# This interface cannot be changed by the LLM
class Context(ABC):
    @property
    @abstractmethod
    def instance(self) -> List[Tuple[float, float]]:
        pass

    @abstractmethod
    def remaining_time(self) -> timedelta:
        pass

    @abstractmethod
    def report_new_best_solution(self, solution: List[int]) -> None:
        pass


def calculate_distance(city1: Tuple[float, float], city2: Tuple[float, float]) -> float:
    """Calculates the Euclidean distance between two cities."""
    return math.sqrt((city1[0] - city2[0]) ** 2 + (city1[1] - city2[1]) ** 2)


def calculate_tour_length(tour: List[int], instance: List[Tuple[float, float]]) -> float:
    """Calculates the total distance of a tour."""
    total_distance = 0.0
    num_cities = len(tour)
    for i in range(num_cities):
        city1_index = tour[i]
        city2_index = tour[(i + 1) % num_cities]  # Wrap around to the start
        city1 = instance[city1_index]
        city2 = instance[city2_index]
        total_distance += calculate_distance(city1, city2)
    return total_distance


def nearest_neighbor(start_city: int, instance: List[Tuple[float, float]]) -> List[int]:
    """Constructs a tour using the nearest neighbor heuristic in-place."""
    num_cities = len(instance)
    tour = list(range(num_cities))
    tour[0], tour[start_city] = tour[start_city], tour[0]  # Move start city to the beginning
    current_city = 0

    for i in range(1, num_cities):
        best_neighbor = i
        best_distance = calculate_distance(instance[tour[current_city]], instance[tour[i]])

        for j in range(i + 1, num_cities):
            distance = calculate_distance(instance[tour[current_city]], instance[tour[j]])
            if distance < best_distance:
                best_distance = distance
                best_neighbor = j

        tour[i], tour[best_neighbor] = tour[best_neighbor], tour[i]  # Swap in-place
        current_city = i

    return tour


def two_opt_swap(tour: List[int], i: int, k: int) -> None:
    """Performs a 2-opt swap on a tour in-place using slice reversal."""
    tour[i:k+1] = tour[i:k+1][::-1]


def two_opt(tour: List[int], instance: List[Tuple[float, float]], distances: List[List[float]], distance_threshold_percentage: float) -> None:
    """Improves a tour using the 2-opt heuristic in-place with first-improvement strategy and pre-calculated distances."""
    n = len(tour)
    improved = True
    tour_length = calculate_tour_length(tour, instance)
    distance_threshold = tour_length * distance_threshold_percentage

    while improved:
        improved = False
        for i in range(1, n - 1):
            for k in range(i + 1, n):
                i_minus_1 = tour[i - 1]
                i_city = tour[i]
                k_city = tour[k]
                k_plus_1 = tour[(k + 1) % n]

                delta = distances[i_minus_1][k_city] + distances[i_city][k_plus_1] - distances[i_minus_1][i_city] - distances[k_city][k_plus_1]

                if delta < -distance_threshold:
                    two_opt_swap(tour, i, k)
                    tour_length += delta
                    distance_threshold = tour_length * distance_threshold_percentage
                    improved = True
                    break
            if improved:
                break


def solve(context: Context) -> None:
    num_cities = len(context.instance)
    best_solution = list(range(num_cities))
    random.shuffle(best_solution)
    best_solution_length = calculate_tour_length(best_solution, context.instance)

    # Pre-calculate distances
    distances = [[calculate_distance(context.instance[i], context.instance[j]) for j in range(num_cities)] for i in range(num_cities)]

    max_iterations = 30000
    iteration = 0
    time_check_interval = 500  # Check remaining time every 500 iterations

    # Centroid initialization
    centroid_x = sum(city[0] for city in context.instance) / num_cities
    centroid_y = sum(city[1] for city in context.instance) / num_cities
    city_distances_to_centroid = [calculate_distance((centroid_x, centroid_y), city) for city in context.instance]
    total_distance_to_centroid = sum(city_distances_to_centroid)
    city_probabilities = [distance / total_distance_to_centroid for distance in city_distances_to_centroid]

    start_time = time.time()
    iteration_time_sum = 0.0
    iteration_count = 0
    time_limit = context.remaining_time().total_seconds() * 0.965  # Further reduce the time for computation
    max_possible_iterations = 1000000

    distance_threshold_percentage = 0.000045  # Adjusted threshold

    initial_iterations = min(5000, num_cities * 4)  # Adjust initial iterations based on problem size

    num_nn_starts = min(12, num_cities) # Increased number of NN starts


    # Generate multiple initial tours and pick the best
    best_initial_tour = None
    best_initial_tour_length = float('inf')
    for _ in range(num_nn_starts):
        if context.remaining_time() < timedelta(seconds=0.01):
            break
        start_city = random.choices(range(num_cities), weights=city_probabilities, k=1)[0]

        initial_tour = nearest_neighbor(start_city, context.instance)

        two_opt(initial_tour, context.instance, distances, distance_threshold_percentage)  # In-place modification

        tour_length = calculate_tour_length(initial_tour, context.instance)
        if tour_length < best_initial_tour_length:
            best_initial_tour_length = tour_length
            best_initial_tour = initial_tour[:]


    best_solution = best_initial_tour[:]
    best_solution_length = best_initial_tour_length


    while context.remaining_time() > timedelta(seconds=0.03) and iteration < max_iterations:
        iteration_start_time = time.time()

        # Perturb the current best solution
        perturbed_tour = best_solution[:]
        i = random.randint(0, num_cities - 2)
        k = random.randint(i + 1, num_cities - 1)
        two_opt_swap(perturbed_tour, i, k)


        two_opt(perturbed_tour, context.instance, distances, distance_threshold_percentage)  # In-place modification
        tour_length = calculate_tour_length(perturbed_tour, context.instance)

        if tour_length < best_solution_length:
            best_solution_length = tour_length
            best_solution = perturbed_tour[:]  # Copy before reporting
            if context.remaining_time() > timedelta(seconds=0.02):  # Adjust threshold
                context.report_new_best_solution(best_solution)


        iteration += 1

        iteration_end_time = time.time()
        iteration_time = iteration_end_time - iteration_start_time
        iteration_time_sum += iteration_time
        iteration_count += 1

        if (iteration_count > initial_iterations and iteration % 100 == 0) :  # Ensure enough iterations to get a good average
            avg_iteration_time = iteration_time_sum / iteration_count
            remaining_time = context.remaining_time().total_seconds()
            max_iterations = min(max_possible_iterations, int(remaining_time / avg_iteration_time * 0.9))  # Conservative estimate

        # Early stopping check
        if iteration_count > initial_iterations:
            estimated_remaining_iterations = (time_limit - (time.time() - start_time)) / avg_iteration_time
            if iteration + estimated_remaining_iterations > max_iterations:
                break

Conclusion

Optiverse did not identify a fundamentally new approach to solving the TSP, but it successfully constructed a solid and practical heuristic from scratch using the problem definition, evaluation feedback, and a previous-generation LLM (Gemini Flash 2.0) within 100 iterations.

It's not clear if this approach will ever lead to truly groundbreaking discoveries, but to me, it seems like an interesting area to keep exploring.

Like this article? Get notified of new ones: