Experiment Overview

This page explains the EvoNash experiment in plain language: what we do, why we do it, and how it connects to game theory, neural networks, and the future of AI. No prior background is required.

14 sections~15 min read

About This Project: The Science Fair Experiment at a Glance

EvoNash is a science fair project that asks a simple question: if we let digital "organisms" with tiny artificial brains evolve in a mini world, does it help to change their "genes" more when they are doing poorly and less when they are doing well? Or is it better to always change them by the same amount, like flipping a coin the same way every time? This project builds a real experiment—a computer platform that runs on a powerful graphics card—to answer that question. Below we explain every part of the experiment in simple terms.

Abstract (What We Did in One Paragraph)

This experiment tests whether adaptive mutation—changing an organism's "genes" (the numbers inside its brain) more when the parent did poorly and less when the parent did well—helps a population of 1,000 digital organisms reach a stable outcome (called a Nash equilibrium) faster than a control group that always uses the same amount of random change (static mutation). We put the organisms in a simple 2D world (a "petri dish") where they can move, eat food, and shoot at each other to steal energy. Their brains are small neural networks that we do not program; we only evolve them by keeping the best performers and randomly mutating their weights. We run two groups side by side, measure how many generations it takes each group to "settle down," and use statistics to see if the adaptive group really got there faster. The results tell us whether this kind of smart mutation could help future AI and evolutionary algorithms.

The Problem (Why We Did This)

In many real-world and computer experiments, we use evolution to improve things: we keep the best performers, copy them with small random changes (mutations), and repeat. But how much should we change them? If we change too much every time, good solutions get destroyed and we search almost at random. If we change too little, we might get stuck in a "local" good outcome and never find a better one. Most classic methods use a fixed amount of mutation—the same for everyone, every time. This project asks: what if we adapt the amount of mutation to how well the parent did? Struggling organisms get more random changes (a chance to try something new); successful ones get fewer changes (we keep what works). We wanted to test whether that idea actually speeds up how fast a population finds a stable, balanced outcome (a Nash equilibrium) in a simple but real experiment.

The Hypothesis (What We Think Will Happen)

Our hypothesis is: if we use adaptive mutation—where the amount of random change is inversely proportional to the parent's fitness (so low-performing parents produce more heavily mutated offspring, and high-performing parents produce less mutated offspring)—then the population will reach a Nash equilibrium (a stable mix of strategies where no one benefits by changing alone) in fewer generations than a control group that uses a fixed mutation rate. In other words, we predict that "smarter" mutation will help the population settle down faster.

Methodology (How We Run the Experiment)

We run two groups of experiments. Everything is the same in both groups except one thing: how much we mutate the offspring. In the control group, we always add the same small random amount of change to the brain weights (static mutation). In the experimental group, we add more change when the parent had a low fitness score and less change when the parent had a high fitness score (adaptive mutation). For each group we:

  • Start with 1,000 random neural-network "brains" in the same petri dish world.
  • Let them live for many "ticks" (moments)—moving, eating food, and sometimes shooting each other—and track who has the most energy.
  • At the end of each generation, we pick the top 20% by fitness score, copy their brains to create offspring, and mutate those copies (more or less depending on the group).
  • We repeat for many generations until the population's behavior stabilizes—meaning the mix of strategies stops changing much (we call that reaching Nash equilibrium).
  • We record when that happened (which generation) and how well the population did (peak fitness).

Then we compare the two groups using statistics: did the adaptive-mutation group reach Nash equilibrium in fewer generations? If yes, that supports our hypothesis.

Why We Use the Same Seed (Fair Start)

To be fair, we start the control and experimental groups with the same random seed. A seed is like a recipe that decides the starting conditions. Using the same seed means both groups begin with the same kind of brains and the same world setup. The only thing that is different is the mutation strategy. That way, if one group reaches a stable result faster, we know it happened because of the mutation strategy—not because it got a luckier start.

We don't rely on just one seed. We repeat the experiment with many different seeds. This gives us enough data to be confident that the result isn't just a coincidence.

Variables (What We Change, What We Measure, What We Keep the Same)

In any good experiment we control what we change and what we measure. Here:

  • What we change (independent variable): The mutation strategy—fixed (control) vs adaptive (experimental).
  • What we measure (dependent variables): (1) How many generations it took to reach Nash equilibrium (our main outcome), and (2) how high the population's rating got (peak fitness).
  • What we keep the same (constants): Population size (1,000), the rules of the petri dish (physics, food, shooting), how we select parents (top 20%), and the shape of the neural network (24 inputs, 64 hidden neurons, 4 outputs). Keeping these the same lets us fairly compare the two mutation strategies.

Why This Matters

This project combines evolution (trial and error over generations), game theory (Nash equilibrium—when no one benefits by changing strategy alone), and neural networks (small artificial brains). Understanding whether adaptive mutation speeds up convergence can help future AI and evolutionary algorithms—for example, in robotics, multi-agent systems, or automated design. The rest of this page explains each part of the experiment in more detail, so you can understand exactly what we did and why.

1What is a neural network?

A neural network is a computer model inspired by how brain cells work. It is made of many simple "cells" (called neurons) that receive numbers, do simple math, and send numbers to other cells. No one programs the network step-by-step to solve the problem. Instead, we give it a structure—layers and connections—and then we change the strength of those connections (the "weights") through learning or, in our case, evolution.

Think of it like a recipe where we only adjust the amounts of ingredients, not the steps. In this experiment, each organism's "brain" is one small neural network. It is nothing like a human brain in size or complexity, but the same basic idea: numbers go in, math happens, and numbers come out that become actions.

2How do neural networks work?

The inputs are numbers that represent what the organism "knows"—for example, how far away the nearest food pellet is, or how close the nearest enemy organism is. Since the world wraps around (toroidal geometry), there are no walls—just open space in every direction. These numbers flow through layers. The first layer takes the inputs and multiplies them by learned "weights," adds "biases," and then applies a simple rule (like "if the result is negative, treat it as zero") so the network can learn patterns that are not straight lines. The result becomes the input to the next layer, and so on.

The output layer produces the final numbers. In our experiment, that is four numbers that control thrust, turn, shoot, and split (see Key terms below). The only thing that changes during evolution is the weights and biases; the layout of the network stays the same. If the first weight is big, that input has a big effect on the next layer; if it is small, it has a small effect.

3Key terms

The following terms are used throughout this overview. They are defined here so you can refer back anytime.

Raycasts

"Raycast" is not a common word—it comes from computer graphics. In this experiment, raycasts are virtual beams or sensors. The organism sends out 8 "beams" in different directions (like headlights or radar). Each beam reports how far the nearest food pellet or other organism is (and sometimes the size of the other organism). So the organism does not "see" pictures; it gets 24 numbers (8 directions × 3 types of data). Since the world wraps around, there are no walls to hit—the raycasts go on forever until they find something or reach maximum range.

The four actions

  • Thrust — The strength of "move forward," from 0 (don't move) to 1 (full power). The organism accelerates in the direction it is facing.
  • Turn — Rotate left or right, from -1 to 1. It changes which direction the organism is facing.
  • Shoot — Fire a projectile. If it hits another organism, the shooter steals some of their energy (that is predation). There is a cooldown: after shooting, the organism must wait a short time before it can shoot again.
  • Split — Another action with a cooldown; it can be used for reproduction or other abilities in the simulation.

Moving

Moving in the experiment is the result of thrust and the simulation's physics. Each moment (each "tick"), the organism's velocity is updated by its thrust, and its position is updated by its velocity. So moving = thrust + physics; the organism does not teleport.

Fitness Score (in depth)

Fitness Score is a number that measures how well an organism performed during its lifetime in the petri dish. In our experiment, each organism's fitness is calculated with a simple formula:

Fitness = Ticks Survived + Remaining Energy

Ticks survived is how many moments (out of 750 per generation) the organism stayed alive. An organism that survives the entire generation gets 750 points for survival alone. Remaining energy is how much energy the organism still has at the end of the generation. Organisms that collected food efficiently and avoided unnecessary energy loss will have more energy left over.

This means an organism that survives the full generation and ends with lots of energy will have the highest fitness score. For example, an organism that survived all 750 ticks and ended with 150 energy would score 750 + 150 = 900. One that died at tick 300 with 0 energy scores just 300.

We use fitness scores to select parents (top 20% by score get to reproduce), to set the mutation rate in the experimental group (low fitness = more mutation, high fitness = less), and to measure how well the population did (peak fitness = highest score anyone reached). So fitness score is the single number that drives evolution and our statistical analysis.

Other terms

Tick — One moment or step in the simulation (like one frame in a video). Generation — One full round of life, then selection, breeding, and mutation. Cooldown — A wait time before an action (e.g. shoot) can be used again. Metabolism — The organism losing a little energy every tick (like burning calories). Foraging — Getting energy by eating food pellets. Predation — Getting energy by shooting another organism and stealing their energy. Policy entropy — A number that measures how "mixed" or "certain" one organism's decisions are (averaged over the population we get mean policy entropy). Entropy variance — How much organisms differ from each other in that "mixed vs certain" measure; when it is low and stable, the population has settled on a similar mix of strategies (we use this to detect Nash equilibrium). Convergence — The population settling into a stable mix of strategies (Nash equilibrium). Fitness Score — How well an organism did; our primary performance measure. Weights — The numbers inside the neural network that get evolved. Mutation — Randomly changing those weights a little when creating offspring.

4How does this experiment implement neural networks?

In this experiment, each organism is controlled by a specific neural network architecture often described as 24-64-4. This code describes the shape of its "brain" and how it processes information. You can think of it like a team of workers passing messages down a line.

1. The Inputs: 24 "Eyes" (Sensors)

The network starts with 24 inputs. Imagine the organism has 8 eyes looking in 8 different directions (forward, backward, left, right, and diagonals). Each eye measures exactly 3 things:

  • How far is the nearest Food pellet in this direction?
  • How far is the nearest Enemy organism in this direction?
  • How far is the nearest boundary wrap point? (Since the world is toroidal—it wraps around like the surface of a donut—there are no walls. This value tells the organism how far it is from the wrap-around edge, which affects how distances to food and enemies are measured.)

With 8 directions × 3 measurements each, that gives us the 24 inputs. These numbers are the only thing the organism "sees." Because the world wraps around, organisms cannot hide in corners or against walls—the environment is completely open in every direction.

2. The Hidden Layer: 64 "Thinkers"

The information then travels to a large hidden layer containing64 neurons.

  • Instead of splitting duties between small groups, this large group of 64 neurons works together to process the raw input data. They identify patterns (like "food is close") and calculate the best strategy simultaneously. Having more neurons in a single layer allows the network to capture complex relationships between the inputs directly.

3. The Outputs: 4 Actions

Finally, the network produces 4 outputs, which are the instructions sent to the organism's body:

  • Thrust: How hard to push forward (0 to 100%).
  • Turn: Which way to steer (-1 for left, +1 for right).
  • Shoot: Whether to fire a projectile (if this number is high enough, it shoots).
  • Split: Whether to reproduce/split (if high enough).

This entire process happens instantly, dozens of times per second, allowing the organism to react to its world in real-time.

We run 1,000 organisms at once. To do this quickly we use a graphics card (GPU) to run all 1,000 brains in parallel—like having 1,000 calculators working at the same time. The software runs on the GPU so we can simulate many generations in a reasonable time.

5Why did we choose a petri dish?

The petri dish is our controlled mini-world for the experiment. Think of a real petri dish in biology: a simple, closed environment where we can watch life (here, digital organisms) under fixed rules. That helps science because we can repeat the experiment exactly—same rules, same starting conditions—and change only one thing: how much we mutate the "genes" (weights) of the neural networks. So we can fairly compare two strategies.

The world is 2D (flat, like a tabletop) and continuous (organisms can be anywhere, not just on a grid), with wrap-around borders (toroidal geometry): going off one edge brings you back on the other side. This means there are no walls or corners to hide in—organisms must survive in the open. The physics are simple (movement and collisions) so the computer can simulate thousands of organisms without extra complexity. The petri dish is our lab bench—simple, repeatable, and designed so we can learn about evolution and mutation, not about the environment.

6What are the organisms?

The organisms (also called agents) are digital creatures represented as circles moving in the 2D petri dish. Each has energy—like health or fuel. They lose a little energy every moment (metabolism, like burning calories just to stay alive) and gain energy in two ways: by eating food (static pellets that give a set amount of energy) or by predation (shooting a projectile at another organism to steal some of their energy).

Foraging is safer but can be slow; predation is riskier but can yield big gains. They have no hands or eyes; their only "senses" are the numbers from the raycasts and their own state, and their only "actions" are the four outputs (thrust, turn, shoot, split). No one programmed them to "go toward food" or "avoid enemies"—they only have a brain (neural network) that turns what they sense into actions. Over time, organisms that keep their energy high survive and reproduce; others die out.

7How do their neural network brains work and what are their motivations?

Each moment (tick), every organism gets a list of 24 numbers: from 8 directions, how far to the nearest food and enemy (and sometimes enemy size), plus a few numbers about itself (energy level, speed, whether it is on cooldown for shooting or splitting). That list is the input to its neural network. The network outputs 4 numbers that control thrust, turn, shoot, and split.

No one programmed the organisms to "go toward food" or "avoid enemies." The network just has weights that get evolved; any "strategy" we see (foraging, fleeing, attacking) emerges from which organisms had more offspring. So their motivation is not written in code; it is implicit: organisms that by chance behave in ways that keep energy high get to reproduce, so over many generations the population tends to act in ways that help survival. We measure their success with a fitness score (see Key terms)—higher fitness means they tend to "win" more often in our pairwise comparisons. Think of it like nature selecting the best survivors.

8Why do they act the way they do?

We do not tell the organisms how to behave. We only select the best performers (top 20% by fitness score), copy their neural network weights to create offspring, and randomly change (mutate) those weights a little. So "why they act the way they do" is: their brains were shaped by many generations of trial and error.

Organisms that happened to have weights that led to good survival and reproduction left more copies; bad strategies died out. It is like breeding dogs for speed—we did not design the legs; we just kept the fastest and over time they got faster. At the start, behavior is almost random; after many generations we often see recognizable strategies (some organisms forage, some attack) because those strategies won in the petri dish. They act the way they do because evolution favored those behaviors in this environment.

9How do we conduct the experiment?

We run two groups of experiments, identical in every way except how much we mutate the offspring.

  • Control group — We use a fixed mutation amount (we always change the weights by the same small random amount), like flipping a coin the same way every time.
  • Experimental group — We use adaptive mutation: we change the weights more when the parent did poorly and less when the parent did well. Struggling organisms get more random changes (more chance to try something new); successful ones get fewer changes (we keep what works).

For each group we start with 1,000 random neural networks, run the petri dish for many generations (each generation = one round of life, selection, breeding, and mutation), and we stop when the population's behavior stabilizes—meaning the mix of strategies stops changing much. We call that approaching a Nash equilibrium (see next section). We record when that happened (which generation) and how well the population did (peak rating). Then we compare the two groups: did the adaptive-mutation group reach stability faster? We use statistics to check if the difference is real or just luck.

10What is game theory? What is Nash equilibrium? Why is it the key metric?

Game theory is the study of situations where multiple decision-makers (players) choose actions, and each person's outcome depends not only on their own choice but on what others do. Think of two people dividing a pizza: if you ask for more, the other might take less; your best choice depends on what you think they will do. In our experiment, the "players" are the organisms. Each one chooses how to behave (forage, attack, flee) based on its neural network, and its success (energy, survival, reproduction) depends on what the other 999 are doing. So the petri dish is a "game" in the game-theory sense.

Nash equilibrium is a situation where no one can improve their outcome by changing their strategy alone, given what everyone else is doing. It is named after the mathematician John Nash. At a Nash equilibrium, if you are the only one who switches from "forage" to "attack," you don't do better—so no one has a reason to switch. It describes a stable outcome: everyone is doing the best they can given what others do.

In our experiment, each organism has a strategy (the way its brain turns inputs into actions). The population has a mix of strategies. We say the population has reached a Nash-like equilibrium when the mix of strategies stops changing from generation to generation: the kinds of behavior have settled into a stable balance. At that point, no organism would do better by behaving differently, given how the rest of the population is behaving. We detect this by watching entropy variance—how much the organisms differ from each other in how "mixed" or "certain" their decisions are. When everyone is behaving similarly, that difference drops and stays low; when it stays low for many generations in a row, we treat that as having reached Nash equilibrium.

Why Nash equilibrium is the key metric: Our hypothesis is that adaptive mutation helps the population reach Nash equilibrium faster than fixed mutation. So the key metric is how many generations it takes to reach Nash equilibrium—that is our primary outcome. If the adaptive-mutation group reaches Nash equilibrium in fewer generations than the control group, that supports the hypothesis. Nash equilibrium is not just a fancy name for "they settled down"—it is the specific, stable outcome from game theory that we use to define "settled," and the generation at which we reach it is the main number we use to test our hypothesis.

11How we detect Nash equilibrium (technical)

Detection criterion. Nash equilibrium is detected using entropy variance across the population, not mean policy entropy. For each generation we compute a scalar policy entropy per agent (expected entropy of the action distribution over a fixed set of sample inputs). The entropy variance is the variance of those per-agent entropies across the population.

Why variance rather than mean entropy. Mean policy entropy indicates how mixed or deterministic the average policy is, but it does not measure population-level homogeneity. At equilibrium we require that the strategy mix has stabilized—i.e., that agents no longer differ substantially in behavior. That corresponds to low variance of policy entropy across agents: when all agents have similar entropies, the population has converged to a homogeneous strategy mix. We therefore define convergence as the generation at which entropy variance falls below a threshold and remains below it for a fixed stability window (after an initial phase of divergence), with a post-convergence buffer to confirm stability.

12Why do we need GPU workers?

We have 1,000 organisms, each with a neural network that does many multiplications every moment, and we run hundreds of generations. Doing that on an ordinary computer (CPU) would take a very long time—hours or days. A GPU (graphics card) is built to do thousands of simple math operations at once (originally for drawing graphics). We use it to run all 1,000 brains in parallel—like having 1,000 people each do one multiplication at the same time instead of one person doing 1,000.

Workers are the computers that have the GPU and actually run the simulation. The website you see is the "controller"; it sends the experiment settings to a worker, the worker runs the petri dish and evolution on its GPU, and sends the results back. So we need GPU workers to finish the experiment in a reasonable time and to separate the heavy computation (worker) from the interface and storage (web app). Think of the worker as a lab technician who runs the experiment and mails back the data.

13What are we measuring?

The primary metric for proving our hypothesis is how many generations it takes to reach Nash equilibrium (convergence velocity). The other metrics (peak fitness, policy entropy) support the analysis, but convergence to Nash is the key outcome we compare between the two groups.

Important detail: Fitness score tells us how well organisms did, whileentropy variance tells us how similar their decision-making styles are. Two groups could have similar fitness but still behave differently. That is why we track both performance and behavior.

  • Convergence velocity ("when did they reach Nash equilibrium?") — We record the generation number at which the population's behavior becomes stable: the variety of strategies (who forages, who attacks) stops changing much from generation to generation. We check this using entropy variance—how much the organisms differ from each other in how mixed or certain their decisions are. When that difference is small and stays small for many generations, everyone is behaving similarly and we say we have reached a Nash-like equilibrium. So convergence velocity = how many generations it took to get there. Faster convergence = fewer generations.
  • Peak fitness ("how good did they get?") — We record the highest fitness score that any organism (or the population) reached (see Key terms for how we calculate it). This tells us how well the evolved strategies performed in the petri dish.
  • Policy entropy ("how predictable are one organism's decisions?") — This number tells us whether an organism is still experimenting (high entropy) or has settled on a stable style (low entropy). We look at the variance of that number across all organisms to detect equilibrium: when the variance is low, everyone is similar; when it stays low for many generations, we have reached Nash equilibrium.

We are measuring how fast the population stabilizes and how well it does, and we compare these between the control and experimental groups.

14Why is this relevant for the future of AI, and how could it be expanded?

This experiment sits at the intersection of three powerful fields: evolutionary computing (improving AI through trial and error over generations), game theory (understanding strategic decision-making when multiple agents interact), and neural networks (giving agents brains that can process information and make decisions). That combination isn't just academic—it has direct applications to some of the most important challenges facing humanity.

Real-World Applications

  • Autonomous vehicles and robotics: Self-driving cars must constantly make decisions in a world full of other drivers, pedestrians, and cyclists—all making their own decisions simultaneously. This is exactly the kind of multi-agent game theory problem our experiment studies. If adaptive mutation helps populations of AI agents find stable strategies faster, that same principle could help robot fleets (warehouse robots, delivery drones, self-driving trucks) learn to coordinate efficiently without crashing into each other.
  • Economics and market design: Stock markets, auctions, and supply chains are all systems where many agents (traders, companies, consumers) interact strategically. Economists use Nash Equilibrium to predict market outcomes. Our experiment tests whether there are faster ways to find these equilibria—which could help design fairer auction systems, more efficient markets, or better pricing algorithms.
  • Drug discovery and protein design: Pharmaceutical companies use evolutionary algorithms to search through billions of possible molecular structures to find effective drugs. The question of "how much should we mutate?" is directly relevant—adaptive mutation could help these searches converge on promising drug candidates faster, potentially saving years of research time.
  • Climate and resource management: Managing shared resources (fisheries, forests, water supplies) involves many stakeholders making independent decisions. Game theory helps model these "tragedy of the commons" situations. Understanding how populations converge to stable strategies could inform policies that help communities reach sustainable equilibria faster.
  • Multi-agent AI systems: As AI becomes more common, we increasingly have situations where multiple AI systems interact—chatbots negotiating, trading algorithms competing, or recommendation systems influencing each other. Understanding how populations of AI agents reach equilibrium is crucial for ensuring these systems behave predictably and safely.
  • Cybersecurity: Attackers and defenders in cybersecurity are engaged in a constant strategic game. Evolutionary approaches to security (where defense strategies evolve in response to attacks) could benefit from adaptive mutation to find robust defense strategies more quickly.

Why Adaptive Mutation Matters Beyond This Experiment

The core question—"should we change things more when they're not working and less when they are?"—is fundamental to optimization everywhere. Currently, most evolutionary algorithms use fixed mutation rates, which is like always turning every knob by the same amount regardless of whether you're close to a good solution or far away. If our experiment demonstrates that adaptive mutation accelerates convergence, that finding could be applied to:

  • Training neural networks more efficiently (reducing the enormous energy cost of training large AI models)
  • Optimizing engineering designs (aircraft wings, circuit layouts, antenna shapes) faster
  • Evolving game-playing AI (like AlphaGo's successors) with less computational cost
  • Accelerating scientific simulations that use evolutionary search methods

How This Experiment Could Be Expanded

This kind of experiment could grow in many directions. We could use bigger or more complex worlds (3D environments, multiple food types, predator-prey ecosystems), larger populations or bigger neural networks, or entirely different mutation strategies (for example, letting the organisms learn their own mutation rate over time). We could also introduce cooperation (organisms that work together to hunt) or communication (organisms that signal to each other), creating richer game-theoretic scenarios. The goal is to show that evolutionary game-theoretic experiments can scale from a science fair project to tools that benefit both scientific research and real-world industry.