Skip to content

MonarchofCoding/EvoCoT-Prototype

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Evo-CoT: Evolutionary Optimization of Chain-of-Thoughts

This repository contains code for the Evo-CoT framework, which uses staged evolutionary algorithms to generate, align, and correct chain-of-thought (CoT) reasoning exemplars. The framework is designed to explore novel reasoning patterns, refine them for problem alignment, and select top-quality CoTs using LLM-based correction.


  1. Installation

1.1 Python Version

Python >= 3.10 recommended.

1.2 Dependencies

Install required packages using pip:

pip install -r requirements.txt

Key dependencies include:

numpy – numerical computations

scipy – scientific operations (optional)

matplotlib – plotting results

tqdm – progress bars

transformers – LLM alignment and evaluation (for Stage 2/3)

torch – PyTorch backend for LLM inference

(Optional: jupyter or ipython for interactive experiments)


  1. Dataset / Population Initialization

The framework requires an initial CoT population stored as JSON (population.json).

Each entry must include:

problem : problem statement

cot : initial chain-of-thought

answer : ground truth answer (optional for exploration)

Download Instructions:

If using a benchmark dataset (e.g., GSM8K, MATH, or custom problems), preprocess into the above JSON format.

Example JSON snippet:

[ { "problem": "In a class of 40 students, 80% have puppies. 25% of those also have parrots. How many have both?", "cot": "First calculate the number of students with puppies. Then compute the subset with parrots.", "answer": "8" }, ... ]


  1. Running Experiments

3.1 Stage 1: Exploration

python stage1_exploration.py

Generates diverse CoTs using meta-heuristics, semantic-preserving mutations, and crossovers.

Logs fitness, diversity, and generation statistics.

3.2 Stage 2: Alignment

python stage2_alignment.py

Aligns top Stage 1 CoTs to their respective problems using LLM guidance.

No evolution occurs here; purely alignment and structural refinement.

3.3 Stage 3: Correction & Ranking

python stage3_correction.py

Uses LLM-based scoring to assign correctness fitness.

Ranks and selects Top-K CoTs.


  1. Reproducibility Notes

Random seeds are set in all stages, but LLM-based alignment may introduce non-determinism.

Stage 1 results can vary slightly depending on mutation and crossover operations.

Save population snapshots (population_stage1_genX.json) to resume experiments or compare intermediate results.


  1. Cost & Computational Considerations

Stage 1 with 2,000 population × 80 generations is computationally intensive (~7,400 total fitness evaluations).

LLM-based alignment and correction (Stage 2/3) can be GPU-accelerated for efficiency.

Suggested workflow for budgeted experiments:

  1. Run smaller populations or fewer generations for prototype testing.

  2. Run full-scale experiments on high-memory GPU nodes for final results.

Track elapsed time, mutation/crossover counts, diversity to monitor experiment efficiency.


  1. Plotting Results

Use matplotlib to visualize fitness trends across generations:

import matplotlib.pyplot as plt

plt.plot(generations, avg_fitness, label='Average Fitness', color='blue') plt.plot(generations, best_fitness, label='Best Fitness', color='red') plt.xlabel('Generation') plt.ylabel('Fitness') plt.title('Stage 1 Fitness Evolution') plt.legend() plt.savefig('stage1_fitness_plot.png') plt.show()

Upload .png or .pdf images to Overleaf for paper figures.

If you want, I can also prepare a requirements.txt and a sample population.json ready for your Overleaf/experiment so you can start immediately.

Do you want me to do that?

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages