Skip to content

Conversation

@init27
Copy link
Contributor

@init27 init27 commented Nov 2, 2025

This idea came super thanks to @cpich3g from our Hackathon.
We are continuing here but thanks again Justin for leading and inspiring!

Adds a complete Pokemon battle environment integration using poke-env and Pokemon Showdown, following OpenEnv's HTTP-based framework patterns.

Design Decisions

Event Loop Management

  • poke-env runs on dedicated POKE_LOOP background thread
  • FastAPI runs on main uvicorn event loop
  • Proper async synchronization via asyncio.run_coroutine_threadsafe()
  • Lazy initialization of asyncio.Event objects on correct loop

Memory Management

  • Automatic battle cleanup every 10 episodes (prevents accumulation)
  • Previous battle task cancellation on reset
  • Stable memory usage over 100+ episodes

Concurrency

  • Single lock for both reset() and step() operations
  • Prevents race conditions in battle state access
  • Thread-safe action queuing

Error Handling

  • Illegal move validation with fallback to random legal action
  • Battle state validation before step execution
  • Comprehensive None checks for Pokemon attributes
  • Graceful handling of finished battles

File Structure

src/envs/pokemon_env/
├── init.py # Package exports
├── models.py # Action, Observation, State dataclasses
├── client.py # HTTPEnvClient implementation
├── README.md # Environment documentation
└── server/
├── init.py
├── app.py # FastAPI application
├── pokemon_environment.py # Core environment logic
├── Dockerfile # Container with Showdown + OpenEnv
└── requirements.txt # Dependencies

examples/project-pikachu/
├── test_local_pokemon.py # Direct environment tests (6 tests)
├── test_http_pokemon.py # HTTP client tests (6 tests)
└── [documentation files] # Testing guides and architecture docs

Reward Modes

Sparse (default):

  • +1.0 for win, -1.0 for loss, 0.0 otherwise

Dense:

  • +0.2 per opponent Pokemon fainted
  • -0.2 per own Pokemon fainted
  • +0.05 per HP% damage dealt
  • +0.5 bonus for win, -0.5 penalty for loss

Credits

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 2, 2025
@init27 init27 requested a review from zkwentz November 2, 2025 03:57
@github-actions
Copy link

github-actions bot commented Nov 2, 2025

⚠️ Deployment failed for pokemon_env

  • Space repo:
  • Live URL:

Please resolve your environment.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

@init27 init27 requested a review from Darktex November 2, 2025 03:57
Added observations on popular choices and practices in Pokemon-RL projects.
@github-actions
Copy link

github-actions bot commented Nov 2, 2025

✅ Deployment succeeded for pokemon_env

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

Add observations section to Readme
@github-actions
Copy link

github-actions bot commented Nov 2, 2025

✅ Deployment succeeded for pokemon_env

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

cpich3g and others added 5 commits November 3, 2025 11:55
- Introduced a comprehensive tutorial for using the OpenEnv Pokemon Battle Environment.
- Included sections on setup, battle mechanics, observations, available moves, and actions.
- Demonstrated a complete battle simulation and a simple strategy agent.
- Ensured proper connection to the environment and handling of battle states.
- Created a Jupyter notebook tutorial for the Pokemon Battle Environment, covering setup, battle mechanics, and agent strategies.
- Added a shell script to test the Pokemon environment Docker image, ensuring services are running and performing basic environment interactions.
Updated Pokemon_Env & added a test notebook
@github-actions
Copy link

github-actions bot commented Nov 3, 2025

✅ Deployment succeeded for pokemon_env

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

Added system requirements, hardware recommendations, setup steps, and common issues with fixes for the Basic Pokemon RL Training environment.
Updated the checklist for setting up the Pokémon RL training environment, including system requirements, hardware recommendations, setup instructions, and known issues.
Removed unnecessary comments and cleaned up formatting.
Added instructions for forking the Pokemon-Showdown repository and noted compatibility issues with random battle generation.
@github-actions
Copy link

github-actions bot commented Nov 5, 2025

✅ Deployment succeeded for pokemon_env

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

1 similar comment
@github-actions
Copy link

github-actions bot commented Nov 5, 2025

✅ Deployment succeeded for pokemon_env

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

Created a  Basic Setup and Installation.md file
@github-actions
Copy link

github-actions bot commented Nov 5, 2025

✅ Deployment succeeded for pokemon_env

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

@github-actions
Copy link

github-actions bot commented Nov 5, 2025

✅ Deployment succeeded for pokemon_env

Nice work! Wait for a code review and we're ready to go.

You can iterate locally or validate fixes by running scripts/deploy_to_hf.sh --env "pokemon_env".

@Darktex Darktex requested a review from Copilot November 7, 2025 18:19
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a comprehensive Pokemon battle environment integration to OpenEnv using poke-env and Pokemon Showdown. The implementation provides HTTP-based access to Pokemon battles for reinforcement learning and AI training.

  • Complete Pokemon environment integration with poke-env library
  • Docker containerization with Pokemon Showdown server and OpenEnv API server
  • Client library for HTTP-based environment interaction
  • Comprehensive test suite and documentation

Reviewed Changes

Copilot reviewed 24 out of 25 changed files in this pull request and generated 24 comments.

Show a summary per file
File Description
src/envs/pokemon_env/server/pokemon_environment.py Core environment implementation with async/event loop synchronization
src/envs/pokemon_env/client.py HTTP client for Pokemon environment
src/envs/pokemon_env/models.py Data models for actions, observations, and state
src/envs/pokemon_env/server/app.py FastAPI server application
src/envs/pokemon_env/server/Dockerfile Multi-service Docker image configuration
src/envs/pokemon_env/test_enhancements.py Comprehensive test suite
examples/project-pikachu/test_*.py Integration and HTTP client tests
examples/project-pikachu/TESTING.md Testing guide and troubleshooting

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if obs.available_switches and random.random() < 0.2 and active_hp < 0.3:
switch_idx = random.choice(range(len(obs.available_switches)))
action = PokemonAction(action_type="switch", action_index=switch_idx)
action_desc = f"Switch to {obs.team[switch_idx + 1].species.title()}"
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Array index mismatch: switch_idx is selected from available_switches (line 498), but used to index obs.team[switch_idx + 1] (line 500). The available_switches list contains indices into the team, not raw team indices. This should be obs.team[obs.available_switches[switch_idx]] to correctly map from the chosen switch index to the actual team member.

Suggested change
action_desc = f"Switch to {obs.team[switch_idx + 1].species.title()}"
action_desc = f"Switch to {obs.team[obs.available_switches[switch_idx]].species.title()}"

Copilot uses AI. Check for mistakes.
dynamax: Whether to dynamax this turn (if applicable)
terastallize: Whether to terastallize this turn (if applicable)
"""
action_type: Literal["move", "switch"] = "move"
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The action_type field doesn't include 'forfeit' or 'default' as valid options, but the pokemon_environment.py implementation handles these action types (lines 142, 193). Consider adding these to the Literal type for consistency: Literal[\"move\", \"switch\", \"forfeit\", \"default\"].

Suggested change
action_type: Literal["move", "switch"] = "move"
action_type: Literal["move", "switch", "forfeit", "default"] = "move"

Copilot uses AI. Check for mistakes.
except ImportError as e:
raise ImportError(
"poke-env is not installed. "
"Please install it with: pip install poke-env"
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corrected command formatting. The installation command should use backticks for code formatting consistency: 'Please install it with: pip install poke-env'

Suggested change
"Please install it with: pip install poke-env"
"Please install it with: `pip install poke-env`"

Copilot uses AI. Check for mistakes.
# Otherwise, use a move
elif obs.available_moves:
move_idx = random.choice(range(len(obs.available_moves)))
move_name = obs.available_moves[move_idx]
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obs.available_moves is a list of integers (move indices, not move names) based on the PokemonObservation model definition. Using it directly as move_name will result in displaying an integer instead of the actual move name. To get the move name, use: obs.active_pokemon.moves[obs.available_moves[move_idx]]['id'] if available.

Suggested change
move_name = obs.available_moves[move_idx]
move_index = obs.available_moves[move_idx]
move_name = obs.active_pokemon.moves[move_index]['id'] if obs.active_pokemon and obs.active_pokemon.moves and move_index < len(obs.active_pokemon.moves) else str(move_index)

Copilot uses AI. Check for mistakes.
Comment on lines +34 to +36
mega_evolve: bool = False
dynamax: bool = False
terastallize: bool = False
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing documentation for the z_move field. While mega_evolve, dynamax, and terastallize are documented in the docstring, z_move is referenced in pokemon_environment.py but not included in the PokemonAction dataclass or its docstring. Either add the field or remove references to it in the implementation.

Copilot uses AI. Check for mistakes.
"""

import sys
import os
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'os' is not used.

Suggested change
import os

Copilot uses AI. Check for mistakes.
import sys
import os
import argparse
import time
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'time' is not used.

Suggested change
import time

Copilot uses AI. Check for mistakes.
"""

import sys
import os
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'os' is not used.

Suggested change
import os

Copilot uses AI. Check for mistakes.

import sys
import os
import time
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'time' is not used.

Suggested change
import time

Copilot uses AI. Check for mistakes.
sys.path.insert(0, str(project_root / "src"))

# Import models and environment
from envs.pokemon_env.models import PokemonAction, PokemonObservation, PokemonState
Copy link

Copilot AI Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'PokemonObservation' is not used.
Import of 'PokemonState' is not used.

Suggested change
from envs.pokemon_env.models import PokemonAction, PokemonObservation, PokemonState
from envs.pokemon_env.models import PokemonAction

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leaving my manual review here and claude review coming next...

action_index: int = 0
move_id: Optional[str] = None
switch_pokemon: Optional[str] = None
mega_evolve: bool = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these be inside Action type then?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or like, these should be "can_mega_evolve", no?

ability: Optional[str]
item: Optional[str]

attack: int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't get to see the opponent pokemon's stats ever though

active_pokemon: Optional[PokemonData] = None
opponent_active_pokemon: Optional[PokemonData] = None
team: List[PokemonData] = field(default_factory=list)
opponent_team: List[PokemonData] = field(default_factory=list)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need a placeholder for opponent pokemon that you have not yet seen. eg at the very start of the battle you see that the other guy has 6 pokemon, then they launch the first and you see that he has 5 more but you don't know which ones

turn: int = 0
forced_switch: bool = False

can_mega_evolve: bool = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are already there in the PokemonData of the active pokemon

def _compute_reward(self, battle, done: bool) -> float:
"""Compute reward based on reward_mode."""

if self.reward_mode == "sparse":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not do magic numbers here, these should be configurable

# Final outcome bonus
if done:
if battle.won:
reward += 0.5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(You probably want something more like 10 for this)

self._last_player_fainted = player_fainted

# Small reward for opponent HP damage
if battle.opponent_active_pokemon and hasattr(battle.opponent_active_pokemon, 'current_hp_fraction'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should likely add a reward for statuses. Paralyzed, Toxin, Spikes, Leech Seed etc.

There should be another reward for REMOVING statuses. What made Blissey super strong was that she could learn Aromatherapy which removes status debuffs on your entire team.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also add a reward for buffs which are a big deal in this game (we banned using Curse on Snorlax back in the day because he became ridiculously strong). Marowak is one Swords Dance away from 999 Attack which one-shots most pokemon in the game. Buffs are heavily quantized, eg Dragon Dance is +1 Atk +1 Spd while Swords Dance is +2 Atk and Belly Drum is +4 Atk (but you lose 50% hp!). Each +1 is +50%.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, there should be a reward for cleaning buffs. Skarmory was played for this reason: he could Whirlwind out your pokemon (buffs do not survive a switch unlike statuses)

# Small reward for opponent HP damage
if battle.opponent_active_pokemon and hasattr(battle.opponent_active_pokemon, 'current_hp_fraction'):
current_hp = battle.opponent_active_pokemon.current_hp_fraction
if current_hp is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also HP are often recovered. Moves like Recover, etc. There should be another reward for recovery of HP

Copy link
Contributor

@Darktex Darktex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summary

This PR adds a comprehensive Pokemon battle environment integration using poke-env and Pokemon Showdown. The implementation demonstrates excellent async synchronization, proper memory management, and strong test coverage. The architecture follows OpenEnv patterns well.

Highlights ✅

  • Excellent async event loop bridging between FastAPI and poke-env's POKE_LOOP
  • Memory leak prevention with automatic battle cleanup
  • Comprehensive error handling with fallback to legal actions
  • Dual reward modes (sparse/dense) for different RL approaches
  • Strong test coverage (needs relocation)
  • Thread-safe with proper locking

Critical Issues 🔴

Two blocking issues must be fixed before merge:

  1. Missing Copyright Headers: All Python files in src/envs/pokemon_env/ need Meta copyright headers
  2. Tests in Wrong Location: Tests should be in tests/envs/pokemon_env/, not examples/

Important Issues 🟡

  1. Test Suite Integration: Tests need proper pytest structure and CI/CD integration
  2. Dockerfile Organization: Multiple Dockerfiles need clarification (see proposal below)
  3. Hard-coded Port: Dockerfile should respect PORT environment variable
  4. Error Messages: Add context to observation metadata for better debugging
  5. Documentation Cleanup: Remove development artifacts from examples (see proposal below)

Minor Issues 🟢

Issues #8-12 are accepted as minor improvements for future consideration.


Proposals

Proposal for Issue #4: Dockerfile Organization

Current situation: Three Dockerfiles exist:

  • Dockerfile - Combines both Showdown + OpenEnv with supervisor ✅
  • Dockerfile.showdown - Standalone Showdown
  • Dockerfile.pokemonenv - Standalone OpenEnv

Recommendation: Keep the main Dockerfile (it's well-designed) and remove Dockerfile.showdown and Dockerfile.pokemonenv because:

  1. The main Dockerfile already handles everything with multi-stage build
  2. Having multiple Dockerfiles creates confusion about which to use
  3. The README should document the single, consolidated approach
  4. Separate containers are rarely needed for this use case (Showdown must run alongside the env)

Alternative (if separate builds are truly needed): Keep all three but:

  • Rename them clearly: Dockerfile, Dockerfile.showdown-standalone, Dockerfile.env-only
  • Add clear documentation in README explaining when to use each
  • Add a decision tree: "Use main Dockerfile unless you need X"

My recommendation: Remove the extra Dockerfiles. The all-in-one approach is simpler and matches the architecture (two services that must communicate).


Proposal for Issue #7: Documentation Cleanup

Current state: examples/project-pikachu/ contains:

  • test_local_pokemon.py (move to tests) ✓
  • test_http_pokemon.py (move to tests) ✓
  • BasicSetupCheckList.md - Development artifact ❌
  • IMPLEMENTATION_SUMMARY.md - Development artifact ❌
  • Poke-Env.MD - Library documentation (external) ❌
  • Readme.MD - Duplicate of main README ❌
  • TESTING.md - Internal testing guide ❌

Recommendation: Remove entire examples/project-pikachu/ directory and instead:

  1. Create examples/pokemon_env_example.py with a simple, documented example:
#!/usr/bin/env python3
"""
Simple Pokemon Battle Environment Example.

This example shows how to use the Pokemon environment with OpenEnv.

Prerequisites:
    docker run -p 8000:8000 -p 9980:9980 pokemon-env:latest

Usage:
    python examples/pokemon_env_example.py
"""

from envs.pokemon_env import PokemonEnv, PokemonAction
import random

def main():
    # Connect to server
    client = PokemonEnv(base_url="http://localhost:9980")
    
    # Reset environment
    result = client.reset()
    print(f"Battle started! Active Pokemon: {result.observation.active_pokemon.species}")
    
    # Battle loop
    step = 0
    while not result.done and step < 100:
        # Choose random legal action
        if result.observation.available_moves:
            action = PokemonAction(
                action_type="move",
                action_index=random.choice(result.observation.available_moves)
            )
        elif result.observation.available_switches:
            action = PokemonAction(
                action_type="switch",
                action_index=random.choice(result.observation.available_switches)
            )
        else:
            break
        
        result = client.step(action)
        step += 1
        
        if step % 10 == 0:
            print(f"Turn {step}: Reward={result.reward:.2f}")
    
    print(f"\nBattle finished in {step} turns")
    print(f"Final reward: {result.reward}")
    state = client.state()
    print(f"Winner: {state.battle_winner}")

if __name__ == "__main__":
    main()
  1. Merge useful content into main README:

    • Architecture details from IMPLEMENTATION_SUMMARY.md
    • Any unique troubleshooting from TESTING.md
  2. Benefits:

    • Single source of truth (README.md)
    • Clear, runnable example for users
    • No confusion about what's documentation vs development artifacts
    • Cleaner repository structure

Overall: This is a high-quality implementation. Once copyright headers are added and tests are relocated, this will be an excellent addition to OpenEnv! 🎉

@@ -0,0 +1,127 @@
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: Missing Copyright Header

All Python files in the OpenEnv repository must include the Meta copyright header. This file and all other Python files in src/envs/pokemon_env/ are missing this required header.

Required header:

# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

Files needing headers:

  • models.py
  • client.py
  • __init__.py
  • server/app.py
  • server/pokemon_environment.py
  • server/__init__.py

Please add this header to the top of each file before the module docstring.

@@ -0,0 +1,310 @@
#!/usr/bin/env python3
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL: Tests in Wrong Location

Tests should be located in tests/envs/pokemon_env/, not in examples/. The examples directory is for user-facing demonstration code, not test suites.

Required changes:

  1. Create tests/envs/pokemon_env/ directory
  2. Move test_local_pokemon.pytests/envs/pokemon_env/test_environment.py
  3. Move test_http_pokemon.pytests/envs/pokemon_env/test_client.py
  4. Remove manual sys.path manipulation (tests should use proper imports)
  5. Follow pytest conventions and integrate with existing test infrastructure

Reference: See tests/envs/atari_env/ or tests/envs/echo_env/ for proper structure.


# Validate battle state
if self._current_battle.finished:
logger.warning("Step called on finished battle, returning final state")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 IMPORTANT: Incomplete Error Messages

When step() is called on a finished battle, the warning is logged but not exposed to the user in the observation.

Improvement:

logger.warning("Step called on finished battle, returning final state")
obs = self._battle_to_observation(self._current_battle, reward=None, done=True)
obs.metadata["warning"] = "Step called on already finished battle"
return obs

This helps users debug issues when they accidentally call step after the battle ends.

\n\
[program:openenv]\n\
command=uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port 9980\n\
directory=/app\n\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 IMPORTANT: Hard-coded Port

The supervisor config hard-codes port 9980. This should respect the PORT environment variable for deployment flexibility.

Fix:

[program:openenv]
command=bash -c "uvicorn envs.pokemon_env.server.app:app --host 0.0.0.0 --port ${PORT:-9980}"
directory=/app
environment=PYTHONPATH="/app/src",PORT="${PORT:-9980}"

This allows deployments to override the port via environment variable while keeping 9980 as the default.

self._episodes_completed = 0
self._cleanup_interval = 10 # Clean up every 10 episodes

def _pokemon_to_data(self, pokemon) -> Optional[PokemonData]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR: Type Annotation

The pokemon parameter could have a type hint for better documentation:

from typing import Any

def _pokemon_to_data(self, pokemon: Any) -> Optional[PokemonData]:

Since this comes from the external poke-env library, Any is appropriate here.


# Battle history cleanup interval
self._episodes_completed = 0
self._cleanup_interval = 10 # Clean up every 10 episodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 MINOR: Magic Number

The cleanup interval is hard-coded. Consider making it configurable:

def __init__(
    self,
    battle_format: str = "gen8randombattle",
    player_username: Optional[str] = None,
    opponent: Optional[Player] = None,
    reward_mode: str = "sparse",
    max_turns: int = 1000,
    cleanup_interval: int = 10,  # Add this parameter
):
    ...
    self._cleanup_interval = cleanup_interval

This allows fine-tuning in production environments if needed.


```bash
# Build both images (run from project root directory)
docker build -t pokemon-showdown:latest -f src/envs/pokemon_env/server/Dockerfile.showdown .
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 IMPORTANT: Dockerfile Documentation

The README mentions building Dockerfile.showdown and Dockerfile.pokemonenv separately, but the main Dockerfile appears to handle everything with supervisor.

Please clarify:

  1. Which Dockerfiles are actually needed?
  2. When to use each one?

See the proposal in the main review for a recommendation to simplify this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. New Environment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants