Skip to content

Conversation

mnoukhov
Copy link
Contributor

No description provided.

mnoukhov and others added 26 commits August 14, 2025 20:44
* Add Claude Code GitHub Workflow (#959)

* "Claude PR Assistant workflow"

* "Claude Code Review workflow"

* Fix DPO script (#956)

* minor fix

* minor fix

* lint

* update debug script

* update debug script

---------

Co-authored-by: Finbarr Timbers <[email protected]>
Co-authored-by: Hamish Ivison <[email protected]>
Copy link

claude bot commented Aug 28, 2025

PR Review: OLMo 2 Retrofit

I've reviewed this pull request that retrofits the codebase for OLMo 2.5 support. Here's my analysis:

Positive Changes

  1. Proper Flash Attention Migration: The change from use_flash_attention_2=True/False to attn_implementation="flash_attention_2"/"eager" in dpo_tune_cache.py:626 follows the newer transformers API pattern and is consistent with other files in the codebase.

  2. Import Cleanup: Moving from setup_logger to logger_utils.setup_logger() in dpo_tune_cache.py:527 improves code organization.

  3. GitHub Workflows: The added Claude Code review workflows (.github/workflows/claude-code-review.yml and .github/workflows/claude.yml) will provide automated code review capabilities.

⚠️ Areas of Concern

Dependency Management Issues

  1. External Git Dependencies:

    Risk: External forks may become stale, unsupported, or introduce security vulnerabilities. Consider:

    • Documenting the specific changes needed from these forks
    • Creating a plan to migrate back to official releases
    • Regular security audits of these dependencies
  2. Pinned Ray Version: Changed from >=2.44.1 to ==2.46.0 (pyproject.toml:22) - this is good for reproducibility but ensure compatibility testing.

Configuration & Deployment

  1. Hard-coded Paths: scripts/train/rlvr/grpo_olmo25.sh contains hard-coded paths like /weka/oe-training-default/ai2-llm/checkpoints/... (line 19). Consider using environment variables or configuration files.

  2. Docker Build Dependencies: The Dockerfile clones a git repository during build (line 68), which could fail if the repository becomes unavailable during builds.

Code Quality

  1. Missing Error Handling: The GRPO script doesn't include error handling for the git clone operations or path validations.

  2. Resource Configuration: The training script specifies very high resource requirements (4 nodes, 24 vLLM engines) without validation or fallback options.

🔧 Recommendations

  1. Dependency Security:

    • Add version pinning for the git dependencies
    • Consider using commit SHAs instead of branch names for better reproducibility
    • Document the specific patches/changes needed
  2. Configuration Management: Replace hard-coded paths with environment variables

  3. Error Handling: Add validation for critical paths and dependencies in scripts.

  4. Documentation: Consider adding a README section explaining the OLMo 2.5 retrofit changes and setup requirements.

🧪 Test Coverage

The PR doesn't include new tests for the OLMo 2.5 specific functionality. Consider adding:

  • Integration tests for the new model loading paths
  • Validation tests for the attention implementation changes

🔒 Security

  • The GitHub workflow tokens and Docker operations appear secure
  • No hardcoded secrets detected
  • External dependencies should be monitored for security updates

Overall Assessment

This is a functional retrofit for OLMo 2.5 support, but the heavy reliance on external forks and hard-coded configurations presents maintenance challenges. The code changes themselves are technically sound and follow established patterns in the codebase.

Recommendation: ✅ Approve with conditions - address dependency management and add error handling before merging to production environments.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

cursor[bot]

This comment was marked as outdated.

docker image doesn't have jq so get username without it
update comment to remove reference to removed flag
--oe_eval_log_to_wandb
WANDB_ARG=""
if [[ -n "$WANDB_RUN_PATH" ]]; then
beaker_user=$(beaker account whoami --format json | jq -r '.[0].name')
beaker_user=$(beaker account whoami --format text | awk 'NR==2 {print $2}')
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Username Extraction Reliability Issue

The beaker username extraction switched from robust JSON parsing (jq) to fragile text parsing (awk). This new approach relies on a specific text output format (line 2, column 2), making it vulnerable to breaking if the beaker account whoami command's output changes.

Fix in Cursor Fix in Web

--gpus ${NUM_GPUS} \
--budget ai2/oe-adapt \
-- \
source configs/beaker_configs/ray_node_setup.sh \&\& \
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Shell Script Command Chaining Error

The shell commands use \&\& instead of &&. The backslash escapes the ampersand, which prevents proper command chaining and will cause the script to fail.

Additional Locations (1)

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants