Skip to content

Conversation

@cuppett
Copy link
Owner

@cuppett cuppett commented Oct 29, 2025

Summary

Implements the missing K8sReplicaSet resource type that was documented in CLAUDE.md but not actually implemented in the codebase.

Changes

  • New Resource Class: src/Kinds/K8sReplicaSet.php

    • Extends K8sResource with full trait composition
    • Implements InteractsWithK8sCluster, Podable, Scalable, Watchable
    • Uses traits: CanScale, HasPods, HasReplicas, HasSelector, HasSpec, HasStatus, HasStatusConditions, HasTemplate
  • Status Methods:

    • getAvailableReplicasCount() - Available replicas from status
    • getReadyReplicasCount() - Ready replicas from status
    • getFullyLabeledReplicasCount() - Fully labeled replicas (ReplicaSet-specific)
    • getDesiredReplicasCount() - Desired replicas from status
    • podsSelector() - Custom pod selector logic
  • Factory Integration: Added replicaSet() method to InitializesResources trait

  • PHPDoc Annotations: Added cluster operation methods to KubernetesCluster

  • Comprehensive Tests: Full unit and integration test suite (265 lines)

  • YAML Fixture: Test fixture for YAML parsing validation

Testing

✅ Unit tests pass (2/2):

  • test_replica_set_build - Fluent API construction
  • test_replica_set_from_yaml - YAML parsing

Integration test suite ready (requires live cluster with kubectl proxy)

Kubernetes Relationship

  • ReplicaSets can be standalone resources (as implemented)
  • Also work correctly with Deployment-managed ReplicaSets via owner references
  • Simpler than Deployment (no update strategies) - matches Kubernetes behavior

API Details

  • API Version: apps/v1
  • Namespaced: Yes
  • Scalable: Yes (via scale subresource)
  • Watchable: Yes

@cuppett cuppett force-pushed the feat/implement-replicaset branch from dba0cb0 to 2e36d50 Compare October 31, 2025 15:53
cuppett and others added 4 commits October 31, 2025 11:54
Implements K8sReplicaSet as a fully functional apps/v1 resource with:
- Complete trait composition (CanScale, HasPods, HasReplicas, etc.)
- Pod selector and replica count status methods
- Factory method in InitializesResources
- PHPDoc annotations for cluster operations
- Comprehensive test suite with unit and integration tests
- YAML fixture for testing

The ReplicaSet resource was previously documented but not implemented.
This fills that gap, providing standalone ReplicaSet support that also
works correctly with Deployment-managed ReplicaSets via owner references.

Tests pass: 2/2 unit tests (build, YAML parsing)
Integration test requires live cluster (kubectl proxy on :8080)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
- Override create() and update() in K8sScale to use REPLACE_OP (PUT)
  instead of CREATE_OP (POST) since scale subresources only support PUT
- Update ReplicaSetTest to use the correct scale() pattern that matches
  StatefulSetTest implementation
- Scale operations now properly update the replica count on the cluster

Fixes integration test scaling errors where scale API was returning 404
for POST operations. Scale subresources require PUT to the resource's
scale path (e.g., /apis/apps/v1/namespaces/default/replicasets/{name}/scale).

Generated with Claude Code

Co-Authored-By: Claude <[email protected]>
The watchPath() and watchLogsPath() methods were creating socket connections
via fopen() without any client-side timeout, relying entirely on the Kubernetes
API server to close the connection after timeoutSeconds expires.

In GitHub Actions pull_request environments, the server-side connection closure
was not working reliably, causing fgets() to block indefinitely and tests to
hang (timeout after 35+ minutes).

This fix adds stream_set_timeout() to both methods with a timeout value based
on the query parameter timeoutSeconds (default 30s) plus a 5-second buffer.
The buffer allows the server-side timeout to fire first under normal conditions
while ensuring the client always has a fallback timeout.

Additionally, the while loop now checks for stream timeout via
stream_get_meta_data() and closes the socket gracefully if a timeout occurs.

Affected tests:
- NetworkPolicyTest (previously hanging at test renoki-co#127 in CI)
- All 30+ resources with watch operations

Verified working:
- NetworkPolicyTest: passes in 10s (was 20s)
- ReplicaSetTest: passes in 15s
- DeploymentTest: passes in 50s
- All tests complete locally without hangs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>

Fix watch operations hanging indefinitely by using non-blocking streams

Problem:
Watch operations (watchPath/watchLogsPath) were hanging indefinitely in CI,
causing all 24 jobs to timeout after 25-35 minutes at test renoki-co#127 (NetworkPolicyTest).

Root Cause Analysis:
1. Original code used fgets() which blocks indefinitely waiting for data
2. First fix attempt: stream_set_timeout() - Does NOT reliably interrupt fgets()
   when waiting for data that never arrives
3. Second fix attempt: stream_select() - Does NOT work with HTTP wrapper streams
   (fopen('http://...')) - PHP limitation: "Cannot cast a filtered stream"
   stream_select() only works with socket streams, not HTTP filtered streams

Solution:
Implemented non-blocking stream approach:
- Set stream_set_blocking($sock, false) to enable non-blocking mode
- Use fread() with explicit timeout checking instead of fgets()
- Buffer incoming data and process complete lines
- Sleep 100ms when no data available (usleep(100000))
- Check overall timeout (server timeoutSeconds + 5 second buffer)
- Exit cleanly on EOF or timeout

Benefits:
- Works with HTTP wrapper streams from fopen()
- Proper timeout handling prevents indefinite hangs
- Buffered line processing handles partial reads correctly
- NetworkPolicyTest now passes in 0.22s (was hanging for 27+ minutes)
- ReplicaSetTest passes with all watch operations in 3.28s

Testing:
- NetworkPolicyTest: ✔ 27 assertions, 0.221s
- ReplicaSetTest: ✔ 62 assertions, 3.282s
- All watch operations now timeout properly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
Fixed critical test isolation issue where DeploymentTest resources were
being reused by ReplicaSetTest, causing infinite loops and CI timeouts.

Changes:
- ReplicaSetTest: Changed label from 'tier: backend' to 'tier: backend-rs'
  to prevent Deployment controller from adopting the ReplicaSet
- DeploymentTest: Added tearDown() method to ensure cleanup of Deployment
  and HPA resources after each test, preventing resource leakage
- Both tests: Added 60-second timeouts to all wait loops (scaling, deletion)
  to prevent infinite hangs and provide clear error messages

Root cause: DeploymentTest created a Deployment with selector 'tier: backend'
which adopted the ReplicaSet from ReplicaSetTest (also labeled 'tier: backend'),
causing the Deployment controller to scale the ReplicaSet to 0. ReplicaSetTest
then waited indefinitely for 2 replicas that would never appear.

This fix ensures tests are properly isolated and will fail fast with descriptive
errors rather than hanging for 25+ minutes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@cuppett cuppett force-pushed the feat/implement-replicaset branch from 2e36d50 to c9a8ab1 Compare October 31, 2025 15:54
Update test_replica_set_build() to use 'backend-rs' label instead of
'backend', matching the label change made to prevent resource adoption
conflicts with DeploymentTest.

This completes the label isolation work - the unit test was setting
the label to 'backend' but asserting it should be 'backend-rs'.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@cuppett cuppett merged commit d106cde into main Oct 31, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants