Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refs sparkfabrik-innovation-team/board#3230: init module #2

Merged
merged 3 commits into from
Nov 5, 2024

Conversation

nbucci
Copy link
Contributor

@nbucci nbucci commented Oct 30, 2024

PR Type

Enhancement


Description

  • Implemented a new Terraform module for Kubernetes Application Sleep Cycles
  • Created resources for managing namespaces, service accounts, and RBAC
  • Implemented CronJobs for automatically scaling deployments up and down based on working hours
  • Added ConfigMaps for environment variables and application configuration
  • Included a Bash script for handling the scaling logic
  • Updated provider requirements from Google to Kubernetes
  • Revised README with new module description and auto-generated documentation
  • Added various customizable variables for flexible configuration

Changes walkthrough 📝

Relevant files
Enhancement
main.tf
Implement Kubernetes Application Sleep Cycles                       

main.tf

  • Replaced Google Storage Bucket resource with Kubernetes resources
  • Added resources for namespace, service account, cluster roles, role
    bindings
  • Created ConfigMaps for environment variables and application
    configuration
  • Implemented CronJobs for scaling deployments up and down
  • +197/-3 
    outputs.tf
    Update outputs for Kubernetes resources                                   

    outputs.tf

  • Removed output for Google Storage Bucket name
  • Added outputs for namespace and Kubernetes labels
  • +7/-3     
    variables.tf
    Define variables for Kubernetes Sleep Cycles module           

    variables.tf

  • Removed 'name' variable
  • Added multiple variables for Kubernetes configuration
  • Included variables for namespace, labels, service account, and CronJob
    settings
  • +106/-2 
    k8s-working-hours-cronjob.yaml.tftpl
    Add CronJob template for working hours                                     

    files/k8s-working-hours-cronjob.yaml.tftpl

    • Added new template file for Kubernetes CronJob
    +48/-0   
    working-hours.sh
    Add scaling script for working hours                                         

    files/working-hours.sh

    • Added new Bash script for scaling deployments
    +46/-0   
    Configuration changes
    versions.tf
    Update provider and version requirements                                 

    versions.tf

  • Changed required provider from Google to Kubernetes
  • Updated Terraform version requirement
  • +4/-5     
    Documentation
    README.md
    Update README for Kubernetes Sleep Cycles module                 

    README.md

  • Updated title and description for the new module
  • Added auto-generated Terraform documentation
  • +65/-5   

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    @sparkfabrik-ai-bot
    Copy link

    sparkfabrik-ai-bot bot commented Oct 30, 2024

    PR Reviewer Guide 🔍

    (Review updated until commit 631361d)

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 Security concerns

    Sensitive information exposure:
    The module creates a kubernetes_secret_v1 resource (lines 39-52 in main.tf) that generates a service account token. While this is a common practice, it's important to ensure that access to this token is strictly controlled and that it's not exposed in logs or other outputs. Additionally, the use of wait_for_service_account_token = true means the token will be available immediately, which could potentially be a security risk if not properly managed.

    ⚡ Key issues to review

    Security Concern
    The kubernetes_secret_v1 resource is creating a service account token, which could potentially be a security risk if not properly managed.

    Resource Naming
    The use of generate_name for the kubernetes_secret_v1 resource may lead to unexpected behavior or difficulties in resource management.

    Default Configuration
    The default working hours schedules (scale_down at 20:00 and scale_up at 07:30) may not be suitable for all use cases and should be reviewed.

    @sparkfabrik-ai-bot
    Copy link

    sparkfabrik-ai-bot bot commented Oct 30, 2024

    PR Code Suggestions ✨

    Latest suggestions up to 631361d

    CategorySuggestion                                                                                                                                    Score
    Enhancement
    Add error handling and logging to the bash script for better debugging and monitoring

    Consider adding error handling and logging to the bash script. This will help with
    debugging and monitoring the scaling operations.

    files/working-hours.sh [35-46]

     for KUBE_NAMESPACE in $(echo "${NAMESPACES}" | tr "," " "); do
       echo "Processing namespace: ${KUBE_NAMESPACE}"
       for DEPLOYMENT in $(kubectl -n "${KUBE_NAMESPACE}" get deployments -l "${DEPLOYMENTS_LABEL_SELECTOR}" -o jsonpath='{.items[*].metadata.name}'); do
         echo "Processing deployment: ${KUBE_NAMESPACE}/${DEPLOYMENT}"
         if [ "${DRY_RUN}" -eq "1" ]; then
           echo "DRY-RUN: kubectl -n ${KUBE_NAMESPACE} scale --replicas ${GO_TO_REPLICAS} deployment ${DEPLOYMENT}"
           continue
         fi
         echo "Scaling deployment ${KUBE_NAMESPACE}/${DEPLOYMENT} to ${GO_TO_REPLICAS} replicas"
    -    kubectl -n "${KUBE_NAMESPACE}" scale --replicas "${GO_TO_REPLICAS}" deployment "${DEPLOYMENT}"
    +    if ! kubectl -n "${KUBE_NAMESPACE}" scale --replicas "${GO_TO_REPLICAS}" deployment "${DEPLOYMENT}"; then
    +      echo "ERROR: Failed to scale deployment ${KUBE_NAMESPACE}/${DEPLOYMENT}" >&2
    +      exit 1
    +    fi
    +    echo "Successfully scaled deployment ${KUBE_NAMESPACE}/${DEPLOYMENT}"
       done
     done
     
    Suggestion importance[1-10]: 9

    Why: Improved error handling and logging in the bash script is crucial for effective debugging and monitoring of scaling operations.

    9
    Add validation rules for cron schedule variables to ensure they are valid expressions

    Consider adding validation rules for critical variables to ensure they meet specific
    criteria. This can help prevent configuration errors and improve the module's
    robustness.

    variables.tf [98-108]

     variable "working_hours_scale_down_schedule" {
       description = "Cron schedule to scale down the Deployments. Remember that this is relative to the timezone defined in the `cronjob_timezone` variable."
       type        = string
       default     = "0 20 * * *"
    +  validation {
    +    condition     = can(regex("^[0-9*/-]+ [0-9*/-]+ [0-9*/-]+ [0-9*/-]+ [0-9*/-]+$", var.working_hours_scale_down_schedule))
    +    error_message = "The working_hours_scale_down_schedule must be a valid cron expression."
    +  }
     }
     
     variable "working_hours_scale_up_schedule" {
       description = "Cron schedule to scale up the Deployments. Remember that this is relative to the timezone defined in the `cronjob_timezone` variable."
       type        = string
       default     = "30 7 * * 1-5"
    +  validation {
    +    condition     = can(regex("^[0-9*/-]+ [0-9*/-]+ [0-9*/-]+ [0-9*/-]+ [0-9*/-]+$", var.working_hours_scale_up_schedule))
    +    error_message = "The working_hours_scale_up_schedule must be a valid cron expression."
    +  }
     }
     
    Suggestion importance[1-10]: 8

    Why: Adding validation rules for cron schedules prevents configuration errors, significantly improving the module's reliability.

    8
    Add a failure policy to the CronJob resource to handle potential job failures

    Consider adding a failure_policy to the CronJob resource to specify how the job
    should behave if it fails. This can help manage potential issues with the scaling
    operations.

    main.tf [154-175]

     resource "kubernetes_manifest" "scale_down" {
       manifest = yamldecode(
         templatefile(
           "${path.module}/files/k8s-working-hours-cronjob.yaml.tftpl",
           {
             name               = "${var.working_hours_resource_prefix}-scale-down"
             namespace          = local.final_namespace
             labels             = local.k8s_full_labels
             suspend            = var.working_hours_suspend
             schedule           = var.working_hours_scale_down_schedule
             timezone           = var.cronjob_timezone
             image              = var.working_hours_docker_image
             config_map_app     = kubernetes_config_map_v1.app.metadata[0].name
             config_map_app_env = kubernetes_config_map_v1.app_env.metadata[0].name
             service_account    = kubernetes_service_account_v1.this.metadata[0].name
    +        failure_policy     = "Retry"
     
             # This is the scale down script, so we want to scale down the replicas to 0.
             go_to_replicas = 0
           }
         )
       )
     }
     
    Suggestion importance[1-10]: 7

    Why: Adding a failure policy improves the robustness of the CronJob, helping to manage potential issues with scaling operations.

    7
    Best practice
    Use the most recent stable API version for Kubernetes resources

    Consider using a more specific API version for the Kubernetes resources. Instead of
    'kubernetes_*_v1', use the most recent stable API version for each resource type.
    This ensures compatibility with the latest Kubernetes versions and access to newer
    features.

    main.tf [11-21]

    -resource "kubernetes_namespace_v1" "this" {
    +resource "kubernetes_namespace" "this" {
       count = var.create_namespace ? 1 : 0
     
       metadata {
         name = var.namespace
         labels = merge(
           local.k8s_full_labels,
           { name = var.namespace },
         )
       }
     }
     
    Suggestion importance[1-10]: 6

    Why: Using the most recent stable API version ensures compatibility with newer Kubernetes versions, but the current version is still functional.

    6

    Previous suggestions

    Suggestions up to commit afdce41
    CategorySuggestion                                                                                                                                    Score
    Security
    Add a lifecycle block to prevent accidental deletion of critical resources

    Consider adding a 'lifecycle' block to the 'kubernetes_secret_v1' resource to
    prevent accidental deletion of the secret.

    main.tf [39-52]

     resource "kubernetes_secret_v1" "this" {
       metadata {
         generate_name = "${var.service_account_name}-"
         namespace     = local.final_namespace
         labels        = local.k8s_full_labels
     
         annotations = {
           "kubernetes.io/service-account.name" = kubernetes_service_account_v1.this.metadata[0].name
         }
       }
     
       type                           = "kubernetes.io/service-account-token"
       wait_for_service_account_token = true
    +
    +  lifecycle {
    +    prevent_destroy = true
    +  }
     }
     
    Suggestion importance[1-10]: 8

    Why: This is a valuable security enhancement that can prevent accidental deletion of important Kubernetes secrets.

    8
    Enhancement
    Add validation blocks to variables to ensure input values are within expected ranges or formats

    Consider adding validation blocks to the variables to ensure that the input values
    are within expected ranges or formats.

    variables.tf [98-108]

     variable "working_hours_scale_down_schedule" {
       description = "Cron schedule to scale down the Deployments. Remember that this is relative to the timezone defined in the `cronjob_timezone` variable."
       type        = string
       default     = "0 20 * * *"
    +  validation {
    +    condition     = can(regex("^([0-59]|[0-5][0-9]) ([0-23]|[0-1][0-9]|2[0-3]) ([1-31]|[1-2][0-9]|3[0-1]) ([1-12]|0[1-9]) ([0-7]|[0-6])$", var.working_hours_scale_down_schedule))
    +    error_message = "The working_hours_scale_down_schedule must be a valid cron expression."
    +  }
     }
     
     variable "working_hours_scale_up_schedule" {
       description = "Cron schedule to scale up the Deployments. Remember that this is relative to the timezone defined in the `cronjob_timezone` variable."
       type        = string
       default     = "30 7 * * 1-5"
    +  validation {
    +    condition     = can(regex("^([0-59]|[0-5][0-9]) ([0-23]|[0-1][0-9]|2[0-3]) ([1-31]|[1-2][0-9]|3[0-1]) ([1-12]|0[1-9]) ([0-7]|[0-6])$", var.working_hours_scale_up_schedule))
    +    error_message = "The working_hours_scale_up_schedule must be a valid cron expression."
    +  }
     }
     
    Suggestion importance[1-10]: 7

    Why: Adding validation for cron expressions is a good practice to catch potential errors early, improving code robustness.

    7
    Maintainability
    Use more descriptive resource names to improve code readability and maintainability

    Consider using a more descriptive name for the 'kubernetes_manifest' resources to
    improve readability and maintainability.

    main.tf [154-175]

    -resource "kubernetes_manifest" "scale_down" {
    +resource "kubernetes_manifest" "working_hours_scale_down_cronjob" {
       manifest = yamldecode(
         templatefile(
           "${path.module}/files/k8s-working-hours-cronjob.yaml.tftpl",
           {
             name               = "${var.working_hours_resource_prefix}-scale-down"
             namespace          = local.final_namespace
             labels             = local.k8s_full_labels
             suspend            = var.working_hours_suspend
             schedule           = var.working_hours_scale_down_schedule
             timezone           = var.cronjob_timezone
             image              = var.working_hours_docker_image
             config_map_app     = kubernetes_config_map_v1.app.metadata[0].name
             config_map_app_env = kubernetes_config_map_v1.app_env.metadata[0].name
             service_account    = kubernetes_service_account_v1.this.metadata[0].name
     
             # This is the scale down script, so we want to scale down the replicas to 0.
             go_to_replicas = 0
           }
         )
       )
     }
     
    Suggestion importance[1-10]: 6

    Why: The suggestion improves code readability, but the current naming is not incorrect and the benefit is relatively minor.

    6
    Best practice
    Use more specific Kubernetes resource types to improve compatibility and maintainability

    Consider using a more specific API version for the Kubernetes resources. For
    example, instead of 'kubernetes_namespace_v1', use 'kubernetes_namespace'.

    main.tf [11-21]

    -resource "kubernetes_namespace_v1" "this" {
    +resource "kubernetes_namespace" "this" {
       count = var.create_namespace ? 1 : 0
     
       metadata {
         name = var.namespace
         labels = merge(
           local.k8s_full_labels,
           { name = var.namespace },
         )
       }
     }
     
    Suggestion importance[1-10]: 5

    Why: While the suggestion is valid, the current usage of 'kubernetes_namespace_v1' is not incorrect and may be intentional for version specificity.

    5

    @nbucci nbucci marked this pull request as draft October 30, 2024 16:58
    @nbucci nbucci marked this pull request as ready for review November 5, 2024 16:03
    @sparkfabrik-ai-bot
    Copy link

    Persistent review updated to latest commit 631361d

    @nbucci nbucci merged commit 5b8a5a4 into main Nov 5, 2024
    1 check passed
    @Monska85 Monska85 deleted the platform/#3230 branch November 6, 2024 12:43
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    1 participant