Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(ecs): add ecs scheduling #21

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 12 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,18 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.4.0](https://github.com/padok-team/terraform-aws-start-stop-scheduler/compare/v0.3.0...v0.4.0) (2023-12-01)

### ⚠ BREAKING CHANGES

* upgrade to latest aws and terraform version

### Features

* **ecs:** add python code to schedule ecs services
* **ecs:** add iam permissions to update ecs services


## [0.3.0](https://github.com/padok-team/terraform-aws-start-stop-scheduler/compare/v0.2.0...v0.3.0) (2023-03-21)


Expand Down
47 changes: 42 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ It supports :

- **AutoscalingGroups**: it suspends the ASG and terminates its instances. At the start, it resumes the ASG, which launches new instances by itself.
- RDS: support simple RDS DB instance. Run the function stop and start on them.
- ECS: scaling of ECS services to 0
- ~~EC2 instances~~: maybe

The lambda function is _idempotent_, so you can launch it on an already stopped/started resource without any risks! It simplifies your job when planning with crons.
Expand Down Expand Up @@ -102,27 +103,64 @@ aws lambda invoke --function-name <function_name_from_output> --payload '{"actio
```

<!-- BEGIN_TF_DOCS -->
## Requirements

| Name | Version |
|------|---------|
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | ~> 1.0 |
| <a name="requirement_archive"></a> [archive](#requirement\_archive) | ~> 2.0 |
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | ~> 5.0 |

## Providers

| Name | Version |
|------|---------|
| <a name="provider_archive"></a> [archive](#provider\_archive) | 2.3.0 |
| <a name="provider_aws"></a> [aws](#provider\_aws) | 4.59.0 |
| <a name="provider_archive"></a> [archive](#provider\_archive) | ~> 2.0 |
| <a name="provider_aws"></a> [aws](#provider\_aws) | ~> 5.0 |

## Modules

No modules.

## Resources

| Name | Type |
|------|------|
| [aws_cloudwatch_event_rule.start_stop](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_event_rule) | resource |
| [aws_cloudwatch_event_target.start_stop](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_event_target) | resource |
| [aws_cloudwatch_log_group.start_stop_scheduler](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_log_group) | resource |
| [aws_iam_role.lambda](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role) | resource |
| [aws_iam_role_policy.lambda_autoscalinggroup](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy) | resource |
| [aws_iam_role_policy.lambda_ec2](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy) | resource |
| [aws_iam_role_policy.lambda_ecs](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy) | resource |
| [aws_iam_role_policy.lambda_rds](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy) | resource |
| [aws_iam_role_policy.lambda_tagging_api](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy) | resource |
| [aws_iam_role_policy_attachment.lambda_basic](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy_attachment) | resource |
| [aws_lambda_function.start_stop_scheduler](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lambda_function) | resource |
| [aws_lambda_permission.allow_cloudwatch_start](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/lambda_permission) | resource |
| [archive_file.lambda_zip](https://registry.terraform.io/providers/hashicorp/archive/latest/docs/data-sources/file) | data source |
| [aws_iam_policy_document.lambda_assume_role_policy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.lambda_autoscalinggroup](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.lambda_ec2](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.lambda_ecs](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.lambda_rds](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_iam_policy_document.lambda_tagging_api](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/iam_policy_document) | data source |
| [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source |

## Inputs

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_name"></a> [name](#input\_name) | A name used to create resources in module | `string` | n/a | yes |
| <a name="input_schedules"></a> [schedules](#input\_schedules) | List of map containing, the following keys: name (for jobs name), start (cron for the start schedule), stop (cron for stop schedule), tag\_key and tag\_value (target recources) | <pre>list(object({<br> name = string<br> start = string<br> stop = string<br> tag_key = string<br> tag_value = string<br> }))</pre> | n/a | yes |
| <a name="input_asg_schedule"></a> [asg\_schedule](#input\_asg\_schedule) | Run the scheduler on AutoScalingGroup. | `bool` | `true` | no |
| <a name="input_aws_regions"></a> [aws\_regions](#input\_aws\_regions) | List of AWS region where the scheduler will be applied. By default target the current region. | `list(string)` | `null` | no |
| <a name="input_custom_iam_lambda_role"></a> [custom\_iam\_lambda\_role](#input\_custom\_iam\_lambda\_role) | Use a custom role used for the lambda. Useful if you cannot create IAM ressource directly with your AWS profile, or to share a role between several resources. | `bool` | `false` | no |
| <a name="input_custom_iam_lambda_role_arn"></a> [custom\_iam\_lambda\_role\_arn](#input\_custom\_iam\_lambda\_role\_arn) | Custom role arn used for the lambda. Used only if custom\_iam\_lambda\_role is set to true. | `string` | `null` | no |
| <a name="input_ec2_schedule"></a> [ec2\_schedule](#input\_ec2\_schedule) | Run the scheduler on EC2 instances. (only allows downscaling) | `bool` | `false` | no |
| <a name="input_ecs_schedule"></a> [ecs\_schedule](#input\_ecs\_schedule) | Run the scheduler on ECS services. | `bool` | `false` | no |
| <a name="input_lambda_timeout"></a> [lambda\_timeout](#input\_lambda\_timeout) | Amount of time your Lambda Function has to run in seconds. | `number` | `10` | no |
| <a name="input_name"></a> [name](#input\_name) | A name used to create resources in module | `string` | n/a | yes |
| <a name="input_rds_schedule"></a> [rds\_schedule](#input\_rds\_schedule) | Run the scheduler on RDS. | `bool` | `true` | no |
| <a name="input_schedules"></a> [schedules](#input\_schedules) | List of map containing, the following keys: name (for jobs name), start (cron for the start schedule), stop (cron for stop schedule), tag\_key and tag\_value (target recources) | <pre>list(object({<br> name = string<br> start = string<br> stop = string<br> tag_key = string<br> tag_value = string<br> }))</pre> | n/a | yes |
| <a name="input_tags"></a> [tags](#input\_tags) | Custom Resource tags | `map(string)` | `{}` | no |

## Outputs
Expand All @@ -139,7 +177,6 @@ aws lambda invoke --function-name <function_name_from_output> --payload '{"actio
| <a name="output_lambda_function_version"></a> [lambda\_function\_version](#output\_lambda\_function\_version) | Latest published version of your Lambda function |
| <a name="output_lambda_iam_role_arn"></a> [lambda\_iam\_role\_arn](#output\_lambda\_iam\_role\_arn) | The ARN of the IAM role used by Lambda function |
| <a name="output_lambda_iam_role_name"></a> [lambda\_iam\_role\_name](#output\_lambda\_iam\_role\_name) | The name of the IAM role used by Lambda function |

<!-- END_TF_DOCS -->

## Advanced features
Expand Down
75 changes: 75 additions & 0 deletions function/scheduler/ecs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
# -*- coding: utf-8 -*-

"""
ECS service scheduler.

Source: https://github.com/diodonfrost/terraform-aws-lambda-scheduler-stop-start/blob/master/package/scheduler/
"""

import logging
from dataclasses import dataclass
from typing import Dict, Iterator, List, Any

import boto3
from botocore.exceptions import ClientError


logger = logging.getLogger()


@dataclass
class ECSService:
"""ECS service"""

service_name: str
cluster_name: str
ecs: Any

def stop(self) -> None:
"""
Stop AWS ECS service
"""
try:
self.ecs.update_service(
cluster=self.cluster_name, service=self.service_name, desiredCount=0
)
except Exception as e:
logger.warn(e)

def start(self, terminate: bool = True) -> None:
"""
Start AWS ECS service
"""
try:
self.ecs.update_service(
cluster=self.cluster_name, service=self.service_name, desiredCount=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya un sujet la dessus
Peut etre que ton ancien desiredCount était pas à 1 ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Proposition pour résoudre ce problème :

  • Au stop, pour chaque service

    • Créer un paramètre ssm s'il n'existe pas (ex : /padok/start-stop-scheduler-ecs-desired-count-service-<service-name>)
    • Changer sa valeur pour le nombre desired count actuel
  • Au start, pour chaque service

    • Lire le paramètre ssm contenant le desired count
    • L'utiliser pour l'update_service, sinon mettre 1 si jamais il n'existe pas (pour éviter que le scheduler échoue à le redémarrer)

)
except Exception as e:
logger.warn(e)


def list_ecs_services_by_tags(tag_key: str, tag_value: str) -> List[ECSService]:
"""
Aws ECS service list function.
"""

ecs = boto3.client("ecs")
rgta = boto3.client("resourcegroupstaggingapi")

ecs_list = []
paginator = rgta.get_paginator("get_resources")
page_iterator = paginator.paginate(
TagFilters=[{"Key": tag_key, "Values": [tag_value]}],
ResourceTypeFilters=["ecs:service"],
)
for page in page_iterator:
for resource_tag_map in page["ResourceTagMappingList"]:
ecs_list.append(
ECSService(
cluster_name=resource_tag_map["ResourceARN"].split("/")[-2],
service_name=resource_tag_map["ResourceARN"].split("/")[-1],
ecs=ecs,
)
)

return ecs_list
20 changes: 17 additions & 3 deletions function/scheduler/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@
from scheduler.autoscaling import list_asg_by_tags
from scheduler.rds import list_rds_by_tags
from scheduler.ec2 import list_ec2_by_tags
from scheduler.ecs import list_ecs_services_by_tags

logging.basicConfig()
logger = logging.getLogger()
Expand All @@ -22,6 +23,7 @@
ASG_SCHEDULE = os.getenv("ASG_SCHEDULE", "true")
RDS_SCHEDULE = os.getenv("RDS_SCHEDULE", "true")
EC2_SCHEDULE = os.getenv("EC2_SCHEDULE", "true")
ECS_SCHEDULE = os.getenv("ECS_SCHEDULE", "true")


def lambda_handler(event, context):
Expand Down Expand Up @@ -52,7 +54,6 @@ def lambda_handler(event, context):
response = {"action": action, "tag": tag, "affected_resources": {}}

if ASG_SCHEDULE == "true":

logger.info(f"Select autoscaling groups with tags {tag['key']}={tag['value']}")
asgs = list_asg_by_tags(tag["key"], tag["value"])

Expand All @@ -69,7 +70,6 @@ def lambda_handler(event, context):
response["affected_resources"]["asg"] = [a.name for a in asgs]

if RDS_SCHEDULE == "true":

logger.info(f"Select RDS instances with tags {tag['key']}={tag['value']}")
rds_list = list_rds_by_tags(tag["key"], tag["value"])

Expand All @@ -85,7 +85,6 @@ def lambda_handler(event, context):
response["affected_resources"]["rds"] = [r.db_id for r in rds_list]

if EC2_SCHEDULE == "true":

logger.info(f"Select EC2 instances with tags {tag['key']}={tag['value']}")
ec2_list = list_ec2_by_tags(tag["key"], tag["value"])

Expand All @@ -97,6 +96,21 @@ def lambda_handler(event, context):

response["affected_resources"]["ec2"] = [r.instance_id for r in ec2_list]

if ECS_SCHEDULE == "true":
logger.info(f"Select ECS services with tags {tag['key']}={tag['value']}")
ecs_list = list_ecs_services_by_tags(tag["key"], tag["value"])

logger.info(f"Run {action} function on {len(ecs_list)} ecs services")

for ecs in ecs_list:
logger.info(f"Run {action} on {ecs.service_name}")
if action == "start":
ecs.start()
elif action == "stop":
ecs.stop()

response["affected_resources"]["ecs"] = [r.service_name for r in ecs_list]

return response


Expand Down
23 changes: 22 additions & 1 deletion main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,26 @@ data "aws_iam_policy_document" "lambda_ec2" {
}
}

resource "aws_iam_role_policy" "lambda_ecs" {
count = var.custom_iam_lambda_role ? 0 : 1

name_prefix = "${local.name_prefix}_ecs"
role = aws_iam_role.lambda[0].id
policy = data.aws_iam_policy_document.lambda_ecs.json
}

data "aws_iam_policy_document" "lambda_ecs" {
statement {
actions = [
"ecs:UpdateService",
]

resources = [
"*",
]
}
}

resource "aws_iam_role_policy" "lambda_ec2" {
count = var.custom_iam_lambda_role ? 0 : 1

Expand Down Expand Up @@ -144,14 +164,15 @@ resource "aws_lambda_function" "start_stop_scheduler" {

source_code_hash = filebase64sha256(data.archive_file.lambda_zip.output_path)

runtime = "python3.8"
runtime = "python3.11"

environment {
variables = {
AWS_REGIONS = var.aws_regions == null ? data.aws_region.current.name : join(", ", var.aws_regions)
RDS_SCHEDULE = tostring(var.rds_schedule)
ASG_SCHEDULE = tostring(var.asg_schedule)
EC2_SCHEDULE = tostring(var.ec2_schedule)
ECS_SCHEDULE = tostring(var.ecs_schedule)
}
}

Expand Down
6 changes: 6 additions & 0 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,12 @@ variable "ec2_schedule" {
type = bool
}

variable "ecs_schedule" {
default = false
description = "Run the scheduler on ECS services."
type = bool
}

variable "aws_regions" {
default = null
description = "List of AWS region where the scheduler will be applied. By default target the current region."
Expand Down
2 changes: 1 addition & 1 deletion versions.tf
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
version = "~> 5.0"
}

archive = {
Expand Down