Skip to content

noteable-io/terraform-aws-secure-for-cloud

 
 

Repository files navigation

Sysdig Secure for Cloud in AWS

Terraform module that deploys the Sysdig Secure for Cloud stack in AWS.

Provides unified threat-detection, compliance, forensics and analysis through these major components:

  • Threat Detection: Tracks abnormal and suspicious activities in your cloud environment based on Falco language. Managed through cloud-connector module.

  • Compliance: Enables the evaluation of standard compliance frameworks. Requires both modules cloud-connector and cloud-bench.

  • Identity and Access Management: Analyses user access overly permissive policies. Requires both modules cloud-connector and cloud-bench.

  • Image Scanning: Automatically scans all container images pushed to the registry (ECR) and the images that run on the AWS workload (currently ECS). Managed through cloud-connector.
    Disabled by Default, can be enabled through deploy_image_scanning_ecr and deploy_image_scanning_ecs input variable parameters.

For other Cloud providers check: GCP, Azure


Notice

  • AWS regions
  • Resource creation inventory Find all the resources created by Sysdig examples in the resource-group sysdig-secure-for-cloud (AWS Resource Group & Tag Editor)
  • All Sysdig Secure for Cloud features but Image Scanning are enabled by default. You can enable it through deploy_scanning input variable parameters.
    • Management Account ECR image scanning is not support since it's not a best practies to have an ECR in the management account. However, we have a workaround to solve this problem in case you need to scan images pushed to the management account ECR.
  • Deployment cost This example will create resources that cost money.
    Run terraform destroy when you don't need them anymore
  • For free subscription users, beware that organizational examples may not deploy properly due to the 1 cloud-account limitation. Open an Issue so we can help you here!

Usage

If you're unsure about what/how to use this module, please fill the questionnaire report as an issue and let us know your context, we will be happy to help and improve our module.

Required Permissions

Provisioning Permissions

Terraform provider credentials/token, requires Administrative permissions in order to be able to create the resources specified in the per-example diagram.

Some components may vary, or may be deployed on different accounts (depending on the example). You can check full resources on each module "Resources" section in their README's. You can also check our source code and suggest changes.

This would be an overall schema of the created resources, for the default setup.

  • Cloudtrail / SNS / S3 / SQS
  • SSM Parameter for Sysdig API Token Storage
  • Sysdig Workload: ECS / AppRunner creation (K8s cluster is pre-required, not created)
    • each compute solution require a role to assume for execution
  • CodeBuild for on-demand image scanning
  • Sysdig role for Compliance

Runtime Permissions

General Permissions

ssm: GetParameters

sqs: ReceiveMessage
sqs: DeleteMessage

s3: ListBucket
s3: GetObject

Image-Scanning specific

codebuild: StartBuild

ecr: GetAuthorizationToken
ecr: BatchCheckLayerAvailability
ecr: GetDownloadUrlForLayer
ecr: GetRepositoryPolicy
ecr: DescribeRepositories
ecr: ListImages
ecr: DescribeImages
ecr: BatchGetImage
ecr: GetLifecyclePolicy
ecr: GetLifecyclePolicyPreview
ecr: ListTagsForResource
ecr: DescribeImageScanFindings

ecs:DescribeTaskDefinition
  • Other Notes:
    • only Sysdig workload related permissions are specified above; infrastructure internal resource permissions (such as Cloudtrail permissions to publish on SNS, or SNS-SQS Subscription) are not detailed.
    • For a better security, permissions are resource pinned, instead of *
    • Check Organizational Use Case - Role Summary for more details

Confirm the Services are Working

Check official documentation on Secure for cloud - AWS, Confirm the Services are working

Forcing Events - Threat Detection

Choose one of the rules contained in an activated Runtime Policies for AWS, such as Sysdig AWS Activity Logs policy and execute it in your AWS account. ex.: 'Delete Bucket Public Access Block' can be easily tested going to an S3 bucket > Permissions > Block public access (bucket settings) > edit > uncheck 'Block all public access'

Remember that in case you add new rules to the policy you need to give it time to propagate the changes.

In the cloud-connector logs you should see similar logs to these

A public access block for a bucket has been deleted (requesting user=OrganizationAccountAccessRole, requesting IP=x.x.x.x, AWS region=eu-central-1, bucket=***

If that's not working as expected, some other questions can be checked

  • are events consumed in the sqs queue, or are they pending?
  • are events being sent to sns topic?

In Secure > Events you should see the event coming through, but beware you may need to activate specific levels such as Info depending on the rule you're firing.

Alternativelly, use Terraform example module to trigger Create IAM Policy that Allows All event can be found on examples/trigger-events.

Forcing Events - Image Scanning

Image scanning is not activated by default. Ensure you have the required scanning enablers in place

  • For ECR image scanning, upload any image to an ECR repository of AWS. Can find CLI instructions within the UI of AWS
  • For ECS running image scanning, deploy any task in your own cluster, or the one that we create to deploy our workload (ex.amazon/amazon-ecs-sample image).

It may take some time, but you should see logs detecting the new image in the ECS cloud-connector task

{"component":"ecs-action","message":"processing detection {\"account\":\"***\",\"region\":\"eu-west-3\",\"taskDefinition\":\"apache:1\"}. source=aws_cloudtrail"}
{"component":"ecs-action","message":"analyzing task 'apache:1' in region 'eu-west-3'"}
{"component":"ecs-action","message":"starting ECS scanning for container index 0 in task 'apache:1'"}

and a CodeBuild project being launched successfully



Troubleshooting

Q-General: Need to modify cloud-connector config (to troubleshoot with debug loglevel, modify ingestors for testing, ...)

A: both in ECS and AppRunner workload types, cloud-connector configuration is passed as a base64-encoded string through the env var CONFIG
S: Get current value, decode it, edit the desired (ex.:logging: debug value), encode it again, and spin it again with this new definition.
For information on all the modifyable configuration see Cloud-Connector Chart reference

Q-General: Getting error "Error: cannot verify credentials" on "sysdig_secure_trusted_cloud_identity" data

A: This happens when Sysdig credentials are not working correctly.
S: Check sysdig provider block is correctly configured with the sysdig_secure_url and sysdig_secure_api_token variables with the correct values. Check Sysdig SaaS per-region URLs if required

Q-General: I'm not able to see Cloud Infrastructure Entitlements Management (CIEM) results

A: Make sure you installed both cloud-bench and cloud-connector modules

Q-General-Networking: What's the requirements for the inbound/outbound connection?

A: Refer to Sysdig SASS Region and IP Ranges Documentation to get Sysdig SaaS endpoint and allow both outbound (for compute vulnerability report) and inbound (for scheduled compliance checkups)
ECS type deployment will create following security-group setup

Q-Scanning: I'm not seeing any image scanning results

A: Need to check several steps
S: First, image scanning is not activated by default. Ensure you have the required scanning enablers in place.
Currently, images are scanned on registry/repository push events, and on the supported compute services on deployment. Make sure these events are triggered.
Dig into secure for cloud compute log (cloud-connector) and check for errors.
If previous logs are ok, check spawned scanning service logs

Q-AWS-Scanning: Images pushed to Management Account ECR are not scanned

A: We don’t scan images from the management account ECR because is not a best practies to have an ECR in this account.
S: Following Role has to be created in the management account

  • Role Name: OrganizationAccountAccessRole
  • Permissions Policies:
    {
      "Version": "2012-10-17",
      "Statement": [
          {
              "Sid": "CustomPolicy",
              "Effect": "Allow",
              "Action": "ecr:GetAuthorizationToken",
              "Resource": "*"
          }
      ]
    }
  • Trust Relationships:
    {
      "Version": "2012-10-17",
      "Statement": [
          {
              "Effect": "Allow",
              "Principal": {
                  "AWS": "arn:aws:iam::<<managementAccountID>>:root"
              },
              "Action": "sts:AssumeRole"
          }
      ]
    }

Q-AWS: Getting Error "BadRequestException: Cannot create group: group already exists

A: This happens when a previous installation of secure-for-cloud exists. On each account where Sysdig has to create resources, it will create a grouping resource-group using the name variable (defaulted to sfc on main examples).
S: Remove previous installation, or if multiple setups are required, use the name varible to change the resource-group name.

Q-AWS: In the ECS compute flavor of secure for cloud, I don't see any logs in the cloud-connector component

A: This may be due to the task not beinb able to start, normally due not not having enough permissions to even fetch the secure apiToken, stored in the AWS SSM service.
S: Access the task and see if there is any value in the "Stopped Reason" field.

Q-AWS: Getting error "Error: failed creating ECS Task Definition: ClientException: No Fargate configuration exists for given values.

A: Your ECS task_size values aren't valid for Fargate. Specifically, your mem_limit value is too big for the cpu_limit you specified
S: Check supported task cpu and memory values

Q-AWS: Getting error "404 Invalid parameter: TopicArn" when trying to reuse an existing cloudtrail-sns

│ Error: error creating SNS Topic Subscription: InvalidParameter: Invalid parameter: TopicArn
│ 	status code: 400, request id: 1fe94ceb-9f58-5d39-a4df-169f55d25eba
│
│   with module.cloudvision_aws_single_account.module.cloud_connector.module.cloud_connector_sqs.aws_sns_topic_subscription.this,
│   on ../../../modules/infrastructure/sqs-sns-subscription/main.tf line 6, in resource "aws_sns_topic_subscription" "this":
│    6: resource "aws_sns_topic_subscription" "this" {

A: In order to subscribe to a SNS Topic, SQS queue must be in the same region
S: Change aws provider region variable to match same region for all resources

Q-AWS: Getting error "400 availabilityZoneId is invalid" when creating the ECS subnet

│ Error: error creating subnet: InvalidParameterValue: Value (apne1-az3) for parameter availabilityZoneId is invalid. Subnets can currently only be created in the following availability zones: apne1-az1, apne1-az2, apne1-az4.
│ 	status code: 400, request id: 6e32d757-2e61-4220-8106-22ccf814e1fe
│
│   with module.vpc.aws_subnet.public[1],
│   on .terraform/modules/vpc/main.tf line 376, in resource "aws_subnet" "public":
│  376: resource "aws_subnet" "public" {

A: For the ECS workload deployment a VPC is being created under the hood. Some AWS zones, such as the 'apne1-az3' in the 'ap-northeast' region does not support NATS, which is activated by default.
S: Specify the desired VPC region availability zones for the vpc module, using the ecs_vpc_region_azs variable to explicit its desired value and workaround the error until AWS gives support for your region.

Q-AWS: I get 400 api error AuthorizationHeaderMalformed on the Sysdig workload ECS Task

error while receiving the messages: error retrieving from S3 bucket=crit-start-trail: operation error S3: GetObject,
https response error StatusCode: 400, RequestID: ***, HostID: ***,
api error AuthorizationHeaderMalformed: The authorization header is malformed; a non-empty Access Key (AKID) must be provided in the credential."}

A: When the S3 bucket, where cloudtrail events are stored, is not in the same account as where the Cloud Connector workload is deployed, it requires the use of the assumeRole configuration. This error happens when the ECS TaskRole has no permissions to assume this role
S: Give permissions to sts:AssumeRole to the role used.

Q-AWS: Getting error 409 EntityAlreadyExists

A: Probably you or someone in the same environment you're using, already deployed a resource with the sysdig terraform module and a naming collision is happening.
S: If you want to maintain several versions, make use of the name input var of the examples

Q-AWS-Datasources: I cannot see my acccount alias in the Data Sources > Cloud page

A: There are several causes to this.
Check that your aws account has an alias set-up. It's not the same as the account name.

$ aws iam list-account-aliases

If all good, test deploy_benchmark flag is enabled on your account, hence the trust-relationship is enabled between Sysdig and your cloud infrastructure. In order to validate the trust-relationship expect no errows on following API.

$ curl -v https://<SYSDIG_SECURE_ENDPOINT>/api/cloud/v2/accounts/<AWS_ACCOUNT_ID>/validateRole \
--header 'Authorization: Bearer <SYSDIG_SECURE_API_TOKEN>'



Upgrading

  • Uninstall previous deployment resources before upgrading
$ terraform destroy
  • Upgrade the full terraform example with
$ terraform init -upgrade
$ terraform plan
$ terraform apply
  • If required, you can upgrade cloud-connector component by restarting the task (stop task). Because it's not pinned to an specific version, it will download the latest one.



Authors

Module is maintained and supported by Sysdig.

License

Apache 2 Licensed. See LICENSE for full details.

Releases

No releases published

Packages

No packages published

Languages

  • HCL 91.0%
  • Python 7.1%
  • Shell 1.7%
  • Ruby 0.2%