This project demonstrates how to export AWS VPC Flow Logs to S3 and subsequently ingest them into Clickhouse Cloud. It includes Terraform configurations to set up the necessary AWS infrastructure and a traffic simulator for testing purposes.
- AWS CLI installed and configured with appropriate credentials
- Terraform v1.10.0 or later
- An AWS account with appropriate permissions
- A Clickhouse Cloud account (for log ingestion)
.
├── main.tf # Main Terraform configuration
├── variables.tf # Variable definitions
├── outputs.tf # Output definitions
├── ec2_log_simulator.tf # EC2 instance for traffic simulation
└── .gitignore # Git ignore file
- Clone the repository:
git clone https://github.com/ClickHouse/aws_vpc_logs_demo.git
cd aws_vpc_logs_demo
- Initialize Terraform:
terraform init
- Configure your AWS credentials:
aws configure sso
# make sure to set the profile to "sa" OR update the profile name in the main.tf file
# Update the Bash Profile or Zsh Profile to set the AWS_PROFILE and AWS_CONFIG_FILE environment variables
export AWS_PROFILE=sa
export AWS_CONFIG_FILE=$HOME/.aws/config
- Review and modify variables:
- Copy
terraform.tfvars.example
toterraform.tfvars
(if provided) - Adjust variables according to your needs
- Deploy the infrastructure:
terraform plan # Review the changes
terraform apply # Apply the changes
The project supports various deployment scenarios through variables:
deploy_vpc
: Create a new VPC (true/false)deploy_s3
: Create a new S3 bucket (true/false)deploy_flow_logs
: Enable VPC Flow Logs (true/false)deploy_simulator
: Deploy EC2 traffic simulator (true/false)
- Captures network traffic in your VPC
- Configurable aggregation intervals
- Logs stored in S3 bucket
- Secure storage for VPC Flow Logs
- Versioning enabled
- Configurable public/private access
- Generates sample network traffic
- Runs on Amazon Linux 2
- Automatically sends HTTP requests to generate flow logs
- Create a new branch:
git checkout -b feature/your-feature-name
-
Make your changes to the Terraform configurations
-
Test your changes:
terraform fmt # Format the code
terraform validate # Validate the configuration
terraform plan # Review changes
- Commit your changes:
git add .
git commit -m "Description of your changes"
- Always format your Terraform code using
terraform fmt
- Use meaningful variable names and descriptions
- Add tags to resources for better organization
- Keep sensitive information in variables and never commit them
After setting up the infrastructure:
- Configure Clickhouse Cloud to read from the S3 bucket
- Set up the appropriate table schema for VPC Flow Logs
- Create a data pipeline to continuously ingest logs
(Detailed Clickhouse integration steps to be added)
To destroy the infrastructure:
terraform destroy
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to the branch
- Create a Pull Request
- The EC2 simulator allows SSH access from any IP (0.0.0.0/0) - modify for production
- S3 bucket public access is configurable - ensure appropriate settings for your use case
- Review and adjust IAM permissions as needed
Create a new issue in the repository!
- Add Clickhouse Integration Steps
- Add Grafana Dashboard
- Clean up the Terraform code