From 29b9997402ed81c485919c0ea7422e160360c184 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 13:17:26 -0500 Subject: [PATCH 01/18] Update to SS Proxy --- terraform-ss-proxy/README.md | 144 ++++++++++ .../example_venue_project_config.conf | 24 ++ terraform-ss-proxy/install.sh | 175 ++++++++++++ terraform-ss-proxy/main.tf | 224 +++++++++++++++ terraform-ss-proxy/reload-apache.cgi | 47 ++++ terraform-ss-proxy/sync_apache_config.sh | 83 ------ terraform-ss-proxy/trigger_reload.js | 259 ++++++++++++++++++ terraform-ss-proxy/unity-cs-main.conf | 85 ++++++ 8 files changed, 958 insertions(+), 83 deletions(-) create mode 100644 terraform-ss-proxy/README.md create mode 100644 terraform-ss-proxy/example_venue_project_config.conf create mode 100755 terraform-ss-proxy/install.sh create mode 100644 terraform-ss-proxy/main.tf create mode 100644 terraform-ss-proxy/reload-apache.cgi delete mode 100755 terraform-ss-proxy/sync_apache_config.sh create mode 100644 terraform-ss-proxy/trigger_reload.js create mode 100644 terraform-ss-proxy/unity-cs-main.conf diff --git a/terraform-ss-proxy/README.md b/terraform-ss-proxy/README.md new file mode 100644 index 00000000..9ba48e01 --- /dev/null +++ b/terraform-ss-proxy/README.md @@ -0,0 +1,144 @@ +# Shared Services Proxy Instructions + +## Overview + +The shared services proxy runs as an EC2 instance with Apache2 as a service. The configuration is maintained as files in an S3 bucket which automatically triggers server configuration updates when files are added, modified, or deleted. + +## EC2 Instance Setup + +### Instance Requirements + +Create an EC2 instance with the following specifications: +- **Instance Type**: `t2.large` +- **Storage**: 12 GB +- **IAM Role**: `U-CS_Service_Role` +- **AMI**: Use the standard Ubuntu AMI as documented in the SSM Parameters +- **Security Groups**: Configure to allow HTTPS traffic on port 4443 + +### IAM Permissions Required + +The EC2 instance's IAM role (`U-CS_Service_Role`) must have the following AWS permissions to run the Terraform infrastructure setup: + +**Required AWS Service Permissions:** +- **SQS**: Create and manage FIFO queues (`sqs:CreateQueue`, `sqs:SetQueueAttributes`, `sqs:TagQueue`) +- **Lambda**: Create functions and manage configurations (`lambda:CreateFunction`, `lambda:UpdateFunctionCode`, `lambda:UpdateFunctionConfiguration`, `lambda:AddPermission`, `lambda:CreateEventSourceMapping`) +- **IAM**: Create and manage Lambda execution roles (`iam:CreateRole`, `iam:AttachRolePolicy`, `iam:PutRolePolicy`, `iam:PassRole`) +- **S3**: Configure bucket notifications (`s3:PutBucketNotification`, `s3:GetBucketNotification`) +- **CloudWatch**: Create log groups (`logs:CreateLogGroup`, `logs:PutRetentionPolicy`) + +**Note**: All IAM roles created by this script will use the specified permission boundary ARN to ensure compliance with organizational policies. + +### Prerequisites + +Before setting up the server, several elements need to be deployed in the SS venue: +- A Route53 entry for a dual-stack ALB which points to the EC2 instance +- The S3 bucket for configuration files +- Proper security groups and networking configuration + +## Installation + +### Quick Start + +1. SSH into your EC2 instance +2. Clone this repository +3. Navigate to the `terraform-ss-proxy` directory +4. Run the installation script: + +```bash +./install.sh +``` + +### Configuration Variables + +The install script uses the following default variables (can be overridden via environment variables): + +```bash +S3_BUCKET_NAME="unity-cs-config-bucket" +PERMISSION_BOUNDARY_ARN="arn:aws:iam::237868187491:policy/mcp-tenantOperator-AMI-APIG" +AWS_REGION="us-west-2" +APACHE_HOST="www.dev.mdps.mcp.nasa.gov" +APACHE_PORT="4443" +DEBOUNCE_DELAY="30" +``` + +To override defaults, export environment variables before running the script: + +```bash +export S3_BUCKET_NAME="my-custom-bucket" +export AWS_REGION="us-east-1" +./install.sh +``` + +## Terraform Infrastructure + +The installation script automatically sets up the required AWS infrastructure using Terraform: + +### What Gets Created + +1. **SQS FIFO Queue**: `apache-config-reload.fifo` + - Used for debouncing multiple configuration changes + - Content-based deduplication enabled + - 14-day message retention + +2. **Lambda Function**: `apache-config-reload-trigger` + - Handles both S3 events and SQS message processing + - Implements debouncing logic to prevent rapid successive reloads + - Makes HTTPS calls to Apache reload endpoint + +3. **IAM Role**: `apache-reload-lambda-role` + - Includes required permissions for SQS, CloudWatch Logs + - Uses the specified permission boundary ARN + +4. **S3 Bucket Notifications** + - Triggers Lambda on `.conf` file changes + - Monitors `ObjectCreated:*` and `ObjectRemoved:*` events + +5. **SQS Event Source Mapping** + - Connects SQS queue to Lambda function + - Batch size of 1 to maintain FIFO ordering + +### Terraform State + +The Terraform configuration uses a local backend and stores state in `terraform.tfstate` in the same directory. + +### Manual Terraform Operations + +If you need to run Terraform commands manually: + +```bash +cd terraform-ss-proxy + +# Initialize +terraform init + +# Plan with custom variables +terraform plan \ + -var="s3_bucket_name=my-bucket" \ + -var="permission_boundary_arn=arn:aws:iam::123456789012:policy/MyBoundary" + +# Apply +terraform apply + +# Destroy (when needed) +terraform destroy +``` + +## Configuration Management + +### File Upload Requirements + +- Files uploaded to the S3 bucket **must have a `.conf` extension** +- Only `.conf` files will trigger configuration reloads +- Files are automatically synced to `/etc/apache2/venues.d/` on the server + +### Reload Process + +1. **S3 Event**: `.conf` file uploaded/modified/deleted +2. **Lambda Trigger**: S3 event triggers Lambda function +3. **SQS Queuing**: Lambda sends message to FIFO queue for debouncing +4. **Processing**: Lambda processes SQS message after debounce delay +5. **Apache Reload**: Lambda makes HTTPS call to reload Apache configuration + +### Debouncing + +The system implements a configurable debounce delay (default: 30 seconds) to prevent excessive Apache reloads when multiple configuration files are changed rapidly. \ No newline at end of file diff --git a/terraform-ss-proxy/example_venue_project_config.conf b/terraform-ss-proxy/example_venue_project_config.conf new file mode 100644 index 00000000..cde07c38 --- /dev/null +++ b/terraform-ss-proxy/example_venue_project_config.conf @@ -0,0 +1,24 @@ +# Local variables for this venue +Define VENUE_ALB_HOST your_alb_for_venu_http_proxy +Define VENUE_ALB_PORT 8080 +Define VENUE_ALB_PATH your_path + +# WebSocket upgrade handling +RewriteCond %{HTTP:Connection} Upgrade [NC] +RewriteCond %{HTTP:Upgrade} websocket [NC] +RewriteCond %{REQUEST_URI} "${VENUE_ALB_PATH}" +RewriteRule ${VENUE_ALB_PATH}(.*) ws://${VENUE_ALB_HOST}:${VENUE_ALB_PORT}${VENUE_ALB_PATH}$1 [P,L] [END] + +# Location block for this venue + + AuthType openid-connect + Require valid-user + ProxyPass "http://${VENUE_ALB_HOST}:${VENUE_ALB_PORT}${VENUE_ALB_PATH}" + ProxyPassReverse "http://${VENUE_ALB_HOST}:${VENUE_ALB_PORT}${VENUE_ALB_PATH}" + RequestHeader set "X-Forwarded-Host" "www.dev.mdps.mcp.nasa.gov:4443" + + +# Clean up +UnDefine VENUE_ALB_HOST +UnDefine VENUE_ALB_PORT +UnDefine VENUE_ALB_PATH \ No newline at end of file diff --git a/terraform-ss-proxy/install.sh b/terraform-ss-proxy/install.sh new file mode 100755 index 00000000..6e029362 --- /dev/null +++ b/terraform-ss-proxy/install.sh @@ -0,0 +1,175 @@ +#!/bin/bash + +# Parse command line arguments +TERRAFORM_ONLY=false +while [[ $# -gt 0 ]]; do + case $1 in + --terraform-only) + TERRAFORM_ONLY=true + shift + ;; + *) + echo "Unknown option $1" + echo "Usage: $0 [--terraform-only]" + echo " --terraform-only Only run Terraform infrastructure setup" + exit 1 + ;; + esac +done + +# Default configuration variables +S3_BUCKET_NAME="ucs-shared-services-apache-config-dev-test" +PERMISSION_BOUNDARY_ARN="arn:aws:iam::237868187491:policy/mcp-tenantOperator-AMI-APIG" +AWS_REGION="us-west-2" +APACHE_HOST="www.dev.mdps.mcp.nasa.gov" +APACHE_PORT="4443" +DEBOUNCE_DELAY="30" +OIDC_CLIENT_ID="ee3duo3i707h93vki01ivja8o" +COGNITO_USER_POOL_ID="us-west-2_yaOw3yj0z" + +echo "Using configuration:" +echo " S3_BUCKET_NAME: $S3_BUCKET_NAME" +echo " PERMISSION_BOUNDARY_ARN: $PERMISSION_BOUNDARY_ARN" +echo " AWS_REGION: $AWS_REGION" +echo " APACHE_HOST: $APACHE_HOST" +echo " APACHE_PORT: $APACHE_PORT" +echo " DEBOUNCE_DELAY: $DEBOUNCE_DELAY" +echo "" + +if [ "$TERRAFORM_ONLY" = true ]; then + echo "Running in Terraform-only mode..." +else + # Check if Terraform is installed + if ! command -v terraform &> /dev/null; then + echo "Terraform not found. Installing Terraform..." + + # Linux + wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg + echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list + sudo apt update && sudo apt install terraform + else + echo "Terraform is already installed: $(terraform version)" + fi + + # Install Apache2 + sudo apt-get update + sudo apt-get install -y apache2 +fi + +if [ "$TERRAFORM_ONLY" = false ]; then + # Enable Apache2 modules + sudo a2enmod rewrite cgid ratelimit ssl headers + + # Prepare ratelimit path + sudo mkdir -p /var/lib/apache2/ratelimit + sudo chown www-data:www-data /var/lib/apache2/ratelimit + + # Add Config + sudo cp unity-cs-main.conf /etc/apache2/sites-enabled/unity-cs-main.conf + + # Update with system parameters + # First lookup the client secret + echo "Looking up Cognito client secret..." + CLIENT_SECRET=$(aws cognito-idp describe-user-pool-client \ + --user-pool-id ${COGNITO_USER_POOL_ID} \ + --client-id ${OIDC_CLIENT_ID} \ + --region ${AWS_REGION} \ + --query 'UserPoolClient.ClientSecret' \ + --output text) + + if [ -z "$CLIENT_SECRET" ]; then + echo "Error: Could not retrieve Cognito client secret. Check your credentials and permissions." + exit 1 + else + echo "Successfully retrieved Cognito client secret." + fi + + # Update the configuration with the pool ID and client secret + sudo sed -i "s/\${COGNITO_POOL_ID}/${COGNITO_USER_POOL_ID}/" /etc/apache2/sites-enabled/unity-cs-main.conf + sudo sed -i "s/\${OIDC_CLIENT_ID}/${OIDC_CLIENT_ID}/" /etc/apache2/sites-enabled/unity-cs-main.conf + sudo sed -i "s/\${OIDC_CLIENT_SECRET}/${CLIENT_SECRET}/" /etc/apache2/sites-enabled/unity-cs-main.conf + + # Remove the default + sudo rm /etc/apache2/sites-enabled/000-default.conf + + # Generate a cert + # Generate certificate and key with predefined values to avoid prompts + sudo mkdir -p /etc/ssl/certs/ + sudo mkdir -p /etc/ssl/private/ + sudo openssl req -x509 -nodes -days 3650 -newkey rsa:2048 \ + -keyout /etc/ssl/private/apache-selfsigned.key \ + -out /etc/ssl/certs/apache-selfsigned.crt \ + -subj "/C=US/ST=CA/L=Pasadena/O=MDPS/OU=IT/CN=localhost" + + # Set proper permissions + echo "Setting permissions..." + sudo chmod 600 /etc/ssl/private/apache-selfsigned.key + sudo chmod 644 /etc/ssl/certs/apache-selfsigned.crt + + # Create remove default index.html + sudo rm /var/www/html/index.html + + # Prepare CGI Script + sudo mkdir -p /etc/apache2/venues.d/ + sudo chown www-data:www-data /etc/apache2/venues.d/ + sudo cp reload-apache.cgi /usr/lib/cgi-bin/reload-apache.cgi + sudo chown www-data:www-data /usr/lib/cgi-bin/reload-apache.cgi + sudo chmod 755 /usr/lib/cgi-bin/reload-apache.cgi + + # Update sudoers + # Using tee to append to sudoers to handle permissions, and ensuring lines are added only once. + if ! sudo grep -q "www-data ALL=(ALL) NOPASSWD: /usr/sbin/apachectl configtest" /etc/sudoers; then + echo "www-data ALL=(ALL) NOPASSWD: /usr/sbin/apachectl configtest" | sudo tee -a /etc/sudoers + fi + + if ! sudo grep -q "www-data ALL=(ALL) NOPASSWD: /usr/sbin/apachectl graceful" /etc/sudoers; then + echo "www-data ALL=(ALL) NOPASSWD: /usr/sbin/apachectl graceful" | sudo tee -a /etc/sudoers + fi + + echo "Apache installation complete." +fi + +# Generate a random token for Lambda (needed for both modes) +if [ "$TERRAFORM_ONLY" = false ]; then + SECURE_TOKEN=$(openssl rand -hex 16) + echo "Generated secure token: ${SECURE_TOKEN}" + echo "This token will be used for Lambda authentication." + + # Update the unity-cs-main.conf file with the new token + # The sed command uses a different delimiter (#) to avoid issues if the token contains slashes. + sudo sed -i "s#your_secure_token_here#${SECURE_TOKEN}#g" /etc/apache2/sites-enabled/unity-cs-main.conf + echo "Apache configuration updated with secure token." +else + # Pull the token from /etc/apache2/sites-enabled/unity-cs-main.conf if it exists + if [ -f "/etc/apache2/sites-enabled/unity-cs-main.conf" ]; then + SECURE_TOKEN=$(sudo grep -oP "X-Reload-Token.*?'\K[^']+" /etc/apache2/sites-enabled/unity-cs-main.conf 2>/dev/null || echo "") + if [ -z "$SECURE_TOKEN" ]; then + echo "Error: Could not extract token from existing Apache config." + exit 1 + else + echo "Using existing token from Apache configuration: ${SECURE_TOKEN}" + fi + else + echo "Apache config not found, must run full install." + exit 1 + fi +fi + +# Run Terraform to create AWS infrastructure +echo "Setting up AWS infrastructure with Terraform..." +cd "$(dirname "$0")" + +# Initialize and apply Terraform +terraform init + +terraform apply -auto-approve \ + -var="s3_bucket_name=${S3_BUCKET_NAME}" \ + -var="permission_boundary_arn=${PERMISSION_BOUNDARY_ARN}" \ + -var="reload_token=${SECURE_TOKEN}" \ + -var="aws_region=${AWS_REGION}" \ + -var="apache_host=${APACHE_HOST}" \ + -var="apache_port=${APACHE_PORT}" \ + -var="debounce_delay=${DEBOUNCE_DELAY}" + +echo "AWS infrastructure setup complete!" +echo "Lambda function created and configured to monitor S3 bucket and process via SQS FIFO queue." diff --git a/terraform-ss-proxy/main.tf b/terraform-ss-proxy/main.tf new file mode 100644 index 00000000..feda57b5 --- /dev/null +++ b/terraform-ss-proxy/main.tf @@ -0,0 +1,224 @@ +terraform { + required_providers { + aws = { + source = "hashicorp/aws" + version = "~> 5.0" + } + } + required_version = ">= 1.0" + + # Local backend for state management + backend "local" { + path = "terraform.tfstate" + } +} + +provider "aws" { + region = var.aws_region +} + +# Variables +variable "s3_bucket_name" { + description = "Name of the S3 bucket to monitor for config changes" + type = string +} + +variable "permission_boundary_arn" { + description = "ARN of the permission boundary policy to apply to IAM roles" + type = string +} + +variable "reload_token" { + description = "Secure token for Apache reload authentication" + type = string + sensitive = true +} + +variable "aws_region" { + description = "AWS region" + type = string + default = "us-west-2" +} + +variable "apache_host" { + description = "Apache host for reload endpoint" + type = string + default = "www.dev.mdps.mcp.nasa.gov" +} + +variable "apache_port" { + description = "Apache port for reload endpoint" + type = string + default = "4443" +} + +variable "debounce_delay" { + description = "Debounce delay in seconds" + type = number + default = 30 +} + +# SQS FIFO Queue for debouncing +resource "aws_sqs_queue" "apache_reload_queue" { + name = "apache-config-reload.fifo" + fifo_queue = true + content_based_deduplication = true + + # Visibility timeout should be longer than Lambda timeout + visibility_timeout_seconds = 300 + + # Message retention + message_retention_seconds = 1209600 # 14 days + + tags = { + Name = "SS Proxy Config Reload Queue" + Purpose = "debounce-config-changes" + } +} + +# Lambda execution role +resource "aws_iam_role" "lambda_role" { + name = "unity-cs-proxy-reload-lambda-role" + permissions_boundary = var.permission_boundary_arn + + assume_role_policy = jsonencode({ + Version = "2012-10-17" + Statement = [ + { + Action = "sts:AssumeRole" + Effect = "Allow" + Principal = { + Service = "lambda.amazonaws.com" + } + } + ] + }) + + tags = { + Name = "SS Proxy Reload Lambda Role" + } +} + +# IAM policy for Lambda +resource "aws_iam_role_policy" "lambda_policy" { + name = "unity-cs-proxy-reload-lambda-policy" + role = aws_iam_role.lambda_role.id + + policy = jsonencode({ + Version = "2012-10-17" + Statement = [ + { + Effect = "Allow" + Action = [ + "logs:CreateLogGroup", + "logs:CreateLogStream", + "logs:PutLogEvents" + ] + Resource = "arn:aws:logs:*:*:*" + }, + { + Effect = "Allow" + Action = [ + "sqs:SendMessage", + "sqs:ReceiveMessage", + "sqs:DeleteMessage", + "sqs:GetQueueAttributes" + ] + Resource = aws_sqs_queue.apache_reload_queue.arn + } + ] + }) +} + +# Lambda function +resource "aws_lambda_function" "apache_reload_trigger" { + filename = "trigger_reload.zip" + function_name = "ss-proxy-config-reload-trigger" + role = aws_iam_role.lambda_role.arn + handler = "trigger_reload.handler" + runtime = "nodejs18.x" + timeout = 60 + + environment { + variables = { + APACHE_HOST = var.apache_host + APACHE_PORT = var.apache_port + RELOAD_TOKEN = var.reload_token + SQS_QUEUE_URL = aws_sqs_queue.apache_reload_queue.url + DEBOUNCE_DELAY = var.debounce_delay + } + } + + depends_on = [ + aws_iam_role_policy.lambda_policy, + aws_cloudwatch_log_group.lambda_logs, + ] + + tags = { + Name = "SS Proxy Config Reload Trigger" + } +} + +# CloudWatch Log Group for Lambda +resource "aws_cloudwatch_log_group" "lambda_logs" { + name = "/aws/lambda/apache-config-reload-trigger" + retention_in_days = 14 + + tags = { + Name = "SS Proxy Reload Lambda Logs" + } +} + +# S3 bucket notification +resource "aws_s3_bucket_notification" "bucket_notification" { + bucket = var.s3_bucket_name + + lambda_function { + lambda_function_arn = aws_lambda_function.apache_reload_trigger.arn + events = ["s3:ObjectCreated:*", "s3:ObjectRemoved:*"] + filter_suffix = ".conf" + } + + depends_on = [aws_lambda_permission.allow_s3] +} + +# Lambda permission for S3 +resource "aws_lambda_permission" "allow_s3" { + statement_id = "AllowExecutionFromS3Bucket" + action = "lambda:InvokeFunction" + function_name = aws_lambda_function.apache_reload_trigger.function_name + principal = "s3.amazonaws.com" + source_arn = "arn:aws:s3:::${var.s3_bucket_name}" +} + +# SQS event source mapping for Lambda +resource "aws_lambda_event_source_mapping" "sqs_trigger" { + event_source_arn = aws_sqs_queue.apache_reload_queue.arn + function_name = aws_lambda_function.apache_reload_trigger.arn + batch_size = 1 + enabled = true +} + +# Create the Lambda deployment package +data "archive_file" "lambda_zip" { + type = "zip" + source_file = "trigger_reload.js" + output_path = "trigger_reload.zip" +} + +# Outputs +output "sqs_queue_url" { + description = "URL of the SQS FIFO queue" + value = aws_sqs_queue.apache_reload_queue.url +} + +output "lambda_function_name" { + description = "Name of the Lambda function" + value = aws_lambda_function.apache_reload_trigger.function_name +} + + +output "lambda_function_arn" { + description = "ARN of the Lambda function" + value = aws_lambda_function.apache_reload_trigger.arn +} \ No newline at end of file diff --git a/terraform-ss-proxy/reload-apache.cgi b/terraform-ss-proxy/reload-apache.cgi new file mode 100644 index 00000000..3551a768 --- /dev/null +++ b/terraform-ss-proxy/reload-apache.cgi @@ -0,0 +1,47 @@ +#!/bin/bash + +# Get environment from SSM +export ENV_SSM_PARAM="/unity/account/venue" +ENVIRONMENT=$(aws ssm get-parameter --name ${ENV_SSM_PARAM} --query "Parameter.Value" --output text) + +# Set variables +S3_BUCKET="ucs-shared-services-apache-config-${ENVIRONMENT}" +LOCAL_FILE="/etc/apache2/sites-enabled/unity-cs.conf" +TEMP_FILE="/tmp/unity-cs.conf" +SLACK_WEBHOOK=$(aws ssm get-parameter --name "/unity/shared-services/slack/apache-config-webhook-url" --with-decryption --query "Parameter.Value" --output text) + +# Function to send message to Slack and exit +send_to_slack() { + local message="$1" + local exit_code="$2" + local env_prefix="[Unity-venue-${ENVIRONMENT}] " + curl --silent --output /dev/null -X POST -H 'Content-type: application/json' \ + --data "{\"text\":\"${env_prefix}${message}\"}" \ + "${SLACK_WEBHOOK}" +} + +# Download files from S3 +aws s3 sync s3://${S3_BUCKET}/ /etc/apache2/venues.d/ --exclude "*" --include "*.conf" --quiet + +# Test the config +echo "Content-type: application/json" +echo "" +CONFIG_TEST=$(sudo apache2ctl configtest 2>&1) +if [[ "$CONFIG_TEST" != *"Syntax OK"* ]]; then + send_to_slack "❌ Apache config sync failed: Failed Config Test" 1 + echo '{"status":"error","message":"Failed to validate config"}' +else + + # Log the request for auditing + echo "[$(date)] Apache config reload requested" >> /var/log/apache2/reload.log + + # Execute the graceful reload + RESULT=$(sudo /usr/sbin/apachectl graceful 2>&1) + SUCCESS=$? + + if [ $SUCCESS -eq 0 ]; then + echo '{"status":"success","message":"Apache configuration reloaded successfully"}' + else + echo '{"status":"error","message":"Failed to reload Apache configuration: '"$RESULT"'"}' + fi +fi \ No newline at end of file diff --git a/terraform-ss-proxy/sync_apache_config.sh b/terraform-ss-proxy/sync_apache_config.sh deleted file mode 100755 index e06cbd2b..00000000 --- a/terraform-ss-proxy/sync_apache_config.sh +++ /dev/null @@ -1,83 +0,0 @@ -#!/bin/bash - -# Get environment from SSM -export ENV_SSM_PARAM="/unity/account/venue" -ENVIRONMENT=$(aws ssm get-parameter --name ${ENV_SSM_PARAM} --query "Parameter.Value" --output text) - -# Set variables -S3_BUCKET="ucs-shared-services-apache-config-${ENVIRONMENT}" -S3_FILE_PATH="unity-cs.conf" -LOCAL_FILE="/etc/apache2/sites-enabled/unity-cs.conf" -TEMP_FILE="/tmp/unity-cs.conf" -SLACK_WEBHOOK=$(aws ssm get-parameter --name "/unity/shared-services/slack/apache-config-webhook-url" --with-decryption --query "Parameter.Value" --output text) - -# Function to send message to Slack and exit -send_to_slack_and_exit() { - local message="$1" - local exit_code="$2" - local env_prefix="[Unity-venue-${ENVIRONMENT}] " - curl -X POST -H 'Content-type: application/json' \ - --data "{\"text\":\"${env_prefix}${message}\"}" \ - "${SLACK_WEBHOOK}" - exit "$exit_code" -} - -# Check if aws cli is installed -if ! command -v aws &> /dev/null; then - echo "AWS CLI is not installed. Please install it first." - send_to_slack_and_exit "❌ Apache config sync failed: AWS CLI not installed" 1 -fi - -# Download the file from S3 to a temp location -if ! aws s3 cp "s3://${S3_BUCKET}/${S3_FILE_PATH}" "${TEMP_FILE}"; then - echo "Failed to download configuration from S3" - send_to_slack_and_exit "❌ Apache config sync failed: Unable to download from S3" 1 -fi - -# Check if the local file exists -if [ ! -f "${LOCAL_FILE}" ]; then - echo "Local configuration file does not exist. Creating new one." - sudo mv "${TEMP_FILE}" "${LOCAL_FILE}" - sudo chown root:root "${LOCAL_FILE}" - sudo chmod 644 "${LOCAL_FILE}" - sudo systemctl reload apache2 - send_to_slack_and_exit "✅ New Apache configuration created and applied" 0 -fi - -# Compare the files -if diff "${TEMP_FILE}" "${LOCAL_FILE}" >/dev/null; then - echo "No changes detected in configuration" - rm "${TEMP_FILE}" - exit 0 -else - echo "Changes detected in configuration. Testing new config..." - - # Generate diff for potential notification - DIFF_OUTPUT=$(diff "${TEMP_FILE}" "${LOCAL_FILE}" | sed ':a;N;$!ba;s/\n/\\n/g' | sed 's/"/\\"/g') - - # Create a backup of the current config - BACKUP_FILE="${LOCAL_FILE}.backup" - sudo cp "${LOCAL_FILE}" "${BACKUP_FILE}" - - # Test the new configuration without moving it yet - sudo cp "${TEMP_FILE}" "${LOCAL_FILE}" - - if sudo apache2ctl configtest; then - echo "Apache configuration test passed. Applying changes..." - sudo chown root:root "${LOCAL_FILE}" - sudo chmod 644 "${LOCAL_FILE}" - sudo systemctl reload apache2 - echo "Apache configuration updated successfully" - sudo rm "${TEMP_FILE}" - sudo rm "${BACKUP_FILE}" - send_to_slack_and_exit "✅ Apache configuration updated successfully\nChanges made:\n${DIFF_OUTPUT}" 0 - else - echo "Apache configuration test failed. Reverting to original configuration..." - # Capture the apache config test error - CONFIG_TEST_ERROR=$(sudo apache2ctl configtest 2>&1) - sudo mv "${BACKUP_FILE}" "${LOCAL_FILE}" - rm "${TEMP_FILE}" - echo "Kept original configuration file" - send_to_slack_and_exit "❌ Apache configuration test failed. Original configuration kept.\n\nConfig Test Error:\n${CONFIG_TEST_ERROR}\n\nAttempted Changes:\n${DIFF_OUTPUT}" 1 - fi -fi diff --git a/terraform-ss-proxy/trigger_reload.js b/terraform-ss-proxy/trigger_reload.js new file mode 100644 index 00000000..481ce077 --- /dev/null +++ b/terraform-ss-proxy/trigger_reload.js @@ -0,0 +1,259 @@ +const https = require('https'); +const { SQSClient, SendMessageCommand, GetQueueAttributesCommand } = require('@aws-sdk/client-sqs'); + +const sqs = new SQSClient({}); + +// Configuration - set these as environment variables in Lambda +const APACHE_HOST = process.env.APACHE_HOST || 'www.dev.mdps.mcp.nasa.gov'; +const APACHE_PORT = process.env.APACHE_PORT || '4443'; +const RELOAD_TOKEN = process.env.RELOAD_TOKEN; +const RELOAD_PATH = '/reload-config'; +const SQS_QUEUE_URL = process.env.SQS_QUEUE_URL; +const DEBOUNCE_DELAY = parseInt(process.env.DEBOUNCE_DELAY) || 30; // seconds + +exports.handler = async (event) => { + console.log('Lambda triggered by event:', JSON.stringify(event, null, 2)); + + // Validate required configuration + if (!RELOAD_TOKEN) { + console.error('RELOAD_TOKEN environment variable is required'); + return { + statusCode: 500, + body: JSON.stringify({ error: 'Missing RELOAD_TOKEN configuration' }) + }; + } + + try { + // Check if this is an S3 event or SQS event + if (event.Records && event.Records[0].eventSource === 'aws:s3') { + return await handleS3Event(event); + } else if (event.Records && event.Records[0].eventSource === 'aws:sqs') { + return await handleSQSEvent(event); + } else { + console.log('Unknown event source'); + return { + statusCode: 400, + body: JSON.stringify({ error: 'Unknown event source' }) + }; + } + } catch (error) { + console.error('Error processing event:', error); + return { + statusCode: 500, + body: JSON.stringify({ + error: 'Failed to process event', + details: error.message + }) + }; + } +}; + +async function handleS3Event(event) { + console.log('Processing S3 event'); + + if (!SQS_QUEUE_URL) { + console.error('SQS_QUEUE_URL environment variable is required for S3 events'); + return { + statusCode: 500, + body: JSON.stringify({ error: 'Missing SQS_QUEUE_URL configuration' }) + }; + } + + // Process S3 event records + const s3Events = event.Records || []; + const relevantEvents = s3Events.filter(record => { + const eventName = record.eventName; + const objectKey = record.s3?.object?.key || ''; + + // Only process PUT/POST/DELETE events for .conf files + return (eventName.startsWith('ObjectCreated') || + eventName.startsWith('ObjectRemoved')) && + objectKey.endsWith('.conf'); + }); + + if (relevantEvents.length === 0) { + console.log('No relevant S3 events found (looking for .conf file changes)'); + return { + statusCode: 200, + body: JSON.stringify({ message: 'No config file changes detected' }) + }; + } + + console.log(`Found ${relevantEvents.length} relevant config file changes`); + + // Send a generic message to SQS to trigger the debounced reload + // Use a fixed message body for content-based deduplication + const sqsParams = { + QueueUrl: SQS_QUEUE_URL, + MessageBody: 'S3 Config Changed', // Generic message for deduplication + MessageGroupId: 'apache-config-reload', // For FIFO queue + DelaySeconds: 0 // Send immediately to SQS + }; + + try { + const command = new SendMessageCommand(sqsParams); + const result = await sqs.send(command); + console.log('Message sent to SQS:', result.MessageId); + + return { + statusCode: 200, + body: JSON.stringify({ + message: 'Config change notification sent to SQS', + messageId: result.MessageId, + eventsProcessed: relevantEvents.length + }) + }; + } catch (sqsError) { + console.error('Failed to send SQS message:', sqsError); + throw sqsError; + } +} + +async function handleSQSEvent(event) { + console.log('Processing SQS event'); + + // Process all SQS messages (though typically there should be only one per invocation) + const results = []; + + for (const record of event.Records) { + try { + const messageBody = JSON.parse(record.body); + console.log('Processing SQS message:', messageBody); + + // Implement debouncing by checking if there are newer messages in the queue + const shouldProceed = await checkIfShouldProceed(); + + if (!shouldProceed) { + console.log('Skipping reload - newer messages detected in queue'); + results.push({ + messageId: record.messageId, + status: 'skipped', + reason: 'newer_messages_pending' + }); + continue; + } + + // Wait for the debounce period to allow any in-flight messages to arrive + console.log(`Waiting ${DEBOUNCE_DELAY} seconds for additional changes...`); + await new Promise(resolve => setTimeout(resolve, DEBOUNCE_DELAY * 1000)); + + // Check again after waiting + const shouldProceedAfterWait = await checkIfShouldProceed(); + + if (!shouldProceedAfterWait) { + console.log('Skipping reload after wait - newer messages detected'); + results.push({ + messageId: record.messageId, + status: 'skipped', + reason: 'newer_messages_after_wait' + }); + continue; + } + + // Proceed with Apache reload + const reloadResult = await makeReloadRequest(); + console.log('Apache reload completed:', reloadResult); + + results.push({ + messageId: record.messageId, + status: 'completed', + reloadResult: reloadResult + }); + + } catch (messageError) { + console.error('Error processing SQS message:', messageError); + results.push({ + messageId: record.messageId, + status: 'error', + error: messageError.message + }); + } + } + + return { + statusCode: 200, + body: JSON.stringify({ + message: 'SQS messages processed', + results: results + }) + }; +} + +async function checkIfShouldProceed() { + if (!SQS_QUEUE_URL) { + return true; // If no SQS configured, proceed + } + + try { + // Check if there are messages in the queue + const command = new GetQueueAttributesCommand({ + QueueUrl: SQS_QUEUE_URL, + AttributeNames: ['ApproximateNumberOfMessages', 'ApproximateNumberOfMessagesNotVisible'] + }); + const queueAttributes = await sqs.send(command); + + const visibleMessages = parseInt(queueAttributes.Attributes.ApproximateNumberOfMessages) || 0; + const invisibleMessages = parseInt(queueAttributes.Attributes.ApproximateNumberOfMessagesNotVisible) || 0; + + console.log(`Queue status - Visible: ${visibleMessages}, In-flight: ${invisibleMessages}`); + + // If there are other visible messages, don't proceed (let the newer message handle it) + return visibleMessages === 0; + + } catch (error) { + console.error('Error checking queue status:', error); + // If we can't check the queue, proceed to be safe + return true; + } +} + +function makeReloadRequest() { + return new Promise((resolve, reject) => { + const options = { + hostname: APACHE_HOST, + port: APACHE_PORT, + path: RELOAD_PATH, + method: 'GET', + headers: { + 'X-Reload-Token': RELOAD_TOKEN, + 'User-Agent': 'AWS-Lambda-S3-Trigger/1.0' + }, + // For self-signed certificates or testing + rejectUnauthorized: process.env.NODE_TLS_REJECT_UNAUTHORIZED !== '0' + }; + + const req = https.request(options, (res) => { + let data = ''; + + res.on('data', (chunk) => { + data += chunk; + }); + + res.on('end', () => { + console.log(`Apache reload response (${res.statusCode}):`, data); + + if (res.statusCode >= 200 && res.statusCode < 300) { + resolve({ + statusCode: res.statusCode, + response: data + }); + } else { + reject(new Error(`HTTP ${res.statusCode}: ${data}`)); + } + }); + }); + + req.on('error', (error) => { + console.error('Request error:', error); + reject(error); + }); + + // Set timeout + req.setTimeout(30000, () => { + req.destroy(); + reject(new Error('Request timeout')); + }); + + req.end(); + }); +} \ No newline at end of file diff --git a/terraform-ss-proxy/unity-cs-main.conf b/terraform-ss-proxy/unity-cs-main.conf new file mode 100644 index 00000000..fd531e4d --- /dev/null +++ b/terraform-ss-proxy/unity-cs-main.conf @@ -0,0 +1,85 @@ + + # Update these with the proper values for the SS Account + Define PORT_NUM 8443 + Define UNITY_SERVER_NAME unity-shared-services-httpd-server-dev + Define UNITY_COGNITO_USER_POOL_ID ${COGNITO_POOL_ID} + Define UNITY_OIDC_CLIENT_ID ${OIDC_CLIENT_ID} + Define UNITY_OIDC_CLIENT_SECRET ${OIDC_CLIENT_SECRET} + Define UNITY_OIDC_REDIRECT_URI https://www.dev.mdps.mcp.nasa.gov:${PORT_NUM}/unity/redirect-url + Define UNITY_OIDC_OIDC_CRYPTO_PASSPHRASE https://www.dev.mdps.mcp.nasa.gov:${PORT_NUM}/unity/redirect-url + + ServerName ${UNITY_SERVER_NAME} + ServerAlias ${UNITY_SERVER_NAME} + ServerAdmin postmaster@unity.httpd + ErrorLog ${APACHE_LOG_DIR}/error.log + CustomLog ${APACHE_LOG_DIR}/access.log combined + SSLProxyEngine On + SSLCertificateFile /etc/ssl/certs/apache-selfsigned.crt + SSLCertificateKeyFile /etc/ssl/private/apache-selfsigned.key + SSLProxyCheckPeerCN off + SSLProxyCheckPeerExpire on + SSLProxyCheckPeerName off + ProxyRequests Off + + OIDCScope "openid email profile" + OIDCProviderMetadataURL https://cognito-idp.us-west-2.amazonaws.com/${UNITY_COGNITO_USER_POOL_ID}/.well-known/openid-configuration + OIDCClientID ${UNITY_OIDC_Client_ID} + OIDCClientSecret ${UNITY_OIDC_CLIENT_SECRET} + + # OIDCRedirectURI is a vanity URL that must point to a path protected by this module but must NOT point to any content + OIDCRedirectURI ${UNITY_OIDC_REDIRECT_URI} + OIDCCryptoPassphrase ${UNITY_OIDC_OIDC_CRYPTO_PASSPHRASE} + + # Set max number of state cookies to avoid piling up OIDC state cookies + OIDCStateMaxNumberOfCookies 3 true + + Define JPL_URL https://www.jpl.nasa.gov/ + + # Map the reload-config URL to the actual script + ScriptAlias /reload-config /usr/lib/cgi-bin/reload-apache.cgi + + # Create the endpoint handler + + # Explicitly disable OIDC authentication for this endpoint + AuthType None + + # Check for the required token using conditional logic + + Require all denied + + + Require all granted + + + SetHandler cgi-script + Options +ExecCGI + + + + AuthType openid-connect + Require valid-user + # No content needed - OIDC module handles this internally + + + ProxyPass "/data/" "http://internal-uds-dev-ds-alb-694766956.us-west-2.elb.amazonaws.com:8005/data/" + ProxyPassReverse "/data/" "http://internal-uds-dev-ds-alb-694766956.us-west-2.elb.amazonaws.com:8005/data/" + + + ProxyPreserveHost on + AuthType openid-connect + Require valid-user + + + + ProxyPass "/stac_fast_api/" "http://internal-uds-sbx-ds-alb-1539439526.us-west-2.elb.amazonaws.com:8080/" + ProxyPassReverse "/stac_fast_api/" "http://internal-uds-sbx-ds-alb-1539439526.us-west-2.elb.amazonaws.com:8080/" + + ProxyPreserveHost on + AuthType openid-connect + Require valid-user + + + # Include all venue configurations (if any) + IncludeOptional /etc/apache2/venues.d/*.conf + + \ No newline at end of file From f0ec44891354d7dbf8998e5b04a14569f8fd0214 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 13:43:48 -0500 Subject: [PATCH 02/18] Fixes and updates --- terraform-ss-proxy/README.md | 50 +++++++++++++++++++--- terraform-ss-proxy/install.sh | 62 +++++++++++++++++++--------- terraform-ss-proxy/main.tf | 2 +- terraform-ss-proxy/trigger_reload.js | 2 +- 4 files changed, 90 insertions(+), 26 deletions(-) diff --git a/terraform-ss-proxy/README.md b/terraform-ss-proxy/README.md index 9ba48e01..a1436ee6 100644 --- a/terraform-ss-proxy/README.md +++ b/terraform-ss-proxy/README.md @@ -53,12 +53,14 @@ Before setting up the server, several elements need to be deployed in the SS ven The install script uses the following default variables (can be overridden via environment variables): ```bash -S3_BUCKET_NAME="unity-cs-config-bucket" +S3_BUCKET_NAME="ucs-shared-services-apache-config-dev-test" PERMISSION_BOUNDARY_ARN="arn:aws:iam::237868187491:policy/mcp-tenantOperator-AMI-APIG" AWS_REGION="us-west-2" APACHE_HOST="www.dev.mdps.mcp.nasa.gov" APACHE_PORT="4443" DEBOUNCE_DELAY="30" +OIDC_CLIENT_ID="ee3duo3i707h93vki01ivja8o" +COGNITO_USER_POOL_ID="us-west-2_yaOw3yj0z" ``` To override defaults, export environment variables before running the script: @@ -69,6 +71,21 @@ export AWS_REGION="us-east-1" ./install.sh ``` +### Install Options + +The install script supports the following options: + +```bash +# Full installation (Apache + Terraform) +./install.sh + +# Terraform infrastructure only (requires existing Apache config) +./install.sh --terraform-only + +# Destroy Terraform infrastructure (cleanup) +./install.sh --destroy-terraform +``` + ## Terraform Infrastructure The installation script automatically sets up the required AWS infrastructure using Terraform: @@ -80,12 +97,12 @@ The installation script automatically sets up the required AWS infrastructure us - Content-based deduplication enabled - 14-day message retention -2. **Lambda Function**: `apache-config-reload-trigger` +2. **Lambda Function**: `ss-proxy-config-reload-trigger` - Handles both S3 events and SQS message processing - Implements debouncing logic to prevent rapid successive reloads - Makes HTTPS calls to Apache reload endpoint -3. **IAM Role**: `apache-reload-lambda-role` +3. **IAM Role**: `unity-cs-proxy-reload-lambda-role` - Includes required permissions for SQS, CloudWatch Logs - Uses the specified permission boundary ARN @@ -93,7 +110,11 @@ The installation script automatically sets up the required AWS infrastructure us - Triggers Lambda on `.conf` file changes - Monitors `ObjectCreated:*` and `ObjectRemoved:*` events -5. **SQS Event Source Mapping** +5. **CloudWatch Log Group**: `/aws/lambda/ss-proxy-config-reload-trigger` + - Logs Lambda function execution and errors + - 14-day retention for troubleshooting + +6. **SQS Event Source Mapping** - Connects SQS queue to Lambda function - Batch size of 1 to maintain FIFO ordering @@ -141,4 +162,23 @@ terraform destroy ### Debouncing -The system implements a configurable debounce delay (default: 30 seconds) to prevent excessive Apache reloads when multiple configuration files are changed rapidly. \ No newline at end of file +The system implements a configurable debounce delay (default: 30 seconds) to prevent excessive Apache reloads when multiple configuration files are changed rapidly. + +## Cleanup + +### Destroying Infrastructure + +To remove all AWS infrastructure created by this setup: + +```bash +./install.sh --destroy-terraform +``` + +This will: +- Destroy the Lambda function and its execution role +- Remove the SQS FIFO queue +- Delete the CloudWatch log group +- Remove S3 bucket notifications +- Clean up all related AWS resources + +**Note**: This does not affect the Apache installation or configuration files on the EC2 instance. \ No newline at end of file diff --git a/terraform-ss-proxy/install.sh b/terraform-ss-proxy/install.sh index 6e029362..86c3a0d0 100755 --- a/terraform-ss-proxy/install.sh +++ b/terraform-ss-proxy/install.sh @@ -2,16 +2,23 @@ # Parse command line arguments TERRAFORM_ONLY=false +DESTROY_TERRAFORM=false while [[ $# -gt 0 ]]; do case $1 in --terraform-only) TERRAFORM_ONLY=true shift ;; + --destroy-terraform) + DESTROY_TERRAFORM=true + TERRAFORM_ONLY=true + shift + ;; *) echo "Unknown option $1" - echo "Usage: $0 [--terraform-only]" - echo " --terraform-only Only run Terraform infrastructure setup" + echo "Usage: $0 [--terraform-only] [--destroy-terraform]" + echo " --terraform-only Only run Terraform infrastructure setup" + echo " --destroy-terraform Destroy Terraform infrastructure (implies --terraform-only)" exit 1 ;; esac @@ -36,7 +43,9 @@ echo " APACHE_PORT: $APACHE_PORT" echo " DEBOUNCE_DELAY: $DEBOUNCE_DELAY" echo "" -if [ "$TERRAFORM_ONLY" = true ]; then +if [ "$DESTROY_TERRAFORM" = true ]; then + echo "Running in Terraform destroy mode..." +elif [ "$TERRAFORM_ONLY" = true ]; then echo "Running in Terraform-only mode..." else # Check if Terraform is installed @@ -129,7 +138,7 @@ if [ "$TERRAFORM_ONLY" = false ]; then echo "Apache installation complete." fi -# Generate a random token for Lambda (needed for both modes) +# Generate a random token for Lambda (needed for both modes, except destroy) if [ "$TERRAFORM_ONLY" = false ]; then SECURE_TOKEN=$(openssl rand -hex 16) echo "Generated secure token: ${SECURE_TOKEN}" @@ -139,7 +148,7 @@ if [ "$TERRAFORM_ONLY" = false ]; then # The sed command uses a different delimiter (#) to avoid issues if the token contains slashes. sudo sed -i "s#your_secure_token_here#${SECURE_TOKEN}#g" /etc/apache2/sites-enabled/unity-cs-main.conf echo "Apache configuration updated with secure token." -else +elif [ "$DESTROY_TERRAFORM" = false ]; then # Pull the token from /etc/apache2/sites-enabled/unity-cs-main.conf if it exists if [ -f "/etc/apache2/sites-enabled/unity-cs-main.conf" ]; then SECURE_TOKEN=$(sudo grep -oP "X-Reload-Token.*?'\K[^']+" /etc/apache2/sites-enabled/unity-cs-main.conf 2>/dev/null || echo "") @@ -155,21 +164,36 @@ else fi fi -# Run Terraform to create AWS infrastructure -echo "Setting up AWS infrastructure with Terraform..." +# Run Terraform cd "$(dirname "$0")" -# Initialize and apply Terraform +# Initialize Terraform terraform init -terraform apply -auto-approve \ - -var="s3_bucket_name=${S3_BUCKET_NAME}" \ - -var="permission_boundary_arn=${PERMISSION_BOUNDARY_ARN}" \ - -var="reload_token=${SECURE_TOKEN}" \ - -var="aws_region=${AWS_REGION}" \ - -var="apache_host=${APACHE_HOST}" \ - -var="apache_port=${APACHE_PORT}" \ - -var="debounce_delay=${DEBOUNCE_DELAY}" - -echo "AWS infrastructure setup complete!" -echo "Lambda function created and configured to monitor S3 bucket and process via SQS FIFO queue." +if [ "$DESTROY_TERRAFORM" = true ]; then + echo "Destroying AWS infrastructure with Terraform..." + terraform destroy -auto-approve \ + -var="s3_bucket_name=${S3_BUCKET_NAME}" \ + -var="permission_boundary_arn=${PERMISSION_BOUNDARY_ARN}" \ + -var="reload_token=dummy" \ + -var="aws_region=${AWS_REGION}" \ + -var="apache_host=${APACHE_HOST}" \ + -var="apache_port=${APACHE_PORT}" \ + -var="debounce_delay=${DEBOUNCE_DELAY}" + + echo "AWS infrastructure destruction complete!" + echo "Lambda function, SQS queue, and related resources have been removed." +else + echo "Setting up AWS infrastructure with Terraform..." + terraform apply -auto-approve \ + -var="s3_bucket_name=${S3_BUCKET_NAME}" \ + -var="permission_boundary_arn=${PERMISSION_BOUNDARY_ARN}" \ + -var="reload_token=${SECURE_TOKEN}" \ + -var="aws_region=${AWS_REGION}" \ + -var="apache_host=${APACHE_HOST}" \ + -var="apache_port=${APACHE_PORT}" \ + -var="debounce_delay=${DEBOUNCE_DELAY}" + + echo "AWS infrastructure setup complete!" + echo "Lambda function created and configured to monitor S3 bucket and process via SQS FIFO queue." +fi diff --git a/terraform-ss-proxy/main.tf b/terraform-ss-proxy/main.tf index feda57b5..b69265f6 100644 --- a/terraform-ss-proxy/main.tf +++ b/terraform-ss-proxy/main.tf @@ -161,7 +161,7 @@ resource "aws_lambda_function" "apache_reload_trigger" { # CloudWatch Log Group for Lambda resource "aws_cloudwatch_log_group" "lambda_logs" { - name = "/aws/lambda/apache-config-reload-trigger" + name = "/aws/lambda/ss-proxy-config-reload-trigger" retention_in_days = 14 tags = { diff --git a/terraform-ss-proxy/trigger_reload.js b/terraform-ss-proxy/trigger_reload.js index 481ce077..7c3498cc 100644 --- a/terraform-ss-proxy/trigger_reload.js +++ b/terraform-ss-proxy/trigger_reload.js @@ -117,7 +117,7 @@ async function handleSQSEvent(event) { for (const record of event.Records) { try { - const messageBody = JSON.parse(record.body); + const messageBody = record.body; console.log('Processing SQS message:', messageBody); // Implement debouncing by checking if there are newer messages in the queue From 5272eac9069eb354332804131f6a5e6ac3b8741d Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 14:19:55 -0500 Subject: [PATCH 03/18] Updates --- terraform-ss-proxy/README.md | 29 ++++++++++++++++++++++++++- terraform-ss-proxy/install.sh | 16 +++++++++++++-- terraform-ss-proxy/reload-apache.cgi | 6 +----- terraform-ss-proxy/unity-cs-main.conf | 14 ++++++------- 4 files changed, 50 insertions(+), 15 deletions(-) diff --git a/terraform-ss-proxy/README.md b/terraform-ss-proxy/README.md index a1436ee6..ed011ba9 100644 --- a/terraform-ss-proxy/README.md +++ b/terraform-ss-proxy/README.md @@ -175,10 +175,37 @@ To remove all AWS infrastructure created by this setup: ``` This will: + - Destroy the Lambda function and its execution role - Remove the SQS FIFO queue - Delete the CloudWatch log group - Remove S3 bucket notifications - Clean up all related AWS resources -**Note**: This does not affect the Apache installation or configuration files on the EC2 instance. \ No newline at end of file +**Note**: This does not affect the Apache installation or configuration files on the EC2 instance. + +## Troubleshooting + +### 403 Forbidden on /reload-config + +If you get a 403 error when testing the reload endpoint, check: + +1. **Token Configuration**: Verify the token was properly replaced in the Apache config: + ```bash + sudo grep "X-Reload-Token" /etc/apache2/sites-enabled/unity-cs-main.conf + ``` + It should show your actual token, not `REPLACE_WITH_SECURE_TOKEN`. + +2. **Test the endpoint manually**: + ```bash + # Get the token from the config + TOKEN=$(sudo grep -oP "X-Reload-Token.*?'\K[^']+" /etc/apache2/sites-enabled/unity-cs-main.conf) + + # Test the endpoint + curl -k -H "X-Reload-Token: $TOKEN" https://localhost:4443/reload-config + ``` + +3. **Check Apache error logs**: + ```bash + sudo tail -f /var/log/apache2/error.log + ``` \ No newline at end of file diff --git a/terraform-ss-proxy/install.sh b/terraform-ss-proxy/install.sh index 86c3a0d0..70fb16bc 100755 --- a/terraform-ss-proxy/install.sh +++ b/terraform-ss-proxy/install.sh @@ -24,7 +24,13 @@ while [[ $# -gt 0 ]]; do esac done + +# Get environment from SSM +export ENV_SSM_PARAM="/unity/account/venue" +ENVIRONMENT=$(aws ssm get-parameter --name ${ENV_SSM_PARAM} --query "Parameter.Value" --output text) + # Default configuration variables +# S3_BUCKET="ucs-shared-services-apache-config-${ENVIRONMENT}" S3_BUCKET_NAME="ucs-shared-services-apache-config-dev-test" PERMISSION_BOUNDARY_ARN="arn:aws:iam::237868187491:policy/mcp-tenantOperator-AMI-APIG" AWS_REGION="us-west-2" @@ -122,6 +128,10 @@ if [ "$TERRAFORM_ONLY" = false ]; then sudo mkdir -p /etc/apache2/venues.d/ sudo chown www-data:www-data /etc/apache2/venues.d/ sudo cp reload-apache.cgi /usr/lib/cgi-bin/reload-apache.cgi + + # Replace S3 bucket placeholder in CGI script + sudo sed -i "s#REPLACE_WITH_S3_BUCKET_NAME#${S3_BUCKET_NAME}#g" /usr/lib/cgi-bin/reload-apache.cgi + sudo chown www-data:www-data /usr/lib/cgi-bin/reload-apache.cgi sudo chmod 755 /usr/lib/cgi-bin/reload-apache.cgi @@ -146,14 +156,16 @@ if [ "$TERRAFORM_ONLY" = false ]; then # Update the unity-cs-main.conf file with the new token # The sed command uses a different delimiter (#) to avoid issues if the token contains slashes. - sudo sed -i "s#your_secure_token_here#${SECURE_TOKEN}#g" /etc/apache2/sites-enabled/unity-cs-main.conf + sudo sed -i "s#REPLACE_WITH_SECURE_TOKEN#${SECURE_TOKEN}#g" /etc/apache2/sites-enabled/unity-cs-main.conf echo "Apache configuration updated with secure token." elif [ "$DESTROY_TERRAFORM" = false ]; then # Pull the token from /etc/apache2/sites-enabled/unity-cs-main.conf if it exists if [ -f "/etc/apache2/sites-enabled/unity-cs-main.conf" ]; then - SECURE_TOKEN=$(sudo grep -oP "X-Reload-Token.*?'\K[^']+" /etc/apache2/sites-enabled/unity-cs-main.conf 2>/dev/null || echo "") + # Extract token from SetEnvIf directive: SetEnvIf X-Reload-Token "^TOKEN_HERE$" valid_token + SECURE_TOKEN=$(sudo grep -oP 'SetEnvIf X-Reload-Token "\^\K[^$]+' /etc/apache2/sites-enabled/unity-cs-main.conf 2>/dev/null || echo "") if [ -z "$SECURE_TOKEN" ]; then echo "Error: Could not extract token from existing Apache config." + echo "Looking for: SetEnvIf X-Reload-Token \"^TOKEN\$\" valid_token" exit 1 else echo "Using existing token from Apache configuration: ${SECURE_TOKEN}" diff --git a/terraform-ss-proxy/reload-apache.cgi b/terraform-ss-proxy/reload-apache.cgi index 3551a768..c6a48dfa 100644 --- a/terraform-ss-proxy/reload-apache.cgi +++ b/terraform-ss-proxy/reload-apache.cgi @@ -1,11 +1,7 @@ #!/bin/bash -# Get environment from SSM -export ENV_SSM_PARAM="/unity/account/venue" -ENVIRONMENT=$(aws ssm get-parameter --name ${ENV_SSM_PARAM} --query "Parameter.Value" --output text) - # Set variables -S3_BUCKET="ucs-shared-services-apache-config-${ENVIRONMENT}" +S3_BUCKET="REPLACE_WITH_S3_BUCKET_NAME" LOCAL_FILE="/etc/apache2/sites-enabled/unity-cs.conf" TEMP_FILE="/tmp/unity-cs.conf" SLACK_WEBHOOK=$(aws ssm get-parameter --name "/unity/shared-services/slack/apache-config-webhook-url" --with-decryption --query "Parameter.Value" --output text) diff --git a/terraform-ss-proxy/unity-cs-main.conf b/terraform-ss-proxy/unity-cs-main.conf index fd531e4d..643e20db 100644 --- a/terraform-ss-proxy/unity-cs-main.conf +++ b/terraform-ss-proxy/unity-cs-main.conf @@ -43,13 +43,13 @@ # Explicitly disable OIDC authentication for this endpoint AuthType None - # Check for the required token using conditional logic - - Require all denied - - - Require all granted - + # Use SetEnvIf to check the token and set environment variable + SetEnvIf X-Reload-Token "^REPLACE_WITH_SECURE_TOKEN$" valid_token + + # Require the environment variable to be set + + Require env valid_token + SetHandler cgi-script Options +ExecCGI From 2ce4bfb4ab64182161ee70cd68014cb1b27ac931 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 16:22:40 -0500 Subject: [PATCH 04/18] Various Fixes and Updates, inc. command to match sudoers --- terraform-ss-proxy/install.sh | 7 +++++++ terraform-ss-proxy/reload-apache.cgi | 11 ++++++++--- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/terraform-ss-proxy/install.sh b/terraform-ss-proxy/install.sh index 70fb16bc..d778f8ba 100755 --- a/terraform-ss-proxy/install.sh +++ b/terraform-ss-proxy/install.sh @@ -132,6 +132,13 @@ if [ "$TERRAFORM_ONLY" = false ]; then # Replace S3 bucket placeholder in CGI script sudo sed -i "s#REPLACE_WITH_S3_BUCKET_NAME#${S3_BUCKET_NAME}#g" /usr/lib/cgi-bin/reload-apache.cgi + # Replace environment placeholder in CGI script + sudo sed -i "s#REPLACE_WITH_ENVIRONMENT_NAME#${ENVIRONMENT}#g" /usr/lib/cgi-bin/reload-apache.cgi + + # Create and set ownership of reload log file + sudo touch /var/log/apache2/reload.log + sudo chown www-data:www-data /var/log/apache2/reload.log + sudo chown www-data:www-data /usr/lib/cgi-bin/reload-apache.cgi sudo chmod 755 /usr/lib/cgi-bin/reload-apache.cgi diff --git a/terraform-ss-proxy/reload-apache.cgi b/terraform-ss-proxy/reload-apache.cgi index c6a48dfa..29e7eab7 100644 --- a/terraform-ss-proxy/reload-apache.cgi +++ b/terraform-ss-proxy/reload-apache.cgi @@ -2,6 +2,7 @@ # Set variables S3_BUCKET="REPLACE_WITH_S3_BUCKET_NAME" +ENVIRONMENT="REPLACE_WITH_ENVIRONMENT_NAME" LOCAL_FILE="/etc/apache2/sites-enabled/unity-cs.conf" TEMP_FILE="/tmp/unity-cs.conf" SLACK_WEBHOOK=$(aws ssm get-parameter --name "/unity/shared-services/slack/apache-config-webhook-url" --with-decryption --query "Parameter.Value" --output text) @@ -9,7 +10,6 @@ SLACK_WEBHOOK=$(aws ssm get-parameter --name "/unity/shared-services/slack/apach # Function to send message to Slack and exit send_to_slack() { local message="$1" - local exit_code="$2" local env_prefix="[Unity-venue-${ENVIRONMENT}] " curl --silent --output /dev/null -X POST -H 'Content-type: application/json' \ --data "{\"text\":\"${env_prefix}${message}\"}" \ @@ -19,12 +19,17 @@ send_to_slack() { # Download files from S3 aws s3 sync s3://${S3_BUCKET}/ /etc/apache2/venues.d/ --exclude "*" --include "*.conf" --quiet +# Do short pause to to make sure +sleep 2 + # Test the config echo "Content-type: application/json" echo "" -CONFIG_TEST=$(sudo apache2ctl configtest 2>&1) +CONFIG_TEST=$(sudo /usr/sbin/apachectl configtest 2>&1) if [[ "$CONFIG_TEST" != *"Syntax OK"* ]]; then - send_to_slack "❌ Apache config sync failed: Failed Config Test" 1 + echo $CONFIG_TEST + send_to_slack "❌ Apache config sync failed: Failed Config Test" + echo "[$(date)] Reload Failed: ${CONFIG_TEST}" >> /var/log/apache2/reload.log echo '{"status":"error","message":"Failed to validate config"}' else From dd7a4d08a336e85059eed06361d3a03b5bd655df Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 16:32:01 -0500 Subject: [PATCH 05/18] Switch to system logging --- terraform-ss-proxy/install.sh | 4 ---- terraform-ss-proxy/reload-apache.cgi | 4 ++-- 2 files changed, 2 insertions(+), 6 deletions(-) diff --git a/terraform-ss-proxy/install.sh b/terraform-ss-proxy/install.sh index d778f8ba..9262f9bb 100755 --- a/terraform-ss-proxy/install.sh +++ b/terraform-ss-proxy/install.sh @@ -135,10 +135,6 @@ if [ "$TERRAFORM_ONLY" = false ]; then # Replace environment placeholder in CGI script sudo sed -i "s#REPLACE_WITH_ENVIRONMENT_NAME#${ENVIRONMENT}#g" /usr/lib/cgi-bin/reload-apache.cgi - # Create and set ownership of reload log file - sudo touch /var/log/apache2/reload.log - sudo chown www-data:www-data /var/log/apache2/reload.log - sudo chown www-data:www-data /usr/lib/cgi-bin/reload-apache.cgi sudo chmod 755 /usr/lib/cgi-bin/reload-apache.cgi diff --git a/terraform-ss-proxy/reload-apache.cgi b/terraform-ss-proxy/reload-apache.cgi index 29e7eab7..b5d0011b 100644 --- a/terraform-ss-proxy/reload-apache.cgi +++ b/terraform-ss-proxy/reload-apache.cgi @@ -29,12 +29,12 @@ CONFIG_TEST=$(sudo /usr/sbin/apachectl configtest 2>&1) if [[ "$CONFIG_TEST" != *"Syntax OK"* ]]; then echo $CONFIG_TEST send_to_slack "❌ Apache config sync failed: Failed Config Test" - echo "[$(date)] Reload Failed: ${CONFIG_TEST}" >> /var/log/apache2/reload.log + logger -t "apache-reload" "Reload Failed: ${CONFIG_TEST}" echo '{"status":"error","message":"Failed to validate config"}' else # Log the request for auditing - echo "[$(date)] Apache config reload requested" >> /var/log/apache2/reload.log + logger -t "apache-reload" "Apache config reload requested" # Execute the graceful reload RESULT=$(sudo /usr/sbin/apachectl graceful 2>&1) From c10416fbe07bef6b73c48ff9f95ef0fb9714f44d Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 16:37:18 -0500 Subject: [PATCH 06/18] Dynamically update port num in apache config --- terraform-ss-proxy/install.sh | 1 + 1 file changed, 1 insertion(+) diff --git a/terraform-ss-proxy/install.sh b/terraform-ss-proxy/install.sh index 9262f9bb..4e73c95a 100755 --- a/terraform-ss-proxy/install.sh +++ b/terraform-ss-proxy/install.sh @@ -103,6 +103,7 @@ if [ "$TERRAFORM_ONLY" = false ]; then sudo sed -i "s/\${COGNITO_POOL_ID}/${COGNITO_USER_POOL_ID}/" /etc/apache2/sites-enabled/unity-cs-main.conf sudo sed -i "s/\${OIDC_CLIENT_ID}/${OIDC_CLIENT_ID}/" /etc/apache2/sites-enabled/unity-cs-main.conf sudo sed -i "s/\${OIDC_CLIENT_SECRET}/${CLIENT_SECRET}/" /etc/apache2/sites-enabled/unity-cs-main.conf + sudo sed -i "s/\${PORT_NUM}/${APACHE_PORT}/" /etc/apache2/sites-enabled/unity-cs-main.conf # Remove the default sudo rm /etc/apache2/sites-enabled/000-default.conf From 64369b82e3b0f87d4a52b49ba4f22f6a32cc44b2 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 16:55:56 -0500 Subject: [PATCH 07/18] Docs and minor updates --- terraform-ss-proxy/README.md | 32 ++++++++++++++++++++++++--- terraform-ss-proxy/install.sh | 21 ++++++++++-------- terraform-ss-proxy/unity-cs-main.conf | 2 +- 3 files changed, 42 insertions(+), 13 deletions(-) diff --git a/terraform-ss-proxy/README.md b/terraform-ss-proxy/README.md index ed011ba9..830383ea 100644 --- a/terraform-ss-proxy/README.md +++ b/terraform-ss-proxy/README.md @@ -42,7 +42,8 @@ Before setting up the server, several elements need to be deployed in the SS ven 1. SSH into your EC2 instance 2. Clone this repository 3. Navigate to the `terraform-ss-proxy` directory -4. Run the installation script: +4. **Configure parameters in install.sh** (see Configuration section below) +5. Run the installation script: ```bash ./install.sh @@ -50,7 +51,7 @@ Before setting up the server, several elements need to be deployed in the SS ven ### Configuration Variables -The install script uses the following default variables (can be overridden via environment variables): +**IMPORTANT**: Before running the installation, you must edit the configuration variables in `install.sh` to match your environment. The script uses the following default variables: ```bash S3_BUCKET_NAME="ucs-shared-services-apache-config-dev-test" @@ -63,7 +64,11 @@ OIDC_CLIENT_ID="ee3duo3i707h93vki01ivja8o" COGNITO_USER_POOL_ID="us-west-2_yaOw3yj0z" ``` -To override defaults, export environment variables before running the script: +**Required Steps**: +1. Edit `install.sh` and update these variables for your environment +2. Ensure the values match your AWS infrastructure and requirements + +Alternatively, you can override defaults by exporting environment variables before running the script: ```bash export S3_BUCKET_NAME="my-custom-bucket" @@ -71,6 +76,27 @@ export AWS_REGION="us-east-1" ./install.sh ``` +### Amazon Cognito Configuration + +**CRITICAL**: Before running the installation, you must configure your Amazon Cognito App Client with the correct callback URL. + +1. **Navigate to your Cognito User Pool** in the AWS Console +2. **Go to App Clients** section +3. **Edit your App Client** (the one specified in `OIDC_CLIENT_ID`) +4. **Add the following to "Allowed callback URLs"**: + ``` + https://{APACHE_HOST}:{APACHE_PORT}/unity/redirect-url + ``` + + For example, with the default values: + ``` + https://www.dev.mdps.mcp.nasa.gov:4443/unity/redirect-url + ``` + +5. **Save the changes** + +**Note**: The callback URL must exactly match your `APACHE_HOST` and `APACHE_PORT` values, followed by `/unity/redirect-url`. If these don't match, authentication will fail. + ### Install Options The install script supports the following options: diff --git a/terraform-ss-proxy/install.sh b/terraform-ss-proxy/install.sh index 4e73c95a..def889a9 100755 --- a/terraform-ss-proxy/install.sh +++ b/terraform-ss-proxy/install.sh @@ -29,16 +29,16 @@ done export ENV_SSM_PARAM="/unity/account/venue" ENVIRONMENT=$(aws ssm get-parameter --name ${ENV_SSM_PARAM} --query "Parameter.Value" --output text) -# Default configuration variables -# S3_BUCKET="ucs-shared-services-apache-config-${ENVIRONMENT}" -S3_BUCKET_NAME="ucs-shared-services-apache-config-dev-test" +# Default configuration variables (can be overridden with environment variables) +# S3_BUCKET="${S3_BUCKET_NAME:-ucs-shared-services-apache-config-${ENVIRONMENT}}" +S3_BUCKET_NAME="${S3_BUCKET_NAME:-ucs-shared-services-apache-config-dev-test}" PERMISSION_BOUNDARY_ARN="arn:aws:iam::237868187491:policy/mcp-tenantOperator-AMI-APIG" -AWS_REGION="us-west-2" -APACHE_HOST="www.dev.mdps.mcp.nasa.gov" -APACHE_PORT="4443" -DEBOUNCE_DELAY="30" -OIDC_CLIENT_ID="ee3duo3i707h93vki01ivja8o" -COGNITO_USER_POOL_ID="us-west-2_yaOw3yj0z" +AWS_REGION="${AWS_REGION:-us-west-2}" +APACHE_HOST="${APACHE_HOST:-www.dev.mdps.mcp.nasa.gov}" +APACHE_PORT="${APACHE_PORT:-4443}" +DEBOUNCE_DELAY="${DEBOUNCE_DELAY:-30}" +OIDC_CLIENT_ID="${OIDC_CLIENT_ID:-ee3duo3i707h93vki01ivja8o}" +COGNITO_USER_POOL_ID="${COGNITO_USER_POOL_ID:-us-west-2_yaOw3yj0z}" echo "Using configuration:" echo " S3_BUCKET_NAME: $S3_BUCKET_NAME" @@ -180,6 +180,9 @@ elif [ "$DESTROY_TERRAFORM" = false ]; then fi fi +#Make sure apache is reloaded (in case we re-run) +sudo apachectl graceful + # Run Terraform cd "$(dirname "$0")" diff --git a/terraform-ss-proxy/unity-cs-main.conf b/terraform-ss-proxy/unity-cs-main.conf index 643e20db..887af725 100644 --- a/terraform-ss-proxy/unity-cs-main.conf +++ b/terraform-ss-proxy/unity-cs-main.conf @@ -1,6 +1,6 @@ # Update these with the proper values for the SS Account - Define PORT_NUM 8443 + Define PORT_NUM ${PORT_NUM} Define UNITY_SERVER_NAME unity-shared-services-httpd-server-dev Define UNITY_COGNITO_USER_POOL_ID ${COGNITO_POOL_ID} Define UNITY_OIDC_CLIENT_ID ${OIDC_CLIENT_ID} From 78c869599371d7218c6ab5c3e6266088817adebe Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 16:59:18 -0500 Subject: [PATCH 08/18] Add missing delete flag --- terraform-ss-proxy/reload-apache.cgi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/terraform-ss-proxy/reload-apache.cgi b/terraform-ss-proxy/reload-apache.cgi index b5d0011b..b45d5c06 100644 --- a/terraform-ss-proxy/reload-apache.cgi +++ b/terraform-ss-proxy/reload-apache.cgi @@ -17,7 +17,7 @@ send_to_slack() { } # Download files from S3 -aws s3 sync s3://${S3_BUCKET}/ /etc/apache2/venues.d/ --exclude "*" --include "*.conf" --quiet +aws s3 sync s3://${S3_BUCKET}/ /etc/apache2/venues.d/ --exclude "*" --include "*.conf" --delete --quiet # Do short pause to to make sure sleep 2 From 8686aab87093bd7f1d7a516f635e2528ead63624 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 17:50:30 -0500 Subject: [PATCH 09/18] Better logging, update to example --- terraform-ss-proxy/example_venue_project_config.conf | 2 +- terraform-ss-proxy/reload-apache.cgi | 3 +-- 2 files changed, 2 insertions(+), 3 deletions(-) diff --git a/terraform-ss-proxy/example_venue_project_config.conf b/terraform-ss-proxy/example_venue_project_config.conf index cde07c38..751c941b 100644 --- a/terraform-ss-proxy/example_venue_project_config.conf +++ b/terraform-ss-proxy/example_venue_project_config.conf @@ -15,7 +15,7 @@ RewriteRule ${VENUE_ALB_PATH}(.*) ws://${VENUE_ALB_HOST}:${VENUE_ALB_PORT}${VENU Require valid-user ProxyPass "http://${VENUE_ALB_HOST}:${VENUE_ALB_PORT}${VENUE_ALB_PATH}" ProxyPassReverse "http://${VENUE_ALB_HOST}:${VENUE_ALB_PORT}${VENUE_ALB_PATH}" - RequestHeader set "X-Forwarded-Host" "www.dev.mdps.mcp.nasa.gov:4443" + RequestHeader set "X-Forwarded-Host" "www.dev.mdps.mcp.nasa.gov:${PORT_NUM}" # Clean up diff --git a/terraform-ss-proxy/reload-apache.cgi b/terraform-ss-proxy/reload-apache.cgi index b45d5c06..3f4c2feb 100644 --- a/terraform-ss-proxy/reload-apache.cgi +++ b/terraform-ss-proxy/reload-apache.cgi @@ -27,8 +27,7 @@ echo "Content-type: application/json" echo "" CONFIG_TEST=$(sudo /usr/sbin/apachectl configtest 2>&1) if [[ "$CONFIG_TEST" != *"Syntax OK"* ]]; then - echo $CONFIG_TEST - send_to_slack "❌ Apache config sync failed: Failed Config Test" + send_to_slack "❌ Apache config sync failed: Failed Config Test: $CONFIG_TEST" logger -t "apache-reload" "Reload Failed: ${CONFIG_TEST}" echo '{"status":"error","message":"Failed to validate config"}' else From 5cae7dcbb940e4fb2bb6f1068326211e350932c8 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 18:25:07 -0500 Subject: [PATCH 10/18] Fixed reload logic and throttling --- terraform-ss-proxy/README.md | 47 +++++++++++++- terraform-ss-proxy/trigger_reload.js | 91 +++++++--------------------- 2 files changed, 65 insertions(+), 73 deletions(-) diff --git a/terraform-ss-proxy/README.md b/terraform-ss-proxy/README.md index 830383ea..aef42c4e 100644 --- a/terraform-ss-proxy/README.md +++ b/terraform-ss-proxy/README.md @@ -59,7 +59,7 @@ PERMISSION_BOUNDARY_ARN="arn:aws:iam::237868187491:policy/mcp-tenantOperator-AMI AWS_REGION="us-west-2" APACHE_HOST="www.dev.mdps.mcp.nasa.gov" APACHE_PORT="4443" -DEBOUNCE_DELAY="30" +RELOAD_DELAY="15" OIDC_CLIENT_ID="ee3duo3i707h93vki01ivja8o" COGNITO_USER_POOL_ID="us-west-2_yaOw3yj0z" ``` @@ -186,9 +186,50 @@ terraform destroy 4. **Processing**: Lambda processes SQS message after debounce delay 5. **Apache Reload**: Lambda makes HTTPS call to reload Apache configuration -### Debouncing +### Reload Throttling -The system implements a configurable debounce delay (default: 30 seconds) to prevent excessive Apache reloads when multiple configuration files are changed rapidly. +The system implements intelligent throttling to ensure Apache configuration reloads are properly spaced while guaranteeing that every S3 event eventually triggers a reload. + +#### How Throttling Works + +1. **Time-Based Bucketing**: The system divides time into intervals (default: 15 seconds, configurable via `RELOAD_DELAY`) + +2. **Boundary Calculation**: When an S3 event occurs, the Lambda calculates the next time boundary: + ``` + Current time: 14:32:07 + Interval: 15 seconds + Next boundary: 14:32:15 (rounded up to next 15-second mark) + ``` + +3. **SQS Message Scheduling**: + - Message is sent to SQS with `MessageDeduplicationId` = boundary timestamp + - `DelaySeconds` = time remaining until that boundary + - Multiple events targeting the same boundary are automatically deduplicated + +4. **Guaranteed Processing**: Each time window gets exactly one reload, but every S3 event is accounted for + +#### Example Timeline + +``` +14:32:07 - S3 event → SQS message for boundary 14:32:15 (8s delay) +14:32:10 - S3 event → REJECTED (duplicate for same boundary) +14:32:12 - S3 event → REJECTED (duplicate for same boundary) +14:32:15 - First message delivered → Apache reload triggered +14:32:23 - S3 event → SQS message for boundary 14:32:30 (7s delay) +14:32:30 - Second message delivered → Apache reload triggered +``` + +#### Configuration + +- **`RELOAD_DELAY`**: Time interval in seconds (default: 15) +- **Lambda Timeout**: Should be set to 15 seconds +- **SQS FIFO**: Uses content-based deduplication with 5-minute window + +This approach ensures: +- ✅ No more than one reload per interval +- ✅ Every S3 event eventually triggers a reload +- ✅ No wasted Lambda execution time +- ✅ Automatic deduplication of rapid changes ## Cleanup diff --git a/terraform-ss-proxy/trigger_reload.js b/terraform-ss-proxy/trigger_reload.js index 7c3498cc..83bf6a2d 100644 --- a/terraform-ss-proxy/trigger_reload.js +++ b/terraform-ss-proxy/trigger_reload.js @@ -1,5 +1,5 @@ const https = require('https'); -const { SQSClient, SendMessageCommand, GetQueueAttributesCommand } = require('@aws-sdk/client-sqs'); +const { SQSClient, SendMessageCommand } = require('@aws-sdk/client-sqs'); const sqs = new SQSClient({}); @@ -9,7 +9,7 @@ const APACHE_PORT = process.env.APACHE_PORT || '4443'; const RELOAD_TOKEN = process.env.RELOAD_TOKEN; const RELOAD_PATH = '/reload-config'; const SQS_QUEUE_URL = process.env.SQS_QUEUE_URL; -const DEBOUNCE_DELAY = parseInt(process.env.DEBOUNCE_DELAY) || 30; // seconds +const RELOAD_DELAY = parseInt(process.env.RELOAD_DELAY) || 15; // seconds exports.handler = async (event) => { console.log('Lambda triggered by event:', JSON.stringify(event, null, 2)); @@ -81,13 +81,18 @@ async function handleS3Event(event) { console.log(`Found ${relevantEvents.length} relevant config file changes`); - // Send a generic message to SQS to trigger the debounced reload - // Use a fixed message body for content-based deduplication + // Calculate next reload boundary (rounded up to next RELOAD_DELAY interval) + const now = Math.floor(Date.now() / 1000); + const nextBoundary = Math.ceil(now / RELOAD_DELAY) * RELOAD_DELAY; + const delaySeconds = Math.max(0, nextBoundary - now); + + // Send message with time-based deduplication to ensure proper throttling const sqsParams = { QueueUrl: SQS_QUEUE_URL, - MessageBody: 'S3 Config Changed', // Generic message for deduplication - MessageGroupId: 'apache-config-reload', // For FIFO queue - DelaySeconds: 0 // Send immediately to SQS + MessageBody: 'S3 Config Changed', + MessageGroupId: 'apache-config-reload', + MessageDeduplicationId: nextBoundary.toString(), // Time-based deduplication + DelaySeconds: delaySeconds // Delay until the boundary timestamp }; try { @@ -98,9 +103,11 @@ async function handleS3Event(event) { return { statusCode: 200, body: JSON.stringify({ - message: 'Config change notification sent to SQS', + message: `Config change notification sent to SQS for reload at ${new Date(nextBoundary * 1000).toISOString()}`, messageId: result.MessageId, - eventsProcessed: relevantEvents.length + eventsProcessed: relevantEvents.length, + nextBoundary: nextBoundary, + delaySeconds: delaySeconds }) }; } catch (sqsError) { @@ -112,7 +119,8 @@ async function handleS3Event(event) { async function handleSQSEvent(event) { console.log('Processing SQS event'); - // Process all SQS messages (though typically there should be only one per invocation) + // Process the SQS message and trigger reload immediately + // The delay was already handled by SQS DelaySeconds const results = []; for (const record of event.Records) { @@ -120,37 +128,7 @@ async function handleSQSEvent(event) { const messageBody = record.body; console.log('Processing SQS message:', messageBody); - // Implement debouncing by checking if there are newer messages in the queue - const shouldProceed = await checkIfShouldProceed(); - - if (!shouldProceed) { - console.log('Skipping reload - newer messages detected in queue'); - results.push({ - messageId: record.messageId, - status: 'skipped', - reason: 'newer_messages_pending' - }); - continue; - } - - // Wait for the debounce period to allow any in-flight messages to arrive - console.log(`Waiting ${DEBOUNCE_DELAY} seconds for additional changes...`); - await new Promise(resolve => setTimeout(resolve, DEBOUNCE_DELAY * 1000)); - - // Check again after waiting - const shouldProceedAfterWait = await checkIfShouldProceed(); - - if (!shouldProceedAfterWait) { - console.log('Skipping reload after wait - newer messages detected'); - results.push({ - messageId: record.messageId, - status: 'skipped', - reason: 'newer_messages_after_wait' - }); - continue; - } - - // Proceed with Apache reload + // Trigger Apache reload immediately const reloadResult = await makeReloadRequest(); console.log('Apache reload completed:', reloadResult); @@ -179,33 +157,6 @@ async function handleSQSEvent(event) { }; } -async function checkIfShouldProceed() { - if (!SQS_QUEUE_URL) { - return true; // If no SQS configured, proceed - } - - try { - // Check if there are messages in the queue - const command = new GetQueueAttributesCommand({ - QueueUrl: SQS_QUEUE_URL, - AttributeNames: ['ApproximateNumberOfMessages', 'ApproximateNumberOfMessagesNotVisible'] - }); - const queueAttributes = await sqs.send(command); - - const visibleMessages = parseInt(queueAttributes.Attributes.ApproximateNumberOfMessages) || 0; - const invisibleMessages = parseInt(queueAttributes.Attributes.ApproximateNumberOfMessagesNotVisible) || 0; - - console.log(`Queue status - Visible: ${visibleMessages}, In-flight: ${invisibleMessages}`); - - // If there are other visible messages, don't proceed (let the newer message handle it) - return visibleMessages === 0; - - } catch (error) { - console.error('Error checking queue status:', error); - // If we can't check the queue, proceed to be safe - return true; - } -} function makeReloadRequest() { return new Promise((resolve, reject) => { @@ -248,8 +199,8 @@ function makeReloadRequest() { reject(error); }); - // Set timeout - req.setTimeout(30000, () => { + // Set timeout (shorter since lambda timeout is 15s) + req.setTimeout(10000, () => { req.destroy(); reject(new Error('Request timeout')); }); From 902193559543e3ad6fa815bb4b53a0f53addbc90 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 18:27:28 -0500 Subject: [PATCH 11/18] Restore bucket name --- terraform-ss-proxy/install.sh | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/terraform-ss-proxy/install.sh b/terraform-ss-proxy/install.sh index def889a9..e3dbeda0 100755 --- a/terraform-ss-proxy/install.sh +++ b/terraform-ss-proxy/install.sh @@ -30,8 +30,7 @@ export ENV_SSM_PARAM="/unity/account/venue" ENVIRONMENT=$(aws ssm get-parameter --name ${ENV_SSM_PARAM} --query "Parameter.Value" --output text) # Default configuration variables (can be overridden with environment variables) -# S3_BUCKET="${S3_BUCKET_NAME:-ucs-shared-services-apache-config-${ENVIRONMENT}}" -S3_BUCKET_NAME="${S3_BUCKET_NAME:-ucs-shared-services-apache-config-dev-test}" +S3_BUCKET_NAME="${S3_BUCKET_NAME:-ucs-shared-services-apache-config-${ENVIRONMENT}}" PERMISSION_BOUNDARY_ARN="arn:aws:iam::237868187491:policy/mcp-tenantOperator-AMI-APIG" AWS_REGION="${AWS_REGION:-us-west-2}" APACHE_HOST="${APACHE_HOST:-www.dev.mdps.mcp.nasa.gov}" From 94eeacb486ad2a2ad8a41448089b905a76999c28 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Tue, 3 Jun 2025 18:54:28 -0500 Subject: [PATCH 12/18] Fixes issues with FIFO delay --- terraform-ss-proxy/README.md | 36 +++++++++++++++++++--------- terraform-ss-proxy/install.sh | 6 ++--- terraform-ss-proxy/main.tf | 2 +- terraform-ss-proxy/trigger_reload.js | 36 ++++++++++++++++++++-------- 4 files changed, 55 insertions(+), 25 deletions(-) diff --git a/terraform-ss-proxy/README.md b/terraform-ss-proxy/README.md index aef42c4e..c5ef9e16 100644 --- a/terraform-ss-proxy/README.md +++ b/terraform-ss-proxy/README.md @@ -201,35 +201,49 @@ The system implements intelligent throttling to ensure Apache configuration relo Next boundary: 14:32:15 (rounded up to next 15-second mark) ``` -3. **SQS Message Scheduling**: - - Message is sent to SQS with `MessageDeduplicationId` = boundary timestamp - - `DelaySeconds` = time remaining until that boundary - - Multiple events targeting the same boundary are automatically deduplicated +3. **SQS Message Queuing**: + - Message is sent to SQS FIFO queue immediately (no DelaySeconds) + - `MessageDeduplicationId` = boundary timestamp for automatic deduplication + - Multiple events targeting the same boundary are automatically deduplicated by SQS -4. **Guaranteed Processing**: Each time window gets exactly one reload, but every S3 event is accounted for +4. **Lambda-Based Delay**: When the Lambda function receives an SQS message: + - It calculates the remaining time until the target boundary + - Waits (using setTimeout) until that exact boundary time + - Then triggers the Apache reload + +5. **Guaranteed Processing**: Each time window gets exactly one reload, but every S3 event is accounted for #### Example Timeline ``` -14:32:07 - S3 event → SQS message for boundary 14:32:15 (8s delay) +14:32:07 - S3 event → SQS message for boundary 14:32:15 (queued immediately) 14:32:10 - S3 event → REJECTED (duplicate for same boundary) 14:32:12 - S3 event → REJECTED (duplicate for same boundary) -14:32:15 - First message delivered → Apache reload triggered -14:32:23 - S3 event → SQS message for boundary 14:32:30 (7s delay) -14:32:30 - Second message delivered → Apache reload triggered +14:32:07 - Lambda receives message → waits 8 seconds → reload at 14:32:15 +14:32:23 - S3 event → SQS message for boundary 14:32:30 (queued immediately) +14:32:23 - Lambda receives message → waits 7 seconds → reload at 14:32:30 ``` +#### FIFO Queue Compatibility + +This approach is designed for **FIFO queue compatibility**: +- **No DelaySeconds**: FIFO queues don't support DelaySeconds > 0 +- **Lambda-based timing**: Delay logic moved to Lambda function processing +- **Consistent timing**: AWS Lambda timing is reliable across instances +- **Order preservation**: FIFO guarantees maintain reload sequence + #### Configuration - **`RELOAD_DELAY`**: Time interval in seconds (default: 15) -- **Lambda Timeout**: Should be set to 15 seconds +- **Lambda Timeout**: Should be set to at least 30 seconds to handle delays - **SQS FIFO**: Uses content-based deduplication with 5-minute window This approach ensures: - ✅ No more than one reload per interval - ✅ Every S3 event eventually triggers a reload -- ✅ No wasted Lambda execution time +- ✅ FIFO queue compatibility (no DelaySeconds restriction) - ✅ Automatic deduplication of rapid changes +- ✅ Precise timing control within Lambda function ## Cleanup diff --git a/terraform-ss-proxy/install.sh b/terraform-ss-proxy/install.sh index e3dbeda0..0bdcbdd3 100755 --- a/terraform-ss-proxy/install.sh +++ b/terraform-ss-proxy/install.sh @@ -35,7 +35,7 @@ PERMISSION_BOUNDARY_ARN="arn:aws:iam::237868187491:policy/mcp-tenantOperator-AMI AWS_REGION="${AWS_REGION:-us-west-2}" APACHE_HOST="${APACHE_HOST:-www.dev.mdps.mcp.nasa.gov}" APACHE_PORT="${APACHE_PORT:-4443}" -DEBOUNCE_DELAY="${DEBOUNCE_DELAY:-30}" +RELOAD_DELAY="${RELOAD_DELAY:-15}" OIDC_CLIENT_ID="${OIDC_CLIENT_ID:-ee3duo3i707h93vki01ivja8o}" COGNITO_USER_POOL_ID="${COGNITO_USER_POOL_ID:-us-west-2_yaOw3yj0z}" @@ -197,7 +197,7 @@ if [ "$DESTROY_TERRAFORM" = true ]; then -var="aws_region=${AWS_REGION}" \ -var="apache_host=${APACHE_HOST}" \ -var="apache_port=${APACHE_PORT}" \ - -var="debounce_delay=${DEBOUNCE_DELAY}" + -var="debounce_delay=${RELOAD_DELAY}" echo "AWS infrastructure destruction complete!" echo "Lambda function, SQS queue, and related resources have been removed." @@ -210,7 +210,7 @@ else -var="aws_region=${AWS_REGION}" \ -var="apache_host=${APACHE_HOST}" \ -var="apache_port=${APACHE_PORT}" \ - -var="debounce_delay=${DEBOUNCE_DELAY}" + -var="debounce_delay=${RELOAD_DELAY}" echo "AWS infrastructure setup complete!" echo "Lambda function created and configured to monitor S3 bucket and process via SQS FIFO queue." diff --git a/terraform-ss-proxy/main.tf b/terraform-ss-proxy/main.tf index b69265f6..69632c4e 100644 --- a/terraform-ss-proxy/main.tf +++ b/terraform-ss-proxy/main.tf @@ -145,7 +145,7 @@ resource "aws_lambda_function" "apache_reload_trigger" { APACHE_PORT = var.apache_port RELOAD_TOKEN = var.reload_token SQS_QUEUE_URL = aws_sqs_queue.apache_reload_queue.url - DEBOUNCE_DELAY = var.debounce_delay + RELOAD_DELAY = var.debounce_delay } } diff --git a/terraform-ss-proxy/trigger_reload.js b/terraform-ss-proxy/trigger_reload.js index 83bf6a2d..da4b45df 100644 --- a/terraform-ss-proxy/trigger_reload.js +++ b/terraform-ss-proxy/trigger_reload.js @@ -9,7 +9,7 @@ const APACHE_PORT = process.env.APACHE_PORT || '4443'; const RELOAD_TOKEN = process.env.RELOAD_TOKEN; const RELOAD_PATH = '/reload-config'; const SQS_QUEUE_URL = process.env.SQS_QUEUE_URL; -const RELOAD_DELAY = parseInt(process.env.RELOAD_DELAY) || 15; // seconds +const RELOAD_DELAY = parseInt(process.env.RELOAD_DELAY) || 15; // seconds (for 15-second windows) exports.handler = async (event) => { console.log('Lambda triggered by event:', JSON.stringify(event, null, 2)); @@ -84,15 +84,18 @@ async function handleS3Event(event) { // Calculate next reload boundary (rounded up to next RELOAD_DELAY interval) const now = Math.floor(Date.now() / 1000); const nextBoundary = Math.ceil(now / RELOAD_DELAY) * RELOAD_DELAY; - const delaySeconds = Math.max(0, nextBoundary - now); // Send message with time-based deduplication to ensure proper throttling + // DelaySeconds removed for FIFO queue compatibility - delay logic moved to Lambda processing const sqsParams = { QueueUrl: SQS_QUEUE_URL, - MessageBody: 'S3 Config Changed', + MessageBody: JSON.stringify({ + message: 'S3 Config Changed', + targetBoundary: nextBoundary, + timestamp: now + }), MessageGroupId: 'apache-config-reload', - MessageDeduplicationId: nextBoundary.toString(), // Time-based deduplication - DelaySeconds: delaySeconds // Delay until the boundary timestamp + MessageDeduplicationId: nextBoundary.toString() // Time-based deduplication }; try { @@ -107,7 +110,7 @@ async function handleS3Event(event) { messageId: result.MessageId, eventsProcessed: relevantEvents.length, nextBoundary: nextBoundary, - delaySeconds: delaySeconds + note: 'Delay logic now handled in Lambda processing' }) }; } catch (sqsError) { @@ -119,22 +122,35 @@ async function handleS3Event(event) { async function handleSQSEvent(event) { console.log('Processing SQS event'); - // Process the SQS message and trigger reload immediately - // The delay was already handled by SQS DelaySeconds const results = []; for (const record of event.Records) { try { - const messageBody = record.body; + const messageBody = JSON.parse(record.body); console.log('Processing SQS message:', messageBody); - // Trigger Apache reload immediately + // Calculate delay until target boundary + const now = Math.floor(Date.now() / 1000); + const targetBoundary = messageBody.targetBoundary; + const delayUntilBoundary = Math.max(0, targetBoundary - now); + + console.log(`Target boundary: ${new Date(targetBoundary * 1000).toISOString()}, delay: ${delayUntilBoundary}s`); + + if (delayUntilBoundary > 0) { + console.log(`Waiting ${delayUntilBoundary} seconds until boundary...`); + await new Promise(resolve => setTimeout(resolve, delayUntilBoundary * 1000)); + } + + // Trigger Apache reload at the boundary const reloadResult = await makeReloadRequest(); console.log('Apache reload completed:', reloadResult); results.push({ messageId: record.messageId, status: 'completed', + targetBoundary: targetBoundary, + actualReloadTime: Math.floor(Date.now() / 1000), + delayWaited: delayUntilBoundary, reloadResult: reloadResult }); From c3973630cde5f862579512a8cbe30cda073dc91a Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Wed, 4 Jun 2025 07:46:13 -0500 Subject: [PATCH 13/18] Add missing RequestHeader --- terraform-ss-proxy/example_venue_project_config.conf | 1 + 1 file changed, 1 insertion(+) diff --git a/terraform-ss-proxy/example_venue_project_config.conf b/terraform-ss-proxy/example_venue_project_config.conf index 751c941b..73d12ac9 100644 --- a/terraform-ss-proxy/example_venue_project_config.conf +++ b/terraform-ss-proxy/example_venue_project_config.conf @@ -15,6 +15,7 @@ RewriteRule ${VENUE_ALB_PATH}(.*) ws://${VENUE_ALB_HOST}:${VENUE_ALB_PORT}${VENU Require valid-user ProxyPass "http://${VENUE_ALB_HOST}:${VENUE_ALB_PORT}${VENUE_ALB_PATH}" ProxyPassReverse "http://${VENUE_ALB_HOST}:${VENUE_ALB_PORT}${VENUE_ALB_PATH}" + RequestHeader set "X-Forwarded-Proto" expr=%{REQUEST_SCHEME} RequestHeader set "X-Forwarded-Host" "www.dev.mdps.mcp.nasa.gov:${PORT_NUM}" From 2ad2535f9bc162ca8047eb815fe0411092cf86be Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Wed, 4 Jun 2025 13:01:01 -0500 Subject: [PATCH 14/18] Add test file --- terraform-ss-proxy/test_race_conditions.sh | 361 +++++++++++++++++++++ 1 file changed, 361 insertions(+) create mode 100755 terraform-ss-proxy/test_race_conditions.sh diff --git a/terraform-ss-proxy/test_race_conditions.sh b/terraform-ss-proxy/test_race_conditions.sh new file mode 100755 index 00000000..f4d7f7fa --- /dev/null +++ b/terraform-ss-proxy/test_race_conditions.sh @@ -0,0 +1,361 @@ +#!/bin/bash + +# Test script for Apache config reload race conditions +# Generates random config files and uploads them to S3 with random timing +# Then verifies all files are present in /etc/apache2/venues.d/ + +set -e + +# Configuration +S3_BUCKET_NAME="${S3_BUCKET_NAME:-ucs-shared-services-apache-config-dev-test}" +MIN_FILES="${MIN_FILES:-3}" +MAX_FILES="${MAX_FILES:-8}" +MAX_DELAY_SECONDS="${MAX_DELAY_SECONDS:-30}" +VENUES_DIR="/etc/apache2/venues.d" +TEST_PREFIX="test-race-" + +# Colors for output +RED='\033[0;31m' +GREEN='\033[0;32m' +YELLOW='\033[1;33m' +BLUE='\033[0;34m' +NC='\033[0m' # No Color + +# Function to log with timestamp and color +log() { + local level=$1 + local message=$2 + local timestamp=$(date '+%H:%M:%S') + + case $level in + "INFO") echo -e "${BLUE}[$timestamp] INFO:${NC} $message" ;; + "SUCCESS") echo -e "${GREEN}[$timestamp] SUCCESS:${NC} $message" ;; + "WARNING") echo -e "${YELLOW}[$timestamp] WARNING:${NC} $message" ;; + "ERROR") echo -e "${RED}[$timestamp] ERROR:${NC} $message" ;; + esac +} + +# Function to generate random config file content +generate_config() { + local proxy_name=$1 + local path_name=$2 + + cat << EOF +# Local variables for this venue +Define VENUE_ALB_HOST $proxy_name +Define VENUE_ALB_PORT 8080 +Define VENUE_ALB_PATH $path_name + +# WebSocket upgrade handling +RewriteCond %{HTTP:Connection} Upgrade [NC] +RewriteCond %{HTTP:Upgrade} websocket [NC] +RewriteCond %{REQUEST_URI} "\${VENUE_ALB_PATH}" +RewriteRule \${VENUE_ALB_PATH}(.*) ws://\${VENUE_ALB_HOST}:\${VENUE_ALB_PORT}\${VENUE_ALB_PATH}\$1 [P,L] [END] + +# Location block for this venue + + AuthType openid-connect + Require valid-user + ProxyPass "http://\${VENUE_ALB_HOST}:\${VENUE_ALB_PORT}\${VENUE_ALB_PATH}" + ProxyPassReverse "http://\${VENUE_ALB_HOST}:\${VENUE_ALB_PORT}\${VENUE_ALB_PATH}" + RequestHeader set "X-Forwarded-Proto" expr=%{REQUEST_SCHEME} + RequestHeader set "X-Forwarded-Host" "www.dev.mdps.mcp.nasa.gov:\${PORT_NUM}" + + +# Clean up +UnDefine VENUE_ALB_HOST +UnDefine VENUE_ALB_PORT +UnDefine VENUE_ALB_PATH +EOF +} + +# Function to generate random proxy name +generate_proxy_name() { + local prefixes=("app" "service" "api" "web" "data" "auth" "admin" "dashboard" "gateway" "worker") + local suffixes=("prod" "dev" "test" "stage" "blue" "green" "alpha" "beta" "main" "backup") + local environments=("east" "west" "central" "internal" "external" "public" "private" "secure" "fast") + + local prefix=${prefixes[$RANDOM % ${#prefixes[@]}]} + local env=${environments[$RANDOM % ${#environments[@]}]} + local suffix=${suffixes[$RANDOM % ${#suffixes[@]}]} + local num=$((RANDOM % 999 + 1)) + + echo "${prefix}-${env}-${suffix}-${num}.example.com" +} + +# Function to generate random path name +generate_path_name() { + local paths=("unity" "data" "api" "admin" "dashboard" "portal" "app" "service" "gateway" "auth") + local versions=("v1" "v2" "v3" "beta" "alpha" "latest" "stable" "dev") + local features=("core" "main" "lite" "pro" "basic" "advanced" "secure" "fast") + + local path=${paths[$RANDOM % ${#paths[@]}]} + local version=${versions[$RANDOM % ${#versions[@]}]} + local feature=${features[$RANDOM % ${#features[@]}]} + + echo "/${path}/${version}/${feature}" +} + +# Function to cleanup test files from S3 +cleanup_s3() { + log "INFO" "Cleaning up test files from S3 bucket: $S3_BUCKET_NAME" + + # List and delete test files + local test_files=$(aws s3 ls "s3://$S3_BUCKET_NAME/" | grep "$TEST_PREFIX" | awk '{print $4}' || true) + + if [ -n "$test_files" ]; then + echo "$test_files" | while read -r file; do + if [ -n "$file" ]; then + log "INFO" "Deleting s3://$S3_BUCKET_NAME/$file" + aws s3 rm "s3://$S3_BUCKET_NAME/$file" + fi + done + else + log "INFO" "No test files found to clean up" + fi +} + +# Function to verify files in venues directory +verify_venues_dir() { + local expected_files=("$@") + local wait_time=60 + + log "INFO" "Waiting $wait_time seconds for Apache reload to complete..." + sleep $wait_time + + log "INFO" "Verifying files in $VENUES_DIR" + + # Get actual files (only test files) + local actual_files=() + if sudo ls "$VENUES_DIR"/ 2>/dev/null | grep -q "$TEST_PREFIX"; then + while IFS= read -r file; do + actual_files+=("$file") + done < <(sudo ls "$VENUES_DIR"/ | grep "$TEST_PREFIX" | sort) + fi + + # Sort expected files for comparison + local sorted_expected=($(printf '%s\n' "${expected_files[@]}" | sort)) + + log "INFO" "Expected files (${#sorted_expected[@]}): ${sorted_expected[*]}" + log "INFO" "Actual files (${#actual_files[@]}): ${actual_files[*]}" + + # Check if arrays match + local success=true + + if [ ${#sorted_expected[@]} -ne ${#actual_files[@]} ]; then + log "ERROR" "File count mismatch! Expected ${#sorted_expected[@]}, found ${#actual_files[@]}" + success=false + else + for i in "${!sorted_expected[@]}"; do + if [ "${sorted_expected[$i]}" != "${actual_files[$i]}" ]; then + log "ERROR" "File mismatch at position $i: expected '${sorted_expected[$i]}', found '${actual_files[$i]}'" + success=false + break + fi + done + fi + + if [ "$success" = true ]; then + log "SUCCESS" "All files verified successfully! ✅" + return 0 + else + log "ERROR" "File verification failed! ❌" + + # Show detailed diff + log "INFO" "Files missing from venues directory:" + for file in "${sorted_expected[@]}"; do + if ! printf '%s\n' "${actual_files[@]}" | grep -q "^$file$"; then + log "WARNING" " Missing: $file" + fi + done + + log "INFO" "Extra files in venues directory:" + for file in "${actual_files[@]}"; do + if ! printf '%s\n' "${sorted_expected[@]}" | grep -q "^$file$"; then + log "WARNING" " Extra: $file" + fi + done + + return 1 + fi +} + +# Function to run race condition test +run_race_test() { + local test_num=$1 + + log "INFO" "🏁 Starting Race Condition Test #$test_num" + + # Generate random number of files + local num_files=$((RANDOM % (MAX_FILES - MIN_FILES + 1) + MIN_FILES)) + log "INFO" "Generating $num_files random config files" + + # Generate file names and delays + local files=() + local delays=() + local temp_dir=$(mktemp -d) + + for i in $(seq 1 $num_files); do + local filename="${TEST_PREFIX}venue-${test_num}-${i}.conf" + local proxy_name=$(generate_proxy_name) + local path_name=$(generate_path_name) + local delay=$((RANDOM % MAX_DELAY_SECONDS)) + + files+=("$filename") + delays+=("$delay") + + # Generate config file + local filepath="$temp_dir/$filename" + generate_config "$proxy_name" "$path_name" > "$filepath" + + log "INFO" "Created $filename (proxy: $proxy_name, path: $path_name, delay: ${delay}s)" + done + + # Sort by delay to create the upload schedule + local upload_schedule=() + for i in "${!files[@]}"; do + upload_schedule+=("${delays[$i]}:${files[$i]}") + done + IFS=$'\n' upload_schedule=($(sort -n <<< "${upload_schedule[*]}")) + + log "INFO" "Upload schedule (delay:filename):" + for item in "${upload_schedule[@]}"; do + log "INFO" " $item" + done + + # Start uploads with timing + local start_time=$(date +%s) + log "INFO" "🚀 Starting timed uploads at $(date '+%H:%M:%S')" + + for item in "${upload_schedule[@]}"; do + local delay_time=${item%:*} + local filename=${item#*:} + local filepath="$temp_dir/$filename" + + # Calculate time to wait + local current_time=$(date +%s) + local elapsed=$((current_time - start_time)) + local wait_time=$((delay_time - elapsed)) + + if [ $wait_time -gt 0 ]; then + log "INFO" "⏱️ Waiting ${wait_time}s before uploading $filename" + sleep $wait_time + fi + + # Upload to S3 + local upload_time=$(date '+%H:%M:%S') + log "INFO" "📤 Uploading $filename at $upload_time" + aws s3 cp "$filepath" "s3://$S3_BUCKET_NAME/$filename" + + if [ $? -eq 0 ]; then + log "SUCCESS" "✅ Uploaded $filename" + else + log "ERROR" "❌ Failed to upload $filename" + fi + done + + local end_time=$(date +%s) + local total_time=$((end_time - start_time)) + log "INFO" "📊 All uploads completed in ${total_time}s" + + # Verify results + verify_venues_dir "${files[@]}" + local verify_result=$? + + # Cleanup temp directory + rm -rf "$temp_dir" + + return $verify_result +} + +# Main execution +main() { + log "INFO" "🧪 Apache Config Reload Race Condition Tester" + log "INFO" "S3 Bucket: $S3_BUCKET_NAME" + log "INFO" "File Range: $MIN_FILES-$MAX_FILES files" + log "INFO" "Max Delay: $MAX_DELAY_SECONDS seconds" + log "INFO" "Test Prefix: $TEST_PREFIX" + + # Check dependencies + if ! command -v aws >/dev/null 2>&1; then + log "ERROR" "AWS CLI is required but not installed" + exit 1 + fi + + # Verify S3 bucket access + log "INFO" "Verifying S3 bucket access..." + if ! aws s3 ls "s3://$S3_BUCKET_NAME/" >/dev/null 2>&1; then + log "ERROR" "Cannot access S3 bucket: $S3_BUCKET_NAME" + exit 1 + fi + log "SUCCESS" "S3 bucket access verified" + + # Cleanup any existing test files first + cleanup_s3 + + # Run tests + local test_count=${1:-1} + local passed=0 + local failed=0 + + for test_num in $(seq 1 $test_count); do + log "INFO" "==============================================" + + if run_race_test $test_num; then + log "SUCCESS" "🎉 Test #$test_num PASSED" + ((passed++)) + else + log "ERROR" "💥 Test #$test_num FAILED" + ((failed++)) + fi + + # Cleanup S3 after each test + log "INFO" "Cleaning up after test #$test_num" + cleanup_s3 + + # Wait between tests if not the last one + if [ $test_num -lt $test_count ]; then + log "INFO" "Waiting 30 seconds before next test..." + sleep 30 + fi + done + + # Final results + log "INFO" "==============================================" + log "INFO" "📊 FINAL RESULTS" + log "SUCCESS" "✅ Passed: $passed" + if [ $failed -gt 0 ]; then + log "ERROR" "❌ Failed: $failed" + else + log "INFO" "❌ Failed: $failed" + fi + log "INFO" "🎯 Success Rate: $(( passed * 100 / (passed + failed) ))%" + + if [ $failed -eq 0 ]; then + log "SUCCESS" "🏆 All tests passed!" + exit 0 + else + log "ERROR" "💔 Some tests failed!" + exit 1 + fi +} + +# Handle command line arguments +if [ "$1" = "--help" ] || [ "$1" = "-h" ]; then + echo "Usage: $0 [TEST_COUNT]" + echo "" + echo "Environment variables:" + echo " S3_BUCKET_NAME S3 bucket for config files (default: ucs-shared-services-apache-config-dev-test)" + echo " MIN_FILES Minimum files per test (default: 3)" + echo " MAX_FILES Maximum files per test (default: 8)" + echo " MAX_DELAY_SECONDS Maximum delay between uploads (default: 30)" + echo "" + echo "Examples:" + echo " $0 # Run 1 test" + echo " $0 5 # Run 5 tests" + echo " S3_BUCKET_NAME=my-bucket $0 3 # Custom bucket, 3 tests" + exit 0 +fi + +# Run main function +main "${1:-1}" \ No newline at end of file From 9f668ead26c1f5b8a9de160843515f848fb7bc65 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Wed, 4 Jun 2025 13:24:41 -0500 Subject: [PATCH 15/18] Updates for test script --- terraform-ss-proxy/test_race_conditions.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/terraform-ss-proxy/test_race_conditions.sh b/terraform-ss-proxy/test_race_conditions.sh index f4d7f7fa..fd41867c 100755 --- a/terraform-ss-proxy/test_race_conditions.sh +++ b/terraform-ss-proxy/test_race_conditions.sh @@ -300,6 +300,7 @@ main() { for test_num in $(seq 1 $test_count); do log "INFO" "==============================================" + log "INFO" "=== $test_num of $test_count ===" if run_race_test $test_num; then log "SUCCESS" "🎉 Test #$test_num PASSED" @@ -358,4 +359,4 @@ if [ "$1" = "--help" ] || [ "$1" = "-h" ]; then fi # Run main function -main "${1:-1}" \ No newline at end of file +main "${$1:-1}" \ No newline at end of file From c3ff5050cfc5df1ac7deb51b34c02b1257cb46d6 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Wed, 4 Jun 2025 13:26:29 -0500 Subject: [PATCH 16/18] Fix typo --- terraform-ss-proxy/test_race_conditions.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/terraform-ss-proxy/test_race_conditions.sh b/terraform-ss-proxy/test_race_conditions.sh index fd41867c..b6b82df2 100755 --- a/terraform-ss-proxy/test_race_conditions.sh +++ b/terraform-ss-proxy/test_race_conditions.sh @@ -359,4 +359,4 @@ if [ "$1" = "--help" ] || [ "$1" = "-h" ]; then fi # Run main function -main "${$1:-1}" \ No newline at end of file +main "${1:-1}" \ No newline at end of file From 683d5b0301e05e4f7a38ead0f047a9e4aed4566e Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Wed, 4 Jun 2025 13:37:22 -0500 Subject: [PATCH 17/18] Test script fixes --- terraform-ss-proxy/test_race_conditions.sh | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/terraform-ss-proxy/test_race_conditions.sh b/terraform-ss-proxy/test_race_conditions.sh index b6b82df2..9cb01ee9 100755 --- a/terraform-ss-proxy/test_race_conditions.sh +++ b/terraform-ss-proxy/test_race_conditions.sh @@ -157,7 +157,6 @@ verify_venues_dir() { if [ "$success" = true ]; then log "SUCCESS" "All files verified successfully! ✅" - return 0 else log "ERROR" "File verification failed! ❌" @@ -176,7 +175,6 @@ verify_venues_dir() { fi done - return 1 fi } @@ -300,7 +298,7 @@ main() { for test_num in $(seq 1 $test_count); do log "INFO" "==============================================" - log "INFO" "=== $test_num of $test_count ===" + log "INFO" "=== $(printf '%4s' $test_num) of $(printf '%4s' $test_count) ===" if run_race_test $test_num; then log "SUCCESS" "🎉 Test #$test_num PASSED" From 30b323194fc91e78783333649903ad9c87f56824 Mon Sep 17 00:00:00 2001 From: Jeff Leach Date: Fri, 6 Jun 2025 14:33:35 -0500 Subject: [PATCH 18/18] Fix for test script --- terraform-ss-proxy/test_race_conditions.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/terraform-ss-proxy/test_race_conditions.sh b/terraform-ss-proxy/test_race_conditions.sh index 9cb01ee9..9ac6d69b 100755 --- a/terraform-ss-proxy/test_race_conditions.sh +++ b/terraform-ss-proxy/test_race_conditions.sh @@ -4,8 +4,6 @@ # Generates random config files and uploads them to S3 with random timing # Then verifies all files are present in /etc/apache2/venues.d/ -set -e - # Configuration S3_BUCKET_NAME="${S3_BUCKET_NAME:-ucs-shared-services-apache-config-dev-test}" MIN_FILES="${MIN_FILES:-3}" @@ -157,6 +155,7 @@ verify_venues_dir() { if [ "$success" = true ]; then log "SUCCESS" "All files verified successfully! ✅" + return 0 else log "ERROR" "File verification failed! ❌" @@ -175,6 +174,7 @@ verify_venues_dir() { fi done + return 1 fi }