# Prompt: Implement Event-Driven ECS Fargate Trigger from S3 Upload ## ๐ŸŽฏ Goal Build an AWS infrastructure pattern using Terraform and Python that executes a containerized Python ingestion script in ECS Fargate whenever a new file is uploaded to an S3 bucket. The ingestion script processes a CSV file and writes to a database. ## ๐Ÿ”ง Implementation Summary 1. **S3 Upload โ†’ EventBridge** - Enable S3 event notifications via EventBridge when new `.csv` files are uploaded to a specific bucket and prefix. 2. **EventBridge โ†’ Lambda** - EventBridge rule triggers a Python Lambda function. - Lambda extracts the S3 bucket and key from the event payload. - Lambda calls ECS `RunTask` with: - `INPUT_FILE=s3:///` (as env var) - `DATABASE_URL=` (as env var) 3. **Lambda โ†’ ECS Fargate** - Launches an ECS Fargate task with: - Predefined task definition - Prebuilt container image for ingestion - Task IAM role with access to S3 + DB - CPU and memory parameters - Logging to CloudWatch ## ๐Ÿ Ingestion Script A Python script already exists. It: - Is containerized and pushed to ECR - Accepts `INPUT_FILE` and `DATABASE_URL` via CLI or env vars - Uses `UPath` and `pandas` to read from S3 - Has logging via a custom logger See script: `src/ingestion/cli/ingest_nec_from_xlb.py` in this repo. ## ๐Ÿ”ง Terraform Resources (Required) This repo has terraform code already implemented for the necessary infrastructure. See existing terraform code: `terraform/infra/main.tf`. You should rely on existing infra where feasible. Generate Terraform code to create: 1. **ECS Cluster** - Launch type: `FARGATE` - Mimic the existing ECS cluster 2. **ECS Task Definition** - Mimic the existing ECS task definition - Container image: `` - CPU/memory: 512/1024 - Environment variables passed at runtime - Logging to CloudWatch - Execution and task roles, where needed or recommended. 3. **IAM Roles** - Lambda Execution Role: - `ecs:RunTask`, `iam:PassRole` - ECS Task Role: - Access to `s3:GetObject` - Access to Secrets Manager & RDS 4. **Lambda Function** - Runtime: Python 3.11 - Source code inline - Extract bucket/key from EventBridge payload - Call ECS `RunTask` with env vars - Add optional retry logic (e.g. 2 attempts, 5 sec delay) 5. **S3 Bucket + EventBridge Integration** - Existing or new bucket: `` - EventBridge enabled for `s3:ObjectCreated:*` - Prefix filter: e.g. `input/` - EventBridge rule targeting Lambda 6. **CloudWatch Log Groups** - For Lambda - For ECS task ## ๐Ÿ“ Python Lambda Function (Requirements) - Accept EventBridge input from S3 - Validate `.csv` extension - Construct full `s3://bucket/key` URI - Call ECS `RunTask` using boto3 - Pass `INPUT_FILE` and `DATABASE_URL` as env vars - Handle errors and log them - Retry a configurable number of times on transient errors ## ๐Ÿง  Assumptions - The container logs to stdout/stderr - Container exits with non-zero on failure - Task finishes under 15 minutes ## ๐Ÿ“ฆ Additional Output Please generate: - `lambda/trigger_ecs.py`: Lambda function code - `ecs_task.tf`: ECS task and cluster resources - `lambda.tf`: Lambda and EventBridge rule - `iam.tf`: IAM roles and policies - `variables.tf` and `outputs.tf`: Optional for clarity --- ## ๐Ÿง  Necessary information 1. **Whatโ€™s the name of the S3 bucket and key prefix to monitor?** > `s3://com.greenlite.file-ingest/nec-uploads/` 2. **Whatโ€™s the full URI of the container image in ECR?** > `841622231873.dkr.ecr.us-east-1.amazonaws.com/file-ingest-service:latest` 3. **Whatโ€™s the ECS cluster name (or should we create a new one)?** > `file-ingest-service-${ENV}-cluster` - `ENV` is an environment variable to specify the runtime env (e.g. dev, stg, prd) 4. **What VPC/subnets/security groups should the ECS task run in?** > Derive this from the terraform that currently exists in this repo and recommend settings. 5. **Where should logs go (CloudWatch log group name)?** > e.g. `/ecs/file-ingest-service/nec-ingestion-task` 6. **What is the database connection method?** - `DATABASE_URL` pulled from Secrets Manager under key: `file-ingest-service-database-url` 7. **Should the ECS task run in a public subnet or behind NAT?** - Should run in a private subnet with access to S3, ECR, and RDS. Make use of the existing VPC Endpoints. 8. **Retry behavior for Lambda on failure?** > 5 retries, 15 seconds, logging upon retry. 9. **Do you want lifecycle policies for the bucket or ECR images?** > No need 10. **Should the Lambda be written as inline code in Terraform, or ZIP/package from a directory?** > If it can be kept simple, implement inline in Terraform.