Posts Event-driven infrastructure as code
Post
Cancel

Event-driven infrastructure as code

Introduction

This is the first post in my Out with the old blog where I explore handling various cloud-based workloads with new techniques that can make your technology stack more scalable and reliable while also reducing costs and improving developer productivity.

Since all modern cloud projects should have an infrastructure as code (IaC) foundation, that’s where this series starts. Specifically, using the AWS CDK to solve a simple use case - process files uploaded to an S3 bucket. To keep things simple, the processing will be restricted to reading a JSON file containing a single array of objects and storing the objects in DynamoDB.

Architecture

This article covers two different architectural approaches to solving this problem. First it describes a containerized application using Amazon Elastic Container Service (ECS) and a Fargate task. Next it details how to build the same application with a serverless approach using AWS Lambda.

Methodology

I chose the CDK for this project because, while there are many great IaC tools available (e.g., Terraform), I wanted to stick with AWS tools/services. That left me to choose between CloudFormation and CDK, which is a no-brainer for the reasons outlined here.

I also created a version of the infrastructure built using AWS SAM. SAM is a good fit for small serverless projects and one-offs but has the same issues as CloudFormation when the project gets large. To keep this post short, I didn’t detail the SAM template but included it in the corresponding Git repository.

This article is not intended as an exhausted how-to reference and assumes some passing understanding of the AWS services involved. I’ve included links to each service mentioned if you are not familiar with them.

The complete source code for this article can be found here.

ECS + Fargate

I’m a big believer in using managed services whenever possible and the easiest way to run containerized applications on AWS is Amazon Elastic Container Service (ECS) and Fargate.

The following sections describe the individual components of the CDK code for the ECS + Fargate application.

CloudTrail

The application needs to run when a new file is added to an S3 bucket. Unfortunately ECS tasks are not destinations for S3 events so I needed to enable CloudTrail on the S3 bucket in order to receive data events:

1
2
3
new Trail(this, 'trail').addS3EventSelector([{
  bucket
}])

Note that CloudTrail has costs associated with it that don’t apply to S3 event destinations, which are free.

Cluster

ECS tasks and services run in a cluster. Fortunately the CDK has reasonable cluster defaults but, in order to save on NAT gateway costs, I restricted the cluster’s VPC to only one availability zone:

1
2
3
4
5
const cluster = new ecs.Cluster(this, 'cluster', {
  vpc: new ec2.Vpc(this, 'vpc', {
    maxAzs: 1
  })
})

Container

For the application code itself, I created a task definition and the container in which it will run:

1
2
3
4
5
6
7
8
9
10
11
12
const taskDefinition = new ecs.FargateTaskDefinition(this, 'task-definition')
const container = taskDefinition.addContainer('container', {
  image: ecs.ContainerImage.fromAsset('src/'),
  environment: {
    TABLE_NAME: table.tableName
  },
  logging: new ecs.AwsLogDriver({
    streamPrefix: 'processS3'
  })
})
bucket.grantRead(taskDefinition.taskRole)
table.grantWriteData(taskDefinition.taskRole)

The code above illustrates one of the CDK’s biggest benefits - abstracting away low-level functionality with high-level constructs.

First, the ecs.ContainerImage.fromAsset() method causes the CDK to automatically build your Docker image and upload it to Amazon Elastic Container Registry when you deploy new code.

Second, the grantRead() and grantWriteData() methods create an IAM role and define the appropriate policies allowing the application to read from the S3 bucket and write to the DynamoDB table.

Interestingly, Fargate tasks do not log to CloudWatch Logs by default. To get the logs out of the container you’ll need to install the AWS log driver for ECS as shown above.

Event Notification

The final piece of the infrastructure is an Amazon EventBridge rule to launch the ECS task when a new object is added to the S3 bucket:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
Trail.onEvent(this, 's3-rule', {
  eventPattern: {
    source: ['aws.s3'],
    detail: {
      eventName: ['PutObject', 'CompleteMultipartUpload']
    }
  },
  target: new EcsTask({
    cluster,
    taskDefinition,
    containerOverrides: [{
      containerName: container.containerName,
      environment: [
        {
          name: 'BUCKET_NAME',
          value: EventField.fromPath('$.detail.requestParameters.bucketName')
        }, {
          name: 'OBJECT_NAME',
          value: EventField.fromPath('$.detail.requestParameters.key')
        }
      ]
    }]
  })
})

One minor thing to note here. EventBridge rules don’t support suffix matching so the application code needs to handle (ignore) non-JSON files.

Lambda

The Lambda version, as you would expect, is less complex than the containerized one. Here is the entire CDK construct to achieve the same functionality as above:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
const handler = new lambda.Function(this, 'handler', {
  code: lambda.Code.fromAsset('src/'),
  handler: 'index.handler',
  runtime: lambda.Runtime.NODEJS_12_X,
  environment: {
    DDB_TABLE: table.tableName
  },
  events: [new S3EventSource(bucket, {
    events: [s3.EventType.OBJECT_CREATED],
    filters: [{
      suffix: '.json'
    }]
  })]
})
bucket.grantRead(handler)
table.grantWriteData(handler)

As with the containerized solution, the CDK automatically builds and deploys your code, configures an event to invoke it when a new object is added to the bucket and grants your code rights to read from the S3 bucket and write to the DynamoDB table.

Conclusion

The CDK does a great job abstracting away a lot of your infrastructure’s plumbing. However, the containerized solution is still pretty complicated, requiring lots of knowledge about different AWS services. Is the added complexity worth the trouble? It depends…

Control

The containerized solution gives you a lot more control over the computing environment. This is particularly important if you need access to a GPU or more resources (CPUs, memory, etc.) than the FaaS provider allows.

Usage Pattern

Your choice of architecture will depend largely on your usage pattern:

  • Lambdas have a 15 minute timeout so if it takes longer than that to process the file, the invocation will stop abruptly. That said, if you have a process that takes longer than 15 minutes to run you should consider an architecture that scales processing horizontally.
  • While containerized tasks can run for as long as needed, this solution isn’t ideal if you process a lot of files. Fargate tasks on ECS take about a minute to spin up and, since the tasks are running in a VPC, you have the potential to exhaust your VPC’s resources.

Cost

Even though ECS and Fargate tasks utilize a serverless pricing model (you only pay for what you use), the NAT gateway they need does not. The AWS managed NAT gateway costs a bit more than $30 per month per availability zone. Of course this cost can be amortized over all the tasks/services running in the VPC.

Comparing actual compute costs is difficult without more detail about the usage patterns. Lambda doesn’t charge for the first million invocations each month, making it virtually free for many workloads. Fargate tasks have a small hourly fee with no free tier. As a rule of thumb, until you have lots of traffic, Lambda will be less expensive. For a more detailed analysis, see this article on getting it right between EC2, Fargate and Lambda.

In addition to the hard costs, you should also consider the soft costs. Setting up and managing clusters, including resource balancing, auto scaling and security, takes an ongoing effort by trained (i.e., expensive) employees. With managed services like Lambda, the cloud provider handles these complexities for you, freeing up your developers to spend more time implementing business logic instead of managing infrastructure.

Finally

I’d love to hear what you think about this post. Please post your comments below.

Also, if you have a use case you are wondering how to handle with a serverless architecture, let me know and I may tackle it in an upcoming article.

This post is licensed under CC BY 4.0 by the author.

-

Simple API infrastructure as code

Comments powered by Disqus.