Before starting the trip better check whether you’re properly equipped

After reading the first part of this series, you should have a general idea about the business and technical goals that have a direct impact on the choice of event-based architecture. This time, I am going to cover tools, AWS services and third-party frameworks that speed up different parts of the process. I will also show you some values we got and problems we came across.

Equipment

Basically,we’ve collected tools that are used during event-based scenarios (serverless for some people) in one bag called “BaseGear”:

Serverless Framework (https://serverless.com/)
Terraform/Terragrunt (https://github.com/gruntwork-io/terragrunt)
CI/CD pipeline for serverless/terragrunt deployments - designed and implemented by our teammate Magda
Amplify (https://aws-amplify.github.io) and Vue.js (https://vuejs.org)

We’ve already posted articles about Amplify,Vue (https://chaosgears.com/lets-start-developing-using-vue-and-amplify/) and CI/CD (https://chaosgears.com/serverless-pipeline-for-ec2-configuration-management-3-solution-architecture-and-implementation-of-serverless-pipeline/) on our blog, so I’ll skip those topics in this article.

Serverless Framework

This framework has been described in one of the previous articles, so I won’t focus on its features. However, let me give you some advice that, hopefully, will be valuable to your projects:

We always build separate folders per microservice, per needed lambda layer, and create a new “serverless project” within that. This particular project had the following structure:

~/common/bango/
.
├── customer-bango-backend
├── customer-bango-algorithm-equipment-layer
├── customer-bango-aws-custom-logging-layer
├── customer-bango-aws-ssm-cache-layer
├── customer-bango-frontend
└── customer-bango-image-terragrunt

~/common/bango/customer-bango-backend/
.
├── README.md
├── algorithms-service
├── analysis-service
├── system-health-service
└── terraform

~/common/bango/customer-bango-backend/algorithms-service
.
├── files
├── functions
├── messages
├── node_modules
├── package-lock.json
├── package.json
├── requirements.txt
├── resources
├── serverless.yml
└── tests

We try to separate databases, s3 buckets definitions from serverless stack especially on “prod” environment. We try to put such services together with the rest. We call them static resources and define them via Terraform code. Serverless Framework has got CloudFormation behind the scenes, so it manages your resources completely. Keep in mind that one simple mistake can accidentally delete serverless resources, like DynamoDB tables. So, either set the DeletionPolicy: Retain or move the definition outside the stack, like we do. Better safe than sorry...
Use “exclude” (exclude and include allows you to define globs that will be excluded from the resulting artifact) to keep control over the packaging process and pack only the code you need.

DynamoDB Streams definition and its “serverless.yml” problem

NOTE (from AWS documentation): “Lambda polls shards in your DynamoDB Stream for records at a base rate of 4 times per second. When records are available, Lambda invokes your function and waits for the result. If processing succeeds, Lambda resumes polling until it receives more records.”

To avoid less efficient synchronous Lambda function invocations set batchSizeor batchWindow parameter. The batch size for Lambda configures the limit parameter in the GetRecords API. DynamoDB Streams will return up to that many records if they are available in the buffer, whereas the batchWindow property specifies a maximum amount of time to wait before triggering a Lambda invocation with a batch of records.

The issue with the configuration in “serverless.yml” is that when you use the schema presented below, it doesn’t respect the Lambda function trigger configuration seen in the AWS Console.

NOTE: We suggest moving the configuration of the DynamoDB Stream and the DynamoDB Table to a separate file in another directory. It simply makes the structure tidier and better organized:

Lambda function trigger configuration in AWS console:

Lambda layers definition - each layer we use for Lambda functions is also defined inside the serverless file regulated with proper version variable:

Generally, I haven’t come across any problems with layers defined this way. However, I did notice one thing about CloudFormation containing layers’ parameter values. Let’s use real case as an example:

We published changes for selected group of Lambda functions via CloudFormation stack. Below, I pasted only the necessary part of parameters section:

The point is that we’ve automated the whole process. We coded a function for getting the latest, available Lambda version, and then publishing a newer one accordingly. However, there is no information in AWS documentation (https://boto3.amazonaws.com/v1/documentation/api/1.9.42/reference/services/lambda.html#Lambda.Client.list_versions_by_function) that you can only collect last 50 versions (we had over 60 that time) viasingle API call. This led to a problem, because we thought we were publishing new function version (via defpublish_new_version(self, uploadId) presented below) and re-pointing the “prod” alias to it. So, as you can see below, we had a situation where AWS Console version greater than 270 was available, but get_latest_published was returning only 50 items with the maximum value of 185, and that was a value the alias has been pointed to. Moreover, it has generated additional problems, because each time we’ve updated the layer version, it wasn’t seen by the Lambda function.

To sum it up, if you’ve exceeded 50 function versions, combined with aliasing, and your Lambda layers/functions updates workflows are being done via CloudFormation/Boto3, use the “NextMarker”.

versionsPublished = ['$LATEST', '9', '13', '17', '21', '25', '29', '33', '37', '41','45', '49', '53', '57', '61', '65', '69', '73', '77', '81', '85','89', '93', '97', '101', '102', '105', '109', '113', '117', '121','125', '129', '130', '131', '132', '133', '137', '141', '145', '149','153', '157', '161', '165', '169', '173', '177', '181', '185']

Getting the latest version (with “NewMarker” key) of Lambda function, will allow you to collect more than 50 versions (especially the most recently published one) via Lambda API:

Publish new version of Lambda function:

Definitely use serverless plugins. You can add them by sls plugin install -n serverless-PLUGIN_NAME or for some like step functions: npm install serverless-step-functions

They pretty much save you development time. Just keep in mind to follow their repositories’ issues. You can see some we’ve been using below:

serverless-python-requirements - a Serverless v1.x plugin to automatically bundle dependencies from requirements.txt and make them available in your PYTHON PATH.

serverless-plugin-aws-alerts - adds CloudWatch alarms to functions. Below, a simple example, showing how to use a default alarm “functionErrors”:

There are more default alarms:

With following default configurations:

If you want, you can create your own alarm or override default alarm’s parameters. We used that in another microservice. Here’s the example:

serverless-pseudo-parameters - you can use #{AWS::AccountId}, #{AWS::Region}, etc., in any of your config strings. This plugin replaces values with the proper pseudo parameter Fn::SubCloudFormation function. Example from our code:

serverless-plugin-lambda-dead-letter - can assign a DeadLetterConfig to a Lambda function and optionally create a new SQS queue or SNS Topic with a simple syntax. Keeping in mind the principle that everything fails, we implement “backup forces” for our Lambdas in order to handle unprocessed events. Below, one of our functions with DLQ configured:

serverless-step-functions – basically, it simplifies the Step Functions state machines definition in a serverless project.

definition: part contains YAML file with particular states configurations. I do prefer to configure it this way instead of putting the whole code in the “serverless.yml” file.

├── files
├── functions
├── messages
├── node_modules
├── package-lock.json
├── package.json
├── requirements.txt
├── resources
├── new_algorithms_package_delete_stepfunctions.yml
├── serverless.yml
└── tests

After the deployment,we got the diagram shown below. As you can see, the loop has been used in order to wait for the result returned from the CloudFormation stack. Basically, it allows you to get rid of sync calls and react, depending on the returned status from another service.

Terraform/ Terragrunt

I am pretty sure each of you knows these tools, so I’ll add only a short annotation. Generally, there are some talks about Terragrunt necessity. According to the description from their repo, “Terragrunt is a thin wrapper for Terraform that provides extra tools for working with multiple Terraform modules”. Therefore, do not expect Terragrunt to wrap up all the issues you have with terraform. Personally, I like it for the way it organizes the repo:

├── aws
│   └── eu-west-1
│       ├── deployments-terragrunt
│       │   └── terraform.tfvars
│       ├── docker-images-terragrunt
│       │   └── terraform.tfvars
│       ├── frontend-build
│       │   └── terraform.tfvars
│       ├── terraform.tfvars
│       └── website
│           └── terraform.tfvars
├── deployment
│   └── buildspec.yml
└── modules
    ├── aws-tf-codebuild
    ├── aws-tf-codebuild-multisource
    ├── aws-tf-codepipeline-base
    ├── aws-tf-codepipeline-github
    ├── aws-tf-codepipeline-multisource
    ├── aws-tf-ecr
    ├── aws-tf-lambda
    ├── aws-tf-s3-host-website
    ├── aws-tf-tags
    ├── image-builder
    ├── multisource-pipeline
    └── terragrunt-deployments

Then, with a simple “terraform.tfvars”, you can set variables used in particular terraform modules located in“source= "../../..//modules/"”,andwith a specified module/product you want to use. Our example depictsa CI/CD pipeline for terraform infrastructural changes built on top the CodePipeline + CodeBuild + Lambda Notification(terragrunt-deployments).

An attentive reader might have noticed the buildspec_path = "./terraform/deployment/buildspec.yml" statement. In this particular case, we used it as a source buildspec file for CodeBuild that is invoking Terragrunt commands and making changes in the environment.

Is it packed already? Next stop “serverless architecture”

So far, I’ve covered tools we use to make things easier and those that save us time. Nonetheless, I would deceive you and blur the reality if I was to say that they work out-of-the-box. For me and my team, it’s all about the estimation; how much time we need to start using new tool effectively, and how much time we save by using a particular tool. Business doesn’t care about tools, it cares about time. My advice is, don’t bind yourself to tools but rather to the question: “what/how much will I achieve if I use it”. Roll up your sleeves, more chapters are on the way...

Before starting the trip better check whether you’re properly equipped

Equipment

Serverless Framework

Terraform/ Terragrunt

Is it packed already? Next stop “serverless architecture”

Technologies

AWS CodePipeline

AWS Amplify

Terraform

Remaining chapters

How to test your Lambda in production with different memory values and Python wrapper

Last stop on our serverless project journey

Decisions in a world where computing is commoditized

Related articles

Challenges with S3 migrator

Reuse dashboards in SaaS BI applications with Amazon QuickSight templates

Let's talk about your project