More about implemented solution, with all the steps of configuration, and a DEMO
Hello and welcome back to the third and final part of the article about serverless pipeline and how it is used to automate configuration management of EC2 instances. In the previous chapter we went through client requirements and all the problems encountered, while working on the project. We also discovered what solutions can be used one very stage of the process and in which cases. Today will be more about implemented solution, with all the steps of configuration, and a DEMO. To sum everything up, we will finish with conclusions reached after few months of use and answer some questions:
- What was good?
- What could be better?
- What are our plans for the future?
We wanted to have a pipeline enhanced with notification directly to Slack, so we created a CodePipeline with 3 stages - Source, Deploy and Notify. Our Pipeline is managing configuration of EC2 instances. Let’s have a look at the workflow.
The whole mechanism is started by a user and, more specifically, the change (event) on the git repository, e.g. PUSH into specific branch.
Diagram shown below describes the main AWS services required to build apipeline: CodeBuild, CodePipeline, ECR, S3, Systems Manager, CloudWatch and Lambda.
Ansible playbooks are executed in CodeBuild project. This is the heart of the pipeline and its second stage.
Code: Definition of CodeBuild project in Terraform.
Access to other AWS resources is granted through IAM service role. It allows us, for example, to read action on repository in ECR, create Network Interfaces in the subnet, send logs to CloudWatch Log Group and S3, get artifact from S3 and parameter from SSM, and store cache inseparated S3 bucket.
All those actions are defined in the project as well. Target artifacts and cache for a project are stored in S3, the environment is using predefined image with configured Ansible from ECR. The source is an artifact generated in the first stage. Specification of the project (buildspec) is stored in a separate file, so we are pointing to that directory. And the most important part, definition specifying that our job will be running inside our VPC and private subnet. Security group is allowing only outbound access:
-HTTPS to SSM, CloudWatch Logs and S3 VPC Endpoints.
-SSH to subnets with target EC2 instances.
Module for CodeBuild is defined to create all required resources at once, and to make a project running - IAM role, Security Group, S3 bucket for cache and CodeBuild project, of course. Source can be located in the same repository or accessed remotely. In this case, we are pointing to a remote source in separated GitHub repository, released and tagged with proper version accordant to semantic versioning 2.0.0 (MAJOR.MINOR.PATCH).
Code: Calling the CodeBuild module in Terraform.
The basic infrastructure with VPC Endpoints configuration is defined in a separate stack (VPC and S3 Endpoint Gateway are not managed by Terraform). Our state files for stacks are stored in separated remote states on S3. Because of such structure, we can reference variables required to run that project in a several ways:
CodeBuild project still needs information about Ansible playbooks and where we would like to execute them. We are able to define it, using shell commands in the Buildspec file. Here, we are specifying location for Ansible configuration files, additional plugins which are dependent on the service, required parameters from AWS SSM Parameter Store (SSH keys) and the rest of the variables.
Code: Buildspec file of CodeBuild project for Ansible playbooks execution.
Now, it’s important to remember that CodeBuild doesn’t allow functionality for variables like drop down list, only simple text field, at least for now. It doesn’t mean that this will not change in the future, but at the moment we had to work with a shell script that exports environment variables for us.
Code: Shell script for managing environment variables.
Order of pipeline stages is defined by CodePipeline. It allows multiple sources, actions and stages, simultaneously or transiently, depending on your needs.
Our pipeline was created in 3 steps. For each of them CodePipeline required proper permissions defined in IAM role like:
It is not possible to parametrize everything inside resource, amount and stage types of a pipeline has to be defined statically, like:
Stage 1: Defining Source, which, in our case, is a private GitHub repository. Target branch and OAuth token are stored in AWS SSM ParameterStore.
Stage 2: Referencing already created CodeBuild project for Ansible playbooks execution.
Stage 3: Referencing Lambda function that is already on place with all required variables.
Code: Definition of CodePipeline project and stages in Terraform
Our AWS Lambda function is using Python 3.7 runtime which, at that moment, provides boto3 - 1.9.221 botocore-1.12.221 and Amazon Linux v1 under lying environment. Besides libraries imported in pipeline_slack_notification.py script, it will also require requests package to post message into Slack channel. Message will contain information about Pipeline URL, AWS account ID, region, date, pipeline name and a commit ID with the change, as described inmessage.json template.
Code: Python script with template for message.
To get commit ID we are using commit.py function. Here AWS SDK for Python (Boto3) is looking for current CodePipeline and last execution (commit) ID.
Code: Python script that gets information about last commit ID used in CodePipeline.
Main function (pipeline_slack_notification.py) is taking care of sending messages to the Slack. Achieving this will require following environment variables:
What is pipeline_slack_notification.py function responsible for? In few words, getting all required information from Pipeline and the change, generating a message based on message template, and sending that message immediately to defined Slack channel.
Code: Python script that generates message and send it into Slack channel.
An example of Slack message output generated by Lambda function:
Thanks to native AWS services we have limited our time for maintenance to a minimum, which resulted in rebuilding docker images and updating pipeline configuration with newer one.
Unlike local deployments, that lasted from 4 to 5 minutes and were often interrupted by environmental errors or expiring tokens, new ones were shortened to ~50 seconds.
AWS native services are self-managed. AWS assumes responsibility for providing the most up-to-date and secure solutions. You pay only for what you use, and, actually, it works exactly like this. After 5 months we are paying ~$1.85 monthly for usage of main AWS services (CodeBuild, CodePipeline and ECR) in our pipelines, summarizing 2 AWS accounts with 5 regions.
*existing for more than 30 days and with at least one code change that runs through it during the month
What was good?
What could be better?
Someone could say that the AWS services used in our solution are too primitive, that CodePipeline, CodeBuild or other AWS services that we know lack many functionalities. Remember, however, that this does not necessarily mean it will never change. AWS is still actively developing its solutions. For example, CodePipeline announced this month that it will transfer globally environment variables. Until recently, we could only configure variables from CodeBuild. It is in our interest to report the demand for new features and feedback to AWS. In the end, we all want to use the most effective solutions :)
Despite the fact that the presented solution for the automation of the EC2 instance configuration is not finished, it allowed us to significantly speed up the work at a very low cost.
In the future, we plan at least the full parameterization of our pipeline and cross-region / cross-account deployment to reduce the number of pipelines. Moreover, we want to test infrastructure to increase reliability but also for security reasons, for we would like to use a private control version system, like AWS CodeCommit, as a source. It allows greater granulation of access rights to specific repositories, branches, and Pull Requests, whereas, at the same time, GitHub grants permissions to all repositories within organisation.These are not all of the changes, of course. Our demand will alter and grow over time. The main thing is to achieve all goals as effectively as possible, and that is what we have done.
More about client requirements, current situation and limitations
More about client requirements, current situation and limitations
We're here to help you!