Overview
This week I was going to talk about CI and CD. These are important parts of the AWS Developer associate exam – which we know from looking at the “Recommended Knowledge and Experience” on the exam page:
Ability to use a CI/CD pipeline to deploy applications on AWS
Ability to write code using AWS security best practices (e.g., not using secret and access keys in the code, instead using IAM roles)
It seems the right time to discuss this subject, as it builds on previous study goals we’ve set. If we think about where we’ve got to about deployments:
- API Gateway lets us stage deployments and cutting over to them is a ‘first class citizen’.
- Deployment types give us an abstraction of how we get ‘there’ when we think of concepts like ‘weighted routing’, ‘canary deployments’, blue green deployment and so on.
- Since we know about deployments themselves, we want to work out how to get there in automated fashion, with minimal human intervention – hence our CI/CD learning.
As ever the table of sections and resources and help are here to help you dive in.
CodeBuild
Let’s start with CodeBuild’s definition:
We can summarise CodeBuild very simply:
- Takes source from somewhere (that being S3, CodeCommit, Bitbucket etc)
- Uses an environment and a build spec together
- Produces an artefact.
Let’s look at the architecture a bit more. Understanding the collaboration between each service and CodeBuild lets you quickly get the high level view. If nothing else that’s useful for the breadth of knowledge needed for the exam. If you can understand this diagram from the concepts page, you’re well on your way.
CodeBuild 101
Since you can get this info direct from the AWS documentation, I’d rather ‘add value’ by helping you nail the essentials first. We’ll focus on the relationship between source code, the build project, and the build environment first. As ever I’ve given you a questions list to help your thinking along.
What you should take from reviewing the concepts is:
- One way or another, you create a build project in AWS. This specifies things like how to get source, which build commands are to be run, and where the output is to go, in other words it’s the “what”, but not the “how”.
- CodeBuild will create an environment based off the build project, this is the “how”, since it’s a concrete ‘place’ where the the work happens.
- We know the the environment is the concrete concept, and the build spec is the abstract one, since your source is downloaded into a specific instance of an environment, code is built in a specific instance of an environment but the commands from the build spec are just a blueprint.
- Put another way, the buildspec is the method from a recipe, the source code is the ingredient list from the recipe and the environment is the actual kitchen.
Once you have understood that, you can consolidate those ideas with a tutorial or two.
Tutorials
I’d go along with the suggestions from the AWS page at this point – play around with the console and even try out the samples that seem of interest. Always keep the original diagram in mind.
Whichever tutorial/sample you pick up, can you find the interaction with any other components in the diagram? Where does S3 play a part? How about Cloudformation? Or CloudWatch?
I always find being hands on with a real example, and then relating it back to the big picture helps me cement my understanding. Remember to also try and apply the questions sheet to whatever you pick up.
What about the exam?
With the fundamentals in hand (and the AWS-cited required experience) we can then focus on the following points, which came up in various practice papers for me.
The Build Env idea
From exam practice questions and experience, I’d definitely be across the idea of how the build spec file works – note that the YAML file binds to phases. If you’ve used anything like Spring or even Java Servlets, remember how we have events like bean registration and servlet destruction. We just express what we’re interested in.
Build Spec Overrides
You can also override a build – look at this option as an example:
aws codebuild start-build --cli-input-json file://start-build.json
Why is this important? The takeaway I want you to think critically about is this:
If a technology provides hooks to carry out certain operations (e.g. different entry points for starting the build, and a way of overriding the value in the build, then let’s think about that. Maybe there are different IAM operations. And therefore maybe different roles. Try and take that as a general learning.
Env Vars are useful
Be aware of how we can provide values to the CodeBuild by use of the environment – skimming is enough. I’m hoping it’s obvious that these provide a way of avoiding hard coding values that vary across environments, i.e one of the development best practices for an exam question.
But not secure
One thing you shouldn’t be using env vars for is values that need to be secure, since CodeBuild (or indeed anyone with the right permissions) can echo these values out.
So understand about parameter store and SSM because you may well get an exam question on secure variables.
You need to enable VPCs (when you need them)
CodeBuild does not connect to any VPC by default. So if your tests need something that exists in a VPC, you have to enable that. This is important to know for two reasons. One is that you might have to troubleshoot for the exam so be aware of the possibility of VPC troubleshooting questions.
The other is more general advice on testing practices. When you write unit/integration tests, you shouldn’t be depending on an external service (ideally). If you have something like a mock server, you can still test that XML/JSON gets sent and can be unmarshalled/deserialized into the right objects. But since you’re in complete control of the infrastructure, if your test fails, you can rule out intermittent network errors. Your test cases become 100% reproducible.
The idea of not being able to reach out unless it’s enabled is one of those underpinning idea of ‘least privilege’. Let’s finish off the section on CodeBuild with look into security principles that might feature in the exam.
Security
You should be aware of these at a high level when it comes to CodeBuild:
- That artifact encryption is possible using CMK. There’s a point on that here – at work for example all our source in S3 is encrypted as rest so we need a KMS solution at build time.
- Least Privilege in IAM is important all the time, and try and understand how CodeBuild uses service-linked roles from this CodeBuild IAM page to pre-empt exam questions on CodeBuild privileges.
- As well as service-linked roles with temp credentials, realise that often in enterprise setups, the build account may deploy into a different AWS account – so at least refresh on cross-account roles.
- Parameter Store and SSM we’ve covered already, but try and understand the difference between it and secrets manager.
So that’s it for CodeBuild in as much as it produces an artifact, so what happens with it afterwards? It needs to ‘get somewhere’. We’ve covered a bit of CodeDeploy previously. We’ll revisit it in the context of CI/CD now.
CodeDeploy
We covered the deployment techniques of CodeDeploy previously, but not really touched on the concepts as much. Looking at the welcome page we can see that CodeDeploy summarised as following:
- Taking an artifact (not necessarily from CodeBuild).
- Puts it somewhere that it can actually be reached.
- Has various strategies for shifting traffic (maybe in your dev account, you can tolerate an outage in the name of speed, but not so in test/prod).
Concepts
I would start with the CodeDeploy video as a refresher:
Then move quickly to the components page. It should be quick to skim the deployment types since we’ve already covered those. Where is important though, is nearer the bottom of the page, where it says “For information about other components in the CodeDeploy workflow”
We want to cover:
I’ve set some questions up on this so you know what to focus on since there’s a lot to wade through. But the concepts above came up in the exams in various guises, so it’s important to get the fundamentals.
Deployments
These diagrams are helpful to grasp the components as chunks. This one is from EC2. There’s a revision, feeding into a deployment configuration, pushing to a deployment group, which collaborates with an autoscaling group.
Instance Health
So in terms of the auto scaling groups, how do we know when an instance is good to be cut over to? Instance health is the next thing to grasp – ‘how to cut over given criteria xyz’ seems like something that might appear in a multiple choice exam. Especially given the emphasis given to deployments in the exam guide.
App Spec Hooks
My practice papers gave me questions on the various CodeDeploy app lifecycle events. so I’d be familiar with at least with the overall stage of the hooks (i.e application stop/before install/after install/application start). I didn’t learn every combination by rote, but I did flashcard up the ‘general’ lifecycle.
Deployment Groups
We have covered this in part on the previous deployment types lesson, but we need to understand Deployment groups and Deployment Configuration – how does CodeDeploy orchestrate these? If you can understand that general point, you’ve got better a chance of understanding the permutations of questions in the exam.
Traffic Shifting
Again something we covered a bit last time – what are the ways to get traffic from old instances/ to new instances? Don’t forget that serverless and compute both have different strategies.
Tutorials
In terms of tutorials, I just used the simple Tutorial: Deploy WordPress to a non-Windows instance example to get the simple action of deployment understood. Then I followed up with one around auto-scaling because that ties into ideas of load balancing and health.
That’s quite a lot to cover from the point of view of CodeBuild and CodeDeploy – we’ll also want to understand how to join them together, and that’s where CodePipeline comes in.
CodePipeline
This is the magic that orchestrates the source being built and deployed – we use at it work using AWS CDK (not something you need for the Developer associate exam). In fact in our projects, code build projects cannot be invoked EXCEPT via the CodePipeline.
So what do you need to know about CodePipeline? We’ll start with the video to get the quick overview.
As ever look at the overview page, and the questions sheet to get both a grounding and find the meaning in the documentation. A few things of notes:
Cross Account Access
CodePipeline frequently needs cross account access since it’s a fairly common pattern now to have a separate build account, and one for each deployment environment.
As per the questions sheet, work out how IAM allows this cross account access to truly understand the pattern involved. Review the concepts page of course and refer to the DevOps example to consolidate your understanding of ‘AWS recommended’ CI/CD pipelines.
CodePipeline Tutorials
I would play this one a bit smart. Let’s first look at the available tutorials. That tells me what I need to know about what CodePipeline collaborates with. This knowledge alone might help me to include/exclude some options from an exam question.
Then let’s be ambitious and try the ECR-ECS tutorial – one tutorials will cover a lot of what we need to know about where the responsibility of CodePipeline starts, and CodeBuild/CodeDeploy finishes.
Best Practice
What would exam prep be without ‘Best Practice’ guides? Wrap up your CodePipeline learning with a review of both generic best practice and ones that are more security centric.
Honourable Mention – CodeStar
You need at least know what this is. Even though I didn’t include this in my own study plan, I did get a question or two in my practice paper. Consider where this might be a fit for questions along the lines of ‘How can I do CI/CD in the simplest possible way?’.
At least have a quick skim of this features page.
Conclusion
As ever, I’m trying to give you the tools to learn just enough to make reasoned decisions in the exam. Learn the fundamentals, learn the use cases in breadth and improve your critical thinking. That was how I managed to get across the mount of required material without spending forever on it.
For me personally, CodeBuild took me a little while to grasp because I dived straight into tutorials and didn’t understand the crossover into CodePipeline and CodeDeploy. That’s why I made the point on starting with the core principle of CodeBuild and then explained the roles of CodeDeploy and CodePipeline.
Hopefully separating this article into sections in this fashion will be helpful to some of you. I hope you found this week useful. Please feel free to give feedback as ever. Next time we’re going extending our look at the available technology for securing services and resources, now we’ve played with many of the technologies throughout this course.