SNS, SQS and Loose Coupling

Overview

This week I wanted to group like concepts together, around loose coupling and realtime processing. Exam questions that imply high throughput will often lean on using queues, for reasons we’ll explain in this article.

Exam questions that mention sheer speed and real-time processing will often lean on Kinesis. We’ll cover where to look for Kinesis info on both Streams (processing and de-coupling of messages) and Firehose (gathering of data for use by streams).

Study guide

As before, I’ve linked off to all relevant sections in this guide.

Section in this GuideLink to AWS DocsLinks to Questions
SQSOverviewN/A
SQS ConceptsBasic ArchitectureBasic Concept Questions
SQS CostsReducing costsInline
SQS Standard Queue TutorialStandard queue with JavaN/A – you can expand with your learning tests.
SQS Fifo Queue TutorialFIFO queue with JavaN/A – you can expand with your learning tests.
SNSOverviewN/A
SNS ConceptsHow SNS WorksBasic Concept Questions
SNS Topic Creation TutorialSNS Using the SDKN/A – you can expand with your learning tests.
SNS Endpoint Subscription TutorialSubscribing an endpoint to a topicN/A – you can expand with your learning tests.
SNS Topic PublishingPublishing a message to a topicN/A – you can expand with your learning tests.
Kinesis OverviewKinesis WhitePaperKinesis Questions (read before starting whitepaper)
Kinesis ComponentsKinesis ArchitectureN/A your own flashcards should originate from here.
Kinesis FirehoseWhat is Kinesis FirehoseQuestions before reading
Kinesis QuotasKinesis Data Streams QuotasN/A – make your own flashcard for ‘exam ‘trivia’

SQS

The big picture of SQS

Queues and topics provide us with a very important meta-concept – don’t always assume that the service you want to communicate with will always be available. So if we design with that in mind, we get to the point where we understand something like eventual consistency. If you’re thinking about high throughout systems think of the following:

Is it important that each transaction sits around, blocking, waiting for the final state to come back from whichever data store we use? Or is it enough to get some commitment back that our transaction has been received, and will eventually get to the desired end state?

What’s ideal is in this context is that we can get ‘in and out’ of our transaction quickly.

Most transactions in reality can be done without waiting around. And to be clear, we’re not talking even minutes or seconds in a lot of cases. The important point is that a queue gives the receiver the freedom to manage the workload at its own pace, and gives the sender confidence that the job will be done eventually.

SQS Concepts

In terms of understanding how it works, I’d recommend you get an understanding of:

Questions to prompt your study can be found here.

SQS and Cost

Exam questions that I’ve experienced were around cost, so we need to understand about optimising SQS for cost too. A few questions online this time:

Batching message operations can be used to reduce costs. What are the options available to me and how do they work?

How can I use the SDK to help batch messages? What restriction do I have?

When would I use long polling as a cost–reduction strategy?

What is the best practice for application development that dovetails with SQS?

How can I use SQS for pricing tiers?

SQS Tutorials

You can look at using standard queues with Java and FIFO queues with Java if you want to expand your learning at a deeper level.

SNS

If you’ve used something like a queue and topic in JMS before, you’ll get this concept of SNS. A queue is just one kind of subscriber to a topic, but it’s nice to think of the difference as the topic having all the smarts as to work out where to send everything, one of which will be a queue.

I’ve included a questions sheet to get you started.

SNS Concepts

In terms of understanding how it works, I’d recommend you get an understanding of:

SNS Tutorials

In terms of the exam, I’d say understanding getting your hands dirty setting up creating, publishing to and subscribing to topics is enough. If you can get an understanding of filters and attributes, and how they might affect receipt of messages, that a bonus.

Kinesis Overview

I didn’t have to do any tutorials on Kinesis to pass the exam, BUT practice papers tripped me up with enough variations of Kinesis questions for me to try and learn the architecture properly.

Remember what we said early about sheer speed, and real-time processing. If you get questions relating to real time processing, large throughout and huge amounts of records per second, Kinesis might be one of the options.

There is a whitepaper here, but I’d recommend reading these questions first to understand where to focus your effort.

Kinesis Components

The architecture guide is here – I’d recommend building your your flashcards starting from this. If you understand the architecture, then you have a chance of understanding variations on questions.

If you can understand something fundamentally, you can remember it under (exam) pressure. If you learn by rote, you can be tripped up by a slightly different question in the exam.

Kinesis Streams

The streams guide start here – I’d recommend you read this page plus all the following:

Kinesis Firehose

Firehose is crucial to the architecture of Kinesis as it’s the means by which data can be aggregated in the first place. The streams are only the ‘unit of currency’ if you will, but Firehose is where action begins.

Notice from the docs and then high level architecture diagram,that the data source itself is not specified in the diagram. What does this imply? Perhaps that the origin of data is ultimately unimportant. Loose coupling again perhaps??


 Amazon Kinesis Data Firehose data flow for Amazon S3

Overview from AWS here, and questions sheet here.

Kinesis Quotas

The section of sizes and limits is worth remembering by rote. I don’t normally recommend that, but it came enough up in enough practice papers, and also in my exam to be aware of how it works. Questions on throttling/performance might well rely on clues from the rest of the question about what your setup up is.

Conclusion

If you’re just after exam pointers, hopefully the question sheets I’ve provided are of use to you. I have a study plan offer where you can get all the study plan flashcard sets for free when you buy your exam papers through my Whizlabs link.

For the exam itself, consider the following when you see a question (but don’t follow blindly of course)

  • If it mentioned real-time processing – could Kinesis be a candidate?
  • If it mentions different data sources – could Firehose be used?
  • High throughput – queues let us ‘hand off’ processing to later, that helps with throughput in some circumstances
  • If we need notification – SNS is an obvious candidate, but read the question carefully!

I hope you found this week useful. Please feel free to give feedback as ever. Next week we’ll be looking at Elasticache.