VPC Flow Log Architecture

Observable uses VPC Flow Logs for passive monitoring of traffic in Amazon Web Services (AWS) virtual private cloud (VPC) environments. This post will describe how the process works at a high level, focusing on the architecture. This effort involves retrieving flow logs from a customer’s CloudWatch Logs account, processing them, and storing the results.

The most straightforward way to do that is to write an application that uses the CloudWatch Logs GetLogEvents API call. Our engineering team published the flowlogs-reader library as an open source project to make it easier to deal with flow logs for testing and analysis.

Processing flow logs with the CloudWatch Logs API

Unfortunately, the straightforward approach runs into problems with large log volumes. The API is rate limited, and if you hit its limit, you may not be able to retrieve logs fast enough to keep up with real-time.

Luckily, CloudWatch Logs can integrate with Amazon Kinesis for real-time processing. There are two main steps:

Download A New Way to Look at AWS Security whitepaper.

Download White Paper

Using Kinesis for real-time retrieval of flow logs

Once your subscription is active, you can write an application that uses the Kinesis GetRecords API call to pull from the Kinesis shards.

Kinesis can scale to deliver large log streams, but the application doing the processing will also need to scale gracefully. That can mean re-writing a single-threaded application for multiple threads, which also entails using a larger EC2 instance (or multiple EC2 instances) if the analysis is CPU-bound.

Scaling flow log processing with parallel EC2 resources

Using more EC2 resources can be expensive, and it also involves an operational burden – the server has to stay up, the application has to be kept running, etc. Fortunately, there is another option. AWS Lamba integrates nicely with Kinesis, and allows you to run many instances of processing applications in parallel. There are no servers to maintain, and your application will scale automatically as log volumes fluctuate.

Scaling flow log processing with Lambda functions

A future post will go over an example of a Lambda application that can be used to analyze flow logs.

Note: The CloudWatch Logs > Kinesis > Lambda pipeline can be further simplified to skip over Kinesis, if all the resources involved are on the same customer account. See Amazon's documentation for details on how to use CloudWatch Logs subscriptions with Lambda.

Detect Threats Faster – Start Your Free, No-Risk Trial