S3CloudfrontToLoggly

Python-flavored AWS Lambda function for fetching cloudfront logs from s3 and bulk uploading to loggly

The handler function is designed to be fired when a new, gzipped Cloudfront log file is dropped into an s3 bucket. (See the Cloudfront docs for how to set that stuff up.) It will first fetch the contents of the file, gzip decompress it, parse and structure the tsv-formatted events into json, and then push them to loggly via their bulk upload endpoint.

Prerequisites

an AWS account that includes usage rights for the relevant services, Lambda & S3.

Prep the function code

Clone the repo and cd into the directory:

git clone https://github.com/harvard-dce/S3CloudfrontToLoggly.git
cd S3CloudfrontToLoggly

Make a local copy of the config file:

cp config.cfg.example config.cfg

Update config.cfg with the appropriate values (see below)
Create the zip file:

zip S3CloudfrontToLoggly.zip S3CloudfrontToLoggly.py config.cfg

Create the function

Open the Lambda console: https://console.aws.amazon.com/lambda/home
Click Create a Lambda Function
Click Skip to skip the blueprint
In the Name field enter "S3CloudfrontToLoggly" (or whatever you want)
Choose the python2.7 runtime
Select Upload a .ZIP file and select the zip file you created earlier
Enter "S3CloudfrontToLoggly.lambda_handler" as the Handler
For the Role select "S3 execution role" and follow the steps to create the new role
Other defaults depend on the size of your log files (I'm using 1024m/10s timeout)
Click Next and then Create function

Configure the source

Click the Event Sources tab in your new function
Click Add event source
Choose "S3" as the Event source type
Select the bucket where your cloudfront logs are dumped
For Event type choose "Object Created (All)" -> "PUT"
Click Submit and you're done!

In the Monitoring tab of your fuction there is a link to the Cloudwatch log streams where you can check for success/errors.

Config options

loggly_token - the access token for your loggly account
loggly_tags - any additional tags you want added to the events, separated by commas. See here for more.
include_bucket_tag - if enabled, include the s3 source bucket name as a loggly event tag
max_lines - max number of events to send at once. Default is probably fine. You just don't want the uploaded "chunks" to be greater than 5MB.
debug - if enabled, extra debug info will be written to the lambda function log output

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.gitignore		.gitignore
README.md		README.md
S3CloudfrontToLoggly.py		S3CloudfrontToLoggly.py
config.cfg.example		config.cfg.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.gitignore

.gitignore

README.md

README.md