Going Serverless With Lambda

29 January 2017

I hate hosting things and will avoid having to run a server at all costs. I don’t like having to keep the OS up to date. I don’t like having to worry about weak applications exposing the entire server to a hack. I don’t like having to take my own backups. I don’t like having to juggle difference versions of software. I could go on.

I typically host my projects with PaaS providers, almost always Heroku. It ticks most of the boxes and pulls me out of managing a server into focusing solely on developing the application.

I’m a big fan of Amazon Web Services and was really interested in Lambda when it was announced. That said, I haven’t really had much of a chance to play with it properly so decided to give it a proper go on my latest project tier12.info.

AWS Lambda

At its core, Lambda is a piece of code you upload to Amazon that you can execute using an AWS API call. There’s minimal configuration required to get things up and running, besides understanding and configuring the AWS concepts like function roles and security groups. You just need to specify runtime, a memory limit, a time limit and, of course, the code that you want to run.

There are a number of runtimes to choose from like C#, Java, Python and Node. This makes it really accessible to a huge number of developers. I chose Node as I was most familiar with Javascript out of the available runtimes.

Pricing is really interesting, billed by the GB-second (at the time of writing, $0.00001667 per GB-second) plus $0.0000002. This model means you can really stretch your dollar - 10 million function executions at 128MB for ~200ms each will set you back $6.17 ($2.00 for the requests and $4.17 for the execution time). In reality the majority of your cost here would fall into the free tier, meaning that the real cost would be $1.80!.

Using Heroku and not rolling your own on Digital Ocean or Linode can cost significantly more. The cost saving of moving as much execution logic as possible out to Lambda is appealing enough in itself, even if you ignore the rest of the benefits.

Structuring the project

I used Lambda for communication with the Bungie API, pulling down the player’s inventory and then pushing the inventory up to my Firebase database. The first function pulled the data and split it into chunks. The second function took that chunk and pushed it to the database.

Node in Lambda

The basic structure of the Lambda function in Node is:

exports.handler = (event, context, callback) => {
  callback(null, true);
};

All of your logic is handled inside the handler function. When you’re done, you call the callback function. The callback parameter has the signature callback(error, result). If you encounter an error during execution and want to report that back to Lambda, then pass a JSON serialisable value in as the first argument. If you’re function run was a success then pass something useful back as the result argument and set error to null.

If there are any errors that are encountered during the run then they will be well documented in the cloudwatch logs. Everything that your function outputs (even with console.log) is piped into that interface so you can view it easily. You can get to the logs by viewing your function in the AWS console.

Publishing your function code

To get your code up to Lambda you can either use the web interface or upload a zip file through the AWS cli tool. Using a zip is the only way to get code up that includes node_modules/ from an npm run. I used the follow snippet to get a new version of the code uploaded once I had created my function:

#!/bin/bash -xe

# A quick sanity check of my code to make sure it has valid syntax.
node -c index.js

# Create the package.
zip -r /tmp/lambda-upload.zip index.js node_modules/

# Upload the package and publish it immediately.
aws lambda update-function-code \
  --profile personal \
  --region eu-west-1 \
  --function-name PutYourFunctionNameHere \
  --zip-file fileb:///tmp/lambda-upload.zip \
  --publish

# Clean up.
rm -f /tmp/lambda-upload.zip

Triggering the function

Once I had the function working and my workflow ready then I needed to trigger my first Lambda function call in PHP. This is easily achieved with the (remarkably complete!) PHP AWS SDK.

<?php
use Aws\Lambda\LambdaClient;

$client = LambdaClient::factory([
    'region' => 'eu-west-1',
    'version' => 'latest'
]);

$result = $client->invoke([
    'FunctionName' => 'PutYourFunctionNameHere',
    'InvocationType' => 'Event',
    'Payload' => json_encode([
        'accessToken' => $_SESSION['access_token'],
        'platformId' => $_SESSION['platform_id'],
        'membershipId' => $_SESSION['membership_id'],
        'characterIds' => $_SESSION['character_ids'],
        'apiKey' => $_ENV['BUNGIE_API_KEY']
    ])
]);

I structured my Lambda function so that everything that it needed to talk to Firebase and the Bungie API was passed to it as a parameter when it was triggered. This included the OAauth access token from Bungie and the OAuth application identifier provided as an API key.

For the first function, these payload values are available as in the event argument to the handler:

exports.handler = (event, context, callback) => {
    var api_key = event.apiKey;
    var access_token = event.accessToken;
    var platform_id = event.platformId;
    var membership_id = event.membershipId;
    var character_ids = event.characterIds;

    // ...
}

Building the pipeline

The next step was getting this first function to trigger multiple invocations of the next function. The traditional model of achieving parallelisation like this is using a barrier or something like Promise.all to wait for all of our threads/promises to complete/resolve. This approach would extend the run time of my function, costing me more money while it slept. As we’ve seen, the billing model of Lambda discourages keeping functions running longer than required, so I decided to use SNS as a publish-consume mechanism between the different functions.

Once the chunking was done, I pushed a message onto the SNS topic that my second function was subscribed to. This meant that whenever a message was added to that topic the second function would be triggered with the SNS message as its payload.

As you can see, I took the same approach to the second function as the first: all the parameters that the function needed to execute were passed in as payload.

var promises = [];

item_chunks.forEach(function (chunk) {
  var params = {
    Message: JSON.stringify({
      accessToken: event.accessToken,
      platformId: event.platformId,
      membershipId: event.membershipId,
      itemData: chunk,
      apiKey: event.apiKey
    }),
    Subject: 'Pushed from MyFunctionName',
    TopicArn: sns_topic
  };

  promises.push(sns.publish(params).promise());
});

Promise.all(promises).then(function () {
  callback(null, item_ids_uniqued);
});

The handler of the second function looked a bit different as the payload data was held in a slightly different place within the event argument:

exports.handler = (event, context, callback) => {
  var snsPayload = event.Records[0].Sns;

  var snsMessage = JSON.parse(snsPayload.Message);

  var api_key = snsMessage.apiKey;
  var access_token = snsMessage.accessToken;
  var platform_id = snsMessage.platformId;
  var membership_id = snsMessage.membershipId;
  var received_items = snsMessage.itemData;

  // ...
};

Chunking and limits

An issue that I ran into was around deciding the size of the chunks. My initial approach was to make as many chunks as I could, balancing the overhead of a new invocation with the speedup I could get from the parallelisation. I quickly encountered limits that Amazon place on your Lambda account and throttling.

The throttling limit is in place to avoid you running up massive bills that may be inadvertently accrued by a bug in your code perhaps causing millions of executions to be triggered. I was able to get this increased by opening a ticket with Amazon support. I decided to err on the side of bigger chunks as the speed up that I was seeing wasn’t as big as I expected and I wanted to be more kind to the Bungie API.

The rest of the stuff I did was more about Firebase and the challenges of building the things that I needed on top of that style of database.

Conclusion

My experience with Lambda has been mostly positive. The possibilities of building a real serverless API or application feels very close. Being able to easily plug in to the myriad of other AWS services is really appealing and they all play nicely together.

I’ve got more reading and experimenting to do around the serverless project and Amazon API Gateway to make the entire workflow easier, but I’ll definitely be giving Lambda serious look for any applications I build in the future.

Even if you’re just moving a small component part of your project out to Lambda, the benefits it can give you around scalability and cost are well worth your time investigating.

Nick Jones