Originally posted at Serverless
I use AWS Lambda for almost all of my projects these days-from Flask apps and Slack bots to cron jobs and monitoring tools. I love how cheap and easy it is to deploy something valuable.
Python is my go-to language, but handling Python packages in Lambda can be tricky. Many important packages need to compile C extensions, like psycopg2 for Postgres access, or numpy, scipy, pandas, or sklearn for numerical analysis. If you compile these on a Mac or Windows system, you’ll get an error when your Lambda tries to load them.
The import path also requires finesse. You can install your dependencies directly into your top-level directory, but that clutters up your workspace. If you install them into a subdirectory like deps/ or vendored/, you have to mess with your sys.path at the beginning of your function.
But there is a much better way. In this post, I’ll show you a how, by using the serverless-python-requirements plugin for the Serverless Framework.
Initial Setup
Let’s get our environment ready. If you have Node and NPM installed, install the Serverless Framework globally with:
You’ll also need to configure your environment with AWS credentials.
Note: if you need a refresher on how to install the Framework or get AWS credentials, check out the Prerequisites portion on the top of our Quick Start Guide.
Creating your service locally
For this quick demo, we’ll deploy a Lambda function that uses the popular NumPy package.
We can create a service from a template. I’m going to use Python 3, but this works with Python 2 as well.
This will create a Serverless Python 3 template project at the given path (numpy-test/) with a service name of numpy-test. You'll need to change into that directory and create a virtual environment for developing locally.
(Note: further reading here about how and why to use virtual environments with Python.)
Let’s set up the function we want to deploy. Edit the contents of handler.py so that it contains the following:
This is a super simple function using an example from the NumPy Quick Start. When working with Lambda, you’ll need to define a function that accepts two arguments: event, and context. You can read more at AWS about the Lambda Function Handler for Python.
Notice the last two lines of the file, which give us a way to quickly test the function locally. If we run python handler.py, it will run our main() function. Let's give it a shot:
Ah, we haven’t installed numpy in our virtual environment yet. Let's do that now, and save the package versions of our environment to a requirements.txt file:
If we run our command locally now, we’ll see the output we want:
Perfect.
Deploying your service
Our function is working locally, and it’s ready for us to deploy to Lambda. Edit the serverless.yml file to look like the following:
This is a basic service called numpy-test. It will deploy a single Python 3.6 function named numpy to AWS, and the entry point for the numpy function is the main function in the handler.py module.
Our last step before deploying is to add the serverless-python-requirements plugin. Create a package.json file for saving your node dependencies. Accept the defaults, then install the plugin:
To configure our serverless.yml file to use the plugin, we'll add the following lines in our serverless.yml:
Note: a previous version of this post set dockerizePip: true instead of dockerizePip: non-linux. You'll need serverless-python-requirements v3.0.5 or higher for this option.
You need to have Docker installed to be able to set dockerizePip: true or dockerizePip: non-linux. Alternatively, you can set dockerizePip: false, and it will not use Docker packaging. But, Docker packaging is essential if you need to build native packages that are part of your dependencies like Psycopg2, NumPy, Pandas, etc.
The plugins section registers the plugin with the Framework. In the custom section, we tell the plugin to use Docker when installing packages with pip. It will use a Docker container that's similar to the Lambda environment so the compiled extensions will be compatible. You will need Docker installed for this to work.
The plugin works by hooking into the Framework on a deploy command. Before your package is zipped, it uses Docker to install the packages listed in your requirements.txt file and save them to a .requirements/ directory. It then symlinks the contents of .requirements/ into your top-level directory so that Python imports work as expected. After the deploy is finished, it cleans up the symlinks to keep your directory clean.
Great. Let’s invoke our numpy function and read the logs:
And there it is. You’ve got NumPy in your Lambda!
Be sure to check out the repo for additional functionality, including automatic compression of libraries before deploying, which can be a huge help with the larger numerical libraries in Python.
Many thanks to the United Income team and Daniel Schep in particular for creating the serverless-python-requirements package. If you want to work on serverless full-time, check out United Income. They use a 100% serverless architecture for everything from serving up their web application to running millions of financial simulations, and they are always looking for talented engineers to join their growing team in Washington, DC.