Deploy Jupyter Notebook to AWS Lambda

May 22, 2018

3 min read

Deploy Jupyter Notebook to AWS Lambda

More often, I find myself opening Jupyter Notebook when facing a math or algorithmic problem. Maybe it is also true for you. At first, you think like: “I’ll make only the research/visualization part in a notebook and then move everything to plain python.” Yet after some time, you end up finishing the algorithm in Jupyter Notebook. A few years earlier, when there were no ”cloud lambdas,” you will end up moving code somewhere. Hopefully, these days, it is possible to deploy your function written in Jupiter Notebook in less than a minute.

AWS and Terraform

We will use Terraform to define infrastructure as code, so we end up making less DevOps routine. And of course, you need to have an AWS account.

Steps

  1. Make lambda function ready to deploy.
  2. Create infrastructure with AWS Lambda and API Gateway.
  3. Test our function.
  4. Create a bash script for deployment.

Function Ready for Deploy

At first, we need to add handler function to our notebook that will receive a request and return the result of calculations.

Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
view raw main.ipynb hosted with ❤ by GitHub

As we could see, the function parses JSON, run algorithm_on_steroids, and then return the result in JSON format. Next, we need to create a file with the name libs.txt. There we will add names of all third-party libs we use(numpy, matplotlib, pandas, …). Then we add a bash script that will create a .zip file ready to be deployed to lambda. What script does:

  1. Create a directory.
  2. Convert the notebook to the python file and move it to the directory.
  3. Run pip install for each lib specified in libs.txt.
  4. Zip folder.

AWS Lambda + API Gateway

To use AWS Lambda, we need to create infrastructure. It includes:

  1. S3 to store the function.
  2. Lambda that will run the function.
  3. API Gateway to communicate with the function.

For simplicity, we will omit best practices and put everything in one terraform file. But before we could run it, we need to specify credentials in the file. Special attention to this piece of the file:

variable "aws_access_key" {
default = "<AWS_ACCESS_KEY>"
}
variable "aws_secret_key" {
default = "<AWS_SECRET_KEY>"
}
variable "region" {
default = "<DEFAULT_REGION>"
}
variable "account_id" {
default = "<AWS_ACCOUNT_ID>"
}
view raw variables.tf hosted with ❤ by GitHub

Create Lambda and Test It!

Now we could open a terminal in the directory with the lambda and all files mentioned earlier.

$ . ./cook_notebook.sh
$ terraform init
$ terraform apply

Isn't it magic? Three commands and we have deployed function.After running terraform apply at the end of the output, we will see a green line with the URL. It is the URL we could use to run the function. Let’s test it by making a POST request.

curl --request POST --data '{"a": 3, "b": 4}' <URL_FROM_OUTPUT>/function

OK, but what if we make changes to the function and want to see a new version deployed? Let’s write a script for deployment.

# $1 = function name
# $2 = bucket name
# $3 = bucket object name (zipped folder)
. ./cook_notebook.sh
aws s3 cp $3 s3://$2/$3
aws lambda update-function-code --function-name $1 --s3-bucket $2 --s3-key $3
view raw deploy.sh hosted with ❤ by GitHub

Now we could run it by typing:

. ./deploy.sh tf-lambda tf-lambdas function.zip

Conclusion

In this post, we've made the automation tool for deploying the Jupyter Notebook function. It is not ideal. For example, it would be good to add a script that will automatically find third-party libs in the notebook and add them to libs.txt, but this is out of the scope of this post. To delete everything we made from AWS, we can run terraform destroy. And poof — everything deleted from AWS. I hope this post was useful to you!