This article explains how to deploy a WSGI python web app + a celery worker.
A recommended reading before moving forward with Heroku is The Process Model.
The Application
The application that we are going to deploy to heroku has the following components:
- WSGI App, it can be a Django or Flask application, the important part is that you provide a WSGI app as entry point.
- Celery, we will be using a Celery worker to run background tasks, for example image resizing.
- Database, well ... you know, we usually want to save data permanently.
- RabbitMQ as message broker.
The Procfile
!
Procfile
let the developer declare the commands each dyno has to run,
there are several process' types:
Process Type | Command Example |
---|---|
web | gunicorn app.wsgi -b 0.0.0.0:$PORT --debug |
worker | celery -A tasks worker --loglevel=info --concurrency=1 |
clock | python foo.py |
This file is the key part to instruct Heroku what to run, for the WSGI app
we'll use a web
process and to run the celery worker daemon we use a
worker
process. Each of these processes will scale independently according
to your configuration (see
Scaling Your Dyno Formation).
For our application we are going to use the following Procfile
:
web: gunicorn app.wsgi -b 0.0.0.0:$PORT --debug worker: celery -A app.tasks worker --loglevel=info --concurrency=1
Testing your Procfile
One of the nice things about the Procfile
is that can be interpreted by
Foreman, so you can test your configuration running
the following command:
$ foreman start 17:43:11 web.1 | started with pid 31923 17:43:11 worker.1 | started with pid 31926 17:43:12 web.1 | 2014-02-19 17:43:12 [31925] [INFO] Starting gunicorn 18.0 17:43:12 web.1 | 2014-02-19 17:43:12 [31925] [INFO] Listening at: http://0.0.0.0:5000 (31925) 17:43:12 web.1 | 2014-02-19 17:43:12 [31925] [INFO] Using worker: sync 17:43:12 web.1 | 2014-02-19 17:43:12 [31938] [INFO] Booting worker with pid: 31938 17:43:14 worker.1 | [2014-02-19 17:43:14,119: WARNING/MainProcess] /home/freyes/.virtualenvs/django-todo/local/lib/python2.7/site-packages/celery/apps/worker.py:161: CDeprecationWarning: 17:43:14 worker.1 | Starting from version 3.2 Celery will refuse to accept pickle by default. 17:43:14 worker.1 | 17:43:14 worker.1 | The pickle serializer is a security concern as it may give attackers 17:43:14 worker.1 | the ability to execute any command. It's important to secure 17:43:14 worker.1 | your broker from unauthorized access when using pickle, so we think 17:43:14 worker.1 | that enabling pickle should require a deliberate action and not be 17:43:14 worker.1 | the default choice. 17:43:14 worker.1 | 17:43:14 worker.1 | If you depend on pickle then you should set a setting to disable this 17:43:14 worker.1 | warning and to be sure that everything will continue working 17:43:14 worker.1 | when you upgrade to Celery 3.2:: 17:43:14 worker.1 | 17:43:14 worker.1 | CELERY_ACCEPT_CONTENT = ['pickle', 'json', 'msgpack', 'yaml'] 17:43:14 worker.1 | 17:43:14 worker.1 | You must only enable the serializers that you will actually use. 17:43:14 worker.1 | 17:43:14 worker.1 | 17:43:14 worker.1 | warnings.warn(CDeprecationWarning(W_PICKLE_DEPRECATED)) 17:43:15 web.1 | 2014-02-19 17:43:15 [31925] [INFO] Handling signal: winch 17:43:15 web.1 | 2014-02-19 17:43:15 [31925] [INFO] SIGWINCH ignored. Not daemonized 17:43:16 worker.1 | [2014-02-19 17:43:16,683: INFO/MainProcess] Connected to amqp://guest@127.0.0.1:5672// 17:43:17 worker.1 | [2014-02-19 17:43:17,323: INFO/MainProcess] mingle: searching for neighbors 17:43:19 worker.1 | [2014-02-19 17:43:19,151: INFO/MainProcess] mingle: all alone 17:43:19 worker.1 | [2014-02-19 17:43:19,750: WARNING/MainProcess] celery@tiefighter ready.
If foreman start
works, then your Procfile
is OK.
Create the app
Once you have configured the Profile
we'll create an application in Heroku
(remember to install heroku-toolbelt) running
heroku create
.
Example output:
$ heroku create Creating murmuring-ravine-4093... done, stack is cedar http://murmuring-ravine-4093.herokuapp.com/ | git@heroku.com:murmuring-ravine-4093.git
This command creates a new heroku app and it adds a new remote to push your code.
.git/config
example:
[remote "heroku"] url = git@heroku.com:desolate-fortress-2758.git fetch = +refs/heads/*:refs/remotes/heroku/*
Adding a Database
Heroku provides databases via addons, there are several providers, but for this article we are going to use heroku-postgres.
To add it to the recently created application run heroku addons:add
heroku-postgresql
. Here is an example output:
$ heroku addons:add heroku-postgresql Adding heroku-postgresql on murmuring-ravine-4093... done, v3 (free) Attached as HEROKU_POSTGRESQL_YELLOW_URL Database has been created and is available ! This database is empty. If upgrading, you can transfer ! data from another database with pgbackups:restore. Use `heroku addons:docs heroku-postgresql` to view documentation.
Adding RabbitMQ
We are using RabbitMQ as message broker for Celery, for this I choose to use
CloudAMQP, to add it to your app run
heroku addons:add cloudamqp
. Here is an example output:
$ heroku addons:add cloudamqp Adding cloudamqp on murmuring-ravine-4093... done, v4 (free) Use `heroku addons:docs cloudamqp` to view documentation.
Adjusting Settings
Heroku passes to your app environment variables with the configuration, this a common practice in the PaaS ecosystem, so you need to modify your app to read config values from it.
Here are examples to setup Celery and Django:
import os
import dj_database_url
DEFAULT_AMQP = "amqp://guest:guest@localhost//"
DEFAULT_DB = "postgres://localhost"
db_url = os.environ.get("DATABASE_URL", DEFAULT_DB)
# Django settings
DATABASES = {"default": dj_database_url.config(default=DEFAULT_DB)}
# Celery app configuration
app = Celery("tasks", backend=db_url.replace("postgres://", "db+postgresql://"),
broker=os.environ.get("CLOUDAMQP_URL", DEFAULT_AMQP))
app.BROKER_POOL_LIMIT = 1
Launch the app!
Now we have everything configured, it's time to launch push our code and let Heroku do the magic running git push heroku master
.
$ git push heroku master Initializing repository, done. Counting objects: 365, done. Delta compression using up to 2 threads. Compressing objects: 100% (148/148), done. Writing objects: 100% (365/365), 39.08 KiB | 0 bytes/s, done. Total 365 (delta 206), reused 319 (delta 185) -----> Python app detected -----> No runtime.txt provided; assuming python-2.7.6. -----> Preparing Python runtime (python-2.7.6) -----> Installing Setuptools (2.1) -----> Installing Pip (1.5.2) -----> Installing dependencies using Pip (1.5.2) Downloading/unpacking Django==1.5.5 (from -r requirements.txt (line 1)) Running setup.py (path:/tmp/pip_build_u26118/Django/setup.py) egg_info for package Django Downloading/unpacking dj-database-url==0.2.1 (from -r requirements.txt (line 2)) Downloading dj-database-url-0.2.1.tar.gz Running setup.py (path:/tmp/pip_build_u26118/dj-database-url/setup.py) egg_info for package dj-database-url Downloading/unpacking django-extensions==0.9 (from -r requirements.txt (line 3)) Running setup.py (path:/tmp/pip_build_u26118/django-extensions/setup.py) egg_info for package django-extensions [...SNIP...] Running setup.py install for pytz warning: no files found matching '*.pot' under directory 'pytz' Running setup.py install for billiard building '_billiard' extension gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DHAVE_SEM_OPEN=1 -DHAVE_FD_TRANSFER=1 -DHAVE_SEM_TIMEDWAIT=1 -IModules/_billiard -I/app/.heroku/python/include/python2.7 -c Modules/_billiard/multiprocessing.c -o build/temp.linux-x86_64-2.7/Modules/_billiard/multiprocessing.o gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DHAVE_SEM_OPEN=1 -DHAVE_FD_TRANSFER=1 -DHAVE_SEM_TIMEDWAIT=1 -IModules/_billiard -I/app/.heroku/python/include/python2.7 -c Modules/_billiard/socket_connection.c -o build/temp.linux-x86_64-2.7/Modules/_billiard/socket_connection.o gcc -pthread -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -DHAVE_SEM_OPEN=1 -DHAVE_FD_TRANSFER=1 -DHAVE_SEM_TIMEDWAIT=1 -IModules/_billiard -I/app/.heroku/python/include/python2.7 -c Modules/_billiard/semaphore.c -o build/temp.linux-x86_64-2.7/Modules/_billiard/semaphore.o gcc -pthread -shared build/temp.linux-x86_64-2.7/Modules/_billiard/multiprocessing.o build/temp.linux-x86_64-2.7/Modules/_billiard/socket_connection.o build/temp.linux-x86_64-2.7/Modules/_billiard/semaphore.o -lrt -o build/lib.linux-x86_64-2.7/_billiard.so warning: no files found matching '*.py' under directory 'Lib' Running setup.py install for anyjson Successfully installed Django dj-database-url django-extensions django-respite django-widget-tweaks gunicorn ipdb ipython psycopg2 readline dj-static static Celery SQLAlchemy pystache kombu pytz billiard amqp anyjson Cleaning up... -----> Collecting static files /app/.heroku/python/lib/python2.7/site-packages/django/conf/urls/defaults.py:3: DeprecationWarning: django.conf.urls.defaults is deprecated; use django.conf.urls instead 0 static files copied. -----> Discovering process types Procfile declares types -> web, worker -----> Compressing... done, 44.2MB -----> Launching... done, v6 http://murmuring-ravine-4093.herokuapp.com deployed to Heroku To git@heroku.com:murmuring-ravine-4093.git * [new branch] master -> master
Initialize the Database
We have our app running ... but the database wasn't initialized. We are using
Django in this example and we are going to run python manage.py syncdb
--noinput
, but for other frameworks you just replace the command with the
one you use to initialize/bootstrap the database.
$ heroku run python manage.py syncdb --noinput Running `python manage.py syncdb --noinput` attached to terminal... up, run.5257 /app/.heroku/python/lib/python2.7/site-packages/django/conf/urls/defaults.py:3: DeprecationWarning: django.conf.urls.defaults is deprecated; use django.conf.urls instead DeprecationWarning) Creating tables ... Creating table auth_permission Creating table auth_group_permissions Creating table auth_group Creating table auth_user_groups Creating table auth_user_user_permissions Creating table auth_user Creating table django_content_type Creating table django_session Creating table django_site Creating table tasks_task Installing custom SQL ... Installing indexes ... Installed 0 object(s) from 0 fixture(s)
And now we are ready to take a look to our app running heroku open
which will open your browser pointing to your app.
What now?
We have the app running and a celery worker, now it's moment to assign a decent domain and other minor settings, heroku provides an option called Production Check, you should use it before going live with your site.