A Brief Introduction to Django Channels

There’s a new updated version of this article here: http://masnun.rocks/2016/09/25/introduction-to-django-channels/


Django has long been an excellent web framework. It has helped many developers and numerous businesses succeed over the years. But before the introduction of Channels, Django only supported the http protocol well. With the gradual evolution of the web technologies, standing here in 2016, supporting http only is simply not enough. Today, we are using websockets for real time communications, WebRTC is getting popular for real time collaboration or video calling, HTTP/2 is also being adapted by many. In the current state of the web, any modern web framework needs to be able to support more and more protocols. This is where Django Channels come into play. Channels aim at adding new capabilities to Django, including the support for modern web technologies like websockets or http2.

How does “Channels” work?

The idea behind Channels is quite simple. To understand the concept, let’s first walk through an example scenario, let’s see how Channels would process a request.

A http/websocket request hits the reverse proxy (ie, nginx). This step is not compulsory but we’re conscious developers and always make sure our requests first go through a hardened, battle proven reverse proxy before it hits our application server

Nginx passes the request to an application server. Since we’re dealing with multiple protocols now, instead of application server, let’s call it “Interface Server”. This interface server knows how to handle requests using different protocols. The interface server accepts the request and transforms into a message. It then passes the message on to a channel.

We have to write consumers which will listen on to specific channels. When new messages arrive on those channels, the consumers would process them and if needed, send a response back to a reply/response channel. The interface server listens on to these response channels and when we write back to these channels, the interface server reads the message and transmits it to the outside world (in this case, our user). The consumers are run in background worker processes. We can spawn as many workers as we like to scale up.

So as you can see, the concept is really simple – an interface server accepts requests and queues them as messages on channels. Consumers process these queues and write back responses on response channels. The interface server sends back the responses. Plain, simple yet effective!

There are channels which are already available for us. For example – http.request channel can be listened on if we want to handle incoming http messages. Or websocket.receive can be used to process incoming websocket messages. In reality, we would probably be less interested in handling http.request ourselves and rather let Django handle it. We would be more interested in adding our custom logic for websocket connections or other protocols. Besides the channels which are already available, we can also create our own custom channels for different purposes. Since the project works by passing messages to channels and handling them with background workers, we can actually use it for managing our background tasks too. For example, instead of generating thumbnails on the fly, we can pass the image information as a message to a channel and the worker does the thumbnailing in the background. By default Channels ship with a management command – runworker which can run background workers to listen to the channels. However, till now, there is no retry mechanism if the message delivery somehow fails. In this regard, Celery can be an excellent choice for writing / running / managing the background workers which would process these channels.

Daphne is now the de-facto interface server that works well with Channels. The channels and message passing work through a “channel layer” which support multiple backends. The popular ones are – In Memory, Redis, IPC. As you can guess, these backends and the channel layer is used to abstract away the process of maintaining different channels/queues and allowing workers to listen to those. In Memory backend maintains the channels in memory and is a good fit for local development. While a Redis cluster would be more suitable in a production environment for scaling up.

Let’s Build a WebSocket Echo Server

Enough talk. Let’s build a simple echo server. But before we can do that, we first have to install the package.

That should install Django (as it’s a dependency of channels) and channels along with the necessary packages. Start a Django project with django-admin and create an app.

Now add channels to the INSTALLED_APPS list in your settings.py. For local development, we are fine with the in memory channel layer, so we need to put these lines in settings.py to define the default channel layer:

In the above code, please note the ROUTING key. As the value of this key, we have to pass the path to our channel routing. In my case, I have an app named realtime and there’s a module named routing.py which has the channel routing.

In the channel routing list, we define our routes which looks very similar to Django’s url patterns. When we receive a message through a websocket connection, the message is passed on to the websocket.receive channel. So we defined a consumer to consume messages from that channel. We also defined a path to indicate that websocket connections to /chat/ should be handled by this particular route. If we omit the path, the clients can connect to any url on the host and we can catch them all! But if we define a path, it helps us namespace things and in another cause which we will see later in this article.

And here’s the consumers.py:

The consumer is very basic. It retrieves the text we received via websocket and replies back. Note that the websocket content is available on the content attribute of the message. And the reply_channel is the response channel here (the interface server is listening on to this channel). Whatever we send to this channel is passed back to the websocket connection.

We have defined our channel layer, created our consumer and mapped a route to it. Now we just need to launch the interface server and the background workers (which run the consumers). In local environment, we can just run – python manage.py runserver as usual. Channels will make sure the interface server and the workers are running in the background. (But this should not be used in production, in production we must use Daphne separately and launch the workers individually. See here).

Once our dev server starts up, let’s open up the web app. If you haven’t added any django views, no worries, you should still see the “It Worked!” welcome page of Django and that should be fine for now. We need to test our websocket and we are smart enough to do that from the dev console. Open up your Chrome Devtools (or Firefox | Safari | any other browser’s dev tools) and navigate to the JS console. Paste the following JS code:

If everything worked, you should get an alert with the message we sent. Since we defined a path, the websocket connection works only on /chat/. Try modifying the JS code and send a message to some other url to see how they don’t work. Also remove the path from our route and see how you can catch all websocket messages from all the websocket connections regardless of which url they were connected to. Cool, no?

Our websocket example was very short and we just tried to demonstrate how things work in general. But Django Channels provide some really cool features to work with websockets. It integrates with the Django Auth system and authenticates the websocket users for you. Using the Group concept, it is very easy to create group chats or live blogs or any sort of real time communication in groups. Love Django’s generic views? We have generic consumers to help you get started fast. The channels docs is quite nice, I suggest you read through the docs and try the concepts.

Using our own channels

We can create our own channels and add consumers to them. Then we can simply add some messages to those channels by using the channel name. Like this:

WSGI or ASGI?

Since Daphne and ASGI is still new, some people still prefer to handle their http requests via WSGI. In such cases, we can configure nginx to route the requests to different servers (wsgi / asgi) based on url, domain or upgrade header. In such cases, having the real time end points under particular namespace can help us easily configure nginx to send the requests under that namespace to Daphne while sending all others to wsgi.

Django: Running management commands inside a Docker container

Okay, so we have dockerized our django app and we need to run a manage.py command for some task. How do we do that? Simple, we have to locate the container that runs the django app, login and then run the command.

Locate The Container

It’s very likely that our app uses multiple containers to compose the entire system. For exmaple, I have one container running MySQL, one container running Redis and another running the actual Django app. If we want to run manage.py commands, we have to login to the one that runs Django.

While our app is running, we can find the running docker containers using the docker ps command like this:

In my case, I am using Docker Compose and I know my Django app runs using the crawler_web image. So we note the name of the container. In the above example, that is – crawler_web_1.

Nice, now we know which container we have to login to.

Logging Into The Container

We use the name of the container to login to it, like this:

The command above will connect us to the container and land us on a bash shell. Now we’re ready to run our command.

Running the command

We cd into the directory if necessary and then run the management command.

Summary

  • docker ps to list running containers and locate the one
  • docker exec -it [container_name] bash to login to the bash shell on that container
  • cd to the django project and run python manage.py [command]

Django REST Framework: Remember to disable Web Browsable API in Production

So this is what happened – I built an url shortening service at work for internal use. It’s a very basic app – shortens urls and tracks clicks. Two models – URL and URLVisit. URL model contains the full url, slug for the short url, created time etc. URLVisit has information related to the click, like user IP, browser data, click time etc and a ForeignKey to URL as expected.

Two different apps were using this service, one from me, another from a different team. I kept the Web Browsable API so the developers from other teams can try it out easily and they were very happy about it. The only job of this app was url shortening so I didn’t bother building a different home page. When people requested the / page on the domain, I would redirect them directly to /api/.

Things were going really great initially. There was not very heavy load on the service. Roughly 50-100 requests per second. I would call that minimal load. The server also had decent hardware and was running on an EC2 instance from AWS. nginx was on the front while the app was run with uwsgi. Everything was so smooth until it happened. After a month and half, we started noticing very poor performance of the server. Sometimes it was taking up to 40 seconds to respond. I started investigating.

It took me some time to find out what actually happened. By the time it happened, we have shortened more than a million urls. So when someone was visiting /api/url-visit/ – the web browsable api was trying to render the html form. The form allows the user to choose one of the entries from the URL model inside a select (dropdown). Rendering that page was causing usages of 100% cpu and blocking / slowing down other requests. It’s not really DRF’s fault. If I tried to load a million of entries into a select like that, it would crash the app too.

Even worse – remember I added a redirect from the home page, directly to the /api/ url? Search engines (bots) started crawling the urls. As a result the app became extremely slow and often unavailable to nginx. I initially thought, I could stop the search engine crawls by adding some robots.txt or simply by adding authentication to the API. But developers from other teams would still time to time visit the API to try out things and then make the app non responsive. So I did what I had to – I disabled the web browsable API and added a separate documentation demonstrating the use of the API with curl, PHP and Python.

I added the following snippet in my production settings file to only enable JSONRenderer for the API:

Things have become pretty smooth afterwards. I can still enjoy the nice HTML interface locally where there are much fewer items. While on my production servers, there is no web browsable APIs to cause any bottlenecks.

Why I love Django & Django REST Framework

If you’re reading this article, it is very likely you are a serious Python developer or at least dabble with the language. If you happen to be a web developer, you probably have already used or at least heard about Django. Django is a pretty awesome web framework. And in this post, I am going to highlight a few key features that I absolutely love about Django. Along with Django, I will also discuss why I love DRF, the very popular REST API development framework for Django. Lately, these two have become indispensable parts of my life as a web developer and I feel it’s very important that I credit them for their awesomeness and spread the words!

Why I love Django?

  • I can quickly generate a project skeleton using the django-admin command. There is management commands to generate apps too.
  • I absolutely love Django’s project structure. It is very well organized and meaningful. A project is composed of reusable apps and the apps can self contain the codes and resources they need/use. The idea of reusable apps is very handy. There are plenty of open source apps that you can plug into your project and extend them as necessary. This allows us to reuse codes and get things done faster.
  • Django’s way of handling database access makes perfect sense to me. I define models, then automatically generate migrations from those. The migrations are written in Python, so I can adjust or tweak if necessary, programmatically. The Model definitions can be introspected by other parts of the framework to generate Admin UIs or API end points automagically – this is a huge win as we will see.
  • Django has a very powerful DB routing mechanism that allows us to use multiple databases and programmatically control the access to them. We can totally control which database is read from, which database is written to, for each app and each model.
  • The URL routing portion also makes great sense. There’s a main route configuration per project. We can add app specific routes to this configuration by include-ing them. You can add namespaces to these included ones. The namespacing gets rids of naming conflict when importing routes from 3rd party apps.
  • For simple views, the function based views are pretty efficient. But there’s also class based views for more control and dealing with complex needs.
  • The Generic Class Views are awesome! You can define ListView, CreateView, UpdateView, DetailView or a DeleteView. You pass them the Model to use and some configuration options, they will handle your CRUD operations for you while you can focus your time and concentration on preparing the templates. These have allowed me to build custom CRUD views very very fast. The generic views are very customizable, so you can alter the database queries, pagination, form fields, values passed to the template. If you’re still writing your own business logic for CRUD views, it’s time you took generic class views for a ride. What I like most is that generic views are generated dynamically. It’s not like scaffolding or code generation. In the case of scaffolding, when you need to add an extra field, you need to modify your model, controller, template, form etc. In class based generic views, you just update your model and everything is updated dynamically.
  • Talking about CRUD views, often you won’t even need to build them. The Django Admin interface takes in your models and generates a beautiful (and powerful) admin interface from them. The admin interface is very customizable. You can add custom actions, filtering, templates and whats not! Once you master Django’s admin interface, you will probably never need to build CRUD views from scratch. Many of my projects use Django Admin with high client satisfaction. And remember what we said about reusable apps and open source packages? There are several packages which provide alterations or replacement for the built in admin interface. You can plug in one of those for a different taste!
  • Management commands are easy to write and comes in very handy for simple tasks. When you install apps, they can provide useful management commands too. One of my favourite management command is “dbshell” which allows me to drop into the database’s command line without remembering the login details.
  • Django Templates are pretty awesome too. The syntax is quite similar to Jinja2 (which inspired many templating languages across different platforms). You can extend the templates by writing your own tags and filters. And there’s of course a backend for direct Jinja2 support.
  • Django is quite mature and the eco system is thriving. There’s a django package for almost everything you would need. There are many 3rd party packages for extending the different parts of Django. There are blogs, forums, cms etc apps which you can plug into your current project to add those functionality. Python also has a vast libraries for numerous purposes. It’s easy to plug them in when we need.
  • Oh, and did I forget to mention the documentation? Django documentation is superb. Very well written and covers almost everything in the framework. I would say Django Docs is one of the best out there.

The above mentioned features are just a few that I can remember right now. Django has been very pleasant to develop in. I have always felt a productivity boost as well as mental satisfaction when working with Django. Every time I faced a challenge, I have found a very simple solutions to it.

Why I love Django REST Framework?

  • Browsable APIs on the web – this is one of the best things about DRF. DRF provides very nice web browsable views for your API end points. These views allow HTML forms to submit data to the APIs. At the same time, they also show sample JSON payload for those requests. You can submit a request through the forms/json input and get back a response. So this is pretty much self documentation of the APIs. I have worked with front end and mobile developers (who consumed my APIs) and they absolutely loved the web browsable interface. In most cases, I didn’t even need to provide any API docs to them, they managed to figure things out for themselves. And if you would still want Swagger, there are 3rd party extensions for that too.
  • The ModelViewSet is like those admin views for Django. You can quickly generate CRUD based REST APIs out of your models using this class.
  • There is APIView which provides get, post, put, delete etc methods representing the HTTP verbs. You can extend APIView and provide business logic for these methods to quickly craft your APIs. There are mixins and some preset class based views to generate these methods based on querysets.
  • But don’t be satisfied just yet, ViewSets take things one step further. When you use APIView, you need to create two routes (for example one for /users and one for /users/1) but with ViewSets allow us to focus more on the business logic while forgetting about the route management. ViewSet is a class based view that has methods like list, retrieve, create, update, destroy – instead of the http verbs. The ViewSet methods allows us to cover all the REST operations in just one class. We add the view set to a router and the router generates all the necessary underlying django routes, ready to be included in a urlconf. And as mentioned earlier, the ModelViewSet is a ViewSet that generates these methods from a Model (queryset)
  • Authentication is baked in the framework. It’s very easy to setup different authentications for your APIs. The permission management is also very simple. The framework integrates nicely with some third party packages to support authentication methods which are not supported by the framework by default.
  • The framework is very customizable. We can easily extend the functionality to suit our needs. It’s all very simple, quite like the simplicity of Django!
  • The documentation is excellent. Adequate examples and very clearly described.

The only thing I am not completely happy with DRF is creating nested routes (/user/1/comment/2) is still pretty difficult/cumbersome. But the way DRF adds features quickly, I hope this will be resolved soon, in some upcoming releases!

Helping These Projects Grow

Django and DRF both are open source projects and are the results of many hours of dedication from kind hearted OSS enthusiasts. Today, we can build and maintain awesome projects because Django and DRF exists. We should all consider giving something back to these projects, in return to the benefits they have brought us!

We can help these projects by contributing codes, helping with translations, answering questions, promoting the projects or making donations.

Here you can learn how to contribute to Django — https://docs.djangoproject.com/en/dev/internals/contributing/

Django Software Foundation accepts donations here — https://www.djangoproject.com/fundraising/

DRF Contribution guide here — http://www.django-rest-framework.org/topics/contributing/

Django: Limiting User Access to Views

In this post, we would like to see how we can limit user accesses to our Django views.

Login Required & Permission Required Decorators

If you have worked with Django, you probably have used the login_required decorator already. Adding the decorator to a view limits access only to the logged in users. If the user is not logged in, s/he is redirected to the default login page. Or we can pass a custom login url to the decorator for that purpose.

Let’s see an example:

There’s another nice decorator – permission_required which works in a similar fashion:

Awesome but let’s learn how do they work internally.

How do they work?

We saw the magic of the login_required and permission_required decorators. But we’re the men of science and we don’t like to believe in magic. So let’s unravel the mystery of these useful decorators.

Here’s the code for the login_required decorator:

By reading the code, we can see that the login_required decorator uses another decorator – user_passes_test which takes/uses a callable to determine whether the user should have access to this view. The callable must accept an user instance and return a boolean value. user_passes_test returns a decorator which is applied to our view.

If we see the source of permission_required, we would see something quite similar. It also uses the same user_passes_test decorator.

Building Our Own Decorators

Now that we know how to limit access to a view based on whether the logged in user passes a test, it’s quite simple for us to build our own decorators for various purposes. Let’s say we want to allow access only to those users who have verified their emails.

Now we can use the decorator to a view like:

Users who have verified their email addresses will be able to access this view. And if they didn’t, they will be redirected to the login view. Using the reason query string, we can display a nice message explaining what’s happening.

Please note, we have used two decorators on the same view. We can use multiple decorators like this to make sure the user passes all the tests we require them to.

Dockerizing a Django Application

I assume you are already familiar with Docker and it’s use cases. If you haven’t yet started using Docker, I strongly recommend you do soon.

I have a Django application that I want to dockerize it for local development. I am also new to Docker, so everything I do in this post might not be suitable for your production environment. So please do check Docker best practices for production apps. This tutorial is meant to be a basic introduction to Docker. In this post, I am going to use Docker Machine and Docker Compose. You can get them by installing the awesome Docker Toolbox.

Components Breakdown

Before we start, we need to break down our requirements so we can individually build the required components. For my particular application, we need these:

  1. Django App Server
  2. MySQL Database Server
  3. Redis Server

We will build images for these separately so we can create individual containers and link them together to compose our ultimate application. We shall build our Django App server and use pre-built images for MySQL and Redis.

Building the Django App Server

Before we begin, let’s talk Dockerfiles. Dockerfiles are scripts to customize our docker builds. It allows us control and flexibility over how we build the images for our applications. We will use our custom Dockerfile to build the Django app server.

To build an image for a Django application we need to go through these following steps:

  • Select a Linux image, we choose Ubuntu
  • Install required packages for the distro.
  • Install Python packages which are required for the app
  • Provide a default command to run and ports to expose

Here’s the Dockerfile we shall use:

So what are we doing here:

  • We’re choosing phusion/baseimage as our base image. It’s a barebone image based on Ubuntu. Ubuntu by default comes with many packages which we don’t need to run inside docker. This base image gets rid of those and provides a very lean and clean image to start with.
  • We just provide a Maintainer name
  • We set DEBIAN_FRONTEND to be non interactive. This will not display any interactive prompts during the build process. Since the docker build process is automated, we really don’t have any way to interact during it. So we disable interaction. And as you might have guessed already ENV sets an environment variable.
  • We install some packages we shall need.
  • We copy our requirements.txt file to /app/src/requirements.txt, change the work directory and install the packages using pip. ADD is used to copy any files or directories to the container while it builds. You might wonder why we didn’t copy over our entire project – that’s because we want to use docker for our development. We will use a nice featire of Docker which would allow us to mount our local directories directly inside the container. Doing this, we would not need to copy files every time they change. More on this will come later.
  • We change directory to /app/src/lisp and run the runall management command. This command runs the Django default server along with some other services my application needs. But usually we would want to just do runserver
  • We EXPOSE port 8000

If you go through the Dockerfile References you will notice – we can do a lot more with Dockerfiles.

Docker Compose and Linking Services

As we mentioned earlier, we shall use pre-built images for MySQL and Redis. We could build them ourselves too but why not take advantage of the well maintained images from the generous folks in the docker community?

We can link multiple docker containers to compose a final application. We can do that using the docker command manually. But Docker Compose is a very nice tool which allows us to define the services we need in a very easy to read syntax. With docker compose, we don’t need to run them manually, we can just use simple commands to do complex docker magic! Here’s our docker-compose.yml file:

In our docker-compose file, we define 3 components:

  • For the web, we pass the path to Dockerfile to build key. We ask to restart always and define volumes to mount. .:/app/src means – mount the current directory on my OS X as /app/src/ on the container. We also define which ports to expose and which containers should be linked with it
  • We also define the mysql and redis components with respective configurations. Note that we define the pre-built image name in the image key. Please make sure the volume paths exist and are accessible.

You can consult the Compose File Reference for more details.

Running The Services

To run the application, we can do:

Please note, the Django server might throw errors if the MySQL / Redis server takes time to initialize. So I usually run them separately:

Database Configuration for Django

Our MySQL server is running on the IP of the Docker Machine. You need to use this IP address in your Django settings file. To get the IP of a docker machine, type in:

Creating Initial Databases

We can pass a MYSQL_DATABASE environment value to the mysql image so the database is created when creating the service. Or we can also connect to the docker machine manually and create our databases.

Django REST Framework: Custom Exception Handler

While using DRF, do you need to handle the exceptions yourselves? Here’s how to do it. First, we need to create an exception handler function like this:

In the exception handler, we get exc which is the exception raised and the context contains the context of the request. However, we didn’t do anything fancy with these. We just called the default exception handler and joined the errors in a flat string. We also added the exception to our response.

Now, we need to tell DRF to call our custom exception handler when it comes across an exception. That is easy:

Open up your settings.py and look for the REST_FRAMEWORK dictionary. Here you need to add the exception handler to the EXCEPTION_HANDLER key. So it should look something like this:

Django: Handling broken migrations

Often for one reason or another, migrations don’t apply properly and we have to fix them manually. If there’s a faulty migration that throws errors when run, we first have to identify what went wrong. Very often on MySQL, we see half applied migrations (probably because MySQL doesn’t have transactions?). In this case we need to do the other half ourselves.

We can easily connect to the database prompt by typing the dbshell command.

The command opens the shell for the respective database engine using the configurations provided in the settings file. That is you don’t have to remember the username or password for the database connection.

Now you have to fix the issues and then you can try running the migration again. In case it fails again, you should alter the database to match the state the migration would have created. For example, if the migration was supposed to alter a column from store_type to store_type_id (from a char field to foreign keys), you have to manually run the query, something like:

Then you have to fake the migration. When a migration is run, Django stores the name of the migration in a table. It helps track which migrations have already run and which needs to be run. When we fake a migration, Django stores the faked migration name in that table without actually running it. If we don’t do this, when we next run migrate command, Django will try to run this migration again and fail.

This is how we fake it:

You can also specify one particular migration when you have multiple migrations running.

Django REST Framework: Displaying full URL for ImageField or FileField

If you have any ImageField or FileField in your model, you can easily display the full URL (including the hostname/domain) for the file/image. Django REST Framework’s model serializers would do it for you. However, to get the hostname/domain name, the serializer needs a “request context” so it can infer the necessary parts and build a full url.

So if you’re manually invoking a serializer, please pass a request context like this:

If you use ModelViewSet, DRF will automatically pass the request context while initializing the serializer. So in that case you don’t need to do anything. You need to pass the context only when you’re manually creating a serializer instance.