Categories
General Python

Auto incrementing IDs for MongoDB

If you’re familiar with relational databases like MySQL or PostgreSQL, you’re probably also familiar with auto incrementing IDs. You select a primary key for a table and make it auto incrementing. Every row you insert afterwards, each of them gets a new ID, automatically incremented from the last one. We don’t have to keep track of what number comes next or ensure the atomic nature of this operation (what happens if two different client wants to insert a new row at the very same time? do they both get the same id?). This can be very useful where sequential, numeric IDs are essential. For example, let’s say we’re building a url shortener. We can base62 encode the ID of the url id to quickly generate a short slug for that long url.

Fast forward to MongoDB, the popular NoSQL database doesn’t have any equivalent to sequential IDs. It’s true that you can insert anything unique as the required _id field of a mongodb document, so you can take things to your hand and try to insert unique ids yourselves. But you have to ensure the uniqueness and atomicity of the operation.

A very popular work around to this is to create a separate mongodb collection. Then maintain documents with a numeric value to keep track of your auto incrementing IDs. Now, every time we want to insert a new document that needs a unique ID, we come back to this collection, use the $inc operator to atomically increment this number and then use the incremented number as the unique id for our new document.

Let me give an example, say we have an messages collection. Each new message needs a new, sequential ID. We create a new collection named sequences. Each document in this sequences collection will hold the last used ID for a collection. So, for tracking the unique ID in the messages collection, we create a new document in the sequences collection like this:

Next, we will write a function that can give us the next sequential ID for a collection by it’s name. The code is in Python, using PyMongo library.

If we need the next auto incrementing ID for the messages collection, we can call it like this:

Find and Modify – Deprecated

If you have searched on Google, you might have come across many StackOverflow answers as well as individual blog posts which refer to findAndModify() call (find_and_modify in Pymongo). This was the way to do things. But it’s deprecated now, so please use the new find_one_and_update function now.

(How) Does this scale?

We would only call the get_sequence function before inserting a new mongo document. The function uses the $inc operator which is atomic in nature. Mongo guarantees this. So even if 100s of different clients trying to increment the value for the same document, they will be all applied one after one. So each value they get will be unique, new IDs.

I personally haven’t been able to test this strategy at a larger scale but according to people on StackOverflow and other forums, people have scaled this to thousands and millions of users. So I guess it’s pretty safe.

Categories
General

Understanding JWT (JSON Web Tokens)

In the end of our last post (which was about Securing REST APIs) we mentioned about JWT. I made a promise that in the next post, we would discuss more about JWT and how we can secure our REST APIs using it. However, when I started drafting the post and writing the code, I realized the underlying concepts of JWT themselves deserve a dedicated blog post. So in this blog post, we will focus solely on JWT and how it works.

What is JWT?

We will ignore the text book definitions and try to explain the concepts in our own words. Don’t be afraid of the serious looking acronym, the concepts are rather simple to understand and comprehend. First let’s break down the term – “JSON Web Tokens”, so it has to do something with JSON, the web and of course tokens. Right? Let’s see.

Yes, a JWT mostly concerns with a Token that is actually a hashed / signed form of a JSON payload. The JSON payload is signed using a hashing algorithm along with a secret to produce a single (slightly long) string that works as a token. So a JWT is basically a string / token generated by processing a JSON payload in a certain way.

So how does JWT help? If you followed our last article, you now know why http basic auth is bad. You have to pass your username and password with every request. That is kind of bad, right? The more you send your username and password over the internet, the more likely it is to get compromised, no? Instead, on the first login, we can accept the username and password and return a token back to the client. The client passes that token with every request. We verify that token to see if it’s a logged in user or not. This is the idea behind Token based authentication.

Random Tokens  vs JWT

How would you generate such token? You could generate a nice random string and store it in database against that user. Right? This is how cookie based session works too, btw. Now what if your application is scaled across multiple servers and all requests are load balanced? One server will not recognize a token / session generated by another server. Unless of course you also have one central database active all the time, serving all the incoming requests from all the servers. That setup is tricky and difficult, no?

There is another work around using sticky sessions where the requests from one particular user is always directed to the same server by the load balancer. This work around is also not as simple as JWT. Even if all these work nicely, we still have to make database queries to validate the token / session. What if we want to provide single sign on (users from one service wants to access resources on a different service all together)? How does that work? We will need a central auth server and all services will have to talk to it to verify the user token.

The benefit of JWT is that it’s lightweight but at the same time it’s a self contained JSON payload. You can store user identity in the JSON, sign it and send the token to the clients. Since it’s signed we can verify and validate it with just our secret key. No database overhead. No need for sticky sessions. Just share the secret key privately and all your services can read the data stored inside the JWT. Others can’t tamper or forge a new, valid token for an user without that secret key. Single sign on just becomes a breeze and less complicated. Sounds good? Let’s see how JWTs are constructed.

Anatomy of JWT

A JSON Web Token consists of three major parts:

  • Header
  • Payload
  • Signature

These 3 parts are separated by dots (.). So a JWT looks like the following

If you look closely, there are 3 parts here:

  • Header: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
  • PayloadeyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9
  • SignatureTJVA95OrM7E2cBab30RMHrHDcEfxjoYZgeFONFh7HgQ

Okay, so we get the three parts, but what do they really do? Also those strings look like meaningless characters to me. What are they? How are they generated? Well, they are encoded in certain ways as we will see in the following sections.

Header

The header is a simple key value pair (dictionary / hashmap) data structure. It usually has two keys typ and alg short for type and algorithm. The standard is to have the keys at best 3 character long, so the generated token does not get too large.

Example:

The typ value is JWT since this is JWT we’re using. The HS256 is the most common and most popular hashing algorithm used with JWT.

Now we just need to base64 encode this part and we get the header string. You were wondering why the strings didn’t make sense. That’s because the data is base64 encoded.

Payload

Here comes our favorite part – the JSON payload. In this part, we put the data we want to store in the JWT. As usual, we should keep the keys and the overall structure as small as possible.

We can add any data we see fit. These fields / keys are called “claims”. There are some reserved claims – keys which can be interpreted in a certain way by the libraries which decode the JWT. For example, if we pass the exp  (expiry) claim with a timestamp, the decoding library will check this value and throw an exception if the time has passed (the token has expired). These can often be helpful in many cases. You can find the common standard fields on Wikipedia.

As usual, we base64 encode the payload to get the payload string.

Signature

The signature part itself is a hashed string. We concatenate the header and the payload strings (base 64 encoded header and payload) with a dot (.) between them. Then we use the hashing algorithm to hash this string with our secret key.

In pseudocode:

That would give us the last part of the JWT, the signature.

Glue it all together

As we discussed before, the JWT is the dot separated form of the three components. So the final JWT would be:

 Using a library

Hey! JSON Web Tokens sounded great but looks like there’s a lot of work involved! Well, it would seem that way since we tried to understand how a JSON Web Token is actually constructed. In our day to day use cases, we would just use a suitable library for the language / platform of our choice and be done with it.

If you are wondering what library you can use with your language / platform, here’s a comprehensive list of libraries – JSON Web Token Libraries.

Real Life Example with PyJWT

Enough talk, time to see some codes. Excited? Let’s go!

We will be using Python with the excellent PyJWT package to encode and decode our JSON Web Tokens in this example. Before we can use the library, we have to install it first. Let’s do that using pip.

Now we can start generating our tokens. Here’s an example code snippet:

If we run the code, we will see:

So it worked – we encoded a payload and then decoded it back. All we needed to do is to call jwt.encode and jwt.decode with our secret key and the payload / token. So simple, no? parties.

Bonus Example – Expiry

In the following example, we will set the expiry to only 2 seconds. Then we will wait 10 seconds (so the token expires by then) and try to decode the token.

What happens after we run it? This happens:

Cool, so we get an error mentioning that the signature has expired by now. This is because we used the standard exp claim and our library knew how to process it. This is how we use the standard claims to ease our job!

Using JWT for REST API Authentication

Now that we’re all convinced of the good sides of JSON Web Tokens, the question comes into mind – how can we use it in our REST APIs?

The idea is simple and straightforward. When the user logs in the first time, we verify his/her credentials and generate a JSON Web Token with necessary details. Then we return this token back to the user/client. The client will now send the token with every request, as part of the authorization header.

The server will decode this token and read the user data. It won’t have to access the database or contact another auth server to verify the user details, it’s all inside the decoded payload.  And since the token is signed and the secret key is “secret”, we can trust the payload.

But please make sure the secret key is not compromised. And of course use SSL (https) so that men in the middle can not hijack the token anyway.

What’s next?

JSON Web Token is not only about authentication. You can use it to securely transmit data from one party to another. However, it’s mostly used for authenticating REST APIs. In our next blog post, we shall go through that use case. We will see how we can authenticate our api using JWT.

In the mean time, you can subscribe to the mailing list so you can stay up to date with this blog. If you liked the article and/or learned something new, please don’t forget to share it with your friends.

Categories
General

REST APIs: The concepts and applications

If you are into web programming you might have come across the terms “REST API” or “RESTful” and you might have felt curious as to what they really are. In this post, we shall take our time to go through some of the concepts / ideas behind REST and it’s applications.

The REST-less Developer

You might have noticed that the technology world is changing so fast. Back in the days, we used to love using Desktop applications. For example, we would probably want to check our emails on Outlook Express or Thunderbird then. Soon the web became popular and web applications started becoming extremely popular for the conveniences they offered. We now love Gmail, don’t we? But then we realized webmails are great and all but that’s not good enough. I want my emails on my phone and tablets. I for example, like to check my emails on my phone – I do that a lot because I can not be in front of a laptop all day. If you think carefully, in a few more years, we would want those emails on our wrist watches.

Now if you were the developer of such a popular email service, how would you serve these different types of devices? For example, the webmail can use HTML fine, mobile phones can browse HTML too but what about desktop and the smart watch apps? Also on phone, people would like native apps more than mobile web apps. So how to feed them data? Don’t be RESTless, some REST always helps! 😉

The RESTful Web

The good old web was working very well for us, for a time being. But the hypertext / HTML is not the native tongue for many devices that we have to accommodate into our services. Also the web is no longer about just “documents”, it is so much more. The REST architecture can shape the modern web in a way that provides uniform access to all our clients (devices). The core idea behind REST is that the client does not need to know anything before hand, it will connect to a server and the server will provide the client with available options via an agreed upon medium. For the web the medium is “HTML”.

For example, when your browser connected to this website, it didn’t know anything beforehand. The server served it a HTML page that has links to various posts and the search form to search content. By reading the source code (HTML), the client (browser) now knows what actions are available. Now, if you click a link or enter a search keyword and perform search, the browser would perform that action. But it has no idea what would happen next. When it performs the action, the server supplies new html telling it what it can do next. So the server is supplying information that the client can use to further interact with the server.

Hypertext or Hypermedia

But hey, not every device can understand HTML, no? Yes, you are absolutely right. That is why the server is no way just confined to HTML. It can provide other responses too, for example XML and JSON are also valid (and two popular) medium of communication. This is why when we describe REST, we usually say “hypermedia” instead of “hypertext”.

The principle that the client does not need to know anything before hand and the server dynamically generates hypermedia responses through which the client interacts with the server – this principle is aptly named “Hypermedia as the engine of application state” aka “HATEOAS“. That is one big name but if you read and think about it, it makes perfect sense. In this principle, the hypermedia generated by the server works as the “engine” of the application’s state. Cool, eh? HATEOAS is a key driving principle of the RESTful web but there’s more. Are you ready to dive in?

Fitting REST into HTTP and APIs

We now understand that in a REST like architecture, there will be a client and a server. The server will provide dynamically generated hypermedia on which the client will act upon. It all makes sense but how do we make our web APIs RESTful?

The idea of communicating over HTTP very often involves Verbs and Resources. Did you notice how very often the same URL can output different responses depending on which http method (GET or POST) we used? The URL can be considered as a resource and the http methods are the verbs. There’s more to just GET and POST. There are PUT, PATCH, DELETE etc.

The purpose/intent of the common http verbs are:

  • GET: The purpose is to literally get the data.
  • POST: This method translates to “create“.
  • PUT / PATCH: We use these methods to update data.
  • DELETE: Come on, do I even need to explain what this one does? 😀

Now while building our APIs, we can map these verbs to our resources. For example, we have User resources. You can access it on http://api.example.com/user. Now when someone makes a GET request, we can send them a list of available users. But when they send new user data via POST, we create a new user. What if they want to view / update / delete a single user instance?

Resources: Collections vs Elements

We can broadly classify the resources into two categories – “collections” and “elements” and apply the http verbs to them. Now we have two different kinds of resources – “user collection” as a whole and “individual users”. How do we map the different http verbs to them? Wikipedia has a nice chart for us.

For Collections: (/user)

  • GET – List all users
  • POST – Create a new user
  • PUT – Replace all users with these new users
  • DELETE – Delete all users

For Elements: (/user/123)

  • GET – Retrieve data about user with ID 123
  • POST – Generally not used and throws errors but can be used if the resource itself is a nested collection. In that case creates new element within that collection.
  • PUT – Replace the user data
  • DELETE – Delete the user

Is this Official?

Everything makes sense and sounds good. So I guess everyone on the web follows this standard? Well, no. Technically, REST is an architecture or architectural style/pattern. It is not a protocol or a standard itself (although it does use other standards like XML or JSON). The sad fact is that nobody has to follow the principles. This is why we often would come across APIs which would not adhere to these principles and design things their own way. And that is kind of alright given REST is not engraved in a holy stone.

But should we follow the principle? Of course we should, we want to be the good citizens of the RESTful web, don’t we?

How can I consume  / create REST APIs?

You do have a curious mind, don’t you? What good is our knowledge of REST if we are not using it ourselves? Of course we shall. This blog post just scratches the surface of it. There is so much more to learn about REST and we shall learn those things in time. I have plans to write detailed posts on consuming and creating REST APIs in the future. You just have to stay in touch! If you haven’t yet, it would be a good idea to subscribe to the mailing list and I will let you know when I write the next piece, deal?

If you have any feedback on this post, please feel free to let me know in the comments. I hope you liked reading it. Also don’t forget to tell your friends about the wonderful world of REST, will you? 🙂