Categories
General

Understanding JWT (JSON Web Tokens)

In the end of our last post (which was about Securing REST APIs) we mentioned about JWT. I made a promise that in the next post, we would discuss more about JWT and how we can secure our REST APIs using it. However, when I started drafting the post and writing the code, I realized the underlying concepts of JWT themselves deserve a dedicated blog post. So in this blog post, we will focus solely on JWT and how it works.

What is JWT?

We will ignore the text book definitions and try to explain the concepts in our own words. Don’t be afraid of the serious looking acronym, the concepts are rather simple to understand and comprehend. First let’s break down the term – “JSON Web Tokens”, so it has to do something with JSON, the web and of course tokens. Right? Let’s see.

Yes, a JWT mostly concerns with a Token that is actually a hashed / signed form of a JSON payload. The JSON payload is signed using a hashing algorithm along with a secret to produce a single (slightly long) string that works as a token. So a JWT is basically a string / token generated by processing a JSON payload in a certain way.

So how does JWT help? If you followed our last article, you now know why http basic auth is bad. You have to pass your username and password with every request. That is kind of bad, right? The more you send your username and password over the internet, the more likely it is to get compromised, no? Instead, on the first login, we can accept the username and password and return a token back to the client. The client passes that token with every request. We verify that token to see if it’s a logged in user or not. This is the idea behind Token based authentication.

Random Tokens  vs JWT

How would you generate such token? You could generate a nice random string and store it in database against that user. Right? This is how cookie based session works too, btw. Now what if your application is scaled across multiple servers and all requests are load balanced? One server will not recognize a token / session generated by another server. Unless of course you also have one central database active all the time, serving all the incoming requests from all the servers. That setup is tricky and difficult, no?

There is another work around using sticky sessions where the requests from one particular user is always directed to the same server by the load balancer. This work around is also not as simple as JWT. Even if all these work nicely, we still have to make database queries to validate the token / session. What if we want to provide single sign on (users from one service wants to access resources on a different service all together)? How does that work? We will need a central auth server and all services will have to talk to it to verify the user token.

The benefit of JWT is that it’s lightweight but at the same time it’s a self contained JSON payload. You can store user identity in the JSON, sign it and send the token to the clients. Since it’s signed we can verify and validate it with just our secret key. No database overhead. No need for sticky sessions. Just share the secret key privately and all your services can read the data stored inside the JWT. Others can’t tamper or forge a new, valid token for an user without that secret key. Single sign on just becomes a breeze and less complicated. Sounds good? Let’s see how JWTs are constructed.

Anatomy of JWT

A JSON Web Token consists of three major parts:

  • Header
  • Payload
  • Signature

These 3 parts are separated by dots (.). So a JWT looks like the following

If you look closely, there are 3 parts here:

  • Header: eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9
  • PayloadeyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiYWRtaW4iOnRydWV9
  • SignatureTJVA95OrM7E2cBab30RMHrHDcEfxjoYZgeFONFh7HgQ

Okay, so we get the three parts, but what do they really do? Also those strings look like meaningless characters to me. What are they? How are they generated? Well, they are encoded in certain ways as we will see in the following sections.

Header

The header is a simple key value pair (dictionary / hashmap) data structure. It usually has two keys typ and alg short for type and algorithm. The standard is to have the keys at best 3 character long, so the generated token does not get too large.

Example:

The typ value is JWT since this is JWT we’re using. The HS256 is the most common and most popular hashing algorithm used with JWT.

Now we just need to base64 encode this part and we get the header string. You were wondering why the strings didn’t make sense. That’s because the data is base64 encoded.

Payload

Here comes our favorite part – the JSON payload. In this part, we put the data we want to store in the JWT. As usual, we should keep the keys and the overall structure as small as possible.

We can add any data we see fit. These fields / keys are called “claims”. There are some reserved claims – keys which can be interpreted in a certain way by the libraries which decode the JWT. For example, if we pass the exp  (expiry) claim with a timestamp, the decoding library will check this value and throw an exception if the time has passed (the token has expired). These can often be helpful in many cases. You can find the common standard fields on Wikipedia.

As usual, we base64 encode the payload to get the payload string.

Signature

The signature part itself is a hashed string. We concatenate the header and the payload strings (base 64 encoded header and payload) with a dot (.) between them. Then we use the hashing algorithm to hash this string with our secret key.

In pseudocode:

That would give us the last part of the JWT, the signature.

Glue it all together

As we discussed before, the JWT is the dot separated form of the three components. So the final JWT would be:

 Using a library

Hey! JSON Web Tokens sounded great but looks like there’s a lot of work involved! Well, it would seem that way since we tried to understand how a JSON Web Token is actually constructed. In our day to day use cases, we would just use a suitable library for the language / platform of our choice and be done with it.

If you are wondering what library you can use with your language / platform, here’s a comprehensive list of libraries – JSON Web Token Libraries.

Real Life Example with PyJWT

Enough talk, time to see some codes. Excited? Let’s go!

We will be using Python with the excellent PyJWT package to encode and decode our JSON Web Tokens in this example. Before we can use the library, we have to install it first. Let’s do that using pip.

Now we can start generating our tokens. Here’s an example code snippet:

If we run the code, we will see:

So it worked – we encoded a payload and then decoded it back. All we needed to do is to call jwt.encode and jwt.decode with our secret key and the payload / token. So simple, no? parties.

Bonus Example – Expiry

In the following example, we will set the expiry to only 2 seconds. Then we will wait 10 seconds (so the token expires by then) and try to decode the token.

What happens after we run it? This happens:

Cool, so we get an error mentioning that the signature has expired by now. This is because we used the standard exp claim and our library knew how to process it. This is how we use the standard claims to ease our job!

Using JWT for REST API Authentication

Now that we’re all convinced of the good sides of JSON Web Tokens, the question comes into mind – how can we use it in our REST APIs?

The idea is simple and straightforward. When the user logs in the first time, we verify his/her credentials and generate a JSON Web Token with necessary details. Then we return this token back to the user/client. The client will now send the token with every request, as part of the authorization header.

The server will decode this token and read the user data. It won’t have to access the database or contact another auth server to verify the user details, it’s all inside the decoded payload.  And since the token is signed and the secret key is “secret”, we can trust the payload.

But please make sure the secret key is not compromised. And of course use SSL (https) so that men in the middle can not hijack the token anyway.

What’s next?

JSON Web Token is not only about authentication. You can use it to securely transmit data from one party to another. However, it’s mostly used for authenticating REST APIs. In our next blog post, we shall go through that use case. We will see how we can authenticate our api using JWT.

In the mean time, you can subscribe to the mailing list so you can stay up to date with this blog. If you liked the article and/or learned something new, please don’t forget to share it with your friends.

Categories
Python

Securing REST APIs: Basic HTTP Authentication with Python / Flask

In our last tutorial on REST API Best Practices, we designed and implemented a very simple RESTful mailing list API. However our API (and the data) was open to public, anyone could read / add / delete subscribers from our mailing list. In serious projects, we definitely do not want that to happen. In this post, we will discuss how we can use http basic auth to authenticate our users and secure our APIs.

PS: If you are new to REST APIs, please check out REST APIs: Concepts and Applications to understand the fundamentals.

Setup API and Private Resource

Before we can move on to authentication, we first need to create some resources which we want to secure. For demonstration purposes, we will keep things simple. We will have a very simple endpoint like below:

If we launch the server and access the endpoint, we will get the expected output:

Our API is for now public. Anyone can access it. Let’s secure it so it’s no longer publicly accessible.

Basic HTTP Authentication

The idea of Basic HTTP Authentication is pretty simple. When we request a resource, the server sends back a header that looks something like this: WWW-Authenticate →Basic realm=”Authentication Required”. Generally when we try to access such resources from a browser, the browser shows us a prompt to enter username and password. The browser then base64 encodes the data and sends back an Authorization header. The server parses the data and verifies the user. If the user is legit, the resource is accessible, otherwise we are not granted permission to access it. 

While using a REST Client, we would very often need to pass the credentials before hand, while we make the request. For example, if we’re using curl, we need to pass the --user option while running the command.

Basic HTTP Authentication is a very old method but quite easy to setup. Flask HTTPAuth is a nice extension that would help us with that.

Install Dependencies

Before we can start writing codes, we need to have the necessary packages installed. We can install the package using pip:

Once the package is installed, we can use it to add authentication to our API endpoints.

Require Login

We will import the HTTPBasicAuth class and create a new instance named auth. It’s important to note that name because we will be using methods on this auth instance as decorators for various purposes.  For example, we will use the @auth.login_required decorator to make sure only logged in users can access the resource.

In our resource, we added the above mentioned decorator to our get method. So if anyone wants to GET that resource, s/he needs to login first. The code looks like this:

If we try to access the resource without logging in, we will get an error telling us we’re not authorized. Let’s send a quick request using curl.

So it worked. Our API endpoint is now no longer public. We need to login before we can access it. And from the API developer’s perspective, we need to let the users login before they can access our API. How do we do that?

Handling User Logins

We would generally store our users in a database. Well, a secured database. And of course, we would never store user password in plain text. But for this tutorial, we would store the user credentials in a dictionary. The password will be in plain text.

Flask HTTP Auth will handle the authentication process for us. We just need to tell it how to verify the user with his/her username and password. The @auth.verify_password decorator can be used to register a function that will receive the username and password. This function will verify if the credentials are correct and based on it’s return value, HTTP Auth extension will handle the user auth.

In the following code snippet, we register the verify function as the callback for verifying user credentials.  When the user passes the credentials, this function will be called. If the function returns True, the user will be accepted as authorized. If it returns False, the user will be rejected. We have kept our data in the USER_DATA dictionary.

Once we have added the above code, we can now test if the auth works.

But if we omit the auth credentials, does it work?

It doesn’t work without the login. Perfect! We now have a secured API endpoint that uses basic http auth. But in all seriousness, it’s not recommended.  That’s right, do not use it in the public internet. It’s perhaps okay to use inside a private network. Why? Please read this thread.

Wrapping Up

With the changes made, here’s the full code for this tutorial:

As discussed in the last section, it’s not recommended to use basic http authentication in open / public systems. However, it is good to know how http basic auth works and it’s simplicity makes beginners grasp the concept of authentication / API security quite easily.

You might be wondering – “If we don’t use http auth, then what do we use instead to secure our REST APIs?”. In our next tutorial on REST APIs, we would demonstrating how we can use JSON Web Tokens aka JWT to secure our APIs. Can’t wait for that long? Go ahead and read the introduction.

And don’t forget to subscribe to the mailing list so when I write the next post, you get a notification!

Categories
Golang

Golang / The Go Programming Language

The Go Programming language (often written as golang) has become quite popular recently. Google is actively backing the project  but Golang has seen usage, contribution and success stories from many other popular brands and enterprises across the internet. Go promises a very simple, easy to learn syntax that allows us to build robust, reliable, efficient software. Once we have had invested some time writing  production grade code in the language, we would agree that Go delivers on it’s promises. It’s indeed a fantastic language – easy to learn, easy to read, reason about and of course maintain. You get superb performance without sacrificing much on productivity. Don’t just trust my words, give it a Go! (See what I did there?)

A little history

Work on Golang started back in 2007 by Robert Griesemer, Rob Pike, and Ken Thompson at Google. So that kinds of make Golang 10 years old in 2017. Although work started in 2007, the language was however announced in 2009. It reached version 1.0 in 2012.

A little about the creators – Rob Pike was a member of the unix team and he is known for his work on Plan 9. Ken Thompson designed and implemented Unix. He also created the B language (from which C was inspired). He was also involved in the Plan 9 project.

Why did they start working on a new language? Because they were frustrated with the ones that existed. You can choose a dynamic language like Python / Javascript and enjoy the ease of programming in them. Or you can choose something like C/C++ to get performance. But at the same time you lose the ease of programming, productivity reduces and the compilation time can sometimes get too long. More and more developers were choosing dynamic languages in their projects, essentially favouring the ease of use over the safety and performance offered by the statically typed, compiled languages. There wasn’t any popular, easy to use, mainstream language that could ease these problems for the developers. You couldn’t get ease of programming, safety, efficiency, fast compilation – all from a single language.

The creators of Golang saw this problem as an opportunity to create a better language that could solve these problems at hand.

Why did Golang become so popular?

Go came with better offerings, specially with solutions to what many of us (including the Googlers) faced. When I read (or write) Go code, I feel like the following equation makes perfect sense => C + Python = Golang. Go is very very fast. Not just the language but also the compiler. Go compiles fast, runs fast. And you still feel quite productive, much more productive than you would feel in C++ or Java.

The syntax is simple. You won’t have to remember many keywords.  The static typing provides safety to a great extent. IDEs can provide better code completion and refactoring assistance. Compiler would help you reduce bugs by catching many potential errors before the program even starts to run (which is applicable to all statically typed languages, nothing Golang specific but with Go’s “light on keywords” design, it’s just more productive and enjoyable).

Go provides a nice, extensive standard library with all the batteries you might need for day to day system or network programming. You want to build an awesome web app? Go standard library has you (mostly) covered.

The major win for Go is perhaps the concurrency primitives. We can create very light weight threads called goroutine which are multiplexed on all available CPU cores. We can easily communicate between the goroutines using channels. I personally found the goroutine and channels based way of writing efficient concurrent programs very easy, elegant and pleasant. Don’t be afraid of writing highly concurrent programs any more!

Golang also compiles everything and generates a single binary that includes everything that you need to run the program. You do not need to have anything installed on your target machine to run the binary. This is a huge win in terms of deployment. Writing and distributing command line tools have never been easier!

How popular is Golang?

It’s quite popular. Golang is currently ranked 16 on the TIOBE Index. Tiobe also declared Go to be the language of the year in 2016. Interestingly they also chose Go as the language of the year back in 2009, the year it was released. It stands 15th on the RedMonk language ratings. If you compare active repositories using Githut, Go would be the 14th most active language on Github. For a new programming language, that has released it’s first major version in 2012, that ranking is quite impressive. If you think about the growth of the language, it’s been getting a lot of traction.

In the latest StackOverflow Survey (2017), Go is 3rd on the most wanted list and 5th on the most loved list. That tells us something, doesn’t it? 😀

gopher with a gun
Gopher with a Gun!

How’s the job market, you ask? Don’t worry, Go has jobs for you!

Also check out the HackerNews “Who is hiring?” threads. I see a lot of Golang related job postings there. And do not forget, Golang pays well.

I want to learn Golang

I am sure you do.  So where can we learn Golang by ourselves? These are my recommendations:

  • The Official Tour is excellent. I recommend it to all the beginners. You will have a very good basic once you complete the tour. The tour is also available offline in case you want to learn some golang while on vacation or on the go.
  • Go By Example is another excellent step by step tour. I always refer to this site when I need to lookup a syntax or need a quick reference.
  • Effective Go is a nice read if you want to learn the best practices and want to write idiomatic Go codes.
  • The Little Go Book is a great free book for learning Go.
  • Official Documentation is something we all should keep close to us while writing some Go code.
  • The Go Blog – This is something I really really love. The official Go blog has many interesting and detailed stuff. Trust me, there’s a lot to learn from there. Highly recommended.

If you need some help, please feel free to ask in Gophers Slack or the Go Forum maintained by the GoBridge people.