Categories
General Python

Auto incrementing IDs for MongoDB

If you’re familiar with relational databases like MySQL or PostgreSQL, you’re probably also familiar with auto incrementing IDs. You select a primary key for a table and make it auto incrementing. Every row you insert afterwards, each of them gets a new ID, automatically incremented from the last one. We don’t have to keep track of what number comes next or ensure the atomic nature of this operation (what happens if two different client wants to insert a new row at the very same time? do they both get the same id?). This can be very useful where sequential, numeric IDs are essential. For example, let’s say we’re building a url shortener. We can base62 encode the ID of the url id to quickly generate a short slug for that long url.

Fast forward to MongoDB, the popular NoSQL database doesn’t have any equivalent to sequential IDs. It’s true that you can insert anything unique as the required _id field of a mongodb document, so you can take things to your hand and try to insert unique ids yourselves. But you have to ensure the uniqueness and atomicity of the operation.

A very popular work around to this is to create a separate mongodb collection. Then maintain documents with a numeric value to keep track of your auto incrementing IDs. Now, every time we want to insert a new document that needs a unique ID, we come back to this collection, use the $inc operator to atomically increment this number and then use the incremented number as the unique id for our new document.

Let me give an example, say we have an messages collection. Each new message needs a new, sequential ID. We create a new collection named sequences. Each document in this sequences collection will hold the last used ID for a collection. So, for tracking the unique ID in the messages collection, we create a new document in the sequences collection like this:

{
    "_id" : "messages",
    "value" : 0
}

Next, we will write a function that can give us the next sequential ID for a collection by it’s name. The code is in Python, using PyMongo library.

def get_sequence(name):
    collection = db.sequences
    document = collection.find_one_and_update({"_id": name}, {"$inc": {"value": 1}}, return_document=True)

    return document["value"]

If we need the next auto incrementing ID for the messages collection, we can call it like this:

{"_id": get_sequence("messages")}
Find and Modify – Deprecated

If you have searched on Google, you might have come across many StackOverflow answers as well as individual blog posts which refer to findAndModify() call (find_and_modify in Pymongo). This was the way to do things. But it’s deprecated now, so please use the new find_one_and_update function now.

(How) Does this scale?

We would only call the get_sequence function before inserting a new mongo document. The function uses the $inc operator which is atomic in nature. Mongo guarantees this. So even if 100s of different clients trying to increment the value for the same document, they will be all applied one after one. So each value they get will be unique, new IDs.

I personally haven’t been able to test this strategy at a larger scale but according to people on StackOverflow and other forums, people have scaled this to thousands and millions of users. So I guess it’s pretty safe.

Categories
Python

API Star: Python 3 API Framework

For building quick APIs in Python, I have mostly depended on Flask. Recently I came across a new API framework for Python 3 named “API Star” which seemed really interesting to me for several reasons. Firstly the framework embraces modern Python features like type hints and asyncio. And then it goes ahead and uses these features to provide awesome development experience for us, the developers. We will get into those features soon but before we begin, I would like to thank Tom Christie for all the work he has put into Django REST Framework and now API Star.

Now back to API Star – I feel very productive in the framework. I can choose to write async codes based on asyncio or I can choose a traditional backend like WSGI. It comes with a command line tool – apistar to help us get things done faster. There’s (optional) support for both Django ORM and SQLAlchemy.  There’s a brilliant type system that enables us to define constraints on our input and output and from these, API Star can auto generate api schemas (and docs), provide validation and serialization feature and a lot more. Although API Star is heavily focused on building APIs, you can also build web applications on top of it fairly easily. All these might not make proper sense until we build something all by ourselves.

Getting Started

We will start by installing API Star. It would be a good idea to create a virtual environment for this exercise. If you don’t know how to create a virtualenv, don’t worry and go ahead.

pip install apistar

If you’re not using a virtual environment or the pip command for your Python 3 is called pip3, then please use pip3 install apistar instead.

Once we have the package installed, we should have access to the apistar command line tool. We can create a new project with it. Let’s create a new project in our current directory.

apistar new .

Now we should have two files created – app.py – which contains the main application and then test.py for our tests. Let’s examine our app.py file:

from apistar import Include, Route
from apistar.frameworks.wsgi import WSGIApp as App
from apistar.handlers import docs_urls, static_urls


def welcome(name=None):
    if name is None:
        return {'message': 'Welcome to API Star!'}
    return {'message': 'Welcome to API Star, %s!' % name}


routes = [
    Route('/', 'GET', welcome),
    Include('/docs', docs_urls),
    Include('/static', static_urls)
]

app = App(routes=routes)


if __name__ == '__main__':
    app.main()

Before we dive into the code, let’s run the app and see if it works. If we navigate to http://127.0.0.1:8080/ we will get this following response:

{"message": "Welcome to API Star!"}

And if we navigate to: http://127.0.0.1:8080/?name=masnun

{"message": "Welcome to API Star, masnun!"}

Similarly if we navigate to: http://127.0.0.1:8080/docs/, we will see auto generated docs for our API.

Now let’s look at the code. We have a welcome function that takes a parameter named name which has a default value of None. API Star is a smart api framework. It will try to find the name key in the url path or query string and pass it to our function. It also generates the API docs based on it. Pretty nice, no?

We then create a list of Route and Include instances and pass the list to the App instance. Route objects are used to define custom user routing. Include , as the name suggests, includes/embeds other routes under the path provided to it.

Routing

Routing is simple. When constructing the App instance, we need to pass a list as the routes argument. This list should comprise of Route or Include objects as we just saw above. For Routes, we pass a url path, http method name and the request handler callable (function or otherwise). For the Include instances, we pass a url path and a list of Routes instance.

Path Parameters

We can put a name inside curly braces to declare a url path parameter. For example /user/{user_id} defines a path where the user_id is a path parameter or a  variable which will be injected into the handler function (actually callable). Here’s a quick example:

from apistar import Route
from apistar.frameworks.wsgi import WSGIApp as App


def user_profile(user_id: int):
    return {'message': 'Your profile id is: {}'.format(user_id)}


routes = [
    Route('/user/{user_id}', 'GET', user_profile),
]

app = App(routes=routes)

if __name__ == '__main__':
    app.main()

If we visit http://127.0.0.1:8080/user/23 we will get a response like this:

{"message": "Your profile id is: 23"}

But if we try to visit http://127.0.0.1:8080/user/some_string – it will not match. Because the user_profile function we defined, we added a type hint for the user_id parameter. If it’s not integer, the path doesn’t match. But if we go ahead and delete the type hint and just use user_profile(user_id), it will match this url. This is again API Star is being smart and taking advantages of typing.

Including / Grouping Routes

Sometimes it might make sense to group certain urls together. Say we have a user module that deals with user related functionality. It might be better to group all the user related endpoints under the /user path. For example – /user/new, /user/1, /user/1/update and what not. We can easily create our handlers and routes in a separate module or package even and then include them in our own routes.

Let’s create a new module named user, the file name would be user.py. Let’s put these codes in this file:

from apistar import Route


def user_new():
    return {"message": "Create a new user"}


def user_update(user_id: int):
    return {"message": "Update user #{}".format(user_id)}


def user_profile(user_id: int):
    return {"message": "User Profile for: {}".format(user_id)}


user_routes = [
    Route("/new", "GET", user_new),
    Route("/{user_id}/update", "GET", user_update),
    Route("/{user_id}/profile", "GET", user_profile),
]

Now we can import our user_routes from within our main app file and use it like this:

from apistar import Include
from apistar.frameworks.wsgi import WSGIApp as App

from user import user_routes

routes = [
    Include("/user", user_routes)
]

app = App(routes=routes)

if __name__ == '__main__':
    app.main()

Now /user/new will delegate to user_new function.

Accessing Query String / Query Parameters

Any parameters passed in the query parameters can be injected directly into handler function. Say for the url /call?phone=1234, the handler function can define a phone parameter and it will receive the value from the query string / query parameters. If the url query string doesn’t include a value for phone, it will get None instead. We can also set a default value to the parameter like this:

def welcome(name=None):
    if name is None:
        return {'message': 'Welcome to API Star!'}
    return {'message': 'Welcome to API Star, %s!' % name}

In the above example, we set a default value to name which is None anyway.

Injecting Objects

By type hinting a request handler, we can have different objects injected into our views. Injecting request related objects can be helpful for accessing them directly from inside the handler. There are several built in objects in the http package from API Star itself. We can also use it’s type system to create our own custom objects and have them injected into our functions. API Star also does data validation based on the constraints specified.

Let’s define our own User type and have it injected in our request handler:

from apistar import Include, Route
from apistar.frameworks.wsgi import WSGIApp as App
from apistar import typesystem


class User(typesystem.Object):
    properties = {
        'name': typesystem.string(max_length=100),
        'email': typesystem.string(max_length=100),
        'age': typesystem.integer(maximum=100, minimum=18)
    }

    required = ["name", "age", "email"]


def new_user(user: User):
    return user


routes = [
    Route('/', 'POST', new_user),
]

app = App(routes=routes)

if __name__ == '__main__':
    app.main()

Now if we send this request:

curl -X POST \
  http://127.0.0.1:8080/ \
  -H 'Cache-Control: no-cache' \
  -H 'Content-Type: application/json' \
  -d '{"name": "masnun", "email": "masnun@gmail.com", "age": 12}'

Guess what happens? We get an error saying age must be equal to or greater than 18. The type system is allowing us intelligent data validation as well. If we enable the docs url, we will also get these parameters automatically documented there.

Sending a Response

If you have noticed so far, we can just pass a dictionary and it will be JSON encoded and returned by default. However, we can set the status code and any additional headers by using the Response class from apistar. Here’s a quick example:

from apistar import Route, Response
from apistar.frameworks.wsgi import WSGIApp as App


def hello():
    return Response(
        content="Hello".encode("utf-8"),
        status=200,
        headers={"X-API-Framework": "API Star"},
        content_type="text/plain"
    )


routes = [
    Route('/', 'GET', hello),
]

app = App(routes=routes)

if __name__ == '__main__':
    app.main()

It should send a plain text response along with a custom header. Please note that the content should be bytes, not string. That’s why I encoded it.

Moving On

I just walked through some of the features of API Star. There’s a lot more of cool stuff in API Star. I do recommend going through the Github Readme for learning more about different features offered by this excellent framework. I shall also try to cover short, focused tutorials on API Star in the coming days.

Categories
Python

Getting Started with Pipenv

If you’re a Python developer, you probably know about pip and the different environment management solutions like virtualenv or venv. The pip tool is currently the standard way to install a Python package. Virtualenv has been a popular way of isolating Python environments for a long time. Pipenv combines the very best of these tools and brings us the one true way to install packages while keeping the dependencies from each project isolated. It claims to have brought the very best from all other packaging worlds (the package manager for other languages / runtimes / frameworks) to the Python world. From what I have seen so far, that claim is quite valid. And it does support Windows pretty well too.

How does Pipenv work?

Pipenv works by creating and managing a virtualenv and a Pipfile for your project. When we install / remove Python packages, the changes are reflected in the Pipfile. It also generates a lock file named Pipfile.lock which is used to lock the version of dependencies and help us produce deterministic builds when needed. In a typical virtualenv setup, we usually create and manage the virtualenv ourselves. Then we activate the environment and pip just installs / uninstalls from that particular virtual environment. Packages like virtualenvwrapper helps us easily create and activate virtualenv with some of it’s handy features. But pipenv takes things further by automating a large part of that. The Pipfile would also make more sense if you have used other packaging systems like Composer, npm, bundler etc.

Getting Started

We need to start by installing pipenv globally. We can install it using pip from PyPi:

pip install pipenv

Now let’s switch to our project directory and try installing a package:

pipenv install flask

When you first run the pipenv install command, you will notice it creates a virtualenv, Pipfile and Pipfile.lock for you. Feel free to go ahead inspect their contents. If you’re using an IDE like PyCharm and want to configure your project interpreter, it would be a good idea to note down the virtualenv path.

Since we have installed Flask, let’s try and running a sample app. Here’s my super simple REST API built with Flask:

from flask import Flask, jsonify

app = Flask(__name__)


@app.route('/')
def hello_world():
    return jsonify({"message": "Hello World!"})

Assuming that you have the FLASK_APP environment variable set to app.py (which contains the above code), we can just run the app like this:

pipenv run flask run

Any executables in the current environment can be run using the pipenv run command. But I know what you might be thinking – we want to do just flask run, not use the entire, long command. That’s easy too. We just need to activate the virtualenv with this command:

pipenv shell

Now you can just do flask run or in fact run any executables in the way we’re used to doing.

Handling Dependencies

We can install and uninstall packages using the install and uninstall commands. When we install a new package, it’s added to our Pipfile and the lock file is updated as well. When we uninstall a package, the Pipfile and the lock files are again updated to reflect the change. The update command uninstalls the packages and installs them again so we have the latest updates.

If you would like to check your dependency graph, just use the graph command which will print out the dependencies in a nice format, kind of like this:

PS C:\Users\Masnun\Documents\Python\pipenvtest> pipenv graph
celery==4.1.0
  - billiard [required: >=3.5.0.2,<3.6.0, installed: 3.5.0.3]
  - kombu [required: >=4.0.2,<5.0, installed: 4.1.0]
    - amqp [required: >=2.1.4,<3.0, installed: 2.2.2]
      - vine [required: >=1.1.3, installed: 1.1.4]
  - pytz [required: >dev, installed: 2017.3]
Flask==0.12.2
  - click [required: >=2.0, installed: 6.7]
  - itsdangerous [required: >=0.21, installed: 0.24]
  - Jinja2 [required: >=2.4, installed: 2.10]
    - MarkupSafe [required: >=0.23, installed: 1.0]
  - Werkzeug [required: >=0.7, installed: 0.12.2]

Pipenv is Awesome!

Trust me, it is! It packs a lot of cool and useful features that can help gear up your Python development workflow. There are just too many to cover in a blog post. I would recommend checking out the Pipenv Docs to get familiar with it more.