Categories
Python

Fetching changed files (diff) between two Git commits in Python

First we import the git module, create a Repo object with the physical path to the git repo. We call the iter_commits() method to get a iterator or generator expression which we pass to list() for converting it to a list item.

To compare one commit to another we use the diff() method of a commit object and pass another commit object as the parameter.

The returned list has all the changed files. Each of this item has two blobs – a_blob or b_blob. a_blob is the blob in the first commit, b_blob is the blob in the last commit – in the selected range.

Each of this blob objects have different properties – name, path, data_stream etc. We’re interested only in the path. We should use both a_blob and b_blob because in case of a renamed file, there are two changes – one is deleted, one is added. a_blob will point to the earlier file, b_blob will be the latest one.

Finally, we print the list of changed files. Easy, is it not? 🙂

5 replies on “Fetching changed files (diff) between two Git commits in Python”

dependencies for:   import git

–> “pip install git-python”

..this was mentioned in another post, but when coming here from google.. it was not obvious

How can I get the content of a diff? the “diff” attribute of class “Diff” seems to return an empty string.

You can make it a bit easier using a set:

 

Comments are closed.