First we import the git module, create a Repo object with the physical path to the git repo. We call the iter_commits() method to get a iterator or generator expression which we pass to list() for converting it to a list item.
1 2 3 4 5 |
import git repo = git.Repo("/var/www/2deal.de") commits_list = list(repo.iter_commits()) print "First commit: ", commits_list[0] |
To compare one commit to another we use the diff() method of a commit object and pass another commit object as the parameter.
The returned list has all the changed files. Each of this item has two blobs – a_blob or b_blob. a_blob is the blob in the first commit, b_blob is the blob in the last commit – in the selected range.
Each of this blob objects have different properties – name, path, data_stream etc. We’re interested only in the path. We should use both a_blob and b_blob because in case of a renamed file, there are two changes – one is deleted, one is added. a_blob will point to the earlier file, b_blob will be the latest one.
1 2 3 4 5 6 7 8 9 10 |
changed_files = [] for x in commits_list[0].diff(commits_list[-1]): if x.a_blob.path not in changed_files: changed_files.append(x.a_blob.path) if x.b_blob is not None and x.b_blob.path not in changed_files: changed_files.append(x.b_blob.path) print changed_files |
Finally, we print the list of changed files. Easy, is it not? 🙂
5 replies on “Fetching changed files (diff) between two Git commits in Python”
dependencies for:Â Â import git
–> “pip install git-python”
..this was mentioned in another post, but when coming here from google.. it was not obvious
no available via pip..
but can be found here: https://www.gitorious.org/git-python
PS: pls remove full name in previous post. thank you! maybe even remove both posts, and just add dependency information to your blog entry 😉
It’s in pip as GitPython, not git-python.
How can I get the content of a diff? the “diff” attribute of class “Diff” seems to return an empty string.
You can make it a bit easier using a set: