Interacting with Git using Python is a very common use case in the DevOps field: very often it is necessary to checkout application’s or scripts along with their configuration or even just checkout versioned configurations. Although more rare, it is sometimes necessary to update the checked out contents and push the committed version back to the “origin” remote repository. In the "Git With Python HowTo GitPython Tutorial And PyGit2 Tutorial" post we play with the two most commonly used Python libraries used to interconnect to Git: gitpython and pygit2.
As a companion of this post you may consider reading:
- Git Tutorial – A Thorough Version Control With Git HowTo
- Git Tutorial – A Thorough Git HowTo About Using Remotes
since both posts provide a lot of useful information on Git
GitPython
Probably the most used library and of course one of the earliest, it is a wrapper around the "git" command line utility providing an object model representing Git concepts and offering error handling. For this reason GitPython needs the "git" executable to be installed on the system and available in your "PATH" environment variable or in the "GIT_PYTHON_GIT_EXECUTABLE" environment variable.
The official documentation is available here.
This project is in maintenance mode, which means that:
- there will be no feature development, unless these are contributed
- there will be no bug fixes, unless they are relevant to the safety of users, or contributed
- issues will be responded to with waiting times of up to a month
Of course I'm not going to describe everything about GitPython: doing that would mean rewriting its documentation, so a pointless thing - I just want to provide some handy examples for the most common use cases.
Installing
GitPython can be easily installed using PIP as follows:
python3 -m pip install gitpython
Handling Exceptions
The very most of the GitPython exceptions can be managed by capturing the "git.exc.CommandError" exception as shown by the following snippet:
try
git.somefunction(...)
except git.exc.GitCommandError as e:
print(e, file=sys.stderr)
exit(1)
Cloning a Remote Repository
If the remote repository is a public one, cloning it requires just a very trivial statement like the following one:
repo = git.Repo.clone_from("https://git0.p1.carcano.corp/mcarcano/myrepo.git", "myrepo")
the function returns a Python Object referring to the cloned repository.
Cloning a Remote Private Repository
If the remote repository is a private one, things differs depending on it is using "HTTP" or "SSH" transport.
HTTP Transport
When dealing with HTTP transport, authentication is password-based: in this case the "git" command line utility requires to specify the path to an external helper script able to pass the authentication credentials.
Since GitPython relies on the "git" command line utility, it inherits the same requirement, so to work with private Git repositories requiring password authentication you must create the "askpass.py" file with the following contents:
#!/usr/bin/python3
from sys import argv
from os import environ
if 'username' in argv[1].lower():
print(environ['GIT_USERNAME'])
exit()
if 'password' in argv[1].lower():
print(environ['GIT_PASSWORD'])
exit()
exit(1)
Despite technically the "askpass.py" can be put where you prefer, but to keep things simple and easy both for running and deployment my suggestion is to put it in the same directory with your script using GitPython.
once done, assign it execution rights:
chmod 755 askpass.py
as for the Python code to put in tour script for cloning the repo, you must export the environment variables:
- GIT_ASKPASS - the path to the "askpass.py" helper script
- GIT_USERNAME - the environment variable with the username expected by the "askpass.py" helper script
- GIT_PASSWORD - the environment variable with the password expected by the "askpass.py" helper script
right before the clone statement.
For example:
working_dir=os.path.dirname(os.path.realpath(__file__))
os.environ["GIT_ASKPASS"] = os.path.join(working_dir, "askpass.py")
os.environ["GIT_USERNAME"] = "myusername"
os.environ["GIT_PASSWORD"] = "thepassword"
repo = git.Repo.clone_from("https://git0.p1.carcano.corp/mcarcano/myprivaterepo.git", "myprivaterepo")
SSH Transport
SSH transport is a different matter, since it relies on authorized SSH public keys: in this case it is necessary to export into the "GIT_SSH_COMMAND" environment variable the ssh statement with the parameter pointing to the SSH private key to use while connecting.
For example, to use the private key from the "sshkey" file in the same directory of the script itself:
ssh_cmd = "ssh -i sshkey"
repo = git.Repo.clone_from("git@git0.p1.carcano.corp:mcarcano/myprivaterepo.git", "myprivaterepo", env={"GIT_SSH_COMMAND": ssh_cmd})
Configuring User Attributes
Properly committing changes to a repository requires to set a few user attributes - at least the Author's name and email address - so that to be able to easily track in the Git log who made the change.
Once cloned the repository (or opened from the local filesystem) and got the repository Python Object, you can easily set these attributes as follows:
repo.config_writer().set_value("user", "name", "Marco Carcano").release()
repo.config_writer().set_value("user", "email", "marco.carcano@carcano.corp").release()
Creating A New Branch
If the cloned branch is a protected one, then you are not allowed to commit changes to it and push it back to the remote repository. In such a scenario you must checkout a new branch from the cloned one and operate the changes into that branch.
This can be achieved as follows:
new_branch = repo.create_head("my-new-branch")
When dealing with protected branches changes are merged to the protected branch by the means of a Merge Request (aka Pull Request). In such a scenario, when creating the Merge Request you must specify the current branch (so the freshly created one) as the source, and the protected one (the one you cloned) as target. In such a scenario, you can get the name of the current branch before creating the new one as follows:
initial_branch_name=repo.active_branch.name
if you need to start a new branch from a different one than the one you cloned, you can use the following syntax:
new_branch = repo.create_head("my-new-branch", origin.refs.master)
of course specifying the ref-spec of the branch you want to use.
Once created the new branch, you probably want to checkout it as follows:
new_branch.checkout()
Committing Changes
Once cloned the repository (or opened from the local filesystem) and got the repository Python Object, you can easily add contents to the staging area:
repo.git.add("changedfile.txt")
the above statement of course adds just a single changed item, ... you may instead prefer to add every changed content as a whole:
repo.git.add(all=True)
Once added everything you fancy to the staging area, just commit the changes as follows:
repo.index.commit("My commit message")
Creating Tags
Git supports two types of tags: lightweight and annotated. GitPython of course support both kind of tags.
If you need to create a lightweight tag:
lightweight_tag = repo.create_tag("my_lw_tag")
If instead you need to create annotated tag:
annotated_tag = repo.create_tag("my_tag", message="the descriptive message")
Pushing Changes
A very common use case is pushing to the remote Git repository the committed changes.
Branches
If the changes have been committed directly into the branch that was previously cloned, you can just define the remote repository as a Python object and push as follows:
remote_name="origin"
repo.remote(remote_name).push()
if instead the branch to push is a new locally created one (a quite common use case when dealing with protected branches), it is necessary to pass it as an argument of the push - for example:
remote_name="origin"
new_local_branch-"mybranch"
repo.remote(remote_name).push(new_local_branch)
even easier, if you just want to push the current branch, no matter it is a new or existent one:
remote_name="origin"
repo.remote(remote_name).push(repo.active_branch.name)
Tags
Same way as with the git command line tool, tags are not included in the push: when dealing with tags, you must explicitly pass it as follows:
remote_name="origin"
repo.remote(remote_name).push("lightweight_tag")
or:
remote_name="origin"
repo.remote(remote_name).push("annotated_tag_name")
You can have the same behaviour of the "--follow-tags" command line git option by adding "push.followTags = true" to the repository configuration as follows:
repo.config_writer().set_value("push", "followTags", "true").release()
After doing that every annotated tag is pushed even when just calling the push() method with no arguments:
PyGit2
It is a set of Python bindings to the libgit2 linkable C Git library - libgit2 is a dependency-free implementation of Git, with a focus on having a nice API for use within other programs. Besides pygit2, there are available bindings for other programming languages such as C#, Objective-C, Golang and so on.
Of course I'm not going to describe everything about pygit2: doing that would mean rewriting its documentation, so a pointless thing - I just want to provide some handy examples for the most common use cases.
Checking For SSH transport support
Before using pygit2, it is best to check that:
- pygit2 has SSH support enabled
- libgit2 with SSH support enabled
Make sure that pygit2 has been built with SSH transport is very easy: just launch Python and run the following statements:
import pygit2 as pg
bool(pg.features and pg.GIT_FEATURE_SSH)
if the output is "True", then pygit2 has been built with SSH transport enabled.
Checking if "libgit2" has been built with SSH support enabled requires instead building a small C program that tries to instantiate the SSH transport.
Create the "/tmp/libgit2-SSH-support-test.c" file with the following contents:
#include <git2.h>
#include <git2/sys/transport.h>
#include <stdio.h>
void main()
{
git_transport *transport;
int err = git_transport_new(&transport, NULL, "ssh://github.com/libgit2/pygit2.git");
printf("%d\n", err);
}
then, build and run it as follows:
gcc /tmp/libgit2-SSH-support-test.c -lgit2 && ./a.out
if the output is "0", then SSH support has been enabled in the "libgit2" library, otherwise it does not.
Installing
On Red Hat family Linux distributions, pygit2 can be easily installed using DNF as follows::
sudo dnf install -y python3-pygit2
if instead you prefer to use PIP, type:
python3 -m pip install pygit2
Handling Exceptions
The very most of the pygit2 exceptions can be managed by capturing the "GitError" exception as shown by the following snippet:
from pygit2._pygit2 import GitError
try
git.somefunction(...)
except GitError as e:
print(e, file=sys.stderr)
exit(1)
Please note how we explicitly imported the GitError name using the "from" directive.
Opening An Existing Local Repository
Opening an existing local repository is very trivial and can be easily accomplished by using the "discover_repository" method as follows:
repository_path = pygit2.discover_repository("path/to/the/repository/directory")
repo = pygit2.Repository(repository_path)
The outcome is a Python object describing the repository.
Cloning a Remote Repository
If the remote repository is a public one, cloning it requires just a very trivial statement like the following one:
repo = pygit2.clone_repository("https://git0.p1.carcano.corp/mcarcano/myrepo.git", "myrepo")
the function returns a Python Object referring to the cloned repository.
Cloning a Remote Private Repository
If the remote repository is a private one, we must instruct pygit2 about how to handle the authentication process. This can be easily achieved by subclassing the "RemoteCallbacks" Object:
class GitCallbacks(pygit2.RemoteCallbacks):
def __init__(self, user=None, token=None, pub_key=None, priv_key=None, passphrase=None):
self.user = user
self.token = token
self.pub_key = pub_key
self.priv_key = priv_key
self.passphrase = passphrase
def credentials(self, url, username_from_url, allowed_types):
if allowed_types & pygit2.enums.CredentialType.USERNAME:
return pygit2.Username(self.user)
elif allowed_types & pygit2.enums.CredentialType.USERPASS_PLAINTEXT:
return pygit2.UserPass(self.user, self.token)
elif allowed_types & pygit2.enums.CredentialType.SSH_KEY:
return pygit2.Keypair(username_from_url, self.pub_key, self.priv_key, self.passphrase)
else:
return None
def push_update_reference(self, refname, message):
if message is not None:
raise GitError("Push of {} failed - error message is: {}".format(refname, message))
def certificate_check(self, certificate, valid, host):
return True
def transfer_progress(self, stats):
print("Retrieved objects: {}/{}".format(stats.indexed_objects, stats.total_objects), end="\r")
in the above snippet:
- the "__init__" method has every parameter as optional (lines 2-7)
- the "credentials" method selects and returns the correct kind of credentials (USERNAME, USERPASS_PLAINTEXT, SSH_KEY) along with its values (lines 9-17)
- the "push_update_reference" method is used to raise a GitError exception if any (error) message is returned during a push (lines 19-21)
- the "certificate_check" overrides the checking of the server's certificate - it returns always True, so the certificate is never check - this is just a sample: don't implement it in production (lines 23-24)
- the "transfer_progress" method is used to print information about the progress of the clone (lines 26-27)
HTTP Transport
When dealing with HTTP transport, authentication is password-based: in this case we initialize the callback by passing the username and the password:
repo = pygit2.clone_repository("https://git0.p1.carcano.corp/mcarcano/myprivaterepo.git", "myprivaterepo",
callbacks=GitCallbacks(user="myusername", token="thepassword"))
SSH Transport
SSH transport is a different matter, since it relies on authorized SSH public keys: in this case we initialize the callback by passing the path to the private key file, the path to the public key file and the passphrase to decode the private key:
repo = pygit2.clone_repository("ssh://git0.p1.carcano.corp/mcarcano/myprivaterepo.git", "myprivaterepo",
callbacks=GitCallbacks(priv_key="sshkey", pub_key="sshkey.pub", passphrase="thepassphrase"))
Getting Configurations From An Opened Repository
Once a repository is opened, either using the "discover_repository" or "clone_repository" methods, a repository object is returned: among its nested objects, there's the "config" object containing, as a list of items, all the setting for that repository (the ones stored in the "./git/config" file inside the repository directory)
These properties can be accessed by using the "__getitem__" method.
For example, if they have already been set into the "./git/config", we can easily get the username and email address as follows:
repo = pygit2.Repository("path/to/the/repository/directory")
repo.config.__getitem__('user.name'))
repo.config.__getitem__('user.email')
Creating A New Branch
If the cloned branch is a protected one, then you are not allowed to commit changes to it and push it back to the remote repository. In such a scenario you must checkout a new branch from the cloned one and operate the changes into that branch.
This can be achieved as follows:
last_commit= repository.revparse_single("HEAD")
new_branch = repository.branches.local.create("my-new-branch",last_commit)
repository.checkout(repository.branches.local["my-new-branch"])
When dealing with protected branches changes are merged to the protected branch by the means of a Merge Request (aka Pull Request). In such a scenario, when creating the Merge Request you must specify the current branch (so the freshly created one) as the source, and the protected one (the one you cloned) as target. In such a scenario, you can get the name of the current branch before creating the new one as follows:
initial_branch_name=repo.head.shorthand
Committing Changes
Once cloned the repository (or opened from the local filesystem) and got the repository Python Object, you can easily add contents to the staging area.
First, create the Signature objects for the author of the changes and for their committer:
author = pygit2.Signature("Marco Carcano", "marco.carcano@carcano.corp")
committer = pygit2.Signature("Marco Carcano", "marco.carcano@carcano.corp")
Despite the fact that most of the time the author and committer are the same person, that is not always true: think for example a pull request - in this case the committer is somebody committing changes on behalf of someone else.
Mind that if the repository already contains committer's information, you can guess both Author and Committer information using the "__getitem__" method of the repository's "config" object from it as follows:
author = pygit2.Signature(repository.config.__getitem__('user.name'), repository.config.__getitem__('user.email'))
committer = pygit2.Signature(repository.config.__getitem__('user.name'), repository.config.__getitem__('user.email'))
If you instead need to access global (system account-wide) information, you must use the "get_global_config()" method.
For example:
global_config = pygit2.Config.get_global_config()
we must the create the index for the staging area's files, add contents to it and create the tree object.
index = repo.index
index.add("changedfile.txt")
index.write()
tree = index.write_tree()
if the commit is not the initial commit of a new git repository, we must also assign:
- the current head's name
- the list of parents
ref = repository.head.name
parents = [ repository.head.target ]
the above statement of course adds just a single changed item, ... you may instead prefer to add every changed content as a whole:
index.add_all()
Once defined all of the above, you can perform the actual commit as follows:
message = "My commit message"
commit=repository.create_commit(ref, author, committer, message, tree, parents)
Creating Tags
Git supports two types of tags: lightweight and annotated. Conversely from GitPython, pygit2 does not have a specific method for creating lightweight tags - you must deal with creating a reference by yourself as follows:
from pygit2._pygit2 import GIT_OBJ_COMMIT
tag_name = "my_lw_tag"
last_commit= repository.revparse_single("HEAD")
if last_commit.type == GIT_OBJ_COMMIT:
repository.create_reference("refs/tags/{}".format(tag_name), (last_commit.id))
it instead provides the "create_tag" method for creating annotated tags. For example:
from pygit2._pygit2 import GIT_OBJ_COMMIT
last_commit= repository.revparse_single("HEAD")
if last_commit.type == GIT_OBJ_COMMIT:
committer = pygit2.Signature("Marco Carcano", "marco.carcano@carcano.corp")
oid = repository.create_tag("my_tag", str(last_commit.id), GIT_OBJ_COMMIT, committer, "the descriptive message")
Pushing Changes
A very common use case is pushing to the remote Git repository the committed changes.
Branches
The statement for pushing branches changes a little bit depending on the underlying transport.
The following snippet is an example for the HTTP transport:
branch="mybranch"
remote_name="origin"
repository.remotes[remote_name].push(["+refs/heads/{}".format(branch)], callbacks=GitCallbacks(user="username", token="password"))
if instead you are dealing with the SSH transport, the above code changes as follows:
branch="mybranch"
remote_name="origin"
repository.remotes[remote_name].push(["+refs/heads/{}".format(branch)], callbacks=GitCallbacks(priv_key="sshkey", pub_key="sshkey.pub", passphrase="password"))
Tags
Pushing tags is pretty similar to pushing branches - the below snippet is for pushing tags using the HTTP transport:
tag="mytag"
remote_name="origin"
repository.remotes[remote_name].push(["+refs/tags/{}".format(branch)], callbacks=GitCallbacks(user="username", token="password"))
whereas the below one is for pushing tags using the SSH transport:
branch="mytag"
remote_name="origin"
repository.remotes[remote_name].push(["+refs/tags/{}".format(branch)], callbacks=GitCallbacks(priv_key="sshkey", pub_key="sshkey.pub", passphrase="password"))
Example Scripts
My affectionate readers know I always show things in action: theory is important, but worth few without some good practical examples.
So I wrote two sample scripts, one using gitpython and the other using pygit2: you can use them as a starting point to write your own, improving and adapting them as needed.
Create a directory for this project and change into it:
mkdir -m 755 git-with-python
cd git-with-python
Both scripts read configuration settings from the "conf" directory, and are stored in the "bin", so create these directories as follows:
mkdir -m 755 conf bin
Several parts of both scripts use the same logic, for example for reading configuration files or initialisation Python's logging facility. For this reason, no matter which script you want to use, you must also create these files.
Configuration Files
Let's start by setting the configuration files - create the "conf/secrets.ini" with the following contents:
[git]
username=fooapp
password=A6.Ur30-Ne
user=Foo Application
email=foo@carcano.corp
it contains:
- the username and password to connect to the remote git repository
- the username and email address to be used for authoring and committing changes
Since - at least with GitPython - we are going to test also the SSH transport, create also the "conf/sshkey" file containing the private key for connecting - it's contents should look like the following snippet:
-----BEGIN OPENSSH PRIVATE KEY-----
b3BlbnNzaC1rZXktdjEAAAAABG5vbmUAAAAEbm9uZQAAAAAAAAABAAABlwAAAAdzc2gtcn
...
dAZ0qapX/6Ypa/AAAAJnZhZ3JhbnRAZ2l0LWNhLXVwMWEwMDEucDEuY2FyY2Fuby5jb3Jw
AQIDBA==
-----END OPENSSH PRIVATE KEY-----
and the "conf/sshkey.pub" file with the related public key - remember to authorise it on the remote git repository or it won't of course be valid for connecting.
then set up the "conf/logging.conf" file with all the settings to be used by the Python's logging facility:
[loggers]
keys=root,foo
[handlers]
keys=console,syslog,file
[formatters]
keys=jsonFormatter
[logger_root]
level=DEBUG
handlers=file
propagate=0
[logger_foo]
level=DEBUG
handlers=file
qualname=__main__
propagate=0
[handler_null]
class = logging.NullHandler
formatter = default
args = ()
[handler_console]
class=StreamHandler
level=DEBUG
formatter=jsonFormatter
args=(sys.stderr,)
[handler_syslog]
class = handlers.SysLogHandler
args = ('/dev/log', handlers.SysLogHandler.LOG_USER)
level=DEBUG
formatter=jsonFormatter
[handler_rotatingfile]
class=handlers.RotatingFileHandler
level=DEBUG
formatter=jsonFormatter
args=('foo.log','a',1024,50)
[handler_file]
class=FileHandler
level=DEBUG
formatter=jsonFormatter
args=('foo.log','a')
[formatter_jsonFormatter]
format={"time": "%(asctime)s", "logger": "%(name)s", "level": "%(levelname)s", "pid": "%(process)d", "src": "%(pathname)s", "line": "%(lineno)d", "msg": "%(message)s"}
GitPython Sample Script
The following script implement the use case using GitPython - create the "bin/bump-version-with-gitpython.py" file with the following contents and assign it execution permission:
#!/usr/bin/python3
import errno, os, sys, argparse, configparser, logging, logging.config
import re
import git
credentials_file = "secrets.ini"
log_config_file = "logging.conf"
# repo_url="git@git0.p1.carcano.corp:fooapp/fooapp.git"
repo_url = "https://git0.p1.carcano.corp:3000/fooapp/fooapp.git"
branch = "master"
repo_dir = "fooapp"
# https://docs.python.org/3/library/logging.html#levels
logging_loglevels = {"NOTSET": 0, "DEBUG": 10, "INFO": 20, "WARNING": 30, "ERROR": 40, "CRITICAL": 50}
def abort(logger_reason, exit_reason=None):
"""
Exit printing a message on stderr and logging the exception using the logging facility
Args:
logger_reason: string or exception object with the reason for exiting to be logged using the logging facility
exit_reason: string or exception object with the reason for exiting to print on stderr - if omitted it is
automatically set with the value of the logger_reason parameter
Returns: nothing
"""
if exit_reason is None:
exit_reason = logger_reason
logger.error(str(logger_reason).replace("'", "\'").replace('"', "'").replace("\n", " ").replace("\r", ""))
logger.error("Script aborted")
print(exit_reason, file=sys.stderr)
exit(1)
def load_credentials(file):
"""
Load the credentials from a file
Args:
file: the path to the credentials' file
Returns:
an object with the contents of the credentials file
"""
try:
contents = configparser.ConfigParser()
if not os.path.isfile(file):
raise FileNotFoundError(
errno.ENOENT, os.strerror(errno.ENOENT), file)
if not os.access(file, os.R_OK):
raise FileNotFoundError(
errno.EPERM, "Unable to open file for reading", file)
contents.read(file)
return contents
except FileNotFoundError as e:
abort(e)
except:
abort("{} is not an INI file".format(file))
def init_logging():
"""
Initialize the Python's standard logging facility
Formatters are automatically set by this function, verbosity can be set by exporting the LOGLEVEL
environment variable as defined by https://docs.python.org/3/library/logging.html#levels.
If necessry, it is possible to alter these formats by creating the logging.ini file in the configuration
directory - the format is the one of the Python's standard logging facility
https://docs.python.org/3/library/logging.config.html#configuration-file-format
Returns: nothing
"""
if os.path.isfile("{}/{}".format(config_dir, log_config_file)):
logging.config.fileConfig(fname="{}/{}".format(config_dir, log_config_file), disable_existing_loggers=False)
log = logging.getLogger(__name__)
print("logging using '{}' file".format("{}/{}".format(config_dir, log_config_file)))
else:
log = logging.getLogger(__name__)
handler = logging.StreamHandler()
formatter = logging.Formatter("%(asctime)s %(name)-12s %(levelname)-8s %(message)s")
handler.setFormatter(formatter)
log.addHandler(handler)
log.setLevel(logging.INFO)
log.setLevel(logging_loglevels[os.getenv("LOGLEVEL", "INFO")])
return log
def git_clone(url, repo_branch="master", dest_dir=None, single_branch=True, ssl_verify=True):
"""
Clone a Git repository
Args:
url: Git repository's URL - both HTTP and SSH transports are supported
repo_branch: name of the branch to check out while cloning the repository
dest_dir: directory on the filesystem where to store the cloned repository
single_branch: if True, it clones only the specified repository branch
ssl_verify: if True, enable the checkings of the TLS certificate - if any
Returns:
an object representing the cloned repository
"""
repo_name = re.sub("\.git$", "", re.sub("/$", "", url).rsplit("/", 1)[-1])
logger.info("Cloning '{}', branch '{}'".format(url, repo_branch))
if dest_dir is None:
dest_dir = os.path.join("/tmp", repo_name)
working_dir = os.path.dirname(os.path.realpath(__file__))
if url.startswith("http"):
try:
if credentials.get("git", "username") and credentials.get("git", "password"):
logger.debug("Logging in to git as '{}'".format(credentials["git"]["username"]))
os.environ["GIT_ASKPASS"] = os.path.join(working_dir, "askpass.py")
os.environ["GIT_USERNAME"] = credentials["git"]["username"]
os.environ["GIT_PASSWORD"] = credentials["git"]["password"]
except configparser.NoSectionError:
logger.debug("Logging in to git anonymously")
pass
try:
logger.debug("Cloning using HTTP transport")
repo = git.Repo.clone_from(url, dest_dir, branch=repo_branch, single_branch=single_branch, depth=1,
config="http.sslVerify={}".format(str(ssl_verify)), allow_unsafe_options=True)
except FileNotFoundError as e:
abort(e)
except git.exc.GitCommandError as e:
abort("Error while checking out '{}' git repository: {}".format(repo_name, str(e)))
else:
logger.debug("Cloning using SSH transport")
try:
if os.path.isfile("{}/sshkey".format(config_dir)):
logger.debug("Logging in to git using SSH key '{}/sshkey'".format(config_dir))
if not os.access("{}/sshkey".format(config_dir), os.R_OK):
raise FileNotFoundError(
errno.EPERM, "Unable to open file for reading", "{}/sshkey".format(config_dir))
ssh_cmd = "ssh -i {}/sshkey".format(config_dir)
repo = git.Repo.clone_from(url, dest_dir, branch=repo_branch, single_branch=single_branch, depth=1,
env={"GIT_SSH_COMMAND": ssh_cmd})
else:
logger.debug("Logging in to git anonymously")
repo = git.Repo.clone_from(url, dest_dir, branch=repo_branch, single_branch=single_branch, depth=1)
except FileNotFoundError as e:
abort(e)
except git.exc.GitCommandError as e:
abort("Error while checking out '{}' git repository: {}".format(repo_name, str(e)))
try:
if credentials.get("git", "user"):
repo.config_writer().set_value("user", "name", credentials["git"]["user"]).release()
except (configparser.NoSectionError, configparser.NoOptionError):
pass
try:
if credentials.get("git", "email"):
repo.config_writer().set_value("user", "email", credentials["git"]["email"]).release()
except (configparser.NoSectionError, configparser.NoOptionError):
pass
return repo
def update_metadata(changelog_file):
"""
Update the metadata file, bumping the release version
Args:
changelog_file: path to the metadata file within the cloned Git repository
Returns: Nothing
"""
try:
with open(os.path.join(repo_dir, changelog_file), "r") as file:
contents = file.read()
project_version = re.search("release:\s+(\d+\.\d+\.\d+)", contents).group(1)
if not project_version:
abort("Unable to guess the project's current version")
major, minor, patchlevel = map(int, project_version.split("."))
logger.debug("Project version tokens: - MAJOR={}, MINOR={}, PATCH={})".format(major, minor, patchlevel))
new_version = "{}.{}.{}".format(major, minor, str(patchlevel + 1))
logger.info("Bumping from version '{}' to version '{}'".format(project_version, new_version))
with open(os.path.join(repo_dir, changelog_file), "r+") as file:
contents = file.readlines()
file.seek(0)
file.truncate()
for line in contents:
match = re.sub(r"^release:.*$", "release: {}".format(new_version), line)
if match != line:
line = match
file.write(line)
logger.debug("File '{}' successfully updated".format(changelog_file))
except FileNotFoundError as e:
abort(e)
try:
repository.git.add(changelog_file)
repository.index.commit("Bumping to version '{}'".format(new_version))
logger.debug("Changes successfully committed")
except git.exc.GitCommandError as e:
abort(e)
def git_push(branch_or_tag_name="", remote="origin"):
"""
Push the specified remote
Args:
branch_or_tag_name: name of the branch or tag to push
remote: name of the remote to push
Returns:
"""
try:
logger.info("Pushing changes to git")
if branch_or_tag_name == "":
branch_or_tag_name = repository.active_branch.name
repository.remote(remote).push(branch_or_tag_name)
except (git.exc.GitCommandError, ValueError) as e:
abort(e)
default_config_dir = "{}/conf".format(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
parser = argparse.ArgumentParser(description="Example script showing how to use git-python")
parser.add_argument("-c", "--config-dir", dest="config_dir", default=default_config_dir,
help="path to the configuration directory")
args = parser.parse_args()
config_dir = args.config_dir
logger = init_logging()
logger.info("Script started")
credentials = load_credentials("{}/{}".format(config_dir, credentials_file))
repository = git_clone(repo_url, branch, repo_dir, True, False)
new_branch = repository.create_head('bump')
new_branch.checkout()
update_metadata("meta.yml")
git_push()
logger.info("Script ended")
PyGit2 Sample Script
The following script implement the use case using pygit2 - create the "bin/bump-version-with-pygit2.py" file with the following contents and assign it execution permission:
Are you enjoying these high quality free contents on a blog without annoying banners? I like doing this for free, but I also have costs so, if you like these contents and you want to help keeping this website free as it is now, please put your tip in the cup below:
Even a small contribution is always welcome!
#!/usr/bin/python3
import errno, os, sys, argparse, configparser, logging, logging.config
import re
import pygit2
from pygit2._pygit2 import GitError
credentials_file = "secrets.ini"
log_config_file = "logging.conf"
# repo_url="git@git0.p1.carcano.corp:fooapp/fooapp.git"
repo_url = "https://git0.p1.carcano.corp:3000/fooapp/fooapp.git"
branch = "master"
repo_dir = "fooapp"
# https://docs.python.org/3/library/logging.html#levels
logging_loglevels = {"NOTSET": 0, "DEBUG": 10, "INFO": 20, "WARNING": 30, "ERROR": 40, "CRITICAL": 50}
class GitCallbacks(pygit2.RemoteCallbacks):
def __init__(self, user=None, token=None, pub_key=None, priv_key=None, passphrase=None):
self.user = user
self.token = token
self.pub_key = pub_key
self.priv_key = priv_key
self.passphrase = passphrase
def credentials(self, url, username_from_url, allowed_types):
if allowed_types & pygit2.enums.CredentialType.USERNAME:
return pygit2.Username(self.user)
elif allowed_types & pygit2.enums.CredentialType.USERPASS_PLAINTEXT:
return pygit2.UserPass(self.user, self.token)
elif allowed_types & pygit2.enums.CredentialType.SSH_KEY:
return pygit2.Keypair(username_from_url, self.pub_key, self.priv_key, self.passphrase)
else:
return None
def certificate_check(self, certificate, valid, host):
return True
def transfer_progress(self, stats):
print("Retrieved objects: {}/{}".format(stats.indexed_objects, stats.total_objects), end="\r")
def abort(logger_reason, exit_reason=None):
"""
Exit printing a message on stderr and logging the exception using the logging facility
Args:
logger_reason: string or exception object with the reason for exiting to be logged using the logging facility
exit_reason: string or exception object with the reason for exiting to print on stderr - if omitted it is
automatically set with the value of the logger_reason parameter
Returns: nothing
"""
if exit_reason is None:
exit_reason = logger_reason
logger.error(str(logger_reason).replace("'", "\'").replace('"', "'").replace("\n", " ").replace("\r", ""))
logger.error("Script aborted")
print(exit_reason, file=sys.stderr)
exit(1)
def load_credentials(file):
"""
Load the credentials from a file
Args:
file: the path to the credentials' file
Returns:
an object with the contents of the credentials file
"""
try:
contents = configparser.ConfigParser()
if not os.path.isfile(file):
raise FileNotFoundError(
errno.ENOENT, os.strerror(errno.ENOENT), file)
if not os.access(file, os.R_OK):
raise FileNotFoundError(
errno.EPERM, "Unable to open file for reading", file)
contents.read(file)
return contents
except FileNotFoundError as e:
abort(e)
except:
abort("{} is not an INI file".format(file))
def init_logging():
"""
Initialize the Python's standard logging facility
Formatters are automatically set by this function, verbosity can be set by exporting the LOGLEVEL
environment variable as defined by https://docs.python.org/3/library/logging.html#levels.
If necessry, it is possible to alter these formats by creating the logging.ini file in the configuration
directory - the format is the one of the Python's standard logging facility
https://docs.python.org/3/library/logging.config.html#configuration-file-format
Returns: nothing
"""
if os.path.isfile("{}/{}".format(config_dir, log_config_file)):
logging.config.fileConfig(fname="{}/{}".format(config_dir, log_config_file), disable_existing_loggers=False)
log = logging.getLogger(__name__)
print("logging using '{}' file".format("{}/{}".format(config_dir, log_config_file)))
else:
log = logging.getLogger(__name__)
handler = logging.StreamHandler()
formatter = logging.Formatter("%(asctime)s %(name)-12s %(levelname)-8s %(message)s")
handler.setFormatter(formatter)
log.addHandler(handler)
log.setLevel(logging.INFO)
log.setLevel(logging_loglevels[os.getenv("LOGLEVEL", "INFO")])
return log
def git_clone(url, repo_branch="master", dest_dir=None):
"""
Clone a Git repository
Args:
url: Git repository's URL - both HTTP and SSH transports are supported
repo_branch: name of the branch to check out while cloning the repository
dest_dir: directory on the filesystem where to store the cloned repository
Returns:
an object representing the cloned repository
"""
repo_name = re.sub("\.git$", "", re.sub("/$", "", url).rsplit("/", 1)[-1])
logger.info("Cloning '{}', branch '{}'".format(url, repo_branch))
if dest_dir is None:
dest_dir = os.path.join("/tmp", repo_name)
working_dir = os.path.dirname(os.path.realpath(__file__))
try:
if url.startswith("http"):
logger.debug("Cloning using HTTP transport")
repo = pygit2.clone_repository(url, dest_dir, depth=1,
callbacks=GitCallbacks(user=credentials["git"]["username"], token=credentials["git"]["password"]))
else:
logger.debug("Cloning using SSH transport")
repo = pygit2.clone_repository(url, "tools", depth=1,
callbacks=GitCallbacks(priv_key="sshkey", pub_key="sshkey.pub",
passphrase=credentials["git"]["password"]))
except (GitError, ValueError) as e:
abort(e)
return repo
def update_metadata(changelog_file):
"""
Update the metadata file, bumping the release version
Args:
changelog_file: path to the metadata file within the cloned Git repository
Returns: Nothing
"""
try:
with open(os.path.join(repo_dir, changelog_file), "r") as file:
contents = file.read()
project_version = re.search("release:\s+(\d+\.\d+\.\d+)", contents).group(1)
if not project_version:
abort("Unable to guess the project's current version")
major, minor, patchlevel = map(int, project_version.split("."))
logger.debug("Project version tokens: - MAJOR={}, MINOR={}, PATCH={})".format(major, minor, patchlevel))
new_version = "{}.{}.{}".format(major, minor, str(patchlevel + 1))
logger.info("Bumping from version '{}' to version '{}'".format(project_version, new_version))
with open(os.path.join(repo_dir, changelog_file), "r+") as file:
contents = file.readlines()
file.seek(0)
file.truncate()
for line in contents:
match = re.sub(r"^release:.*$", "release: {}".format(new_version), line)
if match != line:
line = match
file.write(line)
logger.debug("File '{}' successfully updated".format(changelog_file))
except FileNotFoundError as e:
abort(e)
try:
index = repository.index
index.add(changelog_file)
index.write()
author = pygit2.Signature(credentials["git"]["user"], credentials["git"]["email"])
committer = pygit2.Signature(credentials["git"]["user"], credentials["git"]["email"])
message = "Bumping to version '{}'".format(new_version)
tree = index.write_tree()
ref = repository.head.name
parents = [repository.head.target]
return repository.create_commit(ref, author, committer, message, tree, parents)
except (GitError, OSError) as e:
abort(e)
def git_push(branch="", tag="", remote="origin"):
"""
Push the specified remote
Args:
branch: name of the branch to push
tag: name of the tag to push
remote: name of the remote to push
Returns:
"""
try:
if branch == "" and tag == "":
what = repository.head.name
elif branch != "":
what = "refs/heads/{}".format(branch)
elif tag != "":
what = "refs/tags/{}".format(tag)
logger.info("Pushing changes to git")
if credentials["git"]["username"] and credentials["git"]["password"]:
repository.remotes[remote].push(['+' + what],
callbacks=GitCallbacks(user=credentials["git"]["username"], token=credentials["git"]["password"]))
else:
repository.remotes[remote].push(['+' + what], callbacks=GitCallbacks(priv_key="sshkey",
pub_key="sshkey.pub", passphrase=credentials["git"]["password"]))
except (GitError, ValueError) as e:
abort(e)
default_config_dir = "{}/conf".format(os.path.dirname(os.path.dirname(os.path.realpath(__file__))))
parser = argparse.ArgumentParser(description="Example script showing how to use git-python")
parser.add_argument("-c", "--config-dir", dest="config_dir", default=default_config_dir,
help="path to the configuration directory")
args = parser.parse_args()
config_dir = args.config_dir
logger = init_logging()
logger.info("Script started")
credentials = load_credentials("{}/{}".format(config_dir, credentials_file))
repository = git_clone(repo_url, branch, repo_dir)
last_commit = repository.revparse_single("HEAD")
new_branch = repository.branches.local.create("bump", last_commit)
repository.checkout(repository.branches.local["bump"])
cmit = update_metadata("meta.yml")
git_push()
logger.info("Script ended")
Footnotes
Here it ends our tutorial on Git with Python - I hope you enjoyed it. As we saw, GitPython and PyGit2 are the most commonly used libraries - the first exploits the git command line tool, abstracting data as Python objects. The latter instead is pure Python linked to the libgit2 library, with the downside of an inherited dependency on libssh2 that brings the drawback of having the SSH transport disabled on recent Linux distributions.
Last but not least, we saw the syntax for the most common use cases, along with a full featured example for each of them.
If you appreciate this strive please and if you like this post and any other ones, just share this and the others on Linkedin - sharing and comments are an inexpensive way to push me into going on writing - this blog makes sense only if it gets visited.