I'm looking for a part-time remote job.

Hire me


I'm the author of:

Mastering Redmine is a comprehensive guide with tips, tricks and best practices, and an easy-to-learn structure.

Check the book's project or

Buy the book

Social pages of the book:

By buying this book you also donate to Redmine (see this page).


Follow me:

Adding GitHub mirror for SVN repository

I store code of my projects in Subversion repositories on this server (on which this website is hosted). I thought about switching to GitHub, but I don’t like the idea, that the main code repository of my project would be stored on a third-party server. Not sure why. Also, I didn’t want to lose the ability to count downloads and checkouts (I use these data to determine popularity of projects), but I have already lost this, as many my projects got forks. Anyway, the reason is not such important for this article, as you can have own reasons for the same.

GitHub is the most popular project hosting service nowadays. Due to this many developers, who want to make some changes to my projects, upload their forks to this service. Actually, these projects include not only my code – if someone submits a patch, which fixes some issue or implements some feature and which is good enough (in my opinion), I usually merge it (making a note in CREDITS file). Forks on GitHub can have such changes too, but, because they are not linked to the original projects in any way, I often do not know about their existence. So, to include such changes I need to search for forks (what I did recently). Needless to say, that this would be much easier – in particular due to GitHub’s social networking capabilities – if these forks were linked to my projects. But for this, I would need to host my code repositories on GitHub.

So, I found a workaround: I decided to create mirrors of my projects on GitHub. And, in this article I’ll share, how do I do this.

Creating a mirror repository

Thanks to Git-SVN each Subversion repository can be also represented as a Git repository. Such Git copy preserves the link to the original repository and, therefore, Git commits to the copy can be exported into SVN as well as new Subversion commits can be imported to the Git copy. For mirrors I need only the latter (as I’m not going to make or allow new commits through Git).

I decided to store my Git mirrors under the /var/lib/git directory (considering that I store Subversion repositories under /var/lib/svn). My Subversion repositories are owned and managed by the user www-data, so Git mirrors should be owned and managed by the same user (as we are going to call git from Subversion post-commit hook, what is described below). Therefore, the directory /var/lib/git should be owned by www-data too.

So, let’s create new Git mirror (call this being in /var/lib/git):

git$ git svn init http://svn.s-andy.com/issue-id issue_id

Some people recommend to use the -s argument here, but this did not work for me. As I understand, this argument is needed to correctly import Subversion branches structure (trunc etc), but I do not use any branches in my SVN repositories.

This creates an empty Git repository issue_id, so it’s not a mirror yet. To import all Subversion commits I would need to use git svn clone, but before doing this I need to configure the mirror repository, e.g., to associate Subversion committers with GitHub users (otherwise, GitHub will show only the username, without the link to the user’s GitHub profile). That’s the reason, why I used git svn init instead. As all my Subversion repositories share the same (Redmine) user base and, therefore, the same committer base will be used for all Git repositories, I created the committers list file in the root directory, i.e., in /var/lib/git. The name of the file, that I chose, is committers.lst, but it can be any. Here is, what’s inside it:

s-andy = Andriy Lesyuk <s-andy@andriylesyuk.com> 
(no author) = Andriy Lesyuk <s-andy@andriylesyuk.com>

You may need the second entry, if the committer of some commits is unknown. Without this entry Git-SVN may halt on importing such commits.

When the committers file is ready, you need to register it in your mirror Git repository. To do this execute the following command (being under the root directory of the Git repository):

issue_id$ git config svn.authorsfile ../committers.lst

Now, all is ready to import Subversion commits, what can be done by running the command:

issue_id$ git svn fetch

Uploading the mirror repository

This should work not only for GitHub, but for other Git servers as well.

After that, the Git mirror is ready, but it’s not yet associated with GitHub. If you did not create an empty repository on GitHub, this is the time to do this. When ready, add the GitHub repository as the original source of your local mirror as follows:

issue_id$ git remote add origin git@github.com:s-andy/issue_id.git

Normally, after that everything is ready to run git push, but not in our case, as we use a system user (www-data). So, first we need to make this user be able to run Git.

The following steps can be skipped for further mirrors. After you create the github-update.sh script, as described in the next section, you will be able to run that script to upload mirrors to GitHub.

To upload data to GitHub, Git will use SSH, which, in turn, will need to store configuration data somewhere. Normally, it does this under .ssh subdirectory of the user’s home directory. But, for www-data the home is /var/www and we should not store any such data there! In this particular case, SSH will need to save known hosts and to store the authentication key. Let’s see, how we can have these data under .ssh subdirectory of /var/lib/git.

To make SSH save known hosts in /var/lib/git/.ssh/known_hosts we can use the special environmental variable GIT_SSH_COMMAND as follows:

export GIT_SSH_COMMAND="git -o UserKnownHostsFile=/var/lib/git/.ssh/known_hosts"

However, this works only for Git 2.3 or above. If your Git version is below, as in my case currently, I’m afraid, there is no other way, than to store it in /var/www/.ssh/known_hosts (for this you should allow SSH to create this file under the www-data user, after what you can make it read-only). Generally, this should be save to as this file will contain only public fingerprint of the GitHub server (which is known publicly anyway).

To store the authentication key, you should get a one first. To create such key issue the following command inside the /var/lib/git/.ssh directory:

.ssh$ ssh-keygen -t rsa -b 4096 -C '<your_email@example.com>'

After that, add the content of the generated rsa_id.pub file (the public key) as an access key to your GitHub repository (be sure, that the file, from which you copy the content, has the extension .pub!). This can be done using (profile) Settings → SSH and PGP keys:

New SSH key on GitHub

Now, to use the generated key to access GitHub you need to register it in your local SSH agent (this is the tool, which provides such keys to the SSH client). So, first run the SSH agent as follows:

$ eval $(ssh-agent -s)

Here, I assume, that you do this in BASH or DASH shell (default shells on most Linux distributions). If not, you may need to replace -s with -c.

Now, register the new key in the SSH agent by running the following command:

$ ssh-add /var/lib/git/.ssh/rsa_id

Finally, we can upload all the exported commits to GitHub, what can be done by running:

issue_id$ git push origin master

After that, you can check the mirror repository on GitHub.

All exported commits will have git-svn-id in their messages. These is the ID, that associates Git commits with Subversion ones. Their addition to commit messages can be avoided by creating Git repositories using git svn init with --no-metadata argument, but, as I understand, in such case you won’t be able to fetch new commits from Subversion repository (I did not check this).

At last, it’s worth stopping the SSH agent, which otherwise will run forever (or until the next reboot). This can be done by issuing:

$ ssh-agent -k

Syncing repositories

Now, the Git repository, which was uploaded to GitHub, is a full mirror of the Subversion repository, but, certainly, it won’t automatically include new changes, that are committed to the original repository. So, that’s what we are going to fix right now.

Let’s start with creating a script, which will fetch new commits from the Subversion repository and upload them to the Git repository. I put this script into file /usr/local/bin/github-update.sh. Here is it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/bin/sh
 
MIRRORS_PATH=/var/lib/git
REPOSITORY=$1
 
[ -z "$REPOSITORY" ] && exit 0
[ ! -d "$MIRRORS_PATH/$REPOSITORY" ] && exit 0
 
export GIT_SSH_COMMAND="git -o UserKnownHostsFile=/var/lib/git/.ssh/known_hosts"
 
cd "$MIRRORS_PATH/$REPOSITORY"
 
/usr/bin/ssh-add $MIRRORS_PATH/.ssh/id_rsa 2> /dev/null
 
/usr/bin/git svn rebase > /dev/null 2>&1
/usr/bin/git push origin master 2> /dev/null

Most commands in this script are already known to you. The only difference is that instead of git svn fetch we call git svn rebase here. Both these commands fetch new commits from the Subversion repository, but the latter is the recommended way to do this. Besides, as you can see, the script requires one argument – the name of the Git repository (in my case, it can be different from the name of the Subversion repository).

To make sure, that all new commits in the Subversion repository will be automatically pushed to the Git mirror, we will call this script from the Subversion repository’s post-commit hook, which is executed (if present) right after a new commit arrives. If your Subversion repository does not have such hook yet, it’s the time to add it. So, go to your repository, which in my case is located in /var/lib/svn/issue-id, find the hooks subdirectory there and create file post-commit.sh (if it’s not yet there, of course). Also, make sure, that this file is executable by the user, that manages the repository (www-data in my case).

In the just created file we need to instantiate the SSH agent and call the script, which we created before. This can be done with single command in the following way:

1
2
3
#!/bin/sh
 
ssh-agent /usr/local/bin/github-update.ssh issue_id

By passing /usr/local/bin/github-update.sh as an argument to ssh-agent we make sure, that the SSH agent will stop right after the script finishes its job. In this way we avoid calling ssh-agent -k or something like this.

Now, everything is ready to test, if new commits arrive to the Git repository. So, go and add some useful changes and commit them to SVN.

Showing mirrors in Redmine

This is an optional step, that I personally don’t do currently.

Since 2.x Redmine supports multiple repositories in one project. So, if you have registered your Subversion repository in Redmine, you can also register its Git mirror.

If your primary Subversion repository does not have identifier in Redmine, you can start with specifying it. This can be done in Settings → Repositories → Edit of the project. For example, you can use main-repository as the identifier. You should also make sure, that the Git SCM is enabled in Administration → Settings → Repositories (without this you won’t be able to register Git repositories).

While this step can be skipped, I highly recommend you to specify a regular expression for allowed repository paths for the Git SCM. Otherwise, some users can get ability, e.g., to read your SSH keys, which are stored in /var/lib/git/.ssh. Such regular expression can be specified by setting scm_git_path_regexp option in Redmine’s configuration.yml, as follows (an example):

scm_git_path_regexp: /var/lib/git/[a-z][a-z_]+

If you use my SCM Creator plugin, you should also make sure, that allow_add_local is enabled (you can read more about this option here).

After that, you can create new Git repository in Settings → Repositories of your Redmine project. For Identifier specify, e.g., github-mirror and for URL – /var/lib/git/issue_id.

 

Comments

Also available in: Atom

Add a comment