Reposurgeon

From Freephile Wiki

Reposurgeon is a tool by Eric Raymond that is for editing version-control repositories. It can be used to migrate from one system to another; like migrating from Subversion to Git. A lot of large software projects [1]have migrated from Subversion to Git, and reposurgeon is one of the few tools that can actually accomplish large migrations.

There are many other tools for importing and even bi-directional interfaces between repositories. However, reposurgeon is probably the best tool for doing large complex migrations.

From January 2012:

One of my requirements was that the user should not have to declare the branch structure! You'll be able to read the detailed rules on reposurgeon 2.0's manual page; the short version is that if trunk is present, then trunk, branches/*, and tags/* are treated as candidate branches, and so is every other directory immediately under the repository root. But: a candidate branch is turned into a tag if there are no commits after the copy that created it.

Docs

  1. http://www.catb.org/~esr/reposurgeon
  2. http://www.catb.org/~esr/reposurgeon/features.html
  3. http://www.catb.org/~esr/reposurgeon/reposurgeon.html
  4. http://www.catb.org/~esr/reposurgeon/dvcs-migration-guide.html


Usage

Reposurgeon works in two modes: interactive, and scripted. The sample file below illustrates a scripted approach. You want to work on a complete (subversion) repo, not a working copy, so you either do a filesystem copy, or svnadmin dump first.


grundlett@hq-1:~$ cat bin/reposurgeon-myheart.sh

#!/bin/bash
PROJECT=myheart
svnrepo=file:///mount/data/repositories/$PROJECT
# or something like svnrepo=https://svn.example.com/$PROJECT

gitrepo=/tmp/$PROJECT-git

cd /tmp

# start over with:
#rm $PROJECT-mirror/ $PROJECT-git/ -Rf

echo
echo pull/copy the repository...
# repopuller $svnrepo
# or copy it if it is on the same server:
#cp -av /mount/data/repositories/$PROJECT /tmp/$PROJECT-mirror
echo
echo start conversion...

reposurgeon <<EOF
read /tmp/$PROJECT-mirror
prefer git
edit
references lift
rebuild $gitrepo
EOF
echo ...finished

# now filter out all falsely generated .gitignore files:
cd $gitrepo/
git filter-branch --force --index-filter      \
 "git rm --cached --ignore-unmatch $(find . -name .gitignore|xargs )"  \
 --prune-empty --tag-name-filter cat -- --all

Why does reposurgeon generate .gitignore files[2]? Partly because it converts svn:ignore properties. Partly because some tools (git-svn) introduce .gitignore files to the svn repo. I also believe it may have to do with empty directory commits.[3]

See below for a description of the git filter-branch command

Examples

Re-writing history

This content is probably best associated with Git, but deserves mention here since many migrations will involve manipulating history. With git, you can rewrite history using the filter-branch command

  • drop all empty changesets
    git filter-branch --commit-filter 'git_commit_non_empty_tree "$@"'
    

Verification

List branches sorted by date

git for-each-ref --sort=-committerdate --format='%(refname:short)' refs/heads/

Other

  • svncutter is another tool, written in Python, by Eric Raymond. svncutter is for stream surgery on SVN dump files.
  • repopuller comes with reposurgeon, and is a bash script for using svnsync to create a repo 'mirror' that can be used for reposurgery without having to hit the production repository.

References