Reposurgeon
Reposurgeon is a tool by Eric Raymond that is for editing version-control repositories. It can be used to migrate from one system to another; like migrating from Subversion to Git. A lot of large software projects [1]have migrated from Subversion to Git, and reposurgeon is one of the few tools that can actually accomplish large migrations.
There are many other tools for importing and even bi-directional interfaces between repositories. However, reposurgeon is probably the best tool for doing large complex migrations.
One of my requirements was that the user should not have to declare the branch structure! You'll be able to read the detailed rules on reposurgeon 2.0's manual page; the short version is that if trunk is present, then trunk, branches/*, and tags/* are treated as candidate branches, and so is every other directory immediately under the repository root. But: a candidate branch is turned into a tag if there are no commits after the copy that created it.
Docs[edit | edit source]
- http://www.catb.org/~esr/reposurgeon
- http://www.catb.org/~esr/reposurgeon/features.html
- http://www.catb.org/~esr/reposurgeon/reposurgeon.html
- http://www.catb.org/~esr/reposurgeon/dvcs-migration-guide.html
Usage[edit | edit source]
Reposurgeon works in two modes: interactive, and scripted. The sample file below illustrates a scripted approach. You want to work on a complete (subversion) repo, not a working copy, so you either do a filesystem copy, or svnadmin dump first.
grundlett@hq-1:~$ cat bin/reposurgeon-myheart.sh
#!/bin/bash
PROJECT=myheart
svnrepo=file:///mount/data/repositories/$PROJECT
# or something like svnrepo=https://svn.example.com/$PROJECT
gitrepo=/tmp/$PROJECT-git
cd /tmp
# start over with:
#rm $PROJECT-mirror/ $PROJECT-git/ -Rf
echo
echo pull/copy the repository...
# repopuller $svnrepo
# or copy it if it is on the same server:
#cp -av /mount/data/repositories/$PROJECT /tmp/$PROJECT-mirror
echo
echo start conversion...
reposurgeon <<EOF
read /tmp/$PROJECT-mirror
prefer git
edit
references lift
rebuild $gitrepo
EOF
echo ...finished
# now filter out all falsely generated .gitignore files:
cd $gitrepo/
git filter-branch --force --index-filter \
"git rm --cached --ignore-unmatch $(find . -name .gitignore|xargs )" \
--prune-empty --tag-name-filter cat -- --all
Why does reposurgeon generate .gitignore files[2]? Partly because it converts svn:ignore
properties. Partly because some tools (git-svn) introduce .gitignore files to the svn repo. I also believe it may have to do with empty directory commits.[3]
See below for a description of the git filter-branch
command
Examples[edit | edit source]
- http://blog.runtux.com/2014/04/18/233/
- https://github.com/cmusatyalab/coda-git-conversion/blob/master/lwp.lift
Re-writing history[edit | edit source]
This content is probably best associated with Git, but deserves mention here since many migrations will involve manipulating history. With git, you can rewrite history using the filter-branch command
- drop all empty changesets
git filter-branch --commit-filter 'git_commit_non_empty_tree "$@"'
Verification[edit | edit source]
List branches sorted by date
git for-each-ref --sort=-committerdate --format='%(refname:short)' refs/heads/
Other[edit | edit source]
- svncutter is another tool, written in Python, by Eric Raymond. svncutter is for stream surgery on SVN dump files.
- repopuller comes with reposurgeon, and is a bash script for using
svnsync
to create a repo 'mirror' that can be used for reposurgery without having to hit the production repository.