Difference between revisions of "Git/migrating to git"
(remove CVS mention) |
|||
(10 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
+ | == Simple == | ||
+ | For simple, small migrations, follow the process described by CollabNet in [http://blogs.collab.net/subversion/migrating-subversion-repositories-to-git their blog] (and elsewhere). For anything else, don't do it! git svn clone is not a migration tool <ref>https://git.wiki.kernel.org/index.php/Interfaces,_frontends,_and_tools</ref>. | ||
+ | |||
+ | == Summary == | ||
A summary of the steps for migrating your version control system to git from subversion | A summary of the steps for migrating your version control system to git from subversion | ||
Line 26: | Line 30: | ||
<li> Importing your SVN history into Git | <li> Importing your SVN history into Git | ||
<ol> | <ol> | ||
− | <li> | + | <li> If you can migrate using <code>git svn clone</code><ref>https://git-scm.com/docs/git-svn</ref> which is a tool providing a bi-directional conduit of changesets between subversion and git, then good for you! Your project is small and uncomplicated. For larger, more complicated migrations, this tool is not suited for the job. It will take too long, and simply will fail to produce a git repository. I don't understand why Atlassian recommends this approach in their "tutorial" without telling you that it will fail; or at least providing the major caveats. Still, you can read up on the simplistic scenario <ref>https://www.atlassian.com/git/tutorials/migrating-overview</ref>,<ref>https://www.atlassian.com/git/tutorials/migrating-convert/</ref> |
− | <li> Using svn2git on | + | <li> Using svn2git <ref>There are 2 pieces of software by the same name. The one you want was [https://techbase.kde.org/Projects/MoveToGit/UsingSvn2Git#Getting_the_tools created by the KDE team]. You could use the [https://github.com/nirvdrum/svn2git ruby gem by nirvdrum], but it's going to be slower. Unfortunately the KDE code lived on gitorious.org which was bought out by gitlab. They say they're going to put the code up on archive.org, but it's not there and I wouldn't hold my breath. The good news is that the code can be found and is also referred to as [https://github.com/svn-all-fast-export/svn2git svn-all-fast-export]</ref> |
+ | <li>Using [[reposurgeon]] - a tool by Eric Raymond | ||
</ol> | </ol> | ||
<li> Convert svn:ignore properties to .gitignore file (example of why you need to later delete empty commits which reflect properties not code changes) | <li> Convert svn:ignore properties to .gitignore file (example of why you need to later delete empty commits which reflect properties not code changes) | ||
+ | <li>Verification | ||
+ | <source lang="bash"> | ||
+ | mkdir -p /tmp/verify | ||
+ | cd /tmp/verify | ||
+ | for v in 2.0.0 2.1.0 2.2.0 ; do | ||
+ | svn co $v | ||
+ | done | ||
+ | |||
+ | # export git tags | ||
+ | cd /tmp/conv/myproj-git | ||
+ | for v in 2.0.0 2.1.0 2.2.0 ; do | ||
+ | git archive --format=tar --prefix myproj-$v/ v$v | gzip \ | ||
+ | /tmp/verify/git/myproj-$v-git.tar.gz | ||
+ | |||
+ | done | ||
+ | cd /tmp/verify/git | ||
+ | for v in 2.0.0 2.1.0 2.2.0 ; do | ||
+ | tar xvzf myproj-$v-git.tar.gz | ||
+ | done | ||
+ | |||
+ | # compare them | ||
+ | diff -ur /tmp/verify /tmp/verify/git | ||
+ | |||
+ | </source> | ||
+ | |||
+ | <li>Build | ||
+ | |||
+ | <li> Service integrations | ||
+ | <ol> | ||
+ | <li>JIRA / Issue Tracker | ||
+ | <li>ReviewBoard / Code Review | ||
+ | <li>Jenkins / Build system | ||
+ | <li>SQA | ||
+ | <li>OpenGrok / Code indexing browser | ||
+ | </ol> | ||
<li> Client and End User configurations | <li> Client and End User configurations | ||
<ol> | <ol> | ||
Line 42: | Line 82: | ||
<li> Update references in your literature, project documents, websites, systems, reference materials and procedure documents to reference the new systems. This step can be ameliorated if in the beginning you reference code in a generic way such as "code.example.com" where you can then link to various aspects and implementations of your code systems; rather than naming them specifically based on technology or implementation. | <li> Update references in your literature, project documents, websites, systems, reference materials and procedure documents to reference the new systems. This step can be ameliorated if in the beginning you reference code in a generic way such as "code.example.com" where you can then link to various aspects and implementations of your code systems; rather than naming them specifically based on technology or implementation. | ||
</ol> | </ol> | ||
+ | == Post-processing == | ||
+ | |||
+ | * Use the [https://rtyley.github.io/bfg-repo-cleaner/ BFG Repo cleaner] to remove large files, passwords, unwanted paths | ||
+ | * clone it to reduce size <code>git clone file:///path/to/repo</code> | ||
+ | |||
+ | == Lessons Learned == | ||
+ | * https://techbase.kde.org/Projects/MovetoGit | ||
+ | * http://blog.smartbear.com/software-quality/migrating-from-subversion-to-git-lessons-learned/ | ||
+ | |||
+ | * http://www.midwesternmac.com/blogs/jeff-geerling/switching-svn-repository-svn2git | ||
== Caveats == | == Caveats == | ||
Line 49: | Line 99: | ||
Once you've established a git infrastructure for version control, git to git migrations are incredibly easy... at least for the core git repository functions. Just add another remote to push/pull. This means that if you wish to change your git infrastructure to use a different system, the work involved will mostly be about the extra features bundled with the system (e.g. web viewer, code review, etc.) and integrations. | Once you've established a git infrastructure for version control, git to git migrations are incredibly easy... at least for the core git repository functions. Just add another remote to push/pull. This means that if you wish to change your git infrastructure to use a different system, the work involved will mostly be about the extra features bundled with the system (e.g. web viewer, code review, etc.) and integrations. | ||
+ | == Combining git repos == | ||
+ | You might desire to reorganize your code in the migration process. There are several tools which allow you to merge git repositories together. | ||
+ | |||
+ | * http://search.cpan.org/dist/Git-FastExport/script/git-stitch-repo git-stitch-repo] is made for linear repos | ||
+ | * This other one [http://search.cpan.org/~book/Git-FastExport-0.105/script/git-stitch-repo by the same name], authored by Philippe Bruhat (BooK) is nonetheless capable of [http://www.ifup.org/posts/the-right-tool-for-the-job-git-stitch-repo/ merging two (or more) repositories] | ||
+ | * This one, [https://github.com/robinst/git-merge-repos git-merge-repos] is interesting because it talks about taking multiple repositories with more or less the same branches or tags, and merging them at the tag | ||
+ | * [https://stackoverflow.com/questions/277029/combining-multiple-git-repositories This post on Stackoverflow about combining multiple git repositories] mentions git-stitch-repo, and also how it gained the capability to work with non-linear merge histories. It also explains how to do repo merges with git-filter-branch. Note that git-filter-branch requires you to rewrite your history (breaking SHA1 sums). | ||
+ | * The [http://www.kernel.org/pub/software/scm/git/docs/howto/using-merge-subtree.html subtree merge strategy page on kernel.org] shows you how to do this. | ||
+ | * https://westmarch.j5int.com/2014/06/splicing-git-repositories-together/ [https://github.com/j5int/jbosstools-gitmigration/blob/master/git_fast_filter/testcases/splice_repos.py Splice Repos] is a python script. It's more recent than some others and a better tool because it's based on fast-export/fast-import <ref>https://git-scm.com/docs/git-fast-import</ref>. It grew out of the [https://github.com/j5int/jbosstools-gitmigration JBossTools git migration] (which itself has some useful info on procedures). | ||
+ | * Also, the [[reposurgeon]] tool itself can assist you with the re-organization of your sources. | ||
+ | |||
+ | == Submodules == | ||
+ | Sometimes, 'combining' your work with other work is best accomplished through '''[https://git-scm.com/book/en/v2/Git-Tools-Submodules submodules]'''. Git submodules are a way for you to store other repositories in directories of your project. This is most often used to handle 'vendor' code, or libraries. However, submodules can be used whenever you want to combine repositories, yet maintain them independently. | ||
{{References}} | {{References}} |
Latest revision as of 09:52, 22 December 2015
Contents
Simple[edit | edit source]
For simple, small migrations, follow the process described by CollabNet in their blog (and elsewhere). For anything else, don't do it! git svn clone is not a migration tool [1].
Summary[edit | edit source]
A summary of the steps for migrating your version control system to git from subversion
- Discussions with stakeholders, project lead
- Leverage the resources and expertise of prior large migrations including Pro Git 2nd Ed. Eclipse Foundation, Atlassian, Wikimedia Foundation, EclipseCon proposed talk by Max Anderson, Drupal, PostgreSQL, and Pentaho. Be sure to include the "before", the migration itself, and the "after" migration work.
- Understand the concept of Git repos
- Know the caveats
- Plan and structure your Git space
- Decide what to do with your existing code
- Archive your current SVN repository?
- Import your history into git?
- Do the migration.
- Map users
- Migrations must include at least the following details
- Migration timeline
- mapping of current code to new Git repos
- decision regarding existing code (archive or import)
- A description for each repository (which will be visible in the web view)
- Use scripted recipes for LARGE migrations [2]
- Importing your SVN history into Git
- If you can migrate using
git svn clone
[3] which is a tool providing a bi-directional conduit of changesets between subversion and git, then good for you! Your project is small and uncomplicated. For larger, more complicated migrations, this tool is not suited for the job. It will take too long, and simply will fail to produce a git repository. I don't understand why Atlassian recommends this approach in their "tutorial" without telling you that it will fail; or at least providing the major caveats. Still, you can read up on the simplistic scenario [4],[5] - Using svn2git [6]
- Using reposurgeon - a tool by Eric Raymond
- If you can migrate using
- Convert svn:ignore properties to .gitignore file (example of why you need to later delete empty commits which reflect properties not code changes)
- Verification
mkdir -p /tmp/verify cd /tmp/verify for v in 2.0.0 2.1.0 2.2.0 ; do svn co $v done # export git tags cd /tmp/conv/myproj-git for v in 2.0.0 2.1.0 2.2.0 ; do git archive --format=tar --prefix myproj-$v/ v$v | gzip \ /tmp/verify/git/myproj-$v-git.tar.gz done cd /tmp/verify/git for v in 2.0.0 2.1.0 2.2.0 ; do tar xvzf myproj-$v-git.tar.gz done # compare them diff -ur /tmp/verify /tmp/verify/git
- Build
- Service integrations
- JIRA / Issue Tracker
- ReviewBoard / Code Review
- Jenkins / Build system
- SQA
- OpenGrok / Code indexing browser
- Client and End User configurations
- Create keys, add each to client and server
- Install / setup TortoiseGit for Windows
- Add EGit to Eclipse
- Netbeans natively supports Git since v7.1
- Repository permissions and group definitions, key imports
- Establish Git Resources
- Create Git Task Force
- Update references in your literature, project documents, websites, systems, reference materials and procedure documents to reference the new systems. This step can be ameliorated if in the beginning you reference code in a generic way such as "code.example.com" where you can then link to various aspects and implementations of your code systems; rather than naming them specifically based on technology or implementation.
Post-processing[edit | edit source]
- Use the BFG Repo cleaner to remove large files, passwords, unwanted paths
- clone it to reduce size
git clone file:///path/to/repo
Lessons Learned[edit | edit source]
- https://techbase.kde.org/Projects/MovetoGit
- http://blog.smartbear.com/software-quality/migrating-from-subversion-to-git-lessons-learned/
Caveats[edit | edit source]
A single Subversion repository almost always contains multiple "projects" - each with it's own 'trunk', 'branches' and 'tags'. One thing you'll find moving from SVN to Git is that Git repositories aren't designed for multiple projects in the way that SVN is used. All the separate projects (from a single SVN repository) should be migrated to separate Git repositories. A tag or branch in a Git repository is always global to the repository -- so split the code into repositories along boundaries that make sense semantically. (Note that git has much better support for including library code into a project. The feature is called "sub modules". Thus library code can be semantically split out into it's own repository, and that library can be re-used across multiple git repositories. This is like svn "externals" only better.
Upside[edit | edit source]
Once you've established a git infrastructure for version control, git to git migrations are incredibly easy... at least for the core git repository functions. Just add another remote to push/pull. This means that if you wish to change your git infrastructure to use a different system, the work involved will mostly be about the extra features bundled with the system (e.g. web viewer, code review, etc.) and integrations.
Combining git repos[edit | edit source]
You might desire to reorganize your code in the migration process. There are several tools which allow you to merge git repositories together.
- http://search.cpan.org/dist/Git-FastExport/script/git-stitch-repo git-stitch-repo] is made for linear repos
- This other one by the same name, authored by Philippe Bruhat (BooK) is nonetheless capable of merging two (or more) repositories
- This one, git-merge-repos is interesting because it talks about taking multiple repositories with more or less the same branches or tags, and merging them at the tag
- This post on Stackoverflow about combining multiple git repositories mentions git-stitch-repo, and also how it gained the capability to work with non-linear merge histories. It also explains how to do repo merges with git-filter-branch. Note that git-filter-branch requires you to rewrite your history (breaking SHA1 sums).
- The subtree merge strategy page on kernel.org shows you how to do this.
- https://westmarch.j5int.com/2014/06/splicing-git-repositories-together/ Splice Repos is a python script. It's more recent than some others and a better tool because it's based on fast-export/fast-import [7]. It grew out of the JBossTools git migration (which itself has some useful info on procedures).
- Also, the reposurgeon tool itself can assist you with the re-organization of your sources.
Submodules[edit | edit source]
Sometimes, 'combining' your work with other work is best accomplished through submodules. Git submodules are a way for you to store other repositories in directories of your project. This is most often used to handle 'vendor' code, or libraries. However, submodules can be used whenever you want to combine repositories, yet maintain them independently.
References[edit source]
- ↑ https://git.wiki.kernel.org/index.php/Interfaces,_frontends,_and_tools
- ↑ Max Anderson's recipe for migration of the JBoss Tools repos
- ↑ https://git-scm.com/docs/git-svn
- ↑ https://www.atlassian.com/git/tutorials/migrating-overview
- ↑ https://www.atlassian.com/git/tutorials/migrating-convert/
- ↑ There are 2 pieces of software by the same name. The one you want was created by the KDE team. You could use the ruby gem by nirvdrum, but it's going to be slower. Unfortunately the KDE code lived on gitorious.org which was bought out by gitlab. They say they're going to put the code up on archive.org, but it's not there and I wouldn't hold my breath. The good news is that the code can be found and is also referred to as svn-all-fast-export
- ↑ https://git-scm.com/docs/git-fast-import