Once upon a time there was the SubVersioN to manage changes in the code development. Then has been overtaken and replaced by GIT. I will not tell you why, the Internet is full of articles and discussions on it, I will write down a summary of what I found and did when one of my customer decided to migrate its super-huge-longliving-rescue-the-world codebase from subversion to git.

Use git svn to clone svn including all history (change URL accordingly)

git svn clone --stdlay  http://<repository_url>

The

--stdlay

option is a shorthand way to specify that the SVN structure is the standard one: trunk is in /trunk, branches are in /branches, tags are in /tags.

If this is not the structure is possible to specify it. See doc.

Execution will create a folder named like the repository.

If during the execxution it is interrupted, it could be resumed by using

git svn fetch

inside the created folder; or restarted from scratch. Repeating the “clone” command on an existing directory resume, but later creates error because the git config file will contain duplicated lines.

Please consider that if the process interrupts/resume too much time the final result might be corrupted

It’s possible to migrate the user details. See the proper section below.

Migrating User Details

You can create a mapping file between SVN users and Git users – this means that you will continue to track who did what, even after migration.

To do this, add the –authors-file parameter to the git svn clone command above:

--authors-file=C:\dev\git-migration\users.txt

This file is in the format:

svnUser = Full Name <email@address.com>

Git looks for the user by email address and links it if available. If not, it displays the full name. I believe it also works retroactively – eg. if you add a user after the svn clone, it will still map their checkins…

Please keep in mind that “If this option is specified and git svn encounters an SVN committer name that does not exist in the authors-file, git svn will abort operation”. So even the old developers must be kept in this list

Create a fresh empty project in GitLab, ie  “fabric/uc-messaging.git” or “ref-data/common”

Consider the limitation of namespacing in GitLab. Copy the “project url” (in git or http format) for using it later.

Create remote origin in local checkout dir

cd <project>
git remote add origin  gitlab:<group>/<project>.git

Adjust properly the URL with the one copied from GitLab. This will “set” the name origin to map that URL, but it actually do not transfer anything yet.

GENERAL GIT USAGE NOTE: You can use GIT style URLS. This require you have your keys paired with the git server and no credential will be required, as it will be implied by the keys. Or, you can use the HTTP url style. In that case username and password will be requested or must be settled into configuration using the usual git config commands.

What and How is copied by git-svn

Git clones all the SVN repository, with all the branches and tags, and prepare locally a complete git repository in which.

  • The SVN trunk is remapped as “master” and created as local branch
  • All the other branches found on SVN are created as “remote branches” in the local git repository witht the prefix “origin”
  • All the SVN tags are created as GIT remote branches with the prefix “origin/tag”

Remap remote branches as local

“Remote branches” are not pushed on origin. So if a push is issued at this stage, only the master is created in GitLab.

Almost all the time this is not what is intended. To allow to be pushed on GitLab, any “remote branch” must be firstly copied as “local branch”, using the command

git checkout -b <branchname>  origin/<branchname>

where “<branchname>” is the branch name.

The importer creates the remote branches with the origin prefix, so it’s logic to remap those names in local branches simply removing the origin word, as above. This will end up in having the same branch names previously on SVN.

This must be manually be done for all the branches. The list of remote branches can be obtained via:

git branch -r

Luckily, with IDEA in “column edit mode” and a couple of “cut&paste-fu” and “search&replace-fu” a script can be quickly created.

The “remote/trunk” that will be listed by the above command can be skipped. It’s already “master”

At the end of the operation a call to

git branch

will show all the branches that are ready to be pushed.

Map the SVN tags as GIT tags

The git svn clone creates any “SVN tag” as a “remote branch”.

It is possible to remap those remote branches as GIT branches. This is easily done like above. The result will be the creation of several branches called “tag/<svntagname>”.

Or, it’s possible to load those tags as GIT tags. This require two step more.

For any of the former svn tag should be created a GIT tag with

git tag -a <tagname>  -m "tag message, maybe the tagname itself"  refs/remotes/origin/tags/<tagname>

That must be followed by

git push origin tag <tagname>

(Yes, the git push –all that we will apply later DO NOT push the tags!)

Once again, this must be done manually tag by tag, but once again IDEA in Column Edit Mode and the usual “cut&paste-fu” and “search&replace-fu” can help. Commands can be esecuted in any order so, maybe, all the git tag and then all the git push…

An example script

What follows is an example of the script that keep together the steps above about branches and tags. Is of course an example for an example project. It’s created with IDEA column edit mode, cut and paste and search and replace. Takes few minutes.

In googliterature can be found the possibility to develop a shell/perl script that once extracted the remote branches with the git svn clone, create the git command depicted above. I preferred to do manually, I found it quicker!

 

  git checkout -b branchA                    origin/branchA                 
  git checkout -b branchB                    origin/branchB                    
  git checkout -b branchC                    origin/branchC                    
...

  git tag -a reference-data-service-common-3.0.0                      -m reference-data-service-common-3.0.0                     refs/remotes/origin/tags/reference-data-service-common-3.0.0
  git tag -a reference-data-service-common-3.0.0@19218                -m reference-data-service-common-3.0.0@19218               refs/remotes/origin/tags/reference-data-service-common-3.0.0@19218
  git tag -a reference-data-service-common-3.0.0@19474                -m reference-data-service-common-3.0.0@19474               refs/remotes/origin/tags/reference-data-service-common-3.0.0@19474
...
git push origin tag reference-data-service-common-3.0.0
git push origin tag reference-data-service-common-3.0.0@19218
git push origin tag reference-data-service-common-3.0.0@19474
...
git push --all

 

Push all on the GitLab server

git push --all

It will transfer all the branches created above and the master to the “origin” specified as the GitLab URL.

Note that the TAGS are not pushed. See the above paragraph.

Test the genuinity of the import

Testing the genuinity involves that the same version of the same item in SVN and GIT are identical.

Simply, select randomly a branch, check it out from SVN and GIT in two different folder and compare them with Meld or WinCompare. They are supposed to differs only in the version control tool service files and folder (that the comparison tool usually is able to recognize and filter)

Known Strangenesses

  • The git importer sometimes create branches/tags as <tagname>@<revnum> (see the script example above!). This happens when it’s not possible to identify the forking point of the two branches. It’s a documented possibility, in special cases, that do not mean that the import is broken or incomplete, but only that the import tool is not able to rebuild completely the history.
  • Empty folders are NOT created in the GIT checked out version, that – for empty folder – differs from the svn one

 

 

Migrating SVN to GIT: the ultimate guide