Mirroring Repositories with svnsync
Terminology
To best discuss svnsync without getting confused, we should establish some common terminology before going any further:
- Master: The live read/write repository that will be mirrored via svnsync.
- Mirror: The read-only repository that is synchronized with the master via svnsync.
Overview
svnsync is a utility that became part of the standard Subversion offering when 1.4 was released and is described as a program that "provides all the functionality required for maintaining a read-only mirror of a Subversion repository." While understanding the purpose of svnsync based on it's documentation is simple, why would maintaining a mirror repository be important in the enterprise? With each Subversion implementation being different, there can be many reasons but there are a couple common reasons as to the importance of maintaining mirror repositories:
- Provide a backup repository: This can be beneficial for failover, soft-upgrade, etc.
- Provide a simple read-only repository: Some people want a simple way to provide read-only access to a repository. With svnsync, you can easily achieve this without maintaining authorization files and such. (For example: To maintain a community access point to a repository while using a different repository for the actual developer actions.)
These are just a couple examples but should give you an idea as to what value svnsync can provide. (For a more detailed explanation, please refer to the "Repository Maintenance" chapter of the "Version Control with Subversion" book.) While I could jump right in to script suggestions and examples, doing so would be a shame. To really understand why we are doing what we are, we should really understand how svnsync works. I will be brief in my explanation and then we will go into example scripts and suggestions you can apply in your Subversion implementation.
Understanding svnsync
The way svnsync works is actually pretty simple: Take revisions from one repository and "replay" them to another. This means that the mirror repository plays by the same rules as the master repository. The user account performing the actions against the mirror repository must have write access to that mirror repository. The "secret sauce" that makes svnsync work is due to Subversion maintaining the necessary metadata to know what needs to be synchronized in special revision properties on revision 0 of the mirror repository. That is it. That is how svnsync works and although it is easy to understand, to make svnsync work as designed, there are a few "rules" you need to be aware of. The following is a list of rules and/or best practices for using svnsync:
- The synchronizing user needs read/write access to the complete mirror repository.
- The synchronizing user needs to be able to modify certain revision properties.
- The mirror repository needs to be read-only for all users except the synchronizing user.
- Before you can synchronize a mirror repository with the master, the mirror repository needs to be at revision 0.
Now that we know what svnsync is and how it works and why it might be useful, let's learn how we can start synchronizing a mirror repository with our master repository using svnsync.
Implementing svnsync
The only real prerequisite for implementing svnsync is to have a repository that you want to mirror already created prior to starting this process. Once that is complete, you can follow the steps outlined below:
Step 1: Create Mirror Repository
svnadmin create MIRROR_REPO_PATH
Step 2: Make Mirror Repository Only Writable by Synchronizing User
To make the mirror repository only writable by the synchronizing user, which in our example will be "svnsync", we have a few options. One option is to use the authz functionality of Subversion with a default access rule like this:
[/]
* = r
svnsync = rw
The other option is to use the start-commit hook to check for the svnsync user. Here is an example, as a shell script:
#!/bin/sh
USER="$2"
if [ "$USER" = "syncuser" ]; then exit 0; fi
echo "Only the syncuser user may commit new revisions as this is a read-only, mirror repository." >&2
exit 1
Step 3: Make Mirror Repository Revision Properties Modifiable by Synchronizing User
To do this, we need to create a pre-revprop-change hook with something similar to the following example, as a shell script:
#!/bin/sh
USER="$3"
if [ "$USER" = "syncuser" ]; then exit 0; fi
echo "Only the syncuser user may change revision properties as this is a read-only, mirror repository." >&2
exit 1
Step 4: Register Mirror Repository for Synchronization
Perform the following svnsync command on any system:
svnsync initialize URL_TO_MIRROR_REPO URL_TO_MASTER_REPO --username=svnsync --password=svnsyncpassword
If everything is configured properly, you should see some output like this:
Copied properties for revision 0.
Now that you have registered your mirror repository for synchronization with the master repository, we should go ahead and perform the initial synchronization so that the mirror and the master repository are synchronized.
Step 5: Perform Initial Synchronization
To make sure everything is ready and to perform the initial synchronization, on any system, perform the following:
svnsync synchronize URL_TO_MIRROR_REPO --username=svnsync --password=svnsyncpassword
If everything synchronized property, you should see some output similar to this:
Committed revision 1.
Copied properties for revision 1.
Committed revision 2.
Copied properties for revision 2.
Committed revision 3.
Copied properties for revision 3.
…
Step 6: Automate Synchronization with post-commit Hook
Now with the initial synchronization out of the way, all that needs to happen now is to write a script to be ran either as a scheduled process or as a post-commit hook to synchronize your mirror repository with the master repository. I suggest the post-commit option as it gives you the best chance of having a mirror repository as up-to-date as possible. Here is an example hook that might be used on the master repository to synchronize a mirror repository as part of the post-commit hook. As a shell script:
# Example for synchronizing one repository from the post-commit hook
#!/bin/sh
SVNSYNC=/usr/local/bin/svnsync
$SVNSYNC synchronize URL_TO_MIRROR_REPO --username=svnsync --password=svnsyncpassword &
exit 0
That is it. Once you have followed the steps outlined above, you should have a mirror repository that is kept up to date automatically when someone modifies the master repository. This also concludes our introduction to svnsync and how to implement it.


One additional thing to consider: if you allow revprop changes on the master, you may also wish to create a post-revprop-change hook to re-sync the updated revprops to the mirror (via 'svnsync copy-revprops').
Malcolm Rowe | August 06, 2007 at 01:00 PM
Malcolm,
Great idea. Thanks for contributing.
Take care,
Jeremy
Jeremy Whitlock | August 06, 2007 at 01:39 PM
One more thing for security-concerned: the command line arguments to svnsync (including username and password) may be visible to other users in the system.
Vlad Skvortsov | August 06, 2007 at 01:55 PM
What happens when the connection between the master and mirror is occasionally down? I presume that the next svnsync will pick up from the last successful sync, but does the non-zero exit code produce user-visible warnings in the post-commit hook?
Handy article, thanks.
Matt Doar | August 07, 2007 at 11:38 AM
You probably would not want to rely on returning a message in the hook as that would go to the programmer doing the commit. They could easily be some junior programmer that has no clue what the message means, or they could just not report it.
If doing the synch immediately was a priority, you would want to do something like send an email or post an alert.
But yes, the next synch will just pick up the revisions.
Mark
Mark Phippard | August 08, 2007 at 02:26 PM
Thanks for the article. It would be really nice if you made some mention of rsync, and why svnsync is superior for the job. I think a lot of people default to using rsync for mirroring in general. What I've gathered is simply that svnsync might be more efficient since it's transactional and understands svn's internals. That was really hard to google, and I don't know about the source I found: http://mail-index.netbsd.org/tech-repository/2007/07/30/0000.html
I need to be able to convince my group that we should take the steps to set up svnsync with some justfication that it is better than our existing rsync solution (presently done with CVS).
Micah Elliott | October 18, 2007 at 09:41 AM
Micah,
Thanks for the feedback. I do think you've found the right information. With svnsync, you have a Subversion tool doing your Subversion mirroring.
Take care,
Jeremy
Jeremy Whitlock | October 18, 2007 at 09:58 AM
Jeremy,
First, great article. I would like to point out the hook scripts didn't work until I made them executable. Maybe add that nugget to the article.
I'm actually not doing a realtime push. I'm doing a pull every hour - setup with a cron job. Reason being is because the network our active server is on is not allowed to initiate a connection to our mirror network over vpn.
Jon Dahl | January 18, 2008 at 08:39 AM
Jeremy,
Thanks for the article!
We are thinking about migrating to SVN. One thing that's not to us yet is about multi-site repositories. As I understand svnsync, its purpose is only for having a redundant read-only repository. What about having a copy of a repository which also allows users to commit? This might be for performance issues over a large geographical scale.
The right pointers or any comments would be appreciated.
Cheers,
Oli
Oliver Frank | February 07, 2008 at 12:28 AM
Oli,
You are right. svnsync is only for creating read-only, mirrors of a master repository. Since Subversion is a centralized repository model, there isn't anything you can do with native Subversion to make it become a multi-master setup but there is svk, http://svk.bestpractical.com/, which is a tool built on top of Subversion to allow for multi-master repositories. It might be worth a look.
Take care,
Jeremy
Jeremy Whitlock | February 07, 2008 at 07:22 AM
Ha! Jeremy ,
I'm actually Now doing a real time push. I'm doing a pull every hour network our active server allowed to initiate a connection to our mirror network over vpn,
i have change Mirror network ip ,then How to sync,i what to init agine .
are i what to sync command line but it is not working .
pl reply
i am waiting for you message
Madhu | February 11, 2008 at 01:49 AM
Madhu,
From my understanding, to "move" the slave server, all you have to do is update your synchronizing script on the master server and then update the "svn:sync-from-url" revision property on revision 0 of your master to have the new URL to the slave.
Take care,
Jeremy
Jeremy Whitlock | February 11, 2008 at 08:40 AM
Hi Jeremy,
How to make Automate Synchronization ,i want to write any service in windows
can you tell me Process,can you mail me pl.
thanks
Madhu.
Madhu | February 14, 2008 at 10:06 PM
Madhu,
If you read this article, start to finish, you'll end up with a solution that synchronizes automatically. That is, everytime your repository changes, those changes are mirrored to your slave(s).
Take care,
Jeremy
Jeremy Whitlock | February 15, 2008 at 08:13 AM
To answer Oliver Frank's question from February 7, 2008, one solution would be to look into the WebDAV write-thru proxy feature which will become available in Subversion 1.5:
http://blogs.open.collab.net/svn/2007/10/yesterday-at-th.html
Ryan Schmidt | March 14, 2008 at 12:52 PM
Hi Jeremy,
Can you suggest anything for this scenario?
Can there be two masters of independent "repositories"?
Where SVN #1 is the master for Source Code #1 and a mirror for Source Code #2 and SVN #2 is the master for SC #2 and a mirror for SC #1
rajesh | April 11, 2008 at 12:04 PM
Rajesh,
I do not think you would get the results you seek even if Subversion did allow this. Since Subversion's svnsync just copies revisions from one place to another, copying two separate repositories worth of changes into one single repository doesn't make much sense. My suggestion would be to have master1 and mirror2 on the same Subversion server and master2 and mirror1 on the same Subversion server. This way, even if one box goes down, you still have valid working versions of both repositories. Since a Subversion server can server as many repositories as you deem fit, you could use this scenario with N number of repositories. Does this help?
Take care,
Jeremy
Jeremy Whitlock | April 13, 2008 at 07:52 PM
I think the above might not be clear, but neither was the question I was answering. Basically, you cannot have a repository be a master and a slave at the same time but you can have a Subversion server host both master and mirror repositories. (The reason I mention that the question wasn't clear was because SVN in Rajesh's request could be a server or a repository. There was no mention of what it was so I took a stab at it.)
Jeremy Whitlock | April 13, 2008 at 07:54 PM
I apologize for the late response and not being clear. Yes, I wanted a server to be able to host a master and a slave repositories and not the actual repository being a master and slave. Thanks for the response and clearing things up.
Regards,
Rajesh
rajesh | April 15, 2008 at 11:51 AM