CollabNet
Submerged - CollabNet's Subversion Blog
CollabNet Community

CollabNet Links

  • Submerged Blog
  • On CollabNet Blog
  • CollabNet Home
  • openCollabNet

Categories

  • Administration (8)
  • Client Tools (9)
  • General (35)
  • Subversion Client (23)
  • Subversion Events (2)
  • Subversion in the Enterprise (26)
  • Subversion Server (14)

Past 6 Months

  • June 2008 (4)
  • May 2008 (5)
  • April 2008 (2)
  • March 2008 (3)
  • February 2008 (3)
  • January 2008 (4)

Archives

All Archives...

Subversion 1.5 Beta1 Released

The Subversion project has issued a beta1 release on the path to the final GA. The next release will likely be rc1 but, as always, that depends on the feedback we get from testers.

CollabNet has created binaries for Linux, Windows and Solaris (with OS X coming soon) from this beta1.  A noteworthy change as we get closer to RC1 is that we are now providing our CollabNet Subversion packaging for all three platforms. Solaris and Windows are new with this beta release. In addition, for all three of these platforms the server installation can automatically install and configure ViewVC as part of the server installation.

You can download these binaries from openCollabNet. There are also updated versions of the CollabNet Merge Client, Subclipse and TortoiseSVN available on this site.

We have a forum where you can post questions, comments or problems that you encounter using this version. If after some discussion a report is indeed deemed a defect, we forward the bug to the Subversion community using their normal processes. Feedback on the GUI merge clients will go to CollabNet Engineering.

For those that missed the announcement of the Alpha2 release, here are a few things to be aware of:

  • Subversion 1.5 client has an updated working copy format. Once an SVN 1.5 client updates a working copy it will be converted to the new format. This means that older SVN clients will no longer be able to read these working copies. So either be careful, or be sure to update all your clients to the same version. There is a Python script that you can use to downgrade your working copy to the 1.4 format.
  • Subversion 1.5 server can be used to serve an existing repository, but you will not be able to use the new merge tracking features with this repository until it is upgraded. You have a few options here:
    1. Dump existing repository, then create a new repository using the 1.5 binaries and load the dump file.
    2. Create a new repository using 1.5 binaries and use svnsync 1.5 binary to transfer the contents to the new repository.
    3. Run a new command svnadmin upgrade to upgrade an existing repository. This runs fast as it just updates the internal repository format. There are some downsides to this approach though:
      • SVN 1.5 repositories maintain a new index called the node-origins index  This is needed to speed up certain merge operations. The index will be lazy-populated on upgraded repositories, but can be updated on demand by running a new tool named svn-populate-node-origins-index(.exe). If you have a large repository you will likely want to populate the index. New repositories, or ones that are dumped/loaded or svnsync'ed will not need this.
      • For fsfs repositories, the node-origins index will be slightly faster and take up less disk space if the repository is reloaded or synced. For existing repositories, the index has to be maintained in a separate set of files that takes up extra disk space and I/O.
      • For new 1.5 fsfs repositories, there is a new sharding mechanism for storing revisions on disk. Instead of storing all revisions in one single directory, they are now spread across multiple directories which will be better on certain filesystems.  There is a Python script you can run to shard an exising fsfs repository though.
    4. If you were using an earlier version of the 1.5 binaries, note that the on-disk format of the repository has been changed in the most recent builds. You should dump your repository with your current binaries, then install the new binaries, create a new repository and load the repository.

Please take the opportunity to give these beta versions on the path to GA a try. Your feedback will lead to a better final release and likely also help us deliver Subversion 1.5 as soon as possible.

Posted by Mark Phippard | Date: Mar 21, 2008 | Permalink | Comments (3) | TrackBack (0)

Subversion 1.5 alpha2 released

The Subversion project has issued an alpha2 release on the path to the final GA. The next release might be alpha3, it might be beta1, that depends on the feedback we get from testers. One thing is certain, and that is the sooner we can find any remaining bugs, the sooner the GA release will be available.  Likewise, hearing that you have used this version with some success will also help us make decisions.

CollabNet has created binaries for Linux and Windows (with OS X coming soon) from this alpha2. You can download them from openCollabNet. There are also updated versions of the CollabNet Merge Client, Subclipse and TortoiseSVN available on this site.

We have a forum where you can post questions, comments or problems that you encounter using this version. If after some discussion a report is indeed deemed a defect, we forward the bug to the Subversion community using their normal processes. Feedback on the GUI merge clients will go to CollabNet Engineering.

A few things to be aware of:

  • Subversion 1.5 client has an updated working copy format.  Once an SVN 1.5 client updates a working copy it will be converted to the new format. This means that older SVN clients will no longer be able to read these working copies. So either be careful, or be sure to update all your clients to the same version. There is a Python script that you can use to downgrade your working copy to the 1.4 format.
  • Subversion 1.5 server can be used to serve an existing repository, but you will not be able to use the new merge tracking features with this repository until it is upgraded. You have a few options here:
    1. Dump existing repository, then create a new repository using the 1.5 binaries and load the dump file.
    2. Create a new repository using 1.5 binaries and use svnsync 1.5 binary to transfer the contents to the new repository.
    3. Run a new command svnadmin upgrade to upgrade an existing repository. This runs fast as it just updates the internal repository format. There are some downsides to this approach though:
      • SVN 1.5 repositories maintain a new index called the node-origins index  This is needed to speed up certain merge operations. The index will be lazy-populated on upgraded repositories, but can be updated on demand by running a new tool named svn-populate-node-origins-index(.exe). If you have a large repository you will likely want to populate the index. New repositories, or ones that are dumped/loaded or svnsync'ed will not need this.
      • For fsfs repositories, the node-origins index will be slightly faster and take up less disk space if the repository is reloaded or synced. For existing repositories, the index has to be maintained in a separate set of files that takes up extra disk space and I/O.
      • For new 1.5 fsfs repositories, there is a new sharding mechanism for storing revisions on disk. Instead of storing all revisions in one single directory, they are now spread across multiple directories which will be better on certain filesystems.  There is a Python script you can run to shard an exising fsfs repository though.
    4. If you were using an earlier version of the 1.5 binaries, note that the on-disk format of the repository has been changed in the most recent builds. You should dump your repository with your current binaries, then install the new binaries, create a new repository and load the repository.

Please take the opportunity to give these alpha versions on the path to GA a try. Your feedback will lead to a better final release and likely also help us deliver Subversion 1.5 as soon as possible.

Posted by Mark Phippard | Date: Mar 4, 2008 | Permalink | Comments (0) | TrackBack (0)

Subversion 1.5 WebDAV Write-Thru Proxies

Yesterday at the Pre-SubConf Subversion Workshop in Munich, I presented a short bit about a new Subversion 1.5 feature — WebDAV write-thru proxy support. This feature allows for a Subversion server deployment based around Apache 2.2.x and mod_proxy in which there is a single master server and associated repository, and one or more slave servers which handle read operations while passing through (proxying) write operations to that single master server. In this deployment scenario, each slave server has its own copy of the master repository which is kept in sync by some process, typically one driven by hook scripts on the master server itself.

Why might someone want to use such an arrangement? Well, users are far more likely to be doing read operations (such as checkouts, updates, status checks, log history requests, diff calculations, etc.) than write operations (such as commit, revision property changes, and lock/unlock requests). You might wish to employ several servers to share the burden of handling those read requests (a load-balancing scenario). Or perhaps you have a worldwide organization (such as CollabNet's) with offices in, say, the United States, Eastern Europe, and India. If you deploy a single centralized server across the whole organization, you will almost certainly wind up favoring some users over others in terms of the performance of the network links between them and the Subversion server. WebDAV write-thru proxies would allow you to keep geographically local slave servers for each of your global regions, universally optimizing the performance of all of your users' read requests.

For the purposes of my presentation, I whipped up a simple WebDAV write-thru proxy scenario with a single slave server, and using svnsync as the mechanism for propogating changes from the master server to the slave server. The following describes how you can do the same.

Read More »

Posted by C. Michael Pilato | Date: Oct 16, 2007 | Permalink | Comments (16) | TrackBack (0)

Subversion 1.5 Merge Tracking and svnmerge.py

Two of the questions most asked during the webinar “Branching and Merging with Subversion 1.5” were:

  • “What are the differences between Subversion 1.5's merge tracking features and svnmerge.py?”
  • “Will there be a migration utility?”

I’m in Munich at the moment and today attended CollabNet’s Subversion Power Workshop that precedes SubConf, the first conference dedicated to Subversion. Answers to both questions were covered during the merge tracking sessions.

Before talking about the advantages of Subversion 1.5 merge tracking over svnmerge.py, I think it is important to say that svnmerge.py has been a very useful tool contributed and maintained by Subversion community members. We are all thankful for their contributions. That said, Subversion 1.5 does offer several advantages over svnmerge.py:

  • SVN 1.5 tracks merge history at a per-path level of granularity and tracks file level merges. svnmerge.py only tracks merge info for the entire branch, it does not track finer levels (e.g.: files). This allows SVN to implement cherry picking, which svnmerge.py does not support.
  • svnmerge.py doesn’t always properly merge merge-properties (like svnmerge-integrated and svnmerge-blocked), leading to property conflicts when merging across multiple branches. SVN 1.5 solves this.
  • SVN 1.5 includes support for third-party tools, such as call-backs for conflict resolution. If a merge conflict arises, Subversion can call your GUI client and the client can then fire up a conflict resolution tool. This improves workflow and usability. You can see the integration between SVN 1.5 merging and GUI clients in action by trying CollabNet’s new GUI Merge Client.
  • SVN’s merge tracking includes more comprehensive auditing, like querying the merge history to answer questions about what was merged when and where.

A utility to migrate svnmerge.py merge tracking history from a previous version to Subversion 1.5 was committed to the Subversion trunk just a few days ago. What can it do?

  • Convert the latest merge history for multiple branches. In terms of history, information about the latest merges for each of your branches is the most important information because it allows for an easier merge the next time you merge. If you want to keep things simple and only migrate the latest merge history, you can.
  • The migration utility can also scan your entire repo for merge history, or accept a series of path prefixes to look at: svnmerge-migrate-history.py /path/to/repos trunk tasks branches.

We’ve uploaded a copy of the script to the download section of the Merge Tracking Early Adopter Program. You can download it and test it out on your own repository (not for production please, Subversion 1.5 is still under development). It should work well but tell us if you run into any problems. We know of one limitation: don’t run it on a Subversion 1.5 repository that already has merge history, this has not been tested yet (then again: why would you want to do that?). Right now, only the .py script is available but the Subversion developers plan to release a Windows executable later.

Posted by Guido Haarmans | Date: Oct 15, 2007 | Permalink | Comments (0) | TrackBack (0)

Considerations when upgrading to Subversion 1.5

One of the most common questions that occurs whenever a new release of Subversion (or any product) comes out, has to do with the considerations you need to take into account before upgrading.  I will try to cover some of those in this post.

Subversion's compatibility guidelines guarantee that any 1.x client can talk to any 1.x server.  So you do not have to upgrade the client and server at the same time.  The 1.5 client can talk to 1.0-1.4 servers and any 1.0-1.4 clients can talk to a 1.5 server.

Those same guidelines say that you never have to reload a repository when upgrading the server.  You can upgrade your server from 1.0-1.4 to 1.5 and you do not need to do anything to your repositories.  The new server will silently make any updates that are needed, when they are needed.  In the case of 1.5, a new SQLite database file will be added to the repository to store indexes that speed up the new merge tracking function.  Even though you do not have to reload your repository, there are often small efficiencies you can gain by doing a dump/load of your repository.  In the case of this release, there is a new storage mechanism for fsfs repositories.  Instead of storing all of the revisions in a single folder, we now "shard" the repository by creating a sub-directory for each 1000 revisions. This does not improve performance of Subversion, but it might improve performance of things like backup utilities or other tools you use to work with the file system on the server.  In this case, you do not have to dump/load to get this feature, there is a Python script you can run that will do the sharding for you.  This script will run relatively quickly, certainly orders of magnitude faster than a dump/load.

Some features of Subversion 1.5 will require that you have a 1.5 server and a 1.5 client to enjoy the full benefits of the feature.  For example, the merge tracking feature requires that both the client and server are running 1.5.  If one of them is running a lower version than you get the merge functionality that existed in that version of Subversion.  If you want to take advantage of merge tracking you really need to update your server and all of the clients that are doing merges.  Doing a merge with an old client will work, but will not set any of the merge tracking information when the change is committed.  This simply means that someone using a 1.5 client later will attempt to merge the same info again and get conflicts.  The svn merge --record-only command can be used to fix this after the fact.

When using the new sparse-checkouts functionality it really helps to have a 1.5 server.  The feature will work with an older server, but in most cases the server will still send the extra information (file contents) to the client and the client has to discard what it does not want.

There is an important client-side compatibility feature to take into account.  Some of the new features of 1.5 such as sparse-checkouts and changelists necessitate a new client working copy format number.  So this means as soon as you perform an operation that needs to re-write your working copy (checkout/update/switch/commit/propset etc.) then the internal format of the working copy will be upgraded to the new format for Subversion 1.5.  Once this happens, you will no longer be able to use a 1.4 client with that same working copy.  The same sort of change occurred with the 1.3 to 1.4 upgrade process.  We know that this really impacted many users that were stuck using an older client, so we are including a Python script that you can run to downgrade a working copy back to the 1.4 format.  We will distribute this as an EXE for Windows users.  Just to make this perfectly clear, you can have a 1.4 client and 1.5 client on the same system.  You just need to be careful to not mix their usage on the same working copy.  Also, pure-read operations like (status/info/ls) do not update the working copy.  So simply viewing a working copy with a new version of TortoiseSVN or SCPlugin installed does not modify the working copy format.  Conversely, however, an old version of one of these tools will not be able to read a 1.5 working copy.

Hopefully this answers the questions about upgrading to Subversion 1.5.  It is going to be a great release and I certainly hope people choose to upgrade when it comes out.  If you have additional questions that I did not cover, please post them in the comments or on the Subversion Users forum on openCollabNet.

Posted by Mark Phippard | Date: Oct 14, 2007 | Permalink | Comments (2) | TrackBack (0)

CollabNet Subversion 1.4.4 Update

As you probably know, Subversion 1.4.4 was recently released.  I wanted to let you know that the CollabNet Subversion team is hard at work putting together our packages for 1.4.4.  We will release them in a staged order: Linux first, Solaris second and Windows third.  Most of the time and effort in our release process is spent in QA and validation and the reason we do the Linux and Solaris packages first is that we can get those versions in the hands of our QA team the quickest.  Eventually, I plan on making the order Linux, Windows, Solaris, as there are more Windows than Solaris users who are seeking the packages.

The Linux package is in the final stage of our validation process and I expect it to be available on openCollabNet later this week.  The Solaris package should be available next week and Windows the following week.

It’s worth pointing out that the most significant fix in 1.4.4 is a potential (though very unlikely) repository corruption with BDB repositories.  Since CollabNet recommends FSFS and CollabNet Subversion comes pre-configured with that back-end, this BDB fix does not affect CollabNet Subversion users.

Posted by Mark Phippard | Date: Jun 25, 2007 | Permalink | Comments (5) | TrackBack (0)

Merge Auditing in Subversion 1.5

We had a really nice addition to the Subversion merge tracking feature committed to trunk last week.  Hyrum Wright is a student participating in the Google Summer of Code program.  He has signed up to provide "Merge Auditing" features for merge tracking.  You can see some details on what this means in the functional specifications, but essentially it is about adding intelligence and awareness of the merge tracking information to the svn log and blame output.  In the merge tracking planning, this has been slated as a post-1.5 feature.  It is a feature everyone really wants to see in the code-base, but it was just felt that shipping the release and the core functionality had to be the priority.

Anyway, in this case it appears we will be able to have our cake and eat it too.  Hyrum, whom I should add is already a full Subversion committer, has made great progress on adding this support to the svn log command and has committed those changes to trunk.  Additionally, it turns out that the sample repository we created for our Merge Tracking Early Adopter program really helped to work out the kinks in this feature.  Since we had a nicely documented sample repository, and matching graphic, that Hyrum could refer to, it made it easier for him to refine the feature and get it working properly.

Sample Repo

The above picture is the history of our sample repository. If you take a close look, the r14 commit to trunk is an interesting example where this feature comes into play.  This commit is a merge from a branch, and the merge also contains additional merges from another branch.  If I run this command (note the new -g option):

svn log -g -r 14 $REPOS/trunk

Here is the new output:

------------------------------------------------------------------------
r14 | merger | 2007-05-30 15:48:11 -0400 (Wed, 30 May 2007) | 3 lines

Merge branch b - product roadmap and update about page

Command executed: svn merge $REPOS/branches/b
------------------------------------------------------------------------
r13 | buser | 2007-05-30 15:46:48 -0400 (Wed, 30 May 2007) | 1 line
Result of a merge from: r14

Update info about our company
------------------------------------------------------------------------
r12 | merger | 2007-05-30 15:45:19 -0400 (Wed, 30 May 2007) | 3 lines
Result of a merge from: r14

Merge branch a - product roadmap

Command executed: svn merge $REPOS/branches/a
------------------------------------------------------------------------
r11 | auser | 2007-05-30 15:43:00 -0400 (Wed, 30 May 2007) | 1 line
Result of a merge from: r14, r12

Add product roadmap
------------------------------------------------------------------------

Note how it includes information on the revisions that were merged in r14, and additionally how one of those revisions was the result of another merge.  Graphical clients like our CollabNet Desktop - Eclipse Edition and TortoiseSVN should be able to do really useful and interesting presentations of this information in their UI.

This is fantastic work Hyrum.  Keep it up.  I cannot wait to see the blame implementation come together.

Posted by Mark Phippard | Date: Jun 13, 2007 | Permalink | Comments (4) | TrackBack (0)

Subversion 1.4.4 Released

Those of you who diligently watch your e-mail Inbox for those little blessings that come via announce@subversion.tigris.org recently got a pleasant surprise, though perhaps not the one you'd been wishing for. It's been nearly five months since Subversion's last release — nine months since the introduction of the 1.4 version line. "Where's the next big Subversion release?" "What's taking so long!?" "I want Merge Tracking, and I want it yesterday!"

Don't worry, Subversion 1.5 remains on schedule. Merge Tracking development is still ongoing, as is support for sparse directories and a number of other improvements and features. Those are, of course, in addition to all of the other goodies that 1.5 promises to bring which are already completed. But along the way of developing Subversion 1.5, we found enough important issues in 1.4.3 to justify another patch release on this line.

Subversion 1.4.4 delivers a number of bug fixes for various issues, including (but not limited to):

  • an extremely rare directory property dataloss bug which, for BDB-backed repositories, also unfortunately results in repository corruption
  • a race condition when changing revision properties in FSFS-backed repositories
  • a low-risk security flaw in the 'svn prop* --revprop' subcommands
  • problems with 'svn diff' when writing large diff hunks to the Windows console
  • a path sorting bug in the output of 'svnadmin dump' for non-ASCII paths which can cause invalid dumpfile generation
  • a hang bug triggered during character translation

Now, some of those items look sorta scary. Data loss, repository corruption, race conditions — that's the stuff a technologist's nightmares are made of! But without downplaying the significance of the fixes to those problems, allow me to assure you that the instances of them have thus far been extremely rare. Still, you are encouraged (as always) to upgrade to this latest release of Subversion.  You can see the full set of changes made in 1.4.4 in the Subversion CHANGES file.

And if you're one of those folks dying to get your hands on Merge Tracking functionality, let me encourage you to join the openCollabNet Merge Tracking Early Adopter Program. The more feedback the Subversion developers get on their work, the better — and more timely — the feature will be overall.

Posted by C. Michael Pilato | Date: Jun 9, 2007 | Permalink | Comments (0) | TrackBack (0)

CollabNet Subversion update

I thought I would take a break from my previous technical posts and give a general update about what has been going on in the land of CollabNet Subversion (our Subversion binaries).

We have been busy the last several weeks and the fruits of our labor are starting to see the light of day. First, a couple of weeks ago we released our Subversion 1.4.3 client and server RPM's for Red Hat Enterprise Linux. This was big news for us because it was our first 1.4.3 release and it had been bugging us that we were lagging behind. Producing the binaries is easy, but we created new packaging as well as additional scripts that we provide with the RPM. This took some extra time but now our build environment is complete and going forward it will be easier for us to stay in synch with the open source community.

Earlier this week, we released our Subversion 1.4.3 client and server packages for Solaris 10 SPARC. This was our first Solaris release and again it felt good to get it out the door. We spent some extra time making sure the build works the way we want it to but we sorted out any issues and the release sailed through our QA process. I know that I was pleased when I tested the Solaris version, installing and getting a server up and running is very easy. The extra time we spent paid off and like Linux we can now produce new releases fast.

Also this week, we helped kick-off a new area on openCollabNet that is providing "Community Binaries" for Subversion. The first binary release is for Mac OS X 10.4 PPC and Intel. Jeremy Whitlock, a contributor to this blog, deserves the credit for picking up the ball on this and running with it. He did a great job in both producing the binaries and in putting a lot of quality touches into the packaging. We will have more information on our plans and aspirations for the binaries project when we get the project launched on openCollabNet (within the next two weeks or so). For now, you OS X users, go download the binaries and let us know how it goes for you.

Finally, there is the Windows installer for CollabNet Subversion. Our current version is 1.4.2 and uses InstallAnywhere, a Java-based installer that adds a lot of heft to the download size because it includes a JRE. We are moving away from that to a Windows-specific installer. We are also adding a lot of new features to the installer such as installing the server as a Windows service as part of the install process. It is taking some time to get these features implemented correctly and QA-ed but in the end it will be worth it. We should have the final version out in a couple of weeks. Most likely it will be for Subversion 1.4.4 as that should be released between now and then and creating the binaries themselves is not the issue for us.

Posted by Mark Phippard | Date: May 18, 2007 | Permalink | Comments (4) | TrackBack (0)

Single Repository or Many?

My previous blog entry discussed the issue of repository layout. This entry will try to answer the question of whether you should have one repository per project or a single repository that houses all your projects. There is not going to be a single right answer to this question. Hopefully this post will help you understand the tradeoffs so you can make the right decision that suits your requirements. These are some of the advantages of the single repository approach.

  1. Simplified administration. One set of hooks to deploy. One repository to backup. etc.
  2. Branch/tag flexibility. With the code all in one repository it makes it easier to create a branch or tag involving multiple projects.
  3. Move code easily. Perhaps you want to take a section of code from one project and use it in another, or turn it into a library for several projects. It is easy to move the code within the same repository and retain the history of the code in the process.

Here are some of the drawbacks to the single repository approach, advantages to the multiple repository approach.

  1. Size. It might be easier to deal with many smaller repositories than one large one. For example, if you retire a project you can just archive the repository to media and remove it from the disk and free up the storage. Maybe you need to dump/load a repository for some reason, such as to take advantage of a new Subversion feature. This is easier to do and with less impact if it is a smaller repository. Even if you eventually want to do it to all of your repositories, it will have less impact to do them one at a time, assuming there is not a pressing need to do them all at once.
  2. Global revision number. Even though this should not be an issue, some people perceive it to be one and do not like to see the revision number advance on the repository and for inactive projects to have large gaps in their revision history.
  3. Access control. While Subversion's authz mechanism allows you to restrict access as needed to parts of the repository, it is still easier to do this at the repository level. If you have a project that only a select few individuals should access, this is easier to do with a single repository for that project.
  4. Administrative flexibility. If you have multiple repositories, then it is easier to implement different hook scripts based on the needs of the repository/projects. If you want uniform hook scripts, then a single repository might be better, but if each project wants its own commit email style then it is easier to have those projects in separate repositories

This is just a sampling of the pros and cons of each approach. Hopefully it gives you something to go on to make a decision. I tend to prefer the one repository per project approach with the caveat that if I have many projects that are related to each other, I would put those all in the same repository. I also tend to break up repositories by group or team, although in reality this is just a variation of the project concept.

For example, I had one repository for the Documentation department to use for their projects. Of course in the case of on-line help that was often located in the same project as the application code, but the Documentation team also had other materials they produced and I gave them a repository for that. Likewise, the Marketing department had a repository to store the things they worked on, including the company web site. As was the case with the layout of the repository, this is really just a decision of what will work best for you. That  said, it is a little more difficult to change your repository setup after the fact.

So, it is worth it to take some time to understand your requirements and which approach best suits them.

Posted by Mark Phippard | Date: Apr 26, 2007 | Permalink | Comments (3) | TrackBack (0)

Subversion Repository Layout

I see a lot of questions asked about "What is the recommended repository layout?", "What does trunk mean?", or: "What is the significance of trunk?". This post will try to answer those questions and more.

A Subversion repository implements the metaphor of a versioned filesystem. The repository is just a filesystem with folders and files. It so happens that modifications to this filesystem are versioned and there are implementation enhancements like "cheap" copies that make certain operations less expensive than they are in a traditional filesystem, but the repository itself still behaves like a filesystem: there are no special folders or names and Subversion itself has no knowledge of trunk or branches, they are just folders within the filesystem. It is up to you as the user to give those folders names and structure that are meaningful to you.

That said, there are several common layouts that have been adopted by the community as best practices and therefore one could think of these as recommendations. If your repository is accessible to the public, following these conventions might make it easier for users that have accessed other Subversion repositories to find what they are looking for.

There are two commonly used layouts:

trunk
branches
tags

This first layout is the best option for a repository that contains a single project or a set of projects that are tightly related to each other. This layout is useful because it is simple to branch or tag the entire project or a set of projects with a single command:

svn copy url://repos/trunk url://repos/tags/tagname -m "Create tagname"

This is probably the most commonly used repository layout and is used by many open source projects, like Subversion itself and Subclipse. This is the layout that most hosting sites like Tigris.org, SourceForge.net and Google Code follow as each project at these sites is given its own repository.

The next layout is the best option for a repository that contains unrelated or loosely related projects.

ProjectA
   trunk
   branches
   tags
ProjectB
   trunk
   branches
   tags

In this layout, each project receives a top-level folder and then the trunk/branches/tags folders are created beneath it. This is really the same layout as the first layout, it is just that instead of putting each project in its own repository, they are all in a single repository. The Apache Software Foundation uses this layout for their repository which contains all of their projects in one single repository.

With this layout, each project has its own branches and tags and it is easy to create them for the files in that project using one command, similar to the one previously shown:

svn copy url://repos/ProjectA/trunk url://repos/ProjectA/tags/tagname -m "Create tagname"

What you cannot easily do in this layout is create a branch or tag that contains files from both ProjectA and ProjectB.  You can still do it, but it requires multiple commands and you also have to decide if you are going to make a special folder for the branches and tags that involve multiple projects. If you are going to need to do this a lot, you might want to consider the first layout.

As for the names of the folders within the repository, again: they are just a convention. They have no special meaning to Subversion.

"trunk" is supposed to signify the main development line for the project. You could call this "main" or "mainline" or "production" or whatever you like.

"branches" is obviously supposed to be a place to create branches. People use branches for a lot of purposes. You might want to separate your release or maintenance branches from your feature branches or your customer modification branches etc. In this case, you could create a layer of folders beneath branches, or just create multiple branch folders at the top-level.

"tags" are not treated as special by Subversion either. They are a convention, perhaps enforced by hook script or authz rules, that indicate you are creating a point in time snapshot. Typically the difference between tags and branches is that the former are not modified once they are created. You might call your tag folders "releases" or "snapshots" or "baselines" or whatever term you prefer.

Remember, the significance of the name is for your benefit, not for Subversion. Finally, the architecture of Subversion, with its global revision number can often make the need for tags unnecessary. I do not think there is any reason to create tags just for the sake of creating them. If you find a need to recreate the software at a specific point in time, you can always do so by using svn log to determine the relevant revision number. Tags are best when there are "external" consumers of the repository. Maybe it is a QA/Release team that needs to perform builds, maybe it is an internal development team that wants to use releases of your code in another product, or maybe it is literally external users or customers who need to grab release snapshots from your repository. In these scenarios, creating tags is both a convenient way to be sure they get the right code, as well as a good communication mechanism to indicate the presence of release snapshots.

Hopefully this post clarifies some of these issues for you and makes it easier for you to understand how Subversion works.

I would like to finish by pointing out that a Subversion repository layout can be changed. You can always reorganize and restructure your layout after the fact. At worst, it just might create some short term pain as users adjust their working copies. It is not like you need to start over though. Just change names, move folders or do whatever to get the filesystem looking the way you want it.

Posted by Mark Phippard | Date: Apr 19, 2007 | Permalink | Comments (4) | TrackBack (0)

How Subversion conserves disk space

I wanted to share something from our March openCollabNet Technical Newsletter. If you do not get our newsletter yet, sign up for openCollabNet. It only takes a minute.

To keep the size of the repository as small as possible, Subversion uses deltification, also called "deltified storage". Deltification is the encoding of a chunk of data as a collection of differences against some other data. If two files are very similar, deltification results in storage savings because only the changes are stored, not the entire file.

This works differently depending on what filesystem back-end you use. In BDB (Berkeley Database) fulltexts are found at the tips of each distinct line of a file's history. When a change occurs, the new version is stored as fulltext, then the previous version is rewritten as a delta against that new version. FSFS stores deltas in the opposite direction so that old versions never need to be rewritten.  When a file is changed, the new version is stored as a delta against an older version.

Most source code files change frequently and Subversion's performance would degrade if it had to use every individual delta to re-create a file that has changed many times. Subversion uses "skip-deltas" to improve performance. Skip-deltas are deltas that are calculated not against the immediate next or previous version, but against a version that's closer in the chain of deltas to a fulltext representation of the file. This way the version of a file can be re-created using less deltas than when a delta for each individual change would be needed.

For repositories created with Subversion 1.4 or later, space savings increase further because the delta chunks are stored using a compression algorithm.

Posted by Guido Haarmans | Date: Mar 30, 2007 | Permalink | Comments (0) | TrackBack (0)

Authz and Anon Authn Agony

A recent first-time attempt at using Subversion's path-based authorization module turned out to be less trivial than I'd planned because I was trying to use it with a repository that allowed anonymous read access. Things went well at first — I did some copying and pasting of sample httpd.conf directives and authz file contents from Version Control with Subversion, tweaking as necessary to suit my needs. In a short time, I had what I thought was the perfect setup. I was wrong.

Say, like me, you wish to configure a repository such that it permits anonymous reads to most of it, authenticated reads to the rest of it, and authenticated writes to the whole thing. You already have an Apache htpasswd file with your writers' usernames and password hashes, and you've configured Apache to use that htpasswd file for authentication, and an authz file for authorization. You then make the obvious additions to your authz file:

[groups]
writers = someuser1, someuser2, …

[repository:/]
* = r
@writers = rw

[repository:/trunk/private-area]
* =
@writers = rw

There's a group with your writers' usernames. There's a rule which grants anonymous read to the world, and write access to just the writers. And there's an override rule which removes read access from unauthenticated users in the repository's private area. Looks great.

Then you start testing.

Upon checking out your repository's /trunk directory, anonymous users get what you'd expect — the tree, minus the /trunk/private-area directory.

But what about your authenticated would-be writers? Ah, therein lies the rub! There are no authenticated users. Since anonymous users can checkout the tree, Apache never bothers to query you for authentication credentials. And you can't force Subversion to transmit authentication credentials when Apache hasn't asked for them.

So what's the workaround?

First, you could disable anonymous access altogether, and force non-writers to share a username like "anonymous" and a publicized password. In your authz rules, the user "anonymous" would have only read permission, and only on the public portion of the repository. This works fine, but at some discomfort to non-writers. They now have to supply a password which, though not secret, might still be non-obvious and/or unknown to them.

Secondly, you could just leave things the way they are, and force writers to checkout just the private area of the repository separately. They won't have the luxury of both the public and private areas being connected inside a single working copy, but that might be okay.

Thirdly, you could keep the private stuff in its own repository. For writers, this is very similar to the second workaround. But your writers won't be able to make a private thing public without breaking the history across repositories.

Finally, you could setup a second <Location> block in your httpd.conf file which points to the same repository but with a slightly different URL (for example, with "-no-anon" appended to it). In this block, disallow anonymous access. Then add a matching redundant entry in your authz file, too:

[repository-no-anon:/]
* =
@writers = rw

Now, anonymous non-writers can checkout from the original repository URL without prompting, and won't see the private area. Non-anonymous writers can checkout from the alternate repository URL with prompting, and will see the private area.  (Thanks to Max Bowsher for this great hybrid workaround idea.)

Posted by C. Michael Pilato | Date: Mar 22, 2007 | Permalink | Comments (6) | TrackBack (0)

Run svnserve as a Windows Service

If you are going to run a Subversion server on Windows, you want it to run as a service. Doing so allows you to ensure the server is started automatically when the server reboots, and let’s face it, we have all had to reboot a Windows server once or twice. Running as a service has other benefits though too. For example, you can see if the service is running by using the Windows Management Console, even if you are working on a remote client. Likewise, you can stop and start the service from the console, again even if you are working remotely. In addition, it is a lot easier to script the stop and start of the server as part of your backup process; if that is something you want to do.

I recently wrote an article that was posted on openCollabNet that explains how to setup svnserve to run as a Windows service. This article describes how to do it using a new feature that was included in Subversion 1.4.

Posted by Mark Phippard | Date: Mar 12, 2007 | Permalink | Comments (0) | TrackBack (0)

RSS Syndicate this blog

Recent Posts

  • CollabNet Subversion 1.5.0 binaries available…
    Posted by Mark Phippard
  • Subversion 1.5.0 Released!…
    Posted by Mark Phippard
  • Subversion 1.5 - Release Candidate 9 Available…
    Posted by Guido Haarmans

Recent Site Comments

  • "Good afternoon: I've been trying to get a grip on SharpSVN…"

    Sky
  • "Another vote for 64 bit versions of the subversion client/s…"

    Matt Block
  • "svnadmin, svnlook etc. are only provided with the Server pa…"

    Mark Phippard
  • "The Windows binaries have been released: http://subversion.…"

    René Leonhardt
  • "Does CollabNet provide svnadmin.exe? It's not in the comman…"

    Stefan
  • "What ever happened to the binaries for Solaris v1.5? Is th…"

    Pat Podenski
  • "Joel, I also recall the server requires that the LSB Debia…"

    Mark Phippard
  • ©2008 CollabNet Corporation
    • Site Feedback
    • Terms of Use
    • Privacy Policy
    • Copyright & Trademark