CollabNet
Submerged - CollabNet's Subversion Blog
CollabNet Community

CollabNet Links

  • Submerged Blog
  • On CollabNet Blog
  • CollabNet Home
  • openCollabNet

Categories

  • Administration (8)
  • Client Tools (9)
  • General (35)
  • Subversion Client (23)
  • Subversion Events (2)
  • Subversion in the Enterprise (26)
  • Subversion Server (14)

Past 6 Months

  • June 2008 (4)
  • May 2008 (5)
  • April 2008 (2)
  • March 2008 (3)
  • February 2008 (3)
  • January 2008 (4)

Archives

All Archives...
March 2007

How Subversion conserves disk space

I wanted to share something from our March openCollabNet Technical Newsletter. If you do not get our newsletter yet, sign up for openCollabNet. It only takes a minute.

To keep the size of the repository as small as possible, Subversion uses deltification, also called "deltified storage". Deltification is the encoding of a chunk of data as a collection of differences against some other data. If two files are very similar, deltification results in storage savings because only the changes are stored, not the entire file.

This works differently depending on what filesystem back-end you use. In BDB (Berkeley Database) fulltexts are found at the tips of each distinct line of a file's history. When a change occurs, the new version is stored as fulltext, then the previous version is rewritten as a delta against that new version. FSFS stores deltas in the opposite direction so that old versions never need to be rewritten.  When a file is changed, the new version is stored as a delta against an older version.

Most source code files change frequently and Subversion's performance would degrade if it had to use every individual delta to re-create a file that has changed many times. Subversion uses "skip-deltas" to improve performance. Skip-deltas are deltas that are calculated not against the immediate next or previous version, but against a version that's closer in the chain of deltas to a fulltext representation of the file. This way the version of a file can be re-created using less deltas than when a delta for each individual change would be needed.

For repositories created with Subversion 1.4 or later, space savings increase further because the delta chunks are stored using a compression algorithm.

Posted by Guido Haarmans | Date: Mar 30, 2007 | Permalink | Comments (0) | TrackBack (0)

Subversion LDAP Authentication with Apache

More and more companies are using directory services for housing their user credentials and information.  Example directory services are Active Directory, eDirectory and OpenLDAP.  How does this relate to Subversion?  Well, in the enterprise deployments I've been involved with, most clients wanted to harness their existing directory services for their Subversion authentication.  This blog post will explain the simplicity of hooking up Apache to your directory service using mod_auth_ldap, giving you the ability to authenticate against your existing user data store.

As of now, the only way to utilize your directory service for authentication is by using Apache as your network layer.  This allows you to use any of the available authentication options to Apache for your Subversion authentication and with mod_auth_ldap, Apache can authenticate against your directory service for Subversion.

Before we get started modifying our Apache configuration file, lets look at the simplest Location directive possible for exposing a Subversion repository via Apache:

<Location /repos>
  # Enable Subversion
  DAV svn

  # Directory containing all repository for this path
  SVNParentPath /absolute/path/to/directory/containing/your/repositories
</Location>

Now lets modify this to add mod_auth_ldap support for the authentication portion of the Location directive above:

<Location /repos>
  # Enable Subversion
  DAV svn

  # Directory containing all repository for this path
  SVNParentPath /absolute/path/to/directory/containing/your/repositories

  # LDAP Authentication & Authorization is final; do not check other databases
  AuthLDAPAuthoritative on

  # Do basic password authentication in the clear
  AuthType Basic

  # The name of the protected area or "realm"
  AuthName "Your Subversion Repository"

  # Active Directory requires an authenticating DN to access records
  # This is the DN used to bind to the directory service
  # This is an Active Directory user account
  AuthLDAPBindDN "CN=someuser,CN=Users,DC=your,DC=domain"

  # This is the password for the AuthLDAPBindDN user in Active Directory
  AuthLDAPBindPassword somepassword

  # The LDAP query URL
  # Format: scheme://host:port/basedn?attribute?scope?filter
  # The URL below will search for all objects recursively below the basedn
  # and validate against the sAMAccountName attribute
  AuthLDAPURL "ldap://your.domain:389/DC=your,DC=domain?sAMAccountName?sub?(objectClass=*)"

  # Require authentication for this Location
  Require valid-user
</Location>

Use the in-line comments in the code above to better understand the Apache configuration directives for mod_auth_ldap.  With the above example (which you need to modify for your environment) you can have Apache authenticate your Subversion users against your Active Directory directory service.  The above will also work for other directory services but with minor modifications in the AuthLDAPURL.  For more information, you can consult the mod_auth_ldap documentation linked to in the first paragraph.  Although this post is short, I hope it adds value to those who read it.

Posted by Jeremy Whitlock | Date: Mar 28, 2007 | Permalink | Comments (27) | TrackBack (0)

Multiple Subversion repositories?

Svnsync_4 On Wednesday CM Crossroads and CollabNet hosted a webinar: Subversion in the Enterprise, presented by C. Michael Pilato and Bob Jenkins from CollabNet plus Terrence Cordes, SCM manager at Reuters. Terry gave some great insights into deploying Subversion across global teams; I’ll post a link to a recording of this webinar here soon. Because the presenters only got to a few of the questions that were asked, we will answer some of the remaining ones in this blog over the next few weeks. Here is the first one, asked by a couple of people:

Do you recommend multiple repositories for distributed teams due to WAN performance? 

Subversion itself does not support synchronized repositories that are concurrently used for read and write. Subversion uses one central server. When it was designed, the WAN was kept in mind and networking with low bandwidth requirements is built into the system.

Subversion’s working copy model means that the developer works on his or her code without needing to be connected to the server. You only need a connection with the server in a few cases, for instance for a commit or when updating your local working copy with changes from the repository. When data is sent back and forth, only differences are sent.

Mike Pilato actually touched on this during the webinar. If you make a small change to a large file and commit that change, only the change is sent across the network, not the entire file. This minimizes band-width requirements. Subversion only needs to send the entire files across the wire the first time a developer checks out the repository.

The conclusion is that WAN performance is not an issue when considering Subversion, assuming your network is reliable.

Subversion does have some support for multiple repositories. With version 1.4 svnsync was introduced. This utility lets you replicate your repositories into any number of read-only copies. There are several usage models for this, with back-ups being the most common.

But there are other usages as well. For example, at EclipseCon I met some people from the Philippines who were asking about using multiple repositories to get around network downtime (we’ve all heard about the recent big internet outage in Asia). Their development team is in Los Angeles but build and test happens in the Philippines. This company can set up a main repository with read-write access in the US and use svnsync to make a remote copy for the build and test team. Should the international network go down, they can access the local read-only copy to make a build.

You can find out more about svnsync by typing “svnsync help” at the command prompt or check the online version of Version Control with Subversion. The authors are updating this book for release 1.4 and have a chapter on svnsync (I cannot give you the exact url of the svnsync chapter, due to daily builds the url changes all the time)

If you want to use Subversion and really need multiple read-write repositories, there is a solution: svk (its primary author is Chia-liang Kao). svk is a decentralized version control system built on top of Subversion. It supports things like repository mirroring and disconnected operation. Some people will prefer this but before you choose a distributed repository solution make sure you really need it. It does have some advantages for developers if they are often working disconnected from the network, for example: like Subversion they can work disconnected but additionally they can commit to a local repository. However, it can come at the cost of higher administration overhead, fatter bandwidth requirements and more server infrastructure. Subversion’s centralized model is easier to deploy and maintain and, if you don’t need a distributed model, will have lower cost of ownership.

Posted by Guido Haarmans | Date: Mar 23, 2007 | Permalink | Comments (3) | TrackBack (0)

Authz and Anon Authn Agony

A recent first-time attempt at using Subversion's path-based authorization module turned out to be less trivial than I'd planned because I was trying to use it with a repository that allowed anonymous read access. Things went well at first — I did some copying and pasting of sample httpd.conf directives and authz file contents from Version Control with Subversion, tweaking as necessary to suit my needs. In a short time, I had what I thought was the perfect setup. I was wrong.

Say, like me, you wish to configure a repository such that it permits anonymous reads to most of it, authenticated reads to the rest of it, and authenticated writes to the whole thing. You already have an Apache htpasswd file with your writers' usernames and password hashes, and you've configured Apache to use that htpasswd file for authentication, and an authz file for authorization. You then make the obvious additions to your authz file:

[groups]
writers = someuser1, someuser2, …

[repository:/]
* = r
@writers = rw

[repository:/trunk/private-area]
* =
@writers = rw

There's a group with your writers' usernames. There's a rule which grants anonymous read to the world, and write access to just the writers. And there's an override rule which removes read access from unauthenticated users in the repository's private area. Looks great.

Then you start testing.

Upon checking out your repository's /trunk directory, anonymous users get what you'd expect — the tree, minus the /trunk/private-area directory.

But what about your authenticated would-be writers? Ah, therein lies the rub! There are no authenticated users. Since anonymous users can checkout the tree, Apache never bothers to query you for authentication credentials. And you can't force Subversion to transmit authentication credentials when Apache hasn't asked for them.

So what's the workaround?

First, you could disable anonymous access altogether, and force non-writers to share a username like "anonymous" and a publicized password. In your authz rules, the user "anonymous" would have only read permission, and only on the public portion of the repository. This works fine, but at some discomfort to non-writers. They now have to supply a password which, though not secret, might still be non-obvious and/or unknown to them.

Secondly, you could just leave things the way they are, and force writers to checkout just the private area of the repository separately. They won't have the luxury of both the public and private areas being connected inside a single working copy, but that might be okay.

Thirdly, you could keep the private stuff in its own repository. For writers, this is very similar to the second workaround. But your writers won't be able to make a private thing public without breaking the history across repositories.

Finally, you could setup a second <Location> block in your httpd.conf file which points to the same repository but with a slightly different URL (for example, with "-no-anon" appended to it). In this block, disallow anonymous access. Then add a matching redundant entry in your authz file, too:

[repository-no-anon:/]
* =
@writers = rw

Now, anonymous non-writers can checkout from the original repository URL without prompting, and won't see the private area. Non-anonymous writers can checkout from the alternate repository URL with prompting, and will see the private area.  (Thanks to Max Bowsher for this great hybrid workaround idea.)

Posted by C. Michael Pilato | Date: Mar 22, 2007 | Permalink | Comments (6) | TrackBack (0)

Computing the differences between tags

A very common question asked on the Subversion mailing list is "How can I see the differences between two tags?" Of course there are a lot of variants of this question such as what are the differences between trunk and a branch or two branches etc. The person asking this question is almost always aware that they can run the svn diff command to get the differences, but usually they just want to know the list of differences at the file name level, not the complete line-level diff. Prior to Subversion 1.4, the answer to this question was usually that you had to parse the diff output and extract the file names.

With Subversion 1.4, however, a new option, --summarize, has been added to the diff command.  When this option is provided, the diff command just outputs the differences at the file level.  For example, to see the differences between the Subversion 1.3.1 and 1.3.2 tags I can run this command:

svn diff --summarize http://svn.collab.net/repos/svn/tags/1.3.1 http://svn.collab.net/repos/svn/tags/1.3.2
M      http://svn.collab.net/repos/svn/tags/1.3.1/STATUS
M      http://svn.collab.net/repos/svn/tags/1.3.1/build.conf
M      http://svn.collab.net/repos/svn/tags/1.3.1/configure.in
M      http://svn.collab.net/repos/svn/tags/1.3.1/build/ac-macros/aprutil.m4
M      http://svn.collab.net/repos/svn/tags/1.3.1/build/ac-macros/apr.m4
M      http://svn.collab.net/repos/svn/tags/1.3.1/build/ac-macros/swig.m4
M      http://svn.collab.net/repos/svn/tags/1.3.1/build/get-py-info.py
M      http://svn.collab.net/repos/svn/tags/1.3.1/build/generator/swig/external_runtime.py
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/include/svn_version.h
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/libsvn_wc/status.c
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/libsvn_wc/lock.c
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/bindings/swig/ruby/libsvn_swig_ruby/swigutil_rb.c
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/bindings/swig/INSTALL
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/bindings/swig/NOTES
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/mod_dav_svn/version.c
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/mod_dav_svn/repos.c
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/tests/clients/cmdline/stat_tests.py
A      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/tests/clients/cmdline/authz_tests.py
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/tests/clients/cmdline/svntest/actions.py
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/tests/clients/cmdline/svntest/main.py
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/tests/libsvn_repos/repos-test.c
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/libsvn_repos/commit.c
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/svnserve/serve.c
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/po/ja.po
M      http://svn.collab.net/repos/svn/tags/1.3.1/subversion/po/zh_TW.po
M      http://svn.collab.net/repos/svn/tags/1.3.1/contrib/client-side/svn_load_dirs.pl.in
M      http://svn.collab.net/repos/svn/tags/1.3.1/tools/hook-scripts/mailer/mailer.py
A      http://svn.collab.net/repos/svn/tags/1.3.1/tools/server-side/svnauthz-validate.c
A      http://svn.collab.net/repos/svn/tags/1.3.1/tools/server-side
M      http://svn.collab.net/repos/svn/tags/1.3.1/CHANGES
M      http://svn.collab.net/repos/svn/tags/1.3.1/packages/rpm/rhel-3/apr.patch
M      http://svn.collab.net/repos/svn/tags/1.3.1/packages/rpm/rhel-4/apr.patch
M     http://svn.collab.net/repos/svn/tags/1.3.1

With the --summarize option provided to the diff command, the output shows the changes at the file level. The output is similar to what you would see with many of the other Subversion commands. The first column has a value to indicate if the file was Added, Modified, Deleted, the second column does the same for properties.  This makes it easy to parse the output in scripts. The --summarize option was a great enhancement to the diff command and solves this use case really well.

Posted by Mark Phippard | Date: Mar 16, 2007 | Permalink | Comments (8) | TrackBack (0)

Run svnserve as a Windows Service

If you are going to run a Subversion server on Windows, you want it to run as a service. Doing so allows you to ensure the server is started automatically when the server reboots, and let’s face it, we have all had to reboot a Windows server once or twice. Running as a service has other benefits though too. For example, you can see if the service is running by using the Windows Management Console, even if you are working on a remote client. Likewise, you can stop and start the service from the console, again even if you are working remotely. In addition, it is a lot easier to script the stop and start of the server as part of your backup process; if that is something you want to do.

I recently wrote an article that was posted on openCollabNet that explains how to setup svnserve to run as a Windows service. This article describes how to do it using a new feature that was included in Subversion 1.4.

Posted by Mark Phippard | Date: Mar 12, 2007 | Permalink | Comments (0) | TrackBack (0)

Daylight Saving Time and Subversion

Clock People have asked CollabNet about the implications of the changes to daylight saving time to Subversion. This year DST will start a few weeks earlier in many countries and older operating systems are not aware of the change. The Subversion Development Team has posted a message on Tigris.org to answer the question. CollabNet customers on a hosted Subversion solution do not need to take any action, we will take care of patching the systems.

Posted by Guido Haarmans | Date: Mar 5, 2007 | Permalink | Comments (0) | TrackBack (0)

Welcome to Submerged, the Subversion blog

OpenCollabNet has been live for over 4 months now and has drawn quite a crowd. We have over 3,000 members and 60,000 unique visitors. What we were missing was a blog about Subversion. Here it is.

I have invited a number of my colleagues to blog here and we will focus on technical topics, especially about deploying Subversion for globally distributed development and large teams. Blog posts will come from our Subversion committers who work in the open source community, people like C. Michael Pilato, and you’ll see Mark Phippard who recently joined us; he is the lead on the Subclipse project. And some of our technical consultants can’t wait to share experiences about their work at customers who are doing large-scale deployments of Subversion.

If you have ideas about what you want us to blog about, send me an email or make suggestions in the comments.

Posted by Guido Haarmans | Date: Mar 4, 2007 | Permalink | Comments (0) | TrackBack (0)

RSS Syndicate this blog

Recent Posts

  • CollabNet Subversion 1.5.0 binaries available…
    Posted by Mark Phippard
  • Subversion 1.5.0 Released!…
    Posted by Mark Phippard
  • Subversion 1.5 - Release Candidate 9 Available…
    Posted by Guido Haarmans

Recent Site Comments

  • "Good afternoon: I've been trying to get a grip on SharpSVN…"

    Sky
  • "Another vote for 64 bit versions of the subversion client/s…"

    Matt Block
  • "svnadmin, svnlook etc. are only provided with the Server pa…"

    Mark Phippard
  • "The Windows binaries have been released: http://subversion.…"

    René Leonhardt
  • "Does CollabNet provide svnadmin.exe? It's not in the comman…"

    Stefan
  • "What ever happened to the binaries for Solaris v1.5? Is th…"

    Pat Podenski
  • "Joel, I also recall the server requires that the LSB Debia…"

    Mark Phippard
  • ©2008 CollabNet Corporation
    • Site Feedback
    • Terms of Use
    • Privacy Policy
    • Copyright & Trademark