CollabNet
Submerged - CollabNet's Subversion Blog
CollabNet Community

Categories

  • Administration (7)
  • Client Tools (13)
  • downloads (2)
  • General (36)
  • Non-Developers (2)
  • Subversion Client (34)
  • Subversion Events (5)
  • Subversion in the Enterprise (25)
  • Subversion Server (21)
  • Web/Tech (1)

Past 6 Months

  • January 2010 (1)
  • December 2009 (1)
  • November 2009 (3)
  • September 2009 (1)
  • August 2009 (1)
  • July 2009 (2)

Archives

All Archives...

Energizing Subversion

Today marks the ninth anniversary of my first commit to Subversion. Back then, the project was less than a year old, with folks actively coding on it for only six months so far. There was a command-line client. It could do commits and updates, but only to an XML storage format. There was no repository concept. No server to speak of. Pretty bare-bones. My first commit was a fix for some path handling issues – using the correct directory separator character on Windows platforms. Simple stuff. In retrospect, everything seemed relatively simple then. The goals were clear – to create a compelling replacement for CVS, where "compelling" meant "better, but similar enough to be easily embraced".

That was a long time ago. I've had two children since then, and one of them is already in school. "Time flies," and all that. But in software years, that was eons ago. Subversion has long since surpassed being a compelling replacement for CVS. Today, it's a compelling replacement for most things and a decent replacement for still more. Even statements like that are misleading – Subversion doesn't have to fuss as much with unseating other version control systems these days because it's become the default choice of version control for many segments of the software development market.

Unfortunately, time doesn't stop flying simply because you taste a little success. And looking backward while the world moves forward is only fun until you slam your skull into some of those harsh realities that inevitably lie in the path ahead of you. So I have a confession to make on behalf of the Subversion development community. We've been moving forward with no clear vision for some time now. This shouldn't come as any amount of surprise to my fellow developers. I'll fancy a guess that it's no surprise to many of our loyal users, either. That's not to say that we've been inactive – far from it. Our recent releases have simply been primarily focused around time-consuming features and tasks (merge tracking? a working copy library rewrite?) that are, for the most part, things we'd been hoping to get around to since our 1.0 release and just couldn't delay any further. The release process hasn't stalled. It's just been a while since we (the developers) were able to look any meaningful distance into the future, stake a claim to some set of features and enhancements that we were united in achieving, and confidently assert to our waiting consumers that, "Yes, that Thing You Wanted? You'll have it in 10 weeks' time."

There are many possible reasons for this: the easy stuff is finished, leaving only hard stuff to do; software maturity yields developer attrition; our millions of users keep us busy servicing the existing features; and so on. You might call them "excuses" if you have a particularly negative bent about you. You could call them "par for the course" if you've ever worked on the same software for a long period of time. I'm choosing to call them "motivation".

This year I and my CollabNet peers will be trying to address Subversion's lack of vision. Now obviously, we can't do this ourselves – even in its CollabNet-bolstered infancy, Subversion was a true open source project, governed internally and driven only as far and as fast as its individual human contributors afforded. But we can certainly strive to encourage a common interest in roadmap development amongst our peers. To that end, I've already begun working with some of my fellow developers (including our friends at WANdisco and Elego) to develop a realistic, achievable, living project roadmap for Subversion; I hope more of the Subversion workforce will be of like mind. With the project's recent move to the Apache Software Foundation, I'm optimistic – the ASF is a high-profile open source organization overflowing with talented developers who themselves are Subversion users.

Where do you fit in? Well, certainly the Subversion community wants to know what you need from Subversion. (If you've seen our issue tracker lately, you know that our users have never been shy about filing bug reports and feature requests.) But more than that, we need able programmers willing to help us chart and navigate this course. You don't have to be a stellar software developer – many of us wouldn't regard ourselves that highly, and anyway I've always found that interaction with an open-source community like this is highly conducive to improving your skill set while still serving the public need. You don't have to have oodles of free time on your hands, either – we'll take what you can give, and every little bit helps. What do you say? Visit our new "Getting Involved" web page, and learn how you can contribute to Subversion's success. Help us restore momentum and energy to Subversion.

Posted by C. Michael Pilato | Date: Jan 30, 2010 | Permalink | Comments (1) | TrackBack (0)

Subversion's Operational Logging: What It Is, and What It Ain't

Subversion's operational logging feature was introduced in Subversion 1.3 as a way to allow mod_dav_svn to record one-line summaries of the high-level Subversion operations that it serviced. Similar functionality was added to svnserve in Subversion 1.6. Originally, the feature was desired not because there was an absence of server-side logging, but because the logging provided by Apache was just too low-level for most folks to understand. The underlying protocols in use by mod_dav_svn (HTTP+WebDAV) are not Subversion-specific, or even Subversion-esque. The protocol is stateless, generic, chatty, and more-or-less incomprehensible to a person whose title tends more towards "Subversion Administrator" than "HTTP/WebDAV Server Administrator". Operational logging was intended to help that person make sense of Subversion's WebDAV traffic.

Over the past few years, I've witnessed no small amount of confusion about what this feature really does and does not provide. The primary cause of the confusion often boils down to some simple misunderstandings about how the feature accommodates Subversion's WebDAV interactions, and the limitations of its design. I hope to clarify some of that in this post.

A Subversion client (just as any other WebDAV client) typically communicates with Apache/mod_dav_svn not by using some magic, single request that represents "a high-level operation", but via a whole series of requests, each designed to accomplish some portion of that high-level operation. Those requests might all arrive on a single Apache TCP/IP connection, or they may use many connections. They may arrive in serial, or portions of them may arrive in parallel. They may arrive in rapid succession, or they may arrive occasionally over the course of an unbounded period of time. As a Subversion user, you're content if they arrive at all and the tool does what you've asked it to do, but the details of these arrivals do play a role in the server-side operational logging feature.

Many of the request types (especially the read-only ones) are common to several (or all) of the typical Subversion client operations. An OPTIONS request, for example, is used in pretty much all Subversion client/server connections to negotiate features. But the request type and target URL alone generally aren't enough information to discern which high-level operation the user has invoked. For example, PROPFINDs come in all shapes and sizes and are used all over the place, sometimes for what might be considered sub-tasks (calculating resource URLs and such), but sometimes to fulfill the principle task initiated by the user. REPORT requests also come in different flavors, some of them used for relatively simple information queries on the server, but some of them for what is arguably the bulk of a high-level operation. It is via REPORT requests (perhaps with some supplemental, smaller requests) that a Subversion client fetches the directory layout, file contents, and versioned properties needed to service a checkout, update, diff, or merge. Another type of REPORT request services a log operation by providing the entirety of the requested revision metadata.

Apache logs all of these requests via its standard logging mechanisms. The challenge for mod_dav_svn is to find a way to distill a series of requests into a concise description of an operation that a human could (and would care to) comprehend. As originally proposed, this feature would involve Subversion clients simply providing this high-level operation information to the Apache server as part of the requests sent to handle that operation. There would be APIs for clients to use to share with the underlying Repository Access (RA) layer exactly what the client was trying to do, and the RA layer would pass that information to the server, and the server would log it while servicing the requests. This idea was shot down for a couple of reasons, though. First, a modified Subversion client could be tweaked to lie about what it was and wasn't doing, leaving administrators with unreliable logs (which they might not even know are unreliable). Secondly, because WebDAV is a stateless protocol, there would be additional work required to avoid logging the same high-level operation multiple times, once per low-level WebDAV request. So the feature was re-proposed with the design that was eventually implemented—mod_dav_svn would only attempt to log a high-level operation summary for certain request types, namely those that can be unambiguously recognized. In other words, rather than logging what the client claimed that the user was trying to do, the server would log what the server positively knew it had already done.

Now, generally speaking, there are one or two key requests in most of the Subversion client operations which can be identified meaningfully by the server and logged. A MERGE request is the final step in any operation that seeks to create a new revision in the repository, so when one of those requests succeeds, mod_dav_svn will log the fact that a commit has occurred. The REPORT request types previously mentioned are pretty meaty, and tend to be the central request used to service the most common, high-level, get-data-from-the-server types of client operations. These generate an operation log entry because the server can say with reliability, "I just transmitted a response to a request for an update report" or "I just transmitted a response to a request for a revision log report". Maybe the client is actually using those responses to perform update or log operations, or maybe it isn't—the server can't know.

Unfortunately, because the protocol is often driven in small steps that aren't in themselves unique to a particular operation and, again, because what is logged is the server's operation and not the client's, the logged results aren't always as precise as you might like. A client action might result in more than one operation being logged, or none at all. Or, the server log might say (effectively, not literally), "I sent the properties for a file", but whether that was the result of the user running 'svn propget', 'svn propedit', 'svn info', or something else cannot be determined. Or the client might run 'svn export' and the server log says, "Well, what I sent could have been used for checkout or export, but I'm not sure which." Depending on how the server is driven by the client (details I'll not dive into here), a checkout might log a single line ("Yup, I did a checkout or export!") or it might log hundreds ("Yup, I did a checkout or export! And I sent a file's contents. And another file's contents. And another file's…. "). Likewise, except in very specific cases, those somewhat generic PROPFINDs are not logged via the operational logging mechanism because the server can't reliably know what high-level request they might be part of (if any at all) and because, lacking that information, logging these requests would just be redundant with what is already recorded in the primary Apache logs.

Someone reading this might be tempted to assume that the subliminal message here is that Subversion's operational logging is without value. That's not the case. At a minimum, it offers something that Apache's primary logging mechanism cannot: the ability to disambiguate requests based on the parameters of the request found in its body. To Apache, a REPORT against a default VCC URL is a REPORT against a default VCC URL; as mod_dav_svn processes the body of that REPORT request, though, it can tell the difference between an update-style report request, a log request, a mergeinfo request, and so on. Apache also can't tell you the Subversion-specific bits about what it has done: the access_log will note that a MERGE completed successfully, but only the Subversion operational log can say that this resulted in r34867 being committed in the repository.

Is there room for improvement here? Absolutely. I'd like to see the idea of a client-transmitted operation revisited in earnest, but implemented in a way that allows for easy validation against the server's records of what it actually did. In the meantime, though, I hope that a healthy understanding of the benefits and limitations of the feature as it exists today will prove useful when swimming through logs and attempting to "go all CSI on" events of the past.

Posted by C. Michael Pilato | Date: Dec 9, 2009 | Permalink | Comments (0) | TrackBack (0)

Where Did That Mergeinfo Come From?

One area of merge tracking that has caused confusion is how svn:mergeinfo properties are set.  In many common use cases only mergeinfo on the merge target is set.  There are however, many scenarios where mergeinfo under the target, so called "subtree mergeinfo", is created or updated.  These situations often leave users wondering if something went wrong during the merge and if they should commit these subtree mergeinfo changes.  That confusion is compounded by the fact that a given subtree may have no changes other than those made to the svn:mergeinfo property itself.

This post describes some common cases where subtree mergeinfo is created or updated during a merge.

Pre-existing Subtree Mergeinfo

By far the most common cause of subtree mergeinfo changes is when your working copy merge target has subtree mergeinfo prior to a merge.

For example, we have a branch B1.0, created from trunk, that has some subtree mergeinfo:

1.6.6>svn propget svn:mergeinfo branches/B1.0 -vR
Properties on 'branches/B1.0/A/D/H/psi':
  svn:mergeinfo
    /trunk/A/D/H/psi:4-10

We merge revision 12 from trunk to B1.0 and the output of the merge shows that a new file is added:

1.6.6>svn merge ^/trunk branches/B1.0 -c12
--- Merging r12 into 'branches/B1.0':
A    branches/B1.0/A/C/nu

But when we check the status of the working copy we see some property changes:

1.6.6>svn status
 M      branches/B1.0
A  +    branches/B1.0/A/C/nu
 M      branches/B1.0/A/D/H/psi

Checking the diff on B1.0 alone we see that mergeinfo has been added which describes the merge we just performed, namely r12 from trunk.  That makes sense, it's what we just did!

1.6.6>svn diff --depth empty branches/B1.0

Property changes on: branches/B1.0
___________________________________________________________________
Added: svn:mergeinfo
   Merged /trunk:r12

But what about the property change on A/D/H/psi?

1.6.6>svn diff branches/B1.0/A/D/H/psi

Property changes on: branches/B1.0/A/D/H/psi
___________________________________________________________________
Modified: svn:mergeinfo
   Merged /trunk/A/D/H/psi:r12

Oh, it's a mergeinfo change.  But wait, we happen to know that r12 didn't affect psi and we check the log to confirm this:

1.6.6>svn log -v -r12
------------------------------------------------------------------------
r12 | pburba | 2009-11-17 18:21:47 -0500 (Tue, 17 Nov 2009) | 1 line
Changed paths:
   A /trunk/A/C/nu.c

So why did the mergeinfo on psi change?  The short answer is that for all versions of Subversion up to 1.6.x, subtree mergeinfo is always updated to describe the merge, regardless of whether the subtree was affected by the merge.  This was done to support easier mergeinfo elision* and for consistency in the resulting mergeinfo between single merges of a range and multiple individual merges amounting to the same thing (e.g. svn merge ^/trunk branch -r 100:103 or svn merge ^/trunk branch -c101, svn merge ^/trunk branch -c102, and svn merge ^/trunk branch -c103 would both result in the same mergeinfo).

Early on, we in the development community realized this behavior was causing more confusion than was justified by the somewhat minor benefits.  Surely only subtrees affected by a given merge should have their mergeinfo updated.  There was little dissent on that point, and from a coding perspective the change was almost laughably simple.  Unfortunately making this change had some unforeseen consequences on subsequent merges.  I won't go into the details here, but one of these consequences was the potential for dramatically decreased merge performance.

The good news is that ultimately we were able to preserve (and in many cases improve) merge performance while stopping the recording of mergeinfo changes on subtrees unaffected by a merge.  The bad news is that these changes were extensive and will not be backported to the 1.5.x or 1.6.x releases but rather will be available only in 1.7.

In the meantime, if you are wondering if you should commit these subtree mergeinfo changes, you should.  If you elect to revert them before committing your merge you won't be able to use the merge --reintegrate option any more and worse, for long lived branches you will suffer a performance hit on subsequent merges that can easily reach intolerable levels.

Property Diffs

The svn:mergeinfo property is a versioned property and just like any other versioned property the difference applied by a merge may include changes to or additions of that property.

For example, we might have a branch working copy with no mergeinfo at all:

1.6.6>svn propget svn:mergeinfo branches/B3.0 -vR

We cherry pick a single revision from our trunk:

1.6.6>svn merge ^/trunk branches/B3.0 -c21
--- Merging r21 into 'branches/B3.0':
UU   branches/B3.0/A/D/gamma

Notice the 'U' in the second column?  A property change was part of r21, which we can see when checking the working copy's status:

1.6.6>svn status
 M      branches/B3.0
MM      branches/B3.0/A/D/gamma

Given the topic of this blog it's no surprise that incoming property change was to the svn:mergeinfo property:

1.6.6>svn propget svn:mergeinfo branches/B3.0 -vR
Properties on 'branches/B3.0':
  svn:mergeinfo
    /trunk:21
Properties on 'branches/B3.0/A/D/gamma':
  svn:mergeinfo
    /branches/B1.0/A/D/gamma:20
    /trunk/A/D/gamma:21

If we check the diff of the merge source we can see that a svn:mergeinfo property was added to trunk/A/D/gamma in r21:

1.6.6>svn diff -r20:21 ^/trunk
Index: A/D/gamma
===================================================================
--- A/D/gamma   (revision 20)
+++ A/D/gamma   (revision 21)
@@ -1 +1 @@
-the new gamma file
+nc

Property changes on: A/D/gamma
___________________________________________________________________
Added: svn:mergeinfo
   Merged /branches/B1.0/A/D/gamma:r20

Situations like this can happen in any number of ways, but typically involve subtree merges (i.e. merges targeting some target below the root of a branch) which are later merged as whole branch merges (i.e. merges targeting the root of the branch) to some other branch.

Missing Subtrees

The remaining cases all have one thing in common, subtree mergeinfo is recorded because parts of the merge target are "missing".  They can be missing for several different reasons, but the resulting mergeinfo is quite similar.

Shallow Working Copies

If you merge to shallow working copies (i.e. those set to a depth other than infinity) then you will see subtree mergeinfo created.

For example, say we check out a branch working copy at depth immediates:

1.6.6>svn checkout %ROOT_URL%/branches/B2.0--depth immediates
A    B2.0/A
Checked out revision 13.

In this case the branch has no mergeinfo whatsoever:

1.6.6>svn propget svn:mergeinfo -vR B2.0

Now we merge all available revisions from trunk.  Since part of the working copy is missing due to the shallow checkout we get several tree conflicts:

1.6.6>svn merge ^/trunk B2.0
--- Merging r4 through r13 into 'B2.0/A':
   C B2.0/A/mu
   C B2.0/A/C
   C B2.0/A/D
Summary of conflicts:
  Tree conflicts: 3

We also get new mergeinfo added to B2.0/A:

1.6.6>svn status B2.0
 M      B2.0
 M      B2.0/A
!     C B2.0/A/mu
      >   local missing, incoming edit upon merge
!     C B2.0/A/C
      >   local delete, incoming edit upon merge
!     C B2.0/A/D
      >   local delete, incoming edit upon merge

1.6.6>svn propget svn:mergeinfo -vR B2.0
Properties on 'B2.0':
  svn:mergeinfo
    /trunk:4-13
Properties on 'B2.0/A':
  svn:mergeinfo
    /trunk/A:4-13*

This happens because the svn:mergeinfo property is inheritable.  Since we don't actually have any of the children of B2.0/A in our working copy we haven't actually merged r3:13 to them from trunk.  The non-inheritable mergeinfo range on B2.0/A, "/trunk/A:4-13*", means effectively that "we have merged r3:13 from trunk to this depth, but no further".

Without this non-inheritable mergeinfo on B2.0/A, if we resolved the tree conflicts and committed this merge and then another user checked out a full depth working copy of the branch it would appear to them that r3:13 was completely merged to the branch, which is obviously not the case.

This type of subtree mergeinfo is easily avoided if you always merge into full depth working copies.  If you need to merge into shallow working copies just keep this behavior in mind and be sure to commit all the subtree mergeinfo.

Shallow Merges

Closely related to shallow working copies are shallow merges.  Here it is not the depth of the merge target that is shallow, but rather the depth of the merge itself as specified with the --depth option to the merge command (you can even do shallow merges into shallow working copies).

For example, we check out a full depth working copy of a branch:

1.6.6>svn checkout %ROOT_URL%/branches/B2.0/branches/B2.0 B2
A    B2/A
A    B2/A/B
A    B2/A/B/lambda
A    B2/A/B/E
A    B2/A/B/E/alpha
A    B2/A/B/E/beta
A    B2/A/B/F
A    B2/A/mu
A    B2/A/C
A    B2/A/D
A    B2/A/D/gamma
A    B2/A/D/G
A    B2/A/D/G/pi
A    B2/A/D/G/rho
A    B2/A/D/G/tau
A    B2/A/D/H
A    B2/A/D/H/chi
A    B2/A/D/H/omega
A    B2/A/D/H/psi
Checked out revision 13.

A shallow merge "descends" only as far as the requested depth, in this case the to the target's immediate children:

1.6.6>svn merge ^/trunk/A B2/A --depth immediates
--- Merging r4 through r13 into 'B2/A/mu':
U    B2/A/mu

Like a merge into a shallow working copy, non-inheritable mergeinfo is set on the limits of what was merged to:

1.6.6>svn status B2
 M      B2
 M      B2/A/B
M       B2/A/mu
 M      B2/A/C
 M      B2/A/D

1.6.6>svn propget svn:mergeinfo -vR B2
Properties on 'B2':
  svn:mergeinfo
    /trunk:4-13
Properties on 'B2/A/B':
  svn:mergeinfo
    /trunk/A/B:4-13*
Properties on 'B2/A/C':
  svn:mergeinfo
    /trunk/A/C:4-13*
Properties on 'B2/A/D':
  svn:mergeinfo
    /trunk/A/D:4-13*

Again, this can be avoided by doing full --depth infinity merges.

Switched Subtrees

If you have switched subtrees in your merge target these are treated almost exactly the same as if those subtrees were at depth empty.  The main difference is that the merge will record normal inheritable mergeinfo on the root of the switched subtree, since from that point downward you do have the full working copy (barring switched subtrees within switched subtrees, or shallow switched subtrees, etc).

Authorization Restrictions

Another case of "missing subtrees" is when you merge into a working copy which you don't have full read access to.  Again you will see non-inheritable mergeinfo added around the missing subtree.  I suspect this is a rare occurrence, but if you suddenly see subtree mergeinfo appearing after a merge, and none of the preceding reasons apply, this may be the cause.

* If you are unfamiliar with mergeinfo elision you can read about it here http://www.open.collab.net/community/subversion/articles/merge-info.html.

Posted by pburba | Date: Nov 19, 2009 | Permalink | Comments (0) | TrackBack (0)

SubConf 2009 – A Report

SubConf is the annual conference of Subversion (Version Control System) project community. SubConf 2009 is the third such event which was held in Munich, Germany from 27-29th October 2009. Though SubConf is a User Conference wherein subversion users from various parts of the world take part, we do have developer hackathons in which subversion core developers come together discuss subversion roadmap, hack code, etc. The developers also meet the users to get feedback about subversion and also study the user needs so that the future releases can cater to user needs. This year we had a three day conference which was a great success.

Dscn0076

We had 10 core developers of Subversion project at the conference venue namely Stephen Butler – Elego, Stefan Sperling – Elego, Neels Hofmeyr – Elego, Julian Foad – WANdisco, Greg Stein – Popular Open Source Developer, Hyrum K. Wright - Subversion Corp, Lieven Govaerts, Bert Huijben - The Competence Group, Senthil Kumaran - Collabnet, Inc, C. Michael Pilato - Collabnet, Inc. All the core developers were locked up (Hackathon) for all the 3 days in a room in the conference hotel where they discussed about various things related to Subversion development such as Working Copy Next Generation (WC-NG) library, usage of scratch pool, iterpool in subversion code base, release roadmap, interesting issues to work on, etc. Of course hackathon was not just discussion, we also had some real productive programming done, there were approximately 70 commits to the subversion repository with close to +46696/-36666 lines of change!

The first day of the conference started officially in the evening around 7:00pm with Subversion RoundaTable where users of subversion from various organization post their queries and feedback about Subversion. They also explored the possibilities of getting a feature introduced in future releases of Subversion. This was a fruitful discussion which brings in new requirements to the Subversion Open Source project every year, directly from the actual users.

Dscn0233

On the second day of the conference we had many talks scheduled regarding Version Control Systems. The keynote was delivered by C. Michael Pilato who is a long term (from Jan 2001) Subversion developer. He spoke on the history of Subversion, the way the community works, why CollabNet chose to make Subversion a Open Source Project etc. This was refreshing to see the legacy and the advancements that had gone through in the Subversion Community through the years! The Subversion developers would like (which also forms the message from subversion developers via the conference) the users to do real testing of the pre-release versions (we don't want you to try on production data, though) of Subversion software to catch bugs early and due to the difficulty developers face (mainly due to computing resources) in order to mimic the varied environments in which subversion is deployed in organizations. The developers are interested to hear from organizations which are interested in offering resources to work on testing Subversion and welcome any such potential prospects. The users requested accessibility for pre-release version of Subversion binaries which the Subversion community is not engaged in providing other than the source tarballs, but the developers took a note of it, that they will work on some mechanism to get it done in future. FWIW, Subversion project in the recent past has started providing nightly tarballs of latest trunk development sources - http://orac.ece.utexas.edu/pub/svn/nightly/

Some of the talks given on the second day and third day of the conference were as follows (there were even more talks, but they were non-English):

  • Subversion Release Process by Hyrum Wright (Release manager of Subversion project) and Stefan Sperling

  • Bringing Subversion to the Java (TM) World by Alexander Kitaev and Alexander Sinyushkin

  • WC-NG (Subversion's new working copy management library) by Hyrum Wright

  • Comparing Apples to Oranges – Subversion, git and Mercurial by Stefan Sperling and Stephen Butler

  • Moving from SVN to Mercurial by Zsolt Koppany and Janos Koppany

  • Server Side Java bindings for Suvbersion by Dave Brown

  • SVN Obliterate by Julian Foad

  • Coding Control by Tony Smith from Perforce Software

See http://2009.subconf.de/vortraege/1-konferenztag/ , http://2009.subconf.de/vortraege/2-konferenztag/ for presentation slides.

Another interesting take away from the conference was Subversion Community's feeling about Distributed Version Control Systems (DVCS). The community is excited about DVCS, since we are part of advancing the “State of the Art” and we are happy that, ultimately we have competitors in the version control world :) With the latest improvements on WC-NG library, Subversion will be able to get features like offline commits, shelving, etc which are premature to talk now, but are possible in the foreseeable future.

It was a nice experience for me to lurk around with the Subversion Developers at the Conference, whom I ve known for the past 2 years via email communication. We also had a surprise on the following week after the conference with the announcement made at ApacheCon 2009, about Subversion project finding a new home in Apache Software Foundation! With such kind of announcements and user conferences Subversion Community advances in a faster pace to make this extraordinary piece of Version Control software even better!

Related links

Detailed Report - Day1

Detailed Report - Day2

Detailed Report - Day3

Pictures - Day1

Pictures - Day2

Pictures - Day3

SubConf website

Posted by Senthil Kumaran S | Date: Nov 9, 2009 | Permalink | Comments (0) | TrackBack (0)

Subversion - As Strong As Ever

Yesterday at ApacheCon I witnessed a significant milestone for CollabNet, the Apache Software Foundation (ASF) and Subversion. The CollabNet-sponsored Subversion project and The Apache Software Foundation (ASF) announced that the Subversion project has formally submitted itself to the Apache Incubator in order to become part of the Foundation's efforts. The announcement was greeted with a crescendo of applause at the conference and with comments like ‘what a nice 10th birthday gift’. This logical progression for Subversion comes as Apache and Subversion are completing their first 10 years as open source communities. From a people perspective, many of the same people founded and continue to work on both projects. From a technology perspective, both projects utilize capabilities of the other. In return this move is expected to benefit Subversion and CollabNet by providing outreach into the large ASF committer base and from their semi-annual developer events like today’s ApacheCon that attracted an estimated 500 guests.

The transition for Subversion comes at a time when CollabNet’s sponsorship has established Subversion as the market leading SCM product. The ASF transition should help Subversion extend its position, that’s right, extend its position. However, it crossed my mind that this move might be perceived by some as an attempt to resurrect Subversion from a downward activity trend. CollabNet and Subversion folks would say that this is just not true, that Subversion is as strong as ever but is there independent data to validate that? I decided to do some research of my own by checking in at www.ohloh.net to gather some Subversion activity metrics (thanks to the people over at Ohloh). Oh, and I added Git and Mercurial to provide some additional color to my analysis. Here are some interesting charts:

Code Commit Activity:

SVN-GIT-Mercurial commits
 

Line of Code Growth:

SVN-GIT-Mercurial codebase comparison

I’ll let you form your own conclusion but I believe that Subversion is as strong as ever.

Next - my perspective on CollabNet and Subversion in 2010.

Posted by Richard Murray | Date: Nov 5, 2009 | Permalink | Comments (0) | TrackBack (0)

Using client certificate with Apache and Subversion



This is not a typical use case for anyone who uses the client certificate with Apache and Subversion.  In general, the client certificate is used for all Apache requests including the SVN related ones. This use case is bit different, and uses client certificates for all Apache requests, but not for Subversion requests. This sounds like a straightforward configuration in Apache configuration file, but it is not.

Usual workaround

The SSLVerifyClient optional directive is used to enforce client certificate based authentication. If it is specified at the <Location /> directive, all non-Subversion requests goes through client certificate based authentication. The SSLVerifyClient none directive is used to avoid using client certificate based authentication. If it is specified at the <Location /svn> directive, the Subversion requests do not  go through this authentication.

413 -- Request Entity Too Large

If we use the above workaround, we face 413 Request Entity Too Large while uploading large files using POST method. This is due to bug 12355. According to this bug report, if SSLVerifyClient optional directive is specified at <Location /> directive, the user will face this issue. The bug report claims that it is fixed in Apache 2.0.55, but I faced this issue even in Apache 2.2.11.

The work around is to specify SSLVerifyClient optional at the virtual host level. But then, this setting can be overridden only using <Directory> directive. In our case, it can not be overridden using <Location /svn> directive. Thus the client certificate based authentication is enforced even for SVN requests.

SSLRenegBufferSize directive in Apache 2.2.12

The issue 413 Request Entity Too Large error is occurred when the SSL Renegotiation is attempted, because we specified SSLVerifyClient optional at <Location /> directive. The default size is 2048 bytes, which is not sufficient. In Apache 2.2.12, SSLRenegBufferSize directive is introduced precisely to configure the buffer size. I have not tried this in Apache 2.2.12 yet.

Snippet from Apache 2.2.12 changelog file.

*) mod_ssl: Add SSLRenegBufferSize directive to allow changing the
   size of the buffer used for the request-body where necessary
   during a per-dir renegotiation. PR 39243. [Joe Orton]

The Hack to overcome this issue

We can not use SSLVerifyClient optional at virtual host level. We also can not let SVN requests go through client certificate based authentication.

We skipped the client based authentication for specific servlets which supports file upload, as far as Apache is concerned. We modified the code to still authenticate using client certificate only for these servlets. By using the following directive we fixed this issue. We also avoid specifying the SSLVerifyClient optional directive at <Location /> directive.

<LocationMatch "^/servlets/(?!(fileUpload1|fileUpload2))">
  SSLVerifyClient optional
  SSLVerifyDepth 2
</LocationMatch>
 
This is not a perfect solution, but it solves the problem on hand. We should upgrade to Apache 2.2.12 and verify if SSLRenegBufferSize directive fixes the problem cleanly.

Posted by Bhuvaneswaran A | Date: Sep 3, 2009 | Permalink | Comments (0) | TrackBack (0)

Subversion Path-Based Permissions in CollabNet TeamForge

CollabNet TeamForge (CTF) 5.2 has been out for a while now but I thought I would help introduce the most significant feature added as a result of its release: Subversion path-based permissions.  Prior to CTF 5.2, when you wanted to manage access control for your Subversion repositories, you were able to provide only "blanket level" permissioning.  This means that you either had no access, read access or read/write access and that access applied for the whole repository.  There was no way to open up or lock down subsets of the repository tree.  For many, this was fine but often resulted in some process of maintaining multiple repositories to suit their needs.  The problem was even bigger for those that came from other solutions where they were use to full control.

Well, with the release of CTF 5.2, you now have the ability to use path-based permissions to take full control of the repository access for your project's repositories.  Once you create a Subversion repository, path-based permissions are available just like the rest of the CTF tools.  For those of you that do not want or need path-based permissions, CTF still works with blanket-level access.

(Note: The purpose of this article is not to teach you how to create a CTF integration, project or repository and assumes that you've got a Subversion repository already created and ready to work with.)

As with previous releases of CTF, to start we need to get to the project Permissions by performing the following steps:

  1. Click the "Project Admin" tab
  2. Click the "Permissions" menu item to the left

Now we can create a new role or modify an existing role so that we can create some new path-based permissions for our repository.  To provide a full example, we'll create a few roles:

  • Developer: Has complete access to the repository except the /tags directory where he can only read.
  • Manager: Has complete access to the repository
  • Contractor: Has no access anywhere except /trunk/contractor and /branches/b1/contractor

(As with any CTF role, you can add individual users or you can create groups of users and then add those groups to the role.)  To get started, let's click the "Create" button on the Permissions page.  Fill out the form with the following values:

  • Role Name: Developer
  • Description: This role is for developers.

Once you've done this, click the "Create" button once more.  At this point, the role is created but it has no real permissions set.  Since we're here for path-based permissions, click the "Source Code" menu item to the left.  This is where the magic happens for path-based permissions.  To set/manipulate path-based permissions, look down in the "Permissions for Specific Repositories" section of this page to see a list of repositories.  (Only Subversion repositories have the ability to have path-based permissions set for them.)  To get started, click the "Path-based Permissions" radio button and you should see that a sub-form is displayed.  This is where you will add these path-based permissions.

Let's go ahead and setup the Developer role.  To do this, follow these steps:

  1. Change the default permission for the "/" path to be "View and Commit" by selecting the "View and Commit" radio button
  2. Click the "Add" button
  3. In the newly displayed text box, type in "/tags"
  4. Beside the "/tags" row, select the "View" radio button.
  5. Click the "Save" button

At this point, we have fulfilled the requirements for the Developer role.  Based on our permissions model, any user with the Developer role will have full read/write access to the repository except for the "/tags" directory and everything below it, where the user will have read access only.  Below is a screenshot of what you might expect to see:


Developer Role

Now that we know how to create a role in CTF and modify its "Source Code" permissions to use path-based permissions for Subversion, let's quickly go through configuring the other two roles, starting with the Manager role.

To configure the "Source Code" permissions for the Manager role, follow these steps:

  1. Create the role using the same steps above (The "Role Name" should be "Manager" and the description should be "This role is for managers.")
  2. Get to the "Source Code" permissions for this role using the same steps above
  3. Enable path-based permissions using the same steps above
  4. Click the "View and Commit" radio button for the default repository path
  5. Click "Save"

As with the Developer role, here is an example screenshot:

Manager Role

The last role we have to configure is the Contractor role.  To configure it, follow these steps:

  1. Create the role using the same steps above (The "Role Name" should be "Contractor" and the description should be "This role is for contractors.")
  2. Get to the "Source Code" permissions for this role using the same steps above
  3. Enable path-based permissions using the same steps above (Since we'll not be giving access to any parts of the repository by default, we will not be updating the default path permissions.)
  4. Click the "Add" button
  5. In the newly displayed text box, type "/branches/b1/contractor"
  6. Beside the "/branches/b1/contractor" row, click the "View and Commit" radio button
  7. Click the "Add" button
  8. In the newly displayed text box, type "/trunk/contractor"
  9. Beside the "/trunk/contractor" row, click the "View and Commit" radio button
  10. Click the "Save" button

Here is an example of the configuration:

Contractor Role

At this point you have all of your roles created and you could start adding project members with their respective roles, which is outside the scope of this article.  One more thing before we move on is to explain how these permissions are used to give you access.

When it comes to generating the internal "model" of what you have access to, CTF plays by the same rules as Subversion's authorization model.  The idea here is that you can give/restrict access at any path and the permission for that path applies for said path and all paths below that path.  Sounds simple enough.  Now...to "override" a higher-level permission, all you have to do is create a path-based permission for the path you want to enable/restrict access for.  You saw an example of this in the Contractor role.  While we initially said that the Contractor role would have no access at "/", we then enabled access at "/branches/b1/contractor" and "/trunk/contractor" by creating a more specific rule and that rule applies at that path and everything below that path unless overridden by a lower-level rule.  So to summarize: Permissions for a path are inherited from their parent unless you create a new path-based permission for the path in question overriding its parent's permission.

So what else is there to learn about path-based permissions in CTF?  Well, you should know the CTF tools that path-based permissions are enforced on.  Sure Subversion access is a given but here is the full list of CTF tooling that path-based permissions impact the access of:

  • Subversion access
  • Source code browsing (This includes the enablement/display of the "Source Code" toolbar button, the listing of the repositories when the "Source Code" button is clicked and the actual content rendered when you click a repository and start browsing its content.)
  • Commit viewer (Much like the source code browsing, when you view commit objects in the Commit Viewer, the content rendered depends on your access permissions.)
  • Repository monitoring

Well, that pretty much sums up path-based permissions in CTF.  As you can tell by the information above, path-based permissions are a very simple yet very powerful way to restrict access to your source code and the CTF tooling related to source code.  If you have any question about path-based permissions or anything in CollabNet TeamForge, please visit our community site for mailing lists, forums, articles and help documentation.

Posted by Jeremy Whitlock | Date: Aug 24, 2009 | Permalink | Comments (0) | TrackBack (0)

RSS Syndicate this blog

OnCollabNet Blog

About all topics CollabNet, including community management, Agile ALM, managing distributed teams, and more.
Read the blogs . . .


Recent Submerged Posts

  • Energizing Subversion…
    Posted by C. Michael Pilato
  • Subversion's Operational Logging: What It Is, and W…
    Posted by C. Michael Pilato
  • Where Did That Mergeinfo Come From?…
    Posted by pburba
  • ©2008 CollabNet Corporation
    • Site Feedback
    • Terms of Use
    • Privacy Policy
    • Copyright & Trademark