Tuesday, August 28, 2007

Cat Content

Yesterday we hailed the latest members of the family. The mother's called Gandalf (Ok, that's a males name, but anyways, you can guess why, can't you?), the childs haven't yet got any names. As you can see on the picture, they are neatly ordered from light to dark, so possibly the names should match that order. Suggestions welcome. :-)

Monday, August 20, 2007

History Repeating

In the nineties, I used to work as a freelancer, typically in small one-man projects. At that time, Perl was by far my favourite programming language. Some relicts can still be found. Perl was the proper tool for the job: I had the oversight. I knew any single line of code. If changes were required, then I knew exactly what to do.

When I started to work in larger projects, with two or more members, I felt the weak sides of Perl immediately. Even as the lead developer, I began to lose control. Even if I knew how sources had been, I didn't know exactly, how they were. For example, I could no longer change a methods signature without the risk to break things. The larger the project, the more overwhelming the problem. Java with its type safety and strictness was much more appropriate. I was again able to respond quickly to changed requirements. (Today, I really wonder how large PHP projects are managed and how much time is lost when dealing with these things.)

Of course, that wasn't the end of the story. Code is more than Java, Perl, or any other programming language. I began to become aware of that in my first XML projects, around 1999. The initial version was always based on DOM. The application programmers were fumbling around from node to node, reproducing algorithms again and again. Of course, DOM (or SAX, or pull parsing) was (and still is), the proper tool for generic XML processing. But it is extremely unusable for application programming: As soon as the data model changes, you are lost. Once more, the concept works for small projects. But in a large project, things are quite different: There's absolutely no reliable way to find all locations in the code, that need to be changed.

My reply was the predecessor to JaxMe 1, an early Java/XML binding tool. The binding compiler converts the DTD, or XML schema, into Java beans. By using these beans, the Java compiler controls your use of the data model. If the data model changes, then the compiler tells you exactly what breaks. Ideally, the application programmers will be able to implement about 80% of the projects code by working on the beans and the project becomes manageable again. Of course, the JaxMe predecessor, JaxMe 1, and even JaxMe 2 are very limited tools, compared to modern and full blown Java/XML binding suites like JAXB. But that's not the point, in contrary: It's quite telling, that even my self-made tool could boost a project. The fact, that there have been so many similar projects (Castor, Zeus, XML Beans, to name a few), also goes to show the concepts value.

Two, or three years later, I felt a dejavu: For the first time, I was working in a large SQL/JDBC project. It was the same story again: As soon as we had implemented the first 50 or so queries, some of them quite complex, we felt again, that the growth began to become uncontrolled. An action as simple as changing a column type from boolean to integer could break the project and result in hours of bug tracking. As Hibernate, OJB, and other object-relational mappers (not to mention JDO) weren't available or not really recommendable at the time, I began to implement my own nonsense again. It was sufficient as the base of two large projects. A similar tool, not implemented by me, but by another very capable programmer (Hello, Winfried, should you read this. :-), was the foundation of a third project. The success in relation to the simplicity proves again, how important mapping of loosely coupled entities (tables) to structures controlled by the Java compiler can be.

Well, history's repeating. Nowadays, I am working in a large project again, feeling my own unability to keep the projects parts together: This time the entities are called registry objects, classifications, or associations, they are stored in a registry and accessed via JAXR. The project is getting larger, the pain caused by data model changes is growing. This time, we do not even have a schema language, not to mention a binding compiler, that translates structures into accessors.

But maybe, we can learn from history. I promise that I search for other peoples solution much longer this time. :-)

Thursday, August 16, 2007

Poor Mans GForge

I call myself an extremely experienced person, when it comes to software installation, configuration and stuff like that. Having installed server software like Apache httpd, Bugzilla, Perl, Sendmail, MySQL, innd, snmpd, radiusd, to name a few, on Linux and mainstream Unixes (Solaris, AIX, HP/UX) as well as exotic server operating systems like SGI, ConvexOS, Dec, or Windows, there's a rare chance that an installation may surprise me. The most notable example used to be Hylafax, which really wasn't easy to handle in 1996, or around that. (I wonder what it's like today. The project lives, so things might have changed.)

However, there's another example, which can easily driving me crazy: It's GForge, a collaboration server for developers, that basically enables management of a lot of projects via a comfortable web interface. If you have it installed, that is.

It's not, that I am unable to do it. I have running GForge installations. But my first installation required no less than three days of work, until everything was complete and the fun didn't shrink with the second or third such installation. It's simply that I am wondering whether it's worth the effort. In particular, because I find myself missing Bugzilla anyways, when working with the integrated issue tracker.

So when I recently had to setup a new development server, I decided to try something different: If I am installing Bugzilla (which I would end to do anyways), then I have my preferred issue tracker and a comfortable, multitenant user management tool ready. Why not simply attach Subversion, WebDAV (as a Maven repository) to the Bugzilla database?

The result of my attempt can be found at Bugzilla's own Bugzilla in Bug 392482. (Of course, it wasn't an immediate result. There have been a few evolutionary steps.) It consists of two parts: mod_authn_bugzilla is an authentication module for the Apache httpd. Basically, it's simply mod_authn_dbd with some minor changes (setting of environment variables). The Apache module authenticates users against the Bugzilla database. Like its ancestor, it's database agnostic, so you can use it regardless of the type of the Bugzilla database.

The second part enables management of user ID's with Bugzilla. By default, Bugzilla uses email addresses instead of classical user ID's. Which is of course fine within Bugzilla. However, you wouldn't want to use me@foo.com as part of a subversion URL.

I am quite happy with the system, as it is now. I am using standard components, which come with the Linux distro with almost no configuration changes. For example, my Subversion repositories are where CentOS wants them, and not on GForge's preferred location. The webdav directories are below /var/www, as they should be. Ok, I cannot create new projects via the web interface, but project administration is a rare task: User management is what costs the most time and is typically presumed urgent, so that's what matters.

If you are interested in the details, please contact me.