Thursday, October 14, 2004

Preparing for OOPSLA and GPCE

Next week we're going to OOPSLA and GPCE! I'm going to present our paper Concrete syntax for Objects at OOPSLA. I'm really looking forward to the conference and the talk, although it is a little bit scary to have your first conference talk ever at OOPSLA. I'm going to prepare this talk in the coming week, so get ready for MetaBorg posts ;) .

We're also going to the Software Transformation Systems workshop. I expect that this is going to very be interesting: quite some transformation system researchers will be there.

Tuesday, October 12, 2004

Software Engineering Lectures

Today, I gave a lecture on the concept and techniques of component-based software engineering as part of our software engineering course. The lecture is divided in two parts: concepts of component software and the implementation of these concepts in current platforms (Java and .NET). I hope the students liked the lecture; at least I had a lot of fun myself ;). Preparing a lecture from scratch (which was the case) takes a lot of time, but it usually provides some fresh ideas for stuff that we could do for our 'real work'.

For lectures like these, you get to read material that you wouldn't consider if you would only focus on academic stuff (Essential .NET and Component Development for the Java Platform are great books!). About two weeks ago, I gave a lecture on testing tools and techniques. The preparation for this lecture (and today's one as well) provided me with some more insight in the use of reflection for meta-programming. Concerning testing, it's particularly interesting to see how people work around the poor support for implementing tests in current programming languages. In particular, mock objects (EasyMock is impressive) and dynamic proxies were useful to learn more about.

Friday, October 08, 2004

Data-flow Components are the Magic Bullet

Sean McGrath has posted a nice set of slides (ppt) that he used for his presentation on XML pipelining at XML 2004. Sean is more or less the pipe guy of the XML community, which is an honorable position. Most interesting is his slide (nr 4) on API components versus data-flow components. Unfortunately, I don't know exactly what he said about this slide (I wasn't there), but he must have raised some interesting points.

The last few decades have made clear that data-flow components never go out of fashion. In particular, the only really successful component model is the idea of pipes and filters of Unix. Unix components can easily be composed; they can be implemented in any language and they are suitable for anticipated (e.g. shell scripts) as well ad-hoc (the command line) composition. API-based component models tend to be language, or at least language family, specific. Hence, components relying on such a component model are coupled to a language or platform. As a result, they go out of fashion. This doesn't make API-based component models useless, but components based on such a mechanism just don't last. Over time, components will have to survive in a heterogenous environment, since fashion changes while time passes by. In this way, the Internet is comparable to time. If you want to connect components in an heterogenous environment, then language specific solutions just don't work. Your current environment is the Internet, which is quite heterogenous.

Tool Hell and Encapsulation

After more than 7 years of StrategoXT development (to be honest, I joined the project only about 3 years ago) we have developed a huge number of tools: the bin directory of my StrategoXT installation contains 96 executables and the libexec directory adds 93 tools to that. That's quite a pile of tools, and this pile is a big problem for new users: how to learn all these tools? What command-line arguments do all these tools need?

To make things worse, many of these tools do not only take input and produce output, but they are also generic and need to be specialized with some configuration for a specific programming language. This idea has been explained in the article Grammars as Contracts. The syntax definition of a programming language can be used to configure these generic tools in the typical pipeline of a program transformation. In this way, the grammar is used to specialize these tools to language specific tools. The disadvantage of this approach is that all these tools need to be configured and that this configuration is not just the syntax definition, but for example a parse table, a pretty-print table, an abstract syntax definition, and so on. Users need to know all these file types, tool, and their configuration.

We are all aware of this issue, but for some reason we never really tried to solve this, except by adding more abstract and easy to use tools. In a discussion with Karl Trygve Kalleberg we realized that this really needs to be improved, since StrategoXT users now need to apply far too many tools for generating all the configuration files required for a basic source to source program transformation:

  • pack-sdf, for collecting a set of SDF modules into a single syntax definition.
  • sdf2table, for creating a parse table.
  • sdf2rtg, for creating an abstract syntax definition.
  • rtg2sig, for creating a Stratego signature (abstract data type declarations for Stratego).
  • ppgen, for generating a pretty-printer.

Obviously, this situation cannot be explained to new users. They have to new all these tools and the file types they operate on (.sdf, .def, .tbl, .rtg, .str, .pp). Having all these tools and file types is not really a bad thing, but we provide no mechanism to abstract over these tools and files. Karl came up with a very practical solution to this problem (he also blogged about it). We need a single tool: xtar that applies all these program and produces a single file: an XT archive (.xtar). This archive can be passed to all the tools in StrategoXT and they just take out all the configuration files that they need to do their work. So, the user no longer needs to know all generators of configuration files and he doesn't need to know all these file types as well. This is a huge improvement for new (but also experienced) users. We hope to implement this as soon as possible! A little bit more information is available at the Stratego Wiki

The xtar file and tool are a good example of encapsulation: instead of exposing the internal implementation of our tools, we now just expose the concept an XT archive for a language. In this way, we can also change the implementation details more easily, which used to be quite difficult in the past. It is interesting to see that this idea corresponds exactly to widely applied object-oriented design techniques. Why haven't we learned from this earlier! Maybe we should apply more design patterns to our set of command-line tools?

Wednesday, October 06, 2004

Research and Blogging

Yesterday Karl Trygve Kalleberg returned from his trip to the LASER Summer School on Software Engineering (Elba!) and Norway. Having Karl in Utrecht is really great. He is (over?)loaded with fresh ideas for improving our software and he's always ready for a good discussion.

Yesterday, he convinced me to start blogging. That wasn't too difficult, since I love weblogs. It's truly amazing how many bright people are communicating their ideas and thoughts in their weblogs. This is not limited to the famous guys in the computer industry and blogging scene (such as Patrick Logan, John Lam and Sean McGrath). Less well-known bloggers write very interesting and well-phrased stuff as well. For example, Zef Hemel (a student at the University of Groningen, The Netherlands) maintains a very good weblog. Almost every day he adds an interesting story, straight from his mind. Isn't that great?

This network of bloggers is incredibly powerful. You are no longer thinking for yourself: the blogging scene is a hive mind that spreads knowledge at a faster pace than ever before (at least for software development). I'm a PhD student specialized in software technology, so I'm part of what you might call the research community. What surprises (and maybe disappoints) me is the limited the number of researchers that are blogging. If blogging is such an excellent medium for spreading knowledge, then why aren't we part of it?