Friday, October 08, 2004

Tool Hell and Encapsulation

After more than 7 years of StrategoXT development (to be honest, I joined the project only about 3 years ago) we have developed a huge number of tools: the bin directory of my StrategoXT installation contains 96 executables and the libexec directory adds 93 tools to that. That's quite a pile of tools, and this pile is a big problem for new users: how to learn all these tools? What command-line arguments do all these tools need?

To make things worse, many of these tools do not only take input and produce output, but they are also generic and need to be specialized with some configuration for a specific programming language. This idea has been explained in the article Grammars as Contracts. The syntax definition of a programming language can be used to configure these generic tools in the typical pipeline of a program transformation. In this way, the grammar is used to specialize these tools to language specific tools. The disadvantage of this approach is that all these tools need to be configured and that this configuration is not just the syntax definition, but for example a parse table, a pretty-print table, an abstract syntax definition, and so on. Users need to know all these file types, tool, and their configuration.

We are all aware of this issue, but for some reason we never really tried to solve this, except by adding more abstract and easy to use tools. In a discussion with Karl Trygve Kalleberg we realized that this really needs to be improved, since StrategoXT users now need to apply far too many tools for generating all the configuration files required for a basic source to source program transformation:

  • pack-sdf, for collecting a set of SDF modules into a single syntax definition.
  • sdf2table, for creating a parse table.
  • sdf2rtg, for creating an abstract syntax definition.
  • rtg2sig, for creating a Stratego signature (abstract data type declarations for Stratego).
  • ppgen, for generating a pretty-printer.

Obviously, this situation cannot be explained to new users. They have to new all these tools and the file types they operate on (.sdf, .def, .tbl, .rtg, .str, .pp). Having all these tools and file types is not really a bad thing, but we provide no mechanism to abstract over these tools and files. Karl came up with a very practical solution to this problem (he also blogged about it). We need a single tool: xtar that applies all these program and produces a single file: an XT archive (.xtar). This archive can be passed to all the tools in StrategoXT and they just take out all the configuration files that they need to do their work. So, the user no longer needs to know all generators of configuration files and he doesn't need to know all these file types as well. This is a huge improvement for new (but also experienced) users. We hope to implement this as soon as possible! A little bit more information is available at the Stratego Wiki

The xtar file and tool are a good example of encapsulation: instead of exposing the internal implementation of our tools, we now just expose the concept an XT archive for a language. In this way, we can also change the implementation details more easily, which used to be quite difficult in the past. It is interesting to see that this idea corresponds exactly to widely applied object-oriented design techniques. Why haven't we learned from this earlier! Maybe we should apply more design patterns to our set of command-line tools?

No comments: