Recently, there have been a few interesting developments in standard support for open compilers and program transformation. For example, Sun released the annotation processing tools (APT) as part of the JDK5, which opens up Sun's Java compiler a bit. Also, there is Jackpot, a plugin for Netbeans for transformating Java code. The obvious question is how this relates to work that has been done in research on open compilers and program transformation.
Olivier Lefevre sent me an email to ask how the tree for Java provided by Jackpot and javac compares to our support for parsing and transforming Java in Java-front. The answer is probably useful in general, so I'll quote it here. Feel free to share your opinion in the comments!
As you may know, starting with Java 6 the Sun JDK will ship with an API to the AST: see jackpot api
Yes, Jackpot and APT are great projects. However, there is not yet a full API to the AST in the standard JDK, afaik. The compiler will be more 'open' in two different ways.
First, the current annotation processing tool (APT) is going to be combined with javac, but APT only provides access to the global structure of a Java source file and does not include the statement and expression level. Also, this API does not allow modification of the Java representation. APT is read-only: you can only generate new code.
Second, there is Jackpot, which is a rule-based language for transforming Java code. For Jackpot, the representation of Java used by javac has been opened and cleaned up a bit to make it more usable in external tools. However, this representation is not standardized and Sun recommends not to use stuff from com.sun.*. Afaik, Jackpot will be shipped as part of NetBeans and not as part of the JDK.
How does this compare to Java-front?
That's a good question. The answer depends on the application.
If you just need an AST for Java, then the advantage of the com.sun.source.tree AST is that you are absolutely sure that the AST conforms to javac, since the implementation is exactly the same. Of course, the same holds for ecj and the AST of Java that is provided by Eclipse (org.eclipse.jdt.core.dom.*). However, the grammar provides by Java-front is very good, so I don't expect any parsing problems. It has been tested and used heavily in the last few years and the development of this grammar has even resulted in a number of fixes in the JLS.
An advantage of Java-front is that it is a bit more language independent. Obviously, the Eclipse and Javac ASTs are to be used in Java. If you want to implement a transformation of Java in a different language, then you have to write an exporter. Java-front outputs ASTs in a language independent exchange format (ATerms), which can also be converted to XML. Of course, Java-front is most useful if you combine it with a language that is designed for program transformation and operates on ATerms, such as Stratego. One of the biggest advantages of Stratego is that it is very easy to do traversals over the AST: no tiresome visitors.
If you need more information about Java than can be defined in a context-free grammar, then you need more than just a parser. For more complex transformations (which includes simple refactorings), you'll probably need an implementation of disambiguation (reclassification) and qualification of names. A simple statement like System.out.println is already highly ambiguous with an analysis: is System a variable? a class? a package? Is out an inner class? a field? Most likely, you'll need type information as well. Java and Eclipse have the major advantage that you can safely assume that their type checkers are pretty good. For Jackpot, I suppose that there is some way to get type information (since type information can be used in Jackpot), but I from a quick scan I cannot figure out how to do this from the public API. For Java-front, there is an extension (Dryad) that supports type-checking and disambiguation, but this work is not yet complete. Using an existing compiler is of course a safer alternative. For experiments, the stuff provided by Dryad should be ok (we use it in our course on program transformation).
A different application is the implementation of Java language extensions. Javac and ECJ do not support this. The Java representation is open, but not extensible. Java-front uses a modular syntax definition formalism (SDF) that allows you to extend the grammar of Java in an almost trivial way. The strength of this approach is illustrated by the embedding of the Java syntax in Stratego (GPCE '02) and Java (GPCE '05), the applications of the grammar in MetaBorg (OOPSLA '04), and the modular extension of the grammar for the definition of the AspectJ syntax (OOPSLA '06). Of course, these applications are not really interesting if you are just interested in a Java program transformation tool, but it illustrates the reusability of such a syntax definition (as opposed to the grammars used by ecj, javac and most other parser generators). You'll need tools for pretty-printing as well. Outside of Eclipse, pretty-printing the JDT Core DOM is troublesome and mostly useful for debugging the output only. Inside Eclipse, the support for pretty-printing and preserving the layout of a program is of course excellent (see the existing implementations of refactoring). Jackpot provides a pretty-printer as well, but I don't know if it can be used outside NetBeans. Java-front provides the tool pp-java, which has been heavily tested and can insert parentheses in exactly the right places.
I am interesting in implementing small refactorings.
If you want to implement solid refactorings that could eventually even be deployed, then I would suggest to use an existing framework for refactoring, since there is much more to do than just getting an AST. A few years ago, I implemented an extract method refactoring in JRefactory, which was quite a useful experience. I suppose it's a bit obsolete now, since the refactoring market is dominated by refactorings directly supported by IDEs. You could consider Eclipse or NetBeans.
If your objective is to play a bit with program transformations and maybe even be a bit more adventurous by using real program transformation languages, then it might be nice to use Java-front and Stratego. Using Stratego is a major advantage of the tiresome implementation of traversals in Java (and most other languages).
Hope this helps :) Feel free to ask more questions if anything is unclear :)