2007-02-10

Informal specifications are not so super

I'm trying to get back to the good habit of blogging about our work. I'm not very fond of dumping random links or remarks, so the most challenging part of blogging is to find a good topic to write a decent story about. This time, I fallback to a topic I've actually been working on about two years ago, but is still most relevant. At that time, I was actively developing a type checker for Java, as part of the Dryad project. This story is about a bug in the Java Language Specification, that for whatever bizarre reason has never been reported (afaik).

Super field access

Java supports access to fields of super classes using the super keyword, even if this field is hidden by a declaration of another field with the same name. For example, the following sample will print super.x = 1 and x = 2.

  class S {
    int x = 1;
  }

  class C extends S {
    int x = 2;

    void print() {
      System.out.println("super.x = " + super.x + " and x = " + x);
    }

    public static void main(String[] ps) {
      new C().print(); 
    }
  }

To allow access from inner classes to hidden fields of enclosing instances, Java also supports qualified super field accesses. In this case, the super keyword is prefixed with the name of a lexically enclosing class. This feature is related to the qualified this expression, which allows you to refer to an enclosing instance.

Current specification

We all have a reasonable, though informal, idea what the semantics of this language feature is. Of course for a real specification the semantics has to be defined more precisely. For example, two things that need to be define are what the type of such an expression is and if the field is accessible at all. The specification concisely defines the semantics of this language feature by forwarding the semantic rules to existing, more basic language features. For super.name, the JLS specifies:

"Suppose that a field access expression super.name appears within class C, and the immediate super class of C is class S. Then super.name is treated exactly as if it had been the expression ((S)this).name."

So, in the example I gave, super.x would be exactly equivalent to ((S)this).x. Obviously, the emphasis of exactly is on purpose. Why would they use this word? Does this suggest that there is also a notion of being treated almost exactly in the same way? ;) .

For qualified field access, the specification is almost the same, but this time using a qualified this instead of this.

"Suppose that a field access expression T.super.name appears within class C, and the immediate super class of the class denoted by T is a class whose fully qualified name is S. Then T.super.name is treated exactly as if it had been the expression ((S)T.this).name."

This specification looks very reasonable, considering that for a field access only the compile-time type of the subject expression is used to determine which field is to be used. By casting the subject expression (this) to the right type, the expected field of the super class is accessed.

Oops

Of course, it is always nice to have your type checker as compact as possible, so I was very happy with this specification. I could just forward everything related to super field accesses to the corresponding expression with a cast and a this expression. The Dryad typing rules looked something like this:

  attributes:
    |[ super.x ]| -> <attributes> |[ ((reft) this).x ]|
    where
      reft is the superclass of the current class

  attributes:
    |[ cname.super.x ]| ->  <attributes> |[ ((reft) cname.this).x ]|
    where
      reft is the superclass of the class cname

This implementation looks very attractive, but ... it didn't work. The reason for this is that super.name is in fact not exactly the same as ((S)this).name. The reason for this are the details of protected access, which I've previously written about on my blog. I'm not going to redo that, so let me just give an example (based on the example in the previous post) where this assumed equivalence is invalid. First, the following two classes are valid and can be compiled without any problems:

  package a;
  public class A {
    protected int secret;
  }

  package b;
  public class B extends a.A {
    void f() {
      super.secret = 5;
    }
  }

Next, let's now change the assignment to the expression ((a.A) this).secret, which is equivalent to super.secret according to the specification.

  package b;

  public class B extends a.A {
    void f() {
      ((a.A) this).secret = 5;
    }
  }

Unfortunately, this won't compile, due to the details of protected access:

b/B.java:5: secret has protected access in a.A
    ((a.A) this).secret = 5;
    ^
1 error

This example shows that the two expressions are not treated in the same way, so this looks like a problem in the Java Language Specification to me. Also, this shows that in semantics of languages likje Java the devil is really in the detail. What surprises me is that nobody has mentioned this before. Several major Java compilers have been implemented, right? Shouldn't the programmers responsible for these compilers have encountered this problem?

Java Virtual Machine

Another interesting thing is how the Java Virtual Machine specification deals with this. There is no special bytecode operator for accessing fields of super classes: all field assignments are performed using the putfield operator. Assuming that the source compiler would ignore the protected access problem, the two unequal examples I just gave would compile to exactly the same bytecode. So how can the JVM report an error about an illegal access for the expression ((a.A) this).secret? Well, it turns out that it doesn't.

We can show this by first making secret public, compile B, then make secret protected, and only recompile A. This works like a charm: doing this trick for the following example prints secret = 5.

  package a;
  public class A {
    protected int secret;
    public void print() {
      System.out.println("secret = " + secret);
    }
  }

  package b;
  public class B extends a.A {
    void f() {
      ((a.A) this).secret = 5;
    }
    public static void main(String[] ps) {
      B b = new B();
      b.f();
      b.print();
    }
  }

However, if this would be allowed by bytecode in general, then this would mean that the security vulnerability that was fixed with the details of protected access, would actually only give source level protection. Obviously, that would be no protection at all: you can safely assume that attackers are capable of writing bytecode. So let's try to make the example a bit more adventurous by passing the subject expression to the f method:

  package b;
  public class B extends a.A {
    void f(a.A a) {
      a.secret = 5;
    }
    public static void main(String[] ps) {
      B b = new B();
      b.f(b);
      b.print();
    }
  }

This time, the verifier reports an error:

  Exception in thread "main" java.lang.VerifyError:
    (class: b/B, method: f signature: (La/A;)V)
    Bad access to protected data

This error report is correct, so apparently the verifier does check for illegal protected access. In the first case, it was just a bit more liberal than the source language. The question is, how is this specified in the Java Virtual Machine specif cation? My first impression was that there might be some special handling of accesses to this. However, this would require the verifier to trace which local variables might have the value of this, which is rather unlikely. Then, Dick Eimers (who did lots of scary bytecode stuff for his master thesis) pointed me to a paper that exactly covers this subject: Checking Access to Protected Members in the Java Virtual Machine by Alessandro Coglio. Strange enough, this paper is not cited anywhere, while I think that the discussion of this issue is pretty good.

It turns out that the difference in accessibility between super field accesses and ordinary field accesses is handled implicitly thanks to the type inferencer used by Java Virtual Machine. The inferred type of the operand of the field access will be more specific than the type in the corresponding source code, which makes the access to the protected field valid in bytecode. I don't think that this implicit handling of the observed difference is a very good idea.

No comments: