Friday, April 24, 2009

Is Java String really immutable...?

In many texts String is cited as the ultimate benchmark of Java's various immutable classes. Well, I'm sure you'd have to think the other way once you have read this article.

To start with, let's get back to the books and read the definition of immutability. The Wikipedia defines it as follows -

'In object-oriented and functional programming, an immutable object is an object whose state cannot be modified after it is created.'

I personally find this definition good as it mentions that an immutable instance's state should not be allowed to be modified after it's construction.

Now keeping this in the back of our minds, let's decompile Java's standard String implementation and peep into the hashCode() method -

public int hashCode() {
int h = hash;
if (h == 0) {
int off = offset;
char val[] = value;
int len = count;
for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}

A detailed look at the above code reveals this to be a classic implementation of the 'lazy evaluation' pattern; i.e. the String class instead of computing the hash value [1] during an instance's construction or [2] computing it each time the hashCode() method is called; computes it once, i.e. on the first call to hashCode(), and saves the computed value in the hash attribute.

Now, here is a test to prove the above quoted statement -

String string = "MyDearString";

Field field = String.class.getDeclaredField("hash");
field.setAccessible(true);

System.out.println("Before: " + field.getInt(string));

string.hashCode();

System.out.println("After: " + field.getInt(string ));

Anxious to know what the output would looks like? Here it is...

Before: 0
After: -1554135584

So this infers that for a given String instance which we create, but for which we never call hashCode(), the private hash field will remain to be zero. It is only changed when we call hashCode().

Well, don't you find this in contrast with the basic definition of an object's immutability?

One might argue that we can not observe a String instance in a different state without a reflective read of String's hash field and because the call to retrieve the state actually modifies it; if we can't observe it changing.

Well, if one doesn't observe a change does that mean it never happened...?

This reminds me about the other day when i scratched my new car, from below, against an irregular speed-breaker and my friend Manish said that I should not be worried about that because if a scratch is not visible its never been there. Well, is it really never there...?

However it seems like String is not immutable. Although it seems safe, ignoring reflective access, it definitely looks incorrect.

Post me your views around this...

2 comments:

vivi said...

The private field "hash" does not hold any state information, it is merely a cache of the hashCode() return value. No method return value or behavior will be affected by the fact that hash is set or not.
Also, using reflection to access private fields is a bit cheating ;-)

Rahul Roy said...

Hi, thanks for sharing your views. I do agree that using reflection here in this case is a cheat. However, it does make one thing clear here, i.e. the Java String is just not a perfect immutable class, going as per the core definition of it. :) Ideally, if one states a class to be immutable then, no matter what, it's state (even that of private attributes) should not be changed/altered after its once constructed.