Wednesday, September 9, 2009

StringPool and GarbageCollection

It is interesting and important to understand how strings are handled in Java. The way Java handles or manages strings is little different from other objects.

Case1:

String a = "java";
String b= "java";

Does these two strings represent the same object reference ?. The answer is yes they do.

Case 2:
String c=new String("java");
String d =new String("java");
Does c and d point to the same object reference. The answer is NO . They don't.

Now let us see how all these work out.

First thing: Java maintains a pool of string constants called literal pool. Infact this pool contains references to string instances in heap. Since Strings are also objects they are also created in the heap.

Whenever you put a literal ( example to this is what the variable a represents) in the code, this gets special treatment. When the JVM loads your class where there is a string literal, the VM checks whether there is an existing entry in the string pool for this string value. If yes , the symbolic link is replaced with the actual reference to that string object.
If there is no such string in the pool a entry will be created in the pool and a string instance will be created in the heap. The entry in the pool points to reference of this string object. Again the symbolic link in your class is replaced with the actual reference to the string object in heap.

Again refer to the Case 2 at the top. Here a new string is created using new operator. When we do this , we are telling VM to create a new instance of string in heap. VM won't mind whether the string pool has an entry for this string value. With new always a new instance is created.

These string objects created using new operator are eligible for Garbage collection when there are no reference to them for other parts of the code.

But what about the string objects in heap that are created by VM as a response to a string literal creation like Stirng a ="java". These objects are never garbage collected as they have a reference to them from the string pool constant table.

This string pool constant table is part of the class data of string class and since string class would be never offloaded by the classloader, these referenceexists in the method area and GC can never collect them.





Wednesday, September 2, 2009

ClassNotFoundException

Every programmer would have encountered a ClassNotfoundException somewhere in the development lifecycle. Many times we search for reasons why this is happening and i have seen many times people spending long hours to find why this is happening. This is particularly seen with web applications which are deployed in a webserver like tomcat or an app server like Weblogic or JBoss.In the case of a normal desktop java application, this can be fixed by adding the required class or the library that contains the class to the classpath. This way with Java 1.2, classpath classloader will load the class and the issue is fixed.

For a web application this may not be as staightforward. Some times i have seen people adding the class to the classpath of the webserver and still getting the error again.

Now to identify and fix these issues we need to understand that java supports many types of classloaders. The main one being bootstrap classloader responsible for loading Java API. The other one is the classpath classloader responsible for loading the classes refered in the classpath of the java. In the case of web application , the webserver uses it own user defined class loader to load the classes from the WAR file. Based on java security architecture a class loaded by a classloader can only see classes loaded by it. Then you would wonder how did a userdefined classloader access classes loaded by the bootstrap classloader if it can only see classes loaded by it. This is not true. The classloaders follows a parent child delegation model. If user defined class loader (in this cases , the classloader who loads classes from WAR) is required to load a class it asks the classloader's parent to load it and the parent will ask its parent this chain goes on and if none of the parent can load then it is upto the userdefined classloader to load it and if it cannot load , classnotfoundexception occurs.

I will try to explain this with an example.

Consider the case of a WAR file in which Class A is included in the WAR file. Class A references Class B. But Class B is not included in the WAR file. Infact someone has added it to the classpath of the webserver for some reason. Hence when the web classloader loads Class A, it finds the reference to Class B and it tries to load Class B when the method which uses Class B is invoked on Class A. For this it will ask its immediate parent and finally it will be loaded by the classpath classloader and returned. No issues here. No ClassNotFoundException.

Consider the case in which Class A is put in the webserver's class path and Class B is included in the WAR file. Here when web classloader (classloader responsible for loading classes from the WAR file. ) is required to load Class A, it asks its immediate parent and the chain continues and finally Class A is loaded by the classpath classloader and is returned. Now since Class B is not loaded by classpath classloader and Class A is loaded by classpath classloader when the method which uses Class B is invoked on Class A, a ClassNotFoundException would be thrown.

This is because classpath classloader can only ask its parent to load class and not its child classloader. This means classpath classloader doesn't know about Class B loaded by the Web Classloader. The reverse is true. Web Classloader will have access to classes loaded by its parent.
To solve the above ClassNotFoundException put both classes in the WAR file . Another option is to put Class A in WAR and Class B in classpath.

Pointer : Include all classes in the WAR file. Don't put any application specific classes in the webservers classpath. The above example was taken just to explain the classloader working.