Parallel and concurrent garbage collectors NIO - Data flow made resource-efficient
Jan 15

The Java NIO package is all about performance. That’s why, through this post, the word performance is going to be repeated often, showing where the NIO package increases your application’s throughput. It’s important to note that the “old IO”, the java.io package, is not bad to use and some features of it are not covered by NIO at all: however, in many situations involving data flow, the NIO package provides significantly faster IO operations.

It all starts with buffers

Buffers are the most granular unit in the NIO package. There is a Buffer abstract class which defines certain common features, and a type-specific Buffer extension for each of the primitive types sans the boolean type (IntBuffer, ByteBuffer, etc.)

Each buffer implements the Comparable interface, and it’s a worthy point to remember: if you want to use a buffer in a Map or Set, better use the TreeMap or TreeSet implementations, since they check the buffer’s elements up to the first non-matching element. However, the hash-code implementation has to go through all the elements in order produce a hash-code, making HashMap and HashSet really bad when used with buffers.

In addition to having a similar interface as arrays by allowing read and write operations at specific indices, buffers also work in a serial way for increased performance. They hold a pointer to the current position in the buffer, and when a put or a get operation is requested it is performed at that position, after which the position is advanced forward. They also hold a pointer to the end of the buffer, so that when a read or a write is performed after the current position has passed the end of the buffer an exception is thrown.

This simple mechanism is great for performance, but poses a problem when trying to read data from the buffer after writing to it. Since the position of the buffer is set to one value after the last written value, any read operation will produce unexpected values. To solve this, the buffers offer a flipping model, which allows you to read data immediately after writing it. The flip method performs a simple task: it sets the current position pointer to the beginning of the buffer while setting the end of buffer pointer to where the current position was.

buffer flipping

Of the seven buffer implementations, there are two noteworthy ones: CharBuffer and ByteBuffer. CharBuffer is interesting because it implements the CharSequence interface (much like String), making it a valid input for the Writer interface and more importantly a target to classes in the regular expressions package. In fact, the regex package has been slightly modified to provide better performance when working with buffers. In addition, the CharBuffer is used by the Readable interface, a new interface the Reader abstract class implements; therefore, all Reader subclasses can now read into a CharBuffer as their target output instead of just character arrays. When used with direct buffers, this could be a heavy improvement on performance as well. Another interface CharBuffer implements is the Appendable interface, making it an equal to StringBuilder and StringBuffer in terms of string concatenations.

ByteBuffer has many added features. First and foremost, it is the only buffer that can be allocated as a direct buffer, meaning that its data is mapped to a region of a physical file. That means that it does not take any memory from the VM’s heap, and more importantly doesn’t bother the garbage collector. This is because allocating large objects sends them directly to the old generation, which poses a problem due to two reasons: first, garbage collection on the old generation occurs only when the heap is out of allocation space and second, because the old generation’s collection is significantly slower than the young generation’s collection. By constantly creating large byte arrays for IO operations you might trigger collection on the old generation memory area too frequently, reducing application performance significantly.

By reading the two previous paragraphs, you might wonder how I could have mentioned a direct character buffer if only byte buffers can be allocated as a direct buffer. Well, this is where the second important feature of the ByteBuffer comes into view: a ByteBuffer can be expoesd as a view of any of the other primitive buffers using a asPrimitiveBuffer (i.e. asCharBuffer). It’s important to note that any change done on the view is reflected on the byte buffer, and vice-versa. It is also the only way other primitive type buffers can be made into direct buffers. Note that if a buffer is indeed a direct buffer, a call to the array method will throw an exception. This is because the array method retrieves the backing array, and will not create one from a memory mapped file.

The following diagram summarizes the above:

buffers diagram2

JIT performance

Buffers have another added benefit by receiving a special treatment from the Java VM. According to Sun’s documentation of the Java HotSpot VM, the put and get operations are being compiled to produce high-quality machine code, and in their words: “combined with the large performance and scalability improvements in network and file I/O offered by the New I/O APIs, Java programming language applications can now achieve similar throughput as applications coded in C and C++”.

Next up

Again, there was so much I could write about in a single post. The next post will be all about data flow, using channels to transfer data from and to buffers, and using selectors to optimize data flow from and to multiple channels.

Other posts of interest

3 Responses to “NIO - efficient IO’s granular bits”

  1. Web 2.0 Announcer Says:

    NIO - efficient IO’s granular bits

    […]The Java NIO package is all about performance. That’s why, through this post, the word performance is going to be repeated often, showing where the NIO package increases your application’s throughput. It’s important to note that the “old I…

  2. roScripts - Webmaster resources and websites Says:

    NIO - efficient IO’s granular bits

    NIO - efficient IO’s granular bits

  3. Chaotic Java » Blog Archive » NIO - Data flow made resource-efficient Says:

    […] Java The interweb, design patterns, frameworks and Java « NIO - efficient IO’s granular bits Chaotic Java on IRC […]

Leave a Reply

Chaotic Java is Digg proof thanks to caching by WP Super Cache!