Jan 02

It seems to me that the importance of garbage collection in Java (and other garbage collected languages) is disproportional to the explanations given about it. While there would be millions of articles when you look for JavaFX examples, there would be only a couple if you tried to search about the “Parallel Compacting Collector” mentioned in Sun’s memory management whitepaper.

Since I wanted to understand better how the garbage collection is currently implemented in the Java VM and to see what’s ahead, I’ve scourged the internet a bit and found many interesting articles and slides. With the searches, it became like the old saying: there is a lot of stuff you don’t know, but there’s even more stuff you don’t even know you don’t know. This post is my summary of what I’ve found, but this is just the way I understand the explained material in publications and whitepapers; if I’m wrong somewhere, please do correct me.

The basics of garbage collection

The garbage collector first performs a task called marking. The garbage collector traverses the application graph, starting with the root objects; those are objects that are represented by all active stack frames and all the static variables loaded into the system. Each object the garbage collector meets is marked as being used, and will not be deleted in the sweeping stage.

The sweeping stage is where the deletion of objects take place. There are many ways to delete an object: The traditional C way was to mark the space as free, and let the allocator methods use complex data structures to search the memory for the required free space. This was later improved by providing a defragmenting system which compacted memory by moving objects closer to each other, removing any fragments of free space and therefore allowing allocation to be much faster:

memory collection

For the last trick to be possible a new idea was introduced in garbage collected languages: even though objects are represented by references, much like in C, they don’t really reference their real memory location. Instead, they refer to a location in a dictionary which keeps track of where the object is at any moment.

Fortunately for us – but unfortunately for these garbage collection algorithms – our servers and personal computers got faster (and multiple) processors and bigger memory capacities. Compacting memory areas this large often was very taxing on the application, especially considering that when doing that, the whole application had to freeze due to the changes in the virtual memory map. Fortunately for us though, some smart people improved those algorithms in three ways: concurrency, parallelization and generational collection.

Generational garbage collection

In any application, objects could be categorized according to their life-line. Some objects are short-lived, such as most local variables, and some are long-lived such as the backbone of the application. The thought about generational garbage collection was made possible with the understanding that in an application’s lifetime, most instantiated objects are short-lived, and that there are few connections between long-lived objects to short-lived objects.

In order to take advantage of this information, the memory space is divided to two sections: young generation and old generation. In Java, the long-lived objects are further divided again to permanent objects and old generation objects. Permanent objects are usually objects the Java VM itself created for caching like code, reflection information etc. Old generation objects are objects that survived a few collections in the young generation area.

Since we know that objects in the young generation memory space become garbage early, we collect that area frequently while leaving the old generation’s memory space to be collected in larger intervals. The young generation memory space is much smaller, thus having shorter collection times.

An additional advantage to the knowledge that objects die quickly in this area, we can also skip the compacting step and do something else called copying. This means that instead of seeking free areas (by seeking the areas marked as unused after the marking step), we copy the live objects from one young generation area to another young generation area. The originating area is called the From area, and the target area is called the To area, and after the copying is completed the roles switch: the From becomes the To, and the To becomes the From.

In addition, the Java VM splits the young generation to three areas, by adding an area called Eden which is where all objects are allocated into. To my understanding this is done to make allocation faster by always having the allocator reference to the beginning of Eden after a collection.

generational collection

By using the copying method, garbage collection achieves defragmentation without seeking for dead memory blocks. However, this method proves itself to be more efficient in areas where most objects are garbage, so it is not a good approach to take on the old generation memory area. Indeed, that area is still collected using the compacting algorithm – but now, thanks to the separation of young and old generations, it is done in much larger intervals.

Next up

I didn’t expect the amount of information I’ve found. I especially didn’t expect the amount of information I’ve found regarding how the garbage collector makes use of the multiple processors platforms available today in almost all new computers. I’m not a big believer in extremely long, 3,000 words posts, so all the information regarding parallel and concurrent garbage collectors can be found on the next post, allowing me to upload this one now and have a couple of days to edit the next one before sending it online.

Related Posts with Thumbnails

63 Responses to “How does garbage collection work?”

  1. Ricky Clarkson Says:

    Where have you seen that Java’s garbage collector can move objects around? I’ve only heard of Common Lisp garbage collectors doing that, though I expect others do, and I didn’t think Java’s did.

    Cheers, a good read.

  2. Maris Orbidans Says:

    good read

  3. Avah Says:

    Rick: When the garbage collector does the collection, it moves objects around. From the young generation’s From to the To area, from the young generation area to the old area, and within the old generation area when compacting.

    I got this information from the memory management whitepaper Sun published (now appears as a link at the top of the post) and from other resources such as JavaOne slides (publicly available for non-SDN members as well, even though SDN membership is free).

    Thanks for your comment!

  4. Avah Says:

    Let me correct my drawing a bit, though: in the generation collection, I made it look as if the marking marked the objects as unused. It’s obviously not true if you read the text, as it marks the used objects, and discards the rest.

  5. Web 2.0 Announcer Says:

    How does garbage collection work?

    […]Since I wanted to understand better how the garbage collection is currently implemented in the Java VM and to see what’s ahead, I’ve scourged the internet a bit and found many interesting articles and slides. With the searches, it became like …

  6. alan Says:

    this is interesting but how does this knowledge help you write better code?

  7. roScripts - Webmaster resources and websites Says:

    Chaotic Java » Blog Archive » How does garbage collection work?

    Chaotic Java » Blog Archive » How does garbage collection work?

  8. Avah Says:

    Alan: This is not knowledge to help you write better code yet. This is primarily knowledge out of interest.

  9. LadyCoder Says:

    While knowledge of the garbage collector may not help you write better code, it does help you understand how things work underneath.

    Recently, my work ran into some issues with our system. We had not been very careful with our memory management over the many years that the system has been developed (and maintained & added to…) and so we were actually running out of memory even with garbage collection.

    Understanding how the garbage collection works will help me to keep in mind that even though Java will clean up after me, it is VERY good practice to make sure to clean up after yourself as well. If you are done with an object set it to null, don’t just let it sit there, because eventually, you may very well run into issues where you eat up all of your memory and the garbage collector just can’t handle it.

    One more point to bring up about the garbage collector, which you mentioned briefly, is that after an object has been moved back and forth in the young generation a few times, it gets moved to the old generation and it will stick around there much longer. So, this is why it is important for you to clean up after yourself. You don’t want an object sticking around long enough to get bounced back and forth and then end up in the old generation area, since it will be even longer before it gets cleaned up. Just mark it as null when you are done with it and it will get cleaned out quickly.

  10. Avah Says:

    LadyCoder: I agree with you regarding being careful with your object management even with garbage collector. the garbage collector collects garbage, and if an object is still in use (even if not intentionally) it will created memory leaks even in a garbage collected language like Java.

    However, I don’t tend to agree on the “null” setting. There are very specific cases where you actually need to do that, and in most cases the scope system takes care of the problem for you. More than that, the Java compiler creates “mini-scopes” where it can to optimize your usage of objects even further, and sometimes setting an object to null confuses those optimizations and makes them produce less optimal bytecode, prolonging the life of an object instead of making it shorter.

    I might make a post about known anti-patterns that a lot of people use with garbage collected languages. Setting to null is not extremely bad, but it might result in less than optimal performance too.

    In any case, thanks for your comment!

  11. Daniel Pitts Says:

    I would think that you can optimize even more by adding a specialized reference counter. Obviously that won’t work in all circumstances, and if applied incorrectly it might even reduce performance.

    I wonder if it could be used as a “hint” to the GC that this object is only going to be short lived. (eg, it exists only on the stack frame, and will not be stored somewhere else ever)

  12. Avah Says:

    Daniel: Remember that for short-lived objects, the garbage collector does not look for the dead objects, it looks for the live objects and just abandons the ones unmarked. Therefore, knowing which objects are dead is not going to help it.

    In the case of the old generation the story is different though, but even here, since the entire graph of objects needs to be traversed in order to mark all the live objects, having another counter would not save any time, as far as I understand the process.

  13. double density design » Blog Archive » Bookmarking the web - w01/2008 Says:

    […] A post with information found on the internet about how Garbage Collection works. […]

  14. Daniel Pitts Says:

    Ah, that makes sense. It only copies active objects, so it doesn’t matter the instant that an object become unreachable…

    I’m curious how finalize() and WeakReference, etc.. affects the young generation.

  15. Avah Says:

    Daniel: you’re right to be curious. In fact, one of the biggest pitfalls of applications working with a non-concurrent garbage collector.

    Since the garbage collectors “stop the world” when they perform their act, a too-long finalize method can really make everything go awry in terms of performance.

    In the case of the concurrent collector, to the best of my understanding, if the garbage collector’s collection time is too long, the application could receive an OutOfMemoryException since the collector just can’t keep up.

    WeakReferences are easy though, and don’t affect the young generation much – they are just not followed when the collector traverses the object graph. In fact, you could say that by using weak references, you’re helping the garbage collector work faster! :)

  16. Krzysztof Grajek Says:

    Very nice post, I hope you will never stop writing at this address!
    Happy New Year!

  17. Pascal Says:

    > Even though objects are represented by references
    > … they don’t really reference their real memory
    > location. …

    This is not true anymore in a recent JVM.
    The cost to is too high (every read has an indirection!)

    This was used in Old JVM (and in the Classic MacOS, known as “handle”).

    Moderm VM “know” where the pointers are and
    update all references (you pay only at GC time).

    Normal use of reference incur no cost,
    like in C: just a simple pointer.


  18. Avah Says:

    Pascal: That’s very interesting! Is there a reference document for this?

  19. Chaotic Java » Blog Archive » GC Tips and Memory Leaks Says:

    […] the two posts about garbage collection (basic and advanced) I started receiving questions regarding the works of the GC, and tips on how to use […]

  20. Chaotic Java » Blog Archive » Parallel and concurrent garbage collectors Says:

    […] Java The interweb, design patterns, frameworks and Java « How does garbage collection work? NIO – efficient IO’s granular bits […]

  21. Garbage Collection - The comic panel Says:

    […] Garbage Collection set of posts (Generations, Parallel and Concurrent, Tips and Memory Leaks) are ones that I am personally very proud of. […]

  22. lavnish Says:

    Can you pls elaborate more on the “application graph”…

    The garbage collector first performs a task called marking. The garbage collector traverses the ‘application graph’, starting with the root objects;

  23. Garbage collectors « Java Village Says:

    […] commencer en douceur, une petite intro, on accélère le rythme avec un article un peu plus ardu, pour finir sur un best practice. Pour le […]

  24. Aviad Says:

    @lavnish: Of course! Or at least, to the best of my ability. Historically, the application graph contained all objects directly referenced from your application’s threads’ run() method (or something similar), your static instances of classes, and all objects reachable from these “root” objects.

    As far as I know, this has been optimized; unfortunately, I am unfamiliar with the specifics on this optimization.

  25. Java: Analyze Java Performance Issue | Programming Resources Says:

    […] How Garbage Collection Works […]

  26. Jose Luis Montes de Oca Says:

    Any insight about garbage collector and anonymous objects?

  27. Aviad Says:

    Jose: I assume you mean instances of anonymous classes. Since anonymous classes are the same as non-static inner classes, there is not a lot of difference at all in how the garbage collection is done.

    The one “catch” is that non-static inner classes contain a pointer to their parent class – this is why you can access its members. Because of that, the parent class will not be garbage collected even if all visible pointers to it are gone if you still have a reference to the inner class.

    I hope my answer was clear enough.. :)

  28. Keerthana Says:

    Hi..it was a gud read..esp fa an amateur..btw can u pls lemme know abt d feasibility n complexity levels of implementing garbage collection fa C language using C??? it wud really v helpful if u cud gimme d apt ans at d earliest :)

  29. Karthik Gopal Says:

    HI Guys.. It is a very good article to know about GC. Thanks for it, Yael.

    I am working on a Java application where everytime it goes out of the memory. something like, The UI will be frozen. can you guys gimme few hints where I can make my application more stable. This is an applet running on IE 5.0.

    Please, let me know.

  30. Ramaswamy Ratnala Says:

    for begineers it is very helpful.experienced candidates need not enough.

  31. techgal Says:

    Great info! I was asked in one of my interviews about GC. I could hardly tell them much because books or most of the pages on the internet give restricted info. Now I know how GC really works. Thanks a lot!

  32. Aviad Says:

    @techgal: Thanks for the comment. I’m glad it helped you! :)

  33. Ahishek Banerjee Says:

    The way you explained it was very smooth. Thanks to you for all the information you have gathered and provided us with such a brilliant flow.

  34. swathikah Says:

    thank you..this post was very useful for my research.but i wonder if we could find the memory requirement of an application by offline profiling of its gc data?

  35. Ann Says:

    Really helpful post.

  36. Regalla Naresh Reddy Says:

    This is ok … But the some where it is complex to understand ….!

    The terms which you have used are standard or not ….?

  37. Aviad Says:

    @Regalla: Which terms are you speaking of? I made sure as much as I can to use the technical terms (for example, Eden, From, To, etc are technical terms).

    If there’s something you’re uncertain about, please ask!

  38. Kelvin Njenga Says:

    i would like to know whether garbage collection really guarantees that a program will not run out of memory.
    kind regards.

  39. Aviad Says:

    @Kelvin: Of course not. You could have memory leaks if you retain objects in a “living” state for longer they are required, or if you use more memory than can be allocated by the JVM. It does, however, prevent a lot of “memory leak” situations found in non-GC languages such as not deleting objects for reasons such as mishandling passed references to objects. If you’re trying to avoid memory leaks and GC-related problems in Java, please read the GC Tips and Memory Leaks post.

  40. Peter B. Kessler Says:

    One correction: In your picture of scavenging the Eden into the survivor spaces, you make it look as though the “Just allocated” object exists outside of the Eden, and “will go to Eden”. In fact, new objects are allocate *in* the Eden, so we don’t have to copy it to the Eden.

    Nice write-up. And some nice comments and patient replies to them.

  41. Aviad Says:

    @Peter: You’re correct. I just had to draw it somewhere, and thought that outside of the memory scope will show that this is a new object better. But obviously, the object is allocated inside Eden and not copied into it.

    Also, thanks for the compliment. Glad to have a pleased reader! :)

  42. Muhammad Ali Says:

    I want to know that when jvm calls the gc

  43. Aviad Says:

    @Muhammad: Whenever it feels like it. More specifically, it depends on which generation you would be talking about. The objects in eden and the to and from areas are constantly GC’d, as only the living objects are moved around. The older generation is GC’d according to some tuning specifications, but generally it would occur when the memory hits the upper limit of the currently allocated memory space, just before allocating more memory (if it didn’t reach the maximum yet) or throwing an OOME. I’m not sure about a periodic run of the GC though.

  44. uberVU - social comments Says:

    Social comments and analytics for this post…

    This post was mentioned on Twitter by r3d: Java Garbage collection http://npm55.tk

  45. Ahsan Javed Says:

    Really nice article. However it misses other garbage collection algorithm. Are you planning to write them as well, Looking forward to it.


  46. IrisWang Says:

    very detail, thank you.

  47. manish Says:

    plzzzzzzzzz explain that can we get back the reference of an object which is already lost?
    if yes then how?

  48. Nishwanth Says:

    Good read

  49. Radhika Says:

    Very nice article

  50. Zanyking Says:

    Good Article!

    And about the discussion of set ref to null, as what I can remember, you only need to use it in some cases like set ThreadLocal’s value to null.

    I did this especially when I’m inside a container enviroment which might has a thread pool and played Classloader tricks a lot.

    If you forget to set it to null at the end of each action lifecycle, you’ll probably get memory leak of the old ClassLoader.

  51. siri Says:

    Y don’t add some coding part in this to make clear how it works

  52. Rajasekaran Says:

    You may please refer to the following link, and do let me know if it really helped http://freejavaclass.com/articles/java/garbagecollectionconcepts.jsp

  53. memory leaks in Java - Java Forums Says:

    […] memory leaks. Garbage collection removes unused blocks of memory so that memory does NOT leak. How does garbage collection work? __________________ Zack "There's more to a heart than just anger or […]

  54. Javin @ Tibco RV Tutorial Says:

    This is really good article and you explained in very clean way.

    Deciding space for Eden area and other heap area is very crucial for getting minimal Java Garbage collection which pauses application thread from running.

    FIX Protocol tutorial
    Why String is immutable in Java

  55. Dave Howes Says:

    Nice article, and it’s nice to see such patient responses to the comments.

  56. garbage collection in Java Says:

    Nice article your pictures made the understanding very easy.

    5 tips on writing equals method in Java

  57. Anonymous Says:

    something which I have been looking for. its easy and has some good information.
    I also found this link something close to good http://javarevisited.blogspot.com/2011/04/garbage-collection-in-java.html

  58. Android application architecture. Part II – architectural styles and patterns « Vlad Nevzorov on software development Says:

    […] only if an object does not have any reference to it from other “live” objects (see more details here). In Android it is not so. If a certain GUI gets hidden (it is not visible on the screen), there is […]

  59. sahil bucha Says:

    Nice article.. :)

  60. javapapo Says:

    Nice read – simple enough and well structured! Well done!

  61. Garbage collection για αρχάριους! « Java Hellenic User Group Says:

    […] άρθρο για τα βασικά του garbage collection. Must read! Θα το βρεις εδώ. Share and […]

  62. kwlim Says:

    Good read indeed. 😀

  63. pranav Says:

    Very nice article, very much detailed explanation of garbage collection