Diagnosing Java memory leaks...

Maximilian

Lifer
Feb 8, 2004
12,603
9
81
So this is fun... i've inherited an application with a potential memory leak.

It uses Java 6, Jetty 6.xx and the Java Pesistence API. Its purpose is to display the status of backup software backups. It dosent actually do any backups itself or anything. Pretty sure this is 32 bit Java it runs on, its on Centos 5, 32 bit. It uses the Java Service Wrapper from Tanuki software.

Unfortunately it spews "java.lang.OutOfMemoryError: Java heap space" errors sometimes. I have narrowed it down to one particular page that causes the issue. This page refreshes via an AJAX call every 30 seconds and brings back a hefty bit of data. It communicates with a servlet that returns a json. I've poked though the code and looked at various things, resources that need closed... EntityManager, that gets closed in a finally clause etc. It all looks fine. One weird thing is the EntityManager object contains a transaction variable which contains another EntityManager which contains another transaction etc etc. This chain of EntityManagers seems to be extremely deep. I don't know if this is normal or not...

I have run VisualVM memory profiler on my development version and it looks like T4CStatement, GregorianCalander and ZoneInfo objects never get garbage collected. They sit around for 300+ generations (which I gather is bad). But im not 100% sure, maybe they only get garbage collected after they take up a certain amount of space and it dosent matter how many instances there are.

This may not even be a memory leak it might just need more memory and im seeing the limits of a 32 bit app on a 32 bit OS.

My half baked solution is possibly to call System.gc() either on every invocation of the troublesome servlet (or every 10th invocation, I really need to extensively google the implications of calling System.gc() before I do this) and setting the page to refresh every 60 seconds instead of 30. Im going to try manually forcing a garbage collection by clicking the button in VisualVM on Monday to see if it picks up those objects that sit around for 300+ generations.


tl;dr Given a 32 bit Java 6 application using jetty 6 and a Java Service Wrapper on Centos 5 32 bit, is there any surefire way to know if there is a memory leak? Some objects sit around for 300+ (potentially more) generations in VisualVM memory profiler. These are: T4CStatement, GregorianCalander and ZoneInfo. Also the JPA is used, the EntityManager object (although it is closed) contains a transaction variable which contains another EntityManager etc etc. This chain of EntityManagers seems to be extremely deep. I don't know if this is normal or not...
 

Cogman

Lifer
Sep 19, 2000
10,278
126
106
Don't worry about stuff living for a long time. That doesn't really matter (it may affect throughput but it isn't really an indicator of a memory leak problem.)

I suggest turning on heap dumps on memory errors.

-XX:+HeapDumpOnOutOfMemoryError

And then I suggest using the eclipse memory analyzer on the output heap dump. That will give you a good idea quickly where your problem is (look at the leak suspect report and the dominator tree.)

I would throw away VisualVM for memory analysis, it is pretty garbage. I only really use it to initiate heap dumps. Otherwise I use MAT for everything.

As for "system.gc" Just pull that out all together. You only get OOM exceptions when java has attempted to perform a full GC, 5?, times in a row. Explicitly calling GC accomplishes nothing.

I will say, if you hit "GC" in VisualVM and the memory utilization is still very high (75-90%), then that would be a good time to take a heap dump to see what is going on.
 

Cogman

Lifer
Sep 19, 2000
10,278
126
106
Also, your description of the long chain of entity managers definitely sounds fishy. If they are surviving a request that is definitely something to look at.
 

Mr Evil

Senior member
Jul 24, 2015
464
187
116
mrevil.asvachin.com
...One weird thing is the EntityManager object contains a transaction variable which contains another EntityManager which contains another transaction etc etc. This chain of EntityManagers seems to be extremely deep...
Are you sure it's a long chain and not circular references?
 

Maximilian

Lifer
Feb 8, 2004
12,603
9
81
Are you sure it's a long chain and not circular references?

Ah yeah, you're right! I am a bonehead. According to Intellij EntityTransactionWrapper@5154 refers to EntityTransactionWrapper@5153 which refers to EntityTransactionWrapper@5154 etc etc.

Don't worry about stuff living for a long time. That doesn't really matter (it may affect throughput but it isn't really an indicator of a memory leak problem.)

I suggest turning on heap dumps on memory errors.

-XX:+HeapDumpOnOutOfMemoryError

And then I suggest using the eclipse memory analyzer on the output heap dump. That will give you a good idea quickly where your problem is (look at the leak suspect report and the dominator tree.)

I would throw away VisualVM for memory analysis, it is pretty garbage. I only really use it to initiate heap dumps. Otherwise I use MAT for everything.

As for "system.gc" Just pull that out all together. You only get OOM exceptions when java has attempted to perform a full GC, 5?, times in a row. Explicitly calling GC accomplishes nothing.

I will say, if you hit "GC" in VisualVM and the memory utilization is still very high (75-90%), then that would be a good time to take a heap dump to see what is going on.

Cheers man, thanks for the awesome advice as usual!

The MAT you suggested seems to be quite a bit more useful than visualvm I tested it out by analyzing a heap dump (forced from visualvm) from my development copy and checked out the leak suspects. This is the leak suspects I got from my development copy:

https://www.dropbox.com/s/zdkhx25b4diry30/heapdump-1516031060475_Leak_Suspects.zip?dl=0

I think getting one of those from the live application would be more helpful since I cannot really replicate the OOME on my development copy. I will look into adding that argument so I can analyze a dump from the live application.

One thing I will attempt first is to try increasing the max heap size to 1024, it is currently at 512.

Okay ill forget about using System.gc() then
 

Maximilian

Lifer
Feb 8, 2004
12,603
9
81
So I increased the heap size from 512 to 1024 and the application has been silent so far today.

I also added that -XX:+HeapDumpOnOutOfMemoryError parameter in case it starts kicking out errors again I can analyze them with MAT. Dosent look like it was a memory leak, I think it simply needed more heapspace.

Also the problem page was returning a 38MB JSON and having jquery datatables deal with it every 30 seconds. Slow as hell. I disabled that auto update which should help things.

All seems to be well so far
 

Cogman

Lifer
Sep 19, 2000
10,278
126
106
Yeah, a 38MB json blob wouldn't be too hard to kill a 512MB heap, especially if you store it in memory first before sending it out. Just takes a 14 simultaneous requests to that endpoint to make the app die (maybe even 7 or fewer depending on how the data is marshaled.)
 

Merad

Platinum Member
May 31, 2010
2,586
19
81
Datatables supports a relatively easy to implement server side processing feature. Sounds like it's time to switch over to using it so you aren't pushing 38 megs of data back to the client.
 
Sep 29, 2004
18,665
67
91
I had a large app that ran out of memory. calling System.gc() helped. I did it every so often (like every 5 seconds).

Writing your own Pool for things like String objects can help save memory. String pools cut my memory use down by like 100 meg.
 

Cogman

Lifer
Sep 19, 2000
10,278
126
106
I had a large app that ran out of memory. calling System.gc() helped. I did it every so often (like every 5 seconds).


Calling System.gc() to solve memory leaks is a placebo, plain and simple. Java will do a full garbage collection (actually I believe it does 5) before it will throw an out of memory exception. Really, the only reason to do System.gc() is if your whole app has very natural completion boundaries, is single threaded, and a GC during the actual processing is not tolerable. If those conditions don't apply, System.gc() isn't the right tool.

Writing your own Pool for things like String objects can help save memory. String pools cut my memory use down by like 100 meg.

You really shouldn't do this either. Object pools generally make your application go slower because anything you put in the pool is likely to be added into Old gen (since the pool has a long lifetime). The consequence of that is more major GCs which means a longer runtime.

I am curious, how were you pooling your strings? For example, did you do something like

Code:
String[] stringPool = new String[]{}

and then pass back and modify references to the String? Or did you rather do something like this

Code:
Map<String, String> map;

and then take a string and grab the first instance of that string from the map?

The first method would yield no memory savings (only pain, as mentioned earlier) particularly because java strings are immutable anyways so any modification to the reference creates a brand new string.

The second could be done through String.intern()

Was there a 3rd different thing that you did?
 
Sep 29, 2004
18,665
67
91
Want to get confused? Spend a day reading about Java garbage collection. And that won't be enough. Uless you write the actually code for GC (like 5 people on the planet?) there is no real way to truly understand it other than at a top level.

Actually, java categorizes memory differently over time. There are 3 tiers I think. I don't care to look up their names. If data is not cleaned up soon enough it gets moved into "long term memory". I forget what java does with long term memory storage. I know that it runs slower. I'm pretty sure there is a limit to what it will attempt to clean. Normal GCs do not clean up any data in long term memory. I think it tries cleaning long term memory for a set time limit and gives up and eventually you get an OOM Exception!

Thus the problem. The notion that Java cleans up all unreferenced objects during GC is 100% false. It won't even do it to prevent an OOM exception. The only way to prevent OOM Exceptions is to frequently garbage collect. If I GCed every 20 seconds, my app would complete but use like 2 GB of memory. If I GCed every 5 seconds, it would use about 1.4 GB of memory. If I did nothing, I'd eventually get a OOM Exception.

String pools were just a HashMap<String, String> in a StringPool class that checked to see if A was already stored. If it was, it returned the already existing version of A. Java can do string pooling for you via a command line argument. The problem is that if the strings are no longer referenced, they will not be collected. So in my case, I could flush the pool as needed (application req't dependant). Hashmaps were EXTREMELY FAST (obviously). My app just crunched data. It ran for 40 minutes and I added code to measure the time it took to deal with pooling and it was like 4 seconds total. The thing is, the String "ABCDEF" could literally exist 100,000 times. I actually reported "String pool reuse" at the end of the application for every string in the pool.

I'm pretty sure that Java that runs on servers is more robust. I forget the trickery it does but it obviously has to be more reliable than a client application.

Oh, My app was a java application for clients that took a bunch of proprietary data and blended it with existing PDF files to generate new PDF files to make the PDF files interactive (linking type functionality).

The code was essentially this:
Code:
private final HashMap<Stirng, String> stringMap = new HashMap<>();

public synchronized String fetch(String str) {
if (str == null) {
    return null;
}

if (stringMap.containsKey(str)) {
   return stringMap.get(str);
}

stringMap.put(str, str);
return str;
}
 

Cogman

Lifer
Sep 19, 2000
10,278
126
106
Want to get confused? Spend a day reading about Java garbage collection. And that won't be enough. Uless you write the actually code for GC (like 5 people on the planet?) there is no real way to truly understand it other than at a top level.

I've spent maybe too much time reading about, watching lectures about, and learning about Java's GCs. While I'm not at the point where I could jump in and start writing GC code, I am at the point where I have a pretty firm understanding about how Java's GCs and their collection phases work. It is a hobby of mine.

Actually, java categorizes memory differently over time. There are 3 tiers I think. I don't care to look up their names. If data is not cleaned up soon enough it gets moved into "long term memory". I forget what java does with long term memory storage. I know that it runs slower. I'm pretty sure there is a limit to what it will attempt to clean. Normal GCs do not clean up any data in long term memory. I think it tries cleaning long term memory for a set time limit and gives up and eventually you get an OOM Exception!

The tiers are, Eden/new gen/survivor (all the same space), Old gen, and Metaspace (formerly permgen and not a part of the main heap anymore).

Objects too big to fit into Eden are automatically promoted to Old gen.
Objects that stick around through N number of minor collections are promoted to Old gen
Objects that are referenced by Old gen Objects are moved into old gen.

Those are the general rules.

Minor collections happen whenever new gen is filled up.

All of the minor collectors in mainline java do a moving collection (not talking about ZGC, Shenandoah, or whatever the Azul GC is called).
Major collections are initiated in different ways depending on the collector used. For CMS/G1GC, they are initiated first whenever a set capacity is hit. For G1GC it is something like 40% utilization and for CMS it is something like 70% by default. For CMS, a Stop the world full collection can be triggered either by hitting a fragmentation overhead limit (it takes too long to find some place to promote an object) or when it can't find enough space to allocate an object. For the parallel collector it is just when there isn't enough space to push something into old gen, for G1GC, stop the world full collection happens when it runs out of space as well.

G1GC has the added behavior that minor collections also collect a portion of Old gen based on the target pause time.

Metaspace and PermGen play by different rules.

Thus the problem. The notion that Java cleans up all unreferenced objects during GC is 100% false. It won't even do it to prevent an OOM exception. The only way to prevent OOM Exceptions is to frequently garbage collect. If I GCed every 20 seconds, my app would complete but use like 2 GB of memory. If I GCed every 5 seconds, it would use about 1.4 GB of memory. If I did nothing, I'd eventually get a OOM Exception.

You can read about it in the docs when OOM is thrown
https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/memleaks002.html

There are a lot of reasons for OOMs to trigger and pretty much none of them would be solved by calling System.gc() (unless you are pausing the execution to call System.gc(), in which case you are avoiding the OOM by virtue of skewing the stats it uses to determine that you are spending too much time GCing).

In pretty much all cases, OOMs indicate either a heap that is too small, a GC that is misconfigured, or that you legitimately have a memory leak and it needs to be fixed.

My suggestion, if you haven't employed it yet, is to increase your max heap size (set it at 2gb if that is what is needed to complete, or higher). Generally prefer the parallel collector, if your heap is < 6gb, there isn't a good reason to use anything else. CMS and G1GC will perform worse with higher overhead. The one exception is that is if your box/vm has one processor available to it, then you should use the serial collector. And really, before making any decisions or tweaks to any of the GC settings you should be collecting and looking at your GC logs to determine what should be tuned. Anything else is just blind guessing.

String pools were just a HashMap<String, String> in a StringPool class that checked to see if A was already stored. If it was, it returned the already existing version of A. Java can do string pooling for you via a command line argument. The problem is that if the strings are no longer referenced, they will not be collected. So in my case, I could flush the pool as needed (application req't dependant). Hashmaps were EXTREMELY FAST (obviously). My app just crunched data. It ran for 40 minutes and I added code to measure the time it took to deal with pooling and it was like 4 seconds total. The thing is, the String "ABCDEF" could literally exist 100,000 times. I actually reported "String pool reuse" at the end of the application for every string in the pool.

I'm pretty sure that Java that runs on servers is more robust. I forget the trickery it does but it obviously has to be more reliable than a client application.

Oh, My app was a java application for clients that took a bunch of proprietary data and blended it with existing PDF files to generate new PDF files to make the PDF files interactive (linking type functionality).

The code was essentially this:
Code:
private final HashMap<Stirng, String> stringMap = new HashMap<>();

public synchronized String fetch(String str) {
if (str == null) {
    return null;
}

if (stringMap.containsKey(str)) {
   return stringMap.get(str);
}

stringMap.put(str, str);
return str;
}

Fair enough. You might consider an LRU using a TreeMap. That would remove the need to do any manual pruning (though it is slower than a HashMap).

But if you still want to use a hash map, I would suggest instead you switch over to a ConcurrentHashMap and change your code to look something like this (assuming java 8)

Code:
public String fetch(String s) {
  if (s = null)
    return null;

  return stringMap.putIfAbsent(str, str);
}

Same thread safety with higher speed because the concurrent hashmap does some tricks with locking to try and avoid excessive synching. Bonus, it doesn't do a double lookup (containsKey, get/put)
 
sale-70-410-exam    | Exam-200-125-pdf    | we-sale-70-410-exam    | hot-sale-70-410-exam    | Latest-exam-700-603-Dumps    | Dumps-98-363-exams-date    | Certs-200-125-date    | Dumps-300-075-exams-date    | hot-sale-book-C8010-726-book    | Hot-Sale-200-310-Exam    | Exam-Description-200-310-dumps?    | hot-sale-book-200-125-book    | Latest-Updated-300-209-Exam    | Dumps-210-260-exams-date    | Download-200-125-Exam-PDF    | Exam-Description-300-101-dumps    | Certs-300-101-date    | Hot-Sale-300-075-Exam    | Latest-exam-200-125-Dumps    | Exam-Description-200-125-dumps    | Latest-Updated-300-075-Exam    | hot-sale-book-210-260-book    | Dumps-200-901-exams-date    | Certs-200-901-date    | Latest-exam-1Z0-062-Dumps    | Hot-Sale-1Z0-062-Exam    | Certs-CSSLP-date    | 100%-Pass-70-383-Exams    | Latest-JN0-360-real-exam-questions    | 100%-Pass-4A0-100-Real-Exam-Questions    | Dumps-300-135-exams-date    | Passed-200-105-Tech-Exams    | Latest-Updated-200-310-Exam    | Download-300-070-Exam-PDF    | Hot-Sale-JN0-360-Exam    | 100%-Pass-JN0-360-Exams    | 100%-Pass-JN0-360-Real-Exam-Questions    | Dumps-JN0-360-exams-date    | Exam-Description-1Z0-876-dumps    | Latest-exam-1Z0-876-Dumps    | Dumps-HPE0-Y53-exams-date    | 2017-Latest-HPE0-Y53-Exam    | 100%-Pass-HPE0-Y53-Real-Exam-Questions    | Pass-4A0-100-Exam    | Latest-4A0-100-Questions    | Dumps-98-365-exams-date    | 2017-Latest-98-365-Exam    | 100%-Pass-VCS-254-Exams    | 2017-Latest-VCS-273-Exam    | Dumps-200-355-exams-date    | 2017-Latest-300-320-Exam    | Pass-300-101-Exam    | 100%-Pass-300-115-Exams    |
http://www.portvapes.co.uk/    | http://www.portvapes.co.uk/    |