Title: Java Memory Analysis: Problems and Solutions
1Java Memory Analysis Problems and Solutions
2How Java apps (ab)use memory
3Assume that at the high level your data is
represented efficiently
- Data doesnt sit in memory for longer than needed
- No unnecessary duplicate data structures- E.g.
dont keep same objects in both a List and a Set - Data structures are appropriate- E.g. dont use
ConcurrentHashMap when no concurrency - Data format is appropriate- E.g. dont use
Strings for int/double numbers
4Main sources of memory waste
(from bottom to top level)
- JVM internal object implementation
- Inefficient common data structures-
Collections- Boxed numbers - Data duplication - often biggest overhead
- Memory leaks
5Internal Object Format in HotSpot JVM
6Internal Object Format Alignment
- To enable 4-byte pointers (compressedOops) with
gt4G heap, objects are 8-byte aligned - Thus, for example- java.lang.Integer effective
size is 16 bytes (12b header 4b int)-
java.lang.Long effective size is 24 bytes - not
20! (12b header 8b long 4b padding)
7Summary small objects are bad
- A small objects overhead is up to 400 of its
workload - There are apps with up to 40 of the heap wasted
due to this - See if you can change your code to consolidate
objects or put their contents into flat arrays - Avoid heap size gt 32G! (really 30G)- Unless
your data is mostly int, byte etc.
8Common Collections
- JDK java.util.ArrayList, java.util.HashMap,
java.util.concurrent.ConcurrentHashMap etc. - Third-party - mainly Google com.google.common.col
lect. - Scala has its own equivalent of JDK collections
- JDK collections are nothing magical- Written in
Java, easy to load and read in IDE
9ArrayList Internals
10HashMap Internals
11Memory Footprint of JDK Collections
- JDK pays not much attention to memory footprint-
Just some optimizations for empty ArrayLists and
HashMaps- ConcurrentHashMap and some Google
collections are the worst memory hogs - Memory is wasted due to- Default size of the
internal array (10 for AL, 16 for HM) too high
for small maps. Never shrinks after
initialization.- Entry objects used by all Maps
take at least 32b each! - Sets just reuse Map
structure, no footprint optimization
12Boxed numbers - related to collections
- java.lang.Integer, java.lang.Double etc.
- Were introduced mainly to avoid creating
specialized classes like IntToObjectHashMap - However, proven to be extremely wasteful-
Single int takes 4b. java.lang.Integer effective
size is 16b (12b header 4b int), plus 4b
pointer to it- Single long takes 8b.
java.lang.Long effective size is 24b (12b header
8b long 4b padding), plus 4b pointer to it
13JDK Collections Summary
- Initialized, but empty collections waste memory
- Things like HashMapltObject, Integergt are bad
- HashMapEntry etc. may take up to 30 of memory
- Some third-party libraries provide alternatives-
In particular, fastutil.di.unimi.it (University
of Milan, Italy)- Has Object2IntHashMap,
Long2ObjectHashMap, Int2DoubleHashMap, etc. - no
boxed numbers- Has Object2ObjectOpenHashMap no
Entry objects
14Data Duplication
- Can happen for many reasons- s s1 s2 or s
s.toUpperCase() etc. always generates a new
String object- intObj new Integer(intScalar)
always generates a new Integer object- Duplicate
byte buffers in I/O, serialization, etc. - Very hard to detect without tooling- Small
amount of duplication is inevitable- 20-40
waste is not uncommon in unoptimized apps - Duplicate Strings are most common and easy to fix
15Dealing with String duplication
- Use tooling to determine where dup strings are
either- generated, e.g. s s.toUpperCase()-
permanently attached, e.g. this.name name - Use String.intern() to de-duplicate- Uses a
JVM-internal fast, scalable canonicalization
hashtable- Table is fixed and preallocated - no
extra memory overhead - Small CPU overhead is
normally offset by reduced GC time and improved
cache locality - s s.toUpperCase.intern()this.name
name.intern()
16Other duplicate data
- Can be almost anything. Examples- Timestamp
objects- Partitions (with HashMaps and
ArrayLists) in Apache Hive- Various byte,
char etc. data buffers everywhere - So far convenient tooling so far for automatic
detection of arbitrary duplicate objects - But one can often guess correctly- Just look at
classes that take most memory
17Dealing with non-string duplicates
- Use WeakHashMap to store canonicalized objects-
com.google.common.collect.Interner wraps a
(Weak)HashMap - For big data structures, interning may cause some
CPU performance impact- Interning calls
hashCode() and equals()- GC time reduction would
likely offset this - If duplicate objects are mutable, like HashMap-
May need CopyOnFirstChangeHashMap, etc.
18Duplicate Data Summary
- Duplicate data may cause huge memory waste-
Observed up to 40 overhead in unoptimized apps - Duplicate Strings are easy to- Detect (but need
tooling to analyze a heap dump)- Get rid of -
just use String.intern() - Other kinds of duplicate data more difficult to
find- But its worth the effort!- Mutable
duplicate data is more difficult to deal with
19Memory Leaks
- Unlike C, Java doesnt have real leaks- Data
thats not used anymore, but not released- Too
much persistent data cached in memory - No reliable way to distinguish leaked data- But
any data structure that just keeps growing is bad - So, just pay attention to the biggest (and
growing) data structures- Heap dump see which
GC root(s) hold most memory- Runtime profiling
can be more accurate, but more expensive
20JXRay Memory Analysis Tool
21What is it
- Offline heap analysis tool- Runs once on a given
heap dump, produces a text report - Simple command-line interface - Just one jar
.sh script- No complex installation- Can run
anywhere (laptop or remote headless machine)-
Needs JDK 8 - See http//www.jxray.com for more info
22JXRay main features
- Shows you what occupies the heap- Object
histogram which objects take most memory-
Reference chains which GC roots/data structures
keep biggest object lumps in memory - Shows you where memory is wasted- Object
headers- Duplicate Strings- Bad collections
(empty 1-element small (2-4 element))- Bad
object arrays (empty (all nulls) length 0 or 1
1-element)- Boxed numbers- Duplicate primitive
arrays (e.g. byte buffers)
23Keeping results succinct
- No GUI - generates a plain text report- Easy to
save and exchange- Small 50K regardless of the
dump size- Details a given problem once its
overhead is above threshold (by default 0.1 of
used heap) - Knows about internals of most standard
collections- More compact/informative
representation - Aggregates reference chains from GC roots to
problematic objects
24Reference chain aggregation assumptions
- A problem is important if many objects have it-
E.g.1000s/1000,000s of duplicate strings - Usually there are not too many places in the code
responsible for such a problem- Foo(String s)
this.s s.toUpperCase() - Bar(String
s1, String s2) this.s s1 s2
25Reference chain aggregation what is it
- In the heap, we may have e.g. Baz.stat1 -gt
HashMap_at_243 -gt ArrayList_at_650 -gt Foo.s xyz
Baz.stat2 -gt LinkedList_at_798 -gt HashSet_at_134 -gt
Bar.s 0 Baz.stat1 -gt HashMap_at_529 -gt
ArrayList_at_351 -gt Foo.s abc Baz.stat2 -gt
LinkedList_at_284 -gt HashSet_at_960 -gt Bar.s 1
1000s more chains like this - JXRay aggregates them all into just two lines
Baz.stat1 -gt HashMap -gt ArrayList -gt Foo.s
(abc,xyz and 3567 more dup strings)
Baz.stat2 -gt LinkedList -gt HashSet -gt Bar.s
(0, 1 and )
26Treating collections specially
- Object histogram standard vs JXRay view
HashMapEntry 21500 objs 430K
HashMapEntry 3200 objs 180K HashMap
3200 objs 150K vs
HashMap 3200 objs 760K - Reference chains Foo lt- HashMapEntry.value
lt- HashMapEntry lt- lt- HashMap lt- Object
lt- ArrayList lt- rootX vs Foo lt-
HashMap.values lt- ArrayList lt- rootX
27Bad collections
- Empty no elements at all- Is it used at all? If
yes, allocate lazily. - 1-element- Always has only 1 element - replace
with object- Almost always has 1 element -
solution more complex. Switch between Object and
collection/array lazily. - Small 2..4 elements- Consider smaller initial
capacity- Consider replacing with a plain array
28Bad object arrays
- Empty only nulls- Same as empty collections -
delete or allocate lazily - Length 0- Replace with a singleton zero-length
array - Length 1- Replace with an object?
- Single non-null element- Replace with an object?
Reduce length?
29Memory Analysis and Reducing Footprint concrete
cases
30A Monitoring app
- Scalability wasnt great- Some users had to
increase -Xmx again and again.- Unclear how to
choose the correct size - Big heap -gt long full GC pauses -gt frozen UI
- Some OOMs in small clusters- Not a scale problem
- a bug?
31Investigation, part 1
- Started with the smaller dumps with OOMs-
Immediately found duplicate strings- One string
repeated 1000s times used 90 of the heap - Long
SQL query saved in DB many times, then
retrieved- Adding two String.intern() calls
solved the problem.. almost - Duplicate byte buffers in a 3rd-party library
code- That still caused noticeable overhead-
Ended up limiting saved query size at high
level- Library/auto-gen code may be difficult to
change
32Investigation, part 2
- Next, looked into heap dumps with scalability
problems- Both real and artificial benchmark
setup - Found all the usual issues- String duplication-
Empty or small (1-4 elements) collections- Tons
of small objects (object headers used 31 of
heap!)- Boxed numbers
33Standard solutions applied
- Duplicate strings add more String.intern()
calls- Easy check jxray report, find what data
structures reference bad strings, edit code-
Non-trivial when a String object is mostly
managed by auto-generated code - Bad collections less trivial- Sometimes its
enough to replace new HashMap() with new
HashMap(expectedSize)- Found ArrayLists that
almost always size 0/1
34Dealing with mostly 0/1-size ArrayLists
- Replaced ArrayList list gt Object valueOrArray
- Depending on the situation, valueOrArray may- be
null- point to a single object (element)- point
to an array of objects (elements) - 70 LOC hand-written for this- But memory
savings were worth the effort
35Dealing with non-string duplicate data
- Heap contained a lot of of small objects class
TimestampAndData long timestamp long
value - Guessed that there may be many duplicates- E.g.
many values are just 0/1 - Added a simple canonicalization cache. Result-
8x fewer TimestampAndData objects- 16 memory
savings
36A Monitoring app conclusions
- Fixing string/other data duplication, boxed nums,
small/empty collections together saved 50-
Depends on the workload- Scalability improved
more data - higher savings - Can still save more - replace standard HashMaps
with more memory-friendly maps- HashMapEntry
objects may take a lot of memory!
37Apache Hive Hive Server 2 (HS2)
- HS2 may run out of memory
- Most scenarios involve 1000s of partitions and
10s of concurrent queries - Not many heap dumps from real users
- Create a benchmark which reproduces the problem,
measure where memory goes, optimize
38Experimental setup
- Created a Hive table with 2000 small partitions
- Running 50 concurrent queries like select
count(myfield_1) from mytable crashes an HS2
server with -Xmx500m - More partitions or concurrent queries - more
memory needed
39HS2 Investigation
- Looked into the heap dump generated after OOM
- Not too many different problems- Duplicate
strings 23- java.util.Properties objects take
20 of memory- Various bad collections 18 - Apparently, many Properties are duplicate- A
separate copy per partition per query- For a
read-only partition, all per-query copies are
identical
40HS2 Fixing duplicate strings
- Some String.intern() calls added
- Some strings come from HDFS code- Need separate
changes in Hadoop code - Most interesting String fields of java.net.URI-
private fields initialized internally - no
access- But still can read/write using Java
Reflection- Wrote StringInternUtils.internStrings
InURI(URI) method
41HS2 Fixing duplicate java.util.Properties
objects
- Main problem Properties object is mutable- All
PartitionDesc objects representing the same
partition cannot simply use one canonicalized
Properties object- If one is changed, others
should not! - Had to implement a new class class
CopyOnFirstWriteProperties extends Properties
Properties interned // Used until/unless a
mutator called // Inherited table is filled
and used after first mutation
42HS2 Improvements based on simple read-only
benchmark
- Fixing duplicate strings and properties together
saved 37 of memory - Another 5 can be saved by reduplicating strings
in HDFS - Another 10 can be saved by dealing with bad
collections
43Investigating/fixing concrete apps conclusions
- Any app can develop memory problems over time-
Check and optimize periodically - Many such problems are easy enough to fix-
Intern strings, initialize collections lazily,
etc. - Duplication other than strings is frequent- More
difficult to fix, but may be well worth the
effort- Need to improve tooling to detect it
automatically