Here’s one java interview question which seemes to come up a lot – when would you use an ArrayList, and when would you use a LinkedList?
Well, there are various reasons to prefer each, but one thing which probably wouldn’t leap to mind initially is memory efficiency. If you do consider memory efficiency, it might first seem that ArrayList should be more efficient – in a LinkedList, each node has to retain a pointer to the next (and previous) nodes, and this is not required when you store the data in an array.
However, things re not so simple. Firstly, there are tradeoffs made by the ArrayList implementation which tend to mean that the internal array is larger than it actually needs to be to hold the number of objects required by list.size(). If this wasn’t the case, a new larger internal array would have to be created whenever you add an item to the list, which is a very expensive operation if performed frequently. So the tradeoff is to initialize the internal array with extra empty elements. Only when these empty indexes are filled will the internal array be resized. When this occurs, the new size is generally calculated using the following formula – (oldCapacity * 3)/2 + 1 – which will most likely mean that once more there are a significant number of unused indexes. So all your ArrayList instances are probably holding on to a lot of wasted memory taken up by their sparsely populated internal arrays.
This might not seem all that significant at first glance, and it seems as if the effect of the empty indexes would be outweighed by the extra node references in a LinkedList. But there are some other more surprising drawbacks with an ArrayList:
Firstly, an ArrayList is generally created with an initial internal array length of 10. (You can change this by passing an alternative length into the constructor – but few people generally bother). Lets say in a hypothetical system, you have many lists which are empty, or have very small size. Perhaps, for example, your system has lots of observable classes (bean classes for instance), and each bean stores listener references in an ArrayList. Let’s imagine your system has ten thousand bean instances, and most beans have no listeners. Even with no listeners the ten thousand ArrayList instances would still have 100,000 empty references from their internal arrays. This is starting to look like a significant waste of memory. It’s certainly worth considering initializing your ArrayLists with a default capacity of zero.
But there is an even more significant drawback. That is, the internal array of an ArrayList never shrinks, unless you specifically request this to happen by calling the trimToSize() method. Perhaps surprisingly, this does not happen automatically if you call ArrayList clear(). This means that whenever you add a large number of items to an ArrayList, you may be losing memory forever – even if you subsequently remove the items again or clear the list. Unless you remember to call trimToSize() or the list itself becomes free for garbage collection, the empty references in the internal array will be retained for the life of the app. And how often have you seen trimToSize() called in java applications?
This is not a purely academic problem. An application I have been working on recently used ArrayList as a mechanism to buffer incoming messages for future processing. In general, a large burst of 100,000 messages was received on startup, after which the rate slowed to tens or hundreds. The initial messages would cause the ArrayList’s internal array to be sized to over 100,000 elements. Unfortunately, although this amount of buffering would never again be required, these 100,000 references effectively became leaked memory, since although ArrayList clear() was called when processing the messages, trimToSize() was never invoked. There was a separate buffer of this type for each message type, which cost several megabytes in lost memory overall. Using a LinkedList would not have caused this problem, since there is no internal array in a LinkedList implementation which can leak references.
This problem with sparse arrays is not limited to ArrayList. It is the case for Maps too, and even more worryingly so, since in many Map implementations there is not even a method to call to shrink the internal array. HashMap, for example, contains an array internally, sized to a power of two, which is expanded as required as the map grows. The internal array never shrinks if you remove items or clear the map. There is no trimToSize() method. The only way you can get around this, and free up the memory held by the map’s internal array, is to create a new Map instance (passing in any current mappings which are still required) and throw the old Map instance away. This is the approach I took with a recent performance enhancement to an application which uses IdentityHashMap heavily – and this resulted in a decrease in memory usage of over 10%, or 50MB in a 500MB application. This is a huge impact for a problem which I was largely unaware of, and has gone pretty much under the radar until now.
I think if there is one thing which could be changed in the jdk to improve this, it would be for the clear methods of collections to shrink the internal arrays back to their initial size by default.