Collections
Introduction#
The collections framework in java.util
provides a number of generic classes for sets of data with functionality that can’t be provided by regular arrays.
Collections framework contains interfaces for Collection<O>
, with main sub-interfaces List<O>
and Set<O>
, and mapping collection Map<K,V>
. Collections are the root interface and are being implemented by many other collection frameworks.
Remarks#
Collections are objects that can store collections of other objects inside of them. You can specify the type of data stored in a collection using Generics.
Collections generally use the java.util
or java.util.concurrent
namespaces.
Java 1.4.2 and below do not support generics. As such, you can not specify the type parameters that a collection contains. In addition to not having type safety, you must also use casts to get the correct type back from a collection.
In addition to Collection<E>
, there are multiple major types of collections, some of which have subtypes.
List<E>
is an ordered collection of objects. It is similar to an array, but does not define a size limit. Implementations will usually grow in size internally to accomodate new elements.Set<E>
is a collection of objects that does not allow duplicates.SortedSet<E>
is aSet<E>
that also specifies element ordering.
Map<K,V>
is a collection of key/value pairs.SortedMap<K,V>
is aMap<K,V>
that also specifies element ordering.
Java 5 adds in a new collection type:
Queue<E>
is a collection of elements meant to be processed in a specific order. The implementation specifies whether this is FIFO or LIFO. This obsoletes theStack
class.
Java 6 adds in some new subtypes of collections.
NavigableSet<E>
is aSet<E>
with special navigation methods built in.NavigableMap<K,V>
is aMap<K,V>
with special navigation methods built in.Deque<E>
is aQueue<E>
that can be read from either end.
Note that the above items are all interfaces. In order to use them, you must find the appropriate implementing classes, such as ArrayList
, HashSet
, HashMap
, or PriorityQueue
.
Each type of collection has multiple implementations that have different performance metrics and use cases.
Note that the Liskov Substitution Principle applies to the collection subtypes. That is, a SortedSet<E>
can be passed to a function expecting a Set<E>
. It is also useful to read about Bounded Parameters in the Generics section for more information on how to use collections with class inheritance.
If you want to create your own collections, it may be easier to inherit one of the abstract classes (such as AbstractList
) instead of implementing the interface.
Prior to 1.2, you had to use the following classes/interfaces instead:
Vector
instead ofArrayList
Dictionary
instead ofMap
. Note that Dictionary is also an abstract class rather than an interface.Hashtable
instead ofHashMap
These classes are obsolete and should not be used in modern code.
Declaring an ArrayList and adding objects
We can create an ArrayList
(following the List
interface):
List aListOfFruits = new ArrayList();
List<String> aListOfFruits = new ArrayList<String>();
List<String> aListOfFruits = new ArrayList<>();
Now, use the method add
to add a String
:
aListOfFruits.add("Melon");
aListOfFruits.add("Strawberry");
In the above example, the ArrayList
will contain the String
“Melon” at index 0 and the String
“Strawberry” at index 1.
Also we can add multiple elements with addAll(Collection<? extends E> c)
method
List<String> aListOfFruitsAndVeggies = new ArrayList<String>();
aListOfFruitsAndVeggies.add("Onion");
aListOfFruitsAndVeggies.addAll(aListOfFruits);
Now “Onion” is placed at 0 index in aListOfFruitsAndVeggies
, “Melon” is at index 1 and “Strawberry” is at index 2.
Constructing collections from existing data
Standard Collections
Java Collections framework
A simple way to construct a List
from individual data values is to use java.utils.Arrays
method Arrays.asList
:
List<String> data = Arrays.asList("ab", "bc", "cd", "ab", "bc", "cd");
All standard collection implementations provide constructors that take another collection as an argument adding all elements to the new collection at the time of construction:
List<String> list = new ArrayList<>(data); // will add data as is
Set<String> set1 = new HashSet<>(data); // will add data keeping only unique values
SortedSet<String> set2 = new TreeSet<>(data); // will add data keeping unique values and sorting
Set<String> set3 = new LinkedHashSet<>(data); // will add data keeping only unique values and preserving the original order
Google Guava Collections framework
Another great framework is Google Guava
that is amazing utility class (providing convenience static methods) for construction of different types of standard collections Lists
and Sets
:
import com.google.common.collect.Lists;
import com.google.common.collect.Sets;
...
List<String> list1 = Lists.newArrayList("ab", "bc", "cd");
List<String> list2 = Lists.newArrayList(data);
Set<String> set4 = Sets.newHashSet(data);
SortedSet<String> set5 = Sets.newTreeSet("bc", "cd", "ab", "bc", "cd");
Mapping Collections
Java Collections framework
Similarly for maps, given a Map<String, Object> map
a new map can be constructed with all elements as follows:
Map<String, Object> map1 = new HashMap<>(map);
SortedMap<String, Object> map2 = new TreeMap<>(map);
Apache Commons Collections framework
Using Apache Commons
you can create Map using array in ArrayUtils.toMap
as well as MapUtils.toMap
:
import org.apache.commons.lang3.ArrayUtils;
...
// Taken from org.apache.commons.lang.ArrayUtils#toMap JavaDoc
// Create a Map mapping colors.
Map colorMap = MapUtils.toMap(new String[][] {{
{"RED", "#FF0000"},
{"GREEN", "#00FF00"},
{"BLUE", "#0000FF"}});
Each element of the array must be either a Map.Entry or an Array, containing at least two elements, where the first element is used as key and the second as value.
Google Guava Collections framework
Utility class from Google Guava
framework is named Maps
:
import com.google.common.collect.Maps;
...
void howToCreateMapsMethod(Function<? super K,V> valueFunction,
Iterable<K> keys1,
Set<K> keys2,
SortedSet<K> keys3) {
ImmutableMap<K, V> map1 = toMap(keys1, valueFunction); // Immutable copy
Map<K, V> map2 = asMap(keys2, valueFunction); // Live Map view
SortedMap<K, V> map3 = toMap(keys3, valueFunction); // Live Map view
}
Using Stream
,
Stream.of("xyz", "abc").collect(Collectors.toList());
or
Arrays.stream("xyz", "abc").collect(Collectors.toList());
Join lists
Following ways can be used for joining lists without modifying source list(s).
First approach. Has more lines but easy to understand
List<String> newList = new ArrayList<String>();
newList.addAll(listOne);
newList.addAll(listTwo);
Second approach. Has one less line but less readable.
List<String> newList = new ArrayList<String>(listOne);
newList.addAll(listTwo);
Third approach. Requires third party Apache commons-collections library.
ListUtils.union(listOne,listTwo);
Using Streams the same can be achieved by
List<String> newList = Stream.concat(listOne.stream(), listTwo.stream()).collect(Collectors.toList());
References. Interface List
Removing items from a List within a loop
It is tricky to remove items from a list while within a loop, this is due to the fact that the index and length of the list gets changed.
Given the following list, here are some examples that will give an unexpected result and some that will give the correct result.
List<String> fruits = new ArrayList<String>();
fruits.add("Apple");
fruits.add("Banana");
fruits.add("Strawberry");
INCORRECT
Removing in iteration of for
statement Skips “Banana”:
The code sample will only print Apple
and Strawberry
. Banana
is skipped because it moves to index 0
once Apple
is deleted, but at the same time i
gets incremented to 1
.
for (int i = 0; i < fruits.size(); i++) {
System.out.println (fruits.get(i));
if ("Apple".equals(fruits.get(i))) {
fruits.remove(i);
}
}
Removing in the enhanced for
statement Throws Exception:
Because of iterating over collection and modifying it at the same time.
Throws:
java.util.ConcurrentModificationException
for (String fruit : fruits) {
System.out.println(fruit);
if ("Apple".equals(fruit)) {
fruits.remove(fruit);
}
}
CORRECT
Removing in while loop using an Iterator
Iterator<String> fruitIterator = fruits.iterator();
while(fruitIterator.hasNext()) {
String fruit = fruitIterator.next();
System.out.println(fruit);
if ("Apple".equals(fruit)) {
fruitIterator.remove();
}
}
The Iterator
interface has a remove()
method built in just for this case. However, this method is marked as “optional” in the documentation, and it might throw an UnsupportedOperationException
.
Throws:
UnsupportedOperationException - if the remove operation is not supported by this iterator
Therefore, it is advisable to check the documentation to make sure this operation is supported (in practice, unless the collection is an immutable one obtained through a 3rd party library or the use of one of the Collections.unmodifiable...()
method, the operation is almost always supported).
While using an Iterator
a ConcurrentModificationException
is thrown when the modCount
of the List
is changed from when the Iterator
was created. This could have happened in the same thread or in a multi-threaded application sharing the same list.
A modCount
is an int
variable which counts the number of times this list has been structurally modified. A structural change essentially means an add()
or remove()
operation being invoked on Collection
object (changes made by Iterator
are not counted). When the Iterator
is created, it stores this modCount
and on every iteration of the List
checks if the current modCount
is same as and when the Iterator
was created. If there is a change in the modCount
value it throws a ConcurrentModificationException
.
Hence for the above-declared list, an operation like below will not throw any exception:
Iterator<String> fruitIterator = fruits.iterator();
fruits.set(0, "Watermelon");
while(fruitIterator.hasNext()){
System.out.println(fruitIterator.next());
}
But adding a new element to the List
after initializing an Iterator
will throw a ConcurrentModificationException
:
Iterator<String> fruitIterator = fruits.iterator();
fruits.add("Watermelon");
while(fruitIterator.hasNext()){
System.out.println(fruitIterator.next()); //ConcurrentModificationException here
}
Iterating backwards
for (int i = (fruits.size() - 1); i >=0; i--) {
System.out.println (fruits.get(i));
if ("Apple".equals(fruits.get(i))) {
fruits.remove(i);
}
}
This does not skip anything. The downside of this approach is that the output is reverse. However, in most cases where you remove items that will not matter. You should never do this with LinkedList
.
Iterating forward, adjusting the loop index
for (int i = 0; i < fruits.size(); i++) {
System.out.println (fruits.get(i));
if ("Apple".equals(fruits.get(i))) {
fruits.remove(i);
i--;
}
}
This does not skip anything. When the i
th element is removed from the List
, the element originally positioned at index i+1
becomes the new i
th element. Therefore, the loop can decrement i
in order for the next iteration to process the next element, without skipping.
Using a “should-be-removed” list
ArrayList shouldBeRemoved = new ArrayList();
for (String str : currentArrayList) {
if (condition) {
shouldBeRemoved.add(str);
}
}
currentArrayList.removeAll(shouldBeRemoved);
This solution enables the developer to check if the correct elements are removed in a cleaner way.
In Java 8 the following alternatives are possible. These are cleaner and more straight forward if the removing does not have to happen in a loop.
Filtering a Stream
A List
can be streamed and filtered. A proper filter can be used to remove all undesired elements.
List<String> filteredList =
fruits.stream().filter(p -> !"Apple".equals(p)).collect(Collectors.toList());
Note that unlike all the other examples here, this example produces a new List
instance and keeps the original List
unchanged.
Using removeIf
Saves the overhead of constructing a stream if all that is needed is to remove a set of items.
fruits.removeIf(p -> "Apple".equals(p));
Unmodifiable Collection
Sometimes it’s not a good practice expose an internal collection since it can lead to a malicious code vulnerability due to it’s mutable characteristic. In order to provide “read-only” collections java provides its unmodifiable versions.
An unmodifiable collection is often a copy of a modifiable collection which guarantees that the collection itself cannot be altered. Attempts to modify it will result in an UnsupportedOperationException exception.
It is important to notice that objects which are present inside the collection can still be altered.
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
public class MyPojoClass {
private List<Integer> intList = new ArrayList<>();
public void addValueToIntList(Integer value){
intList.add(value);
}
public List<Integer> getIntList() {
return Collections.unmodifiableList(intList);
}
}
The following attempt to modify an unmodifiable collection will throw an exception:
import java.util.List;
public class App {
public static void main(String[] args) {
MyPojoClass pojo = new MyPojoClass();
pojo.addValueToIntList(42);
List<Integer> list = pojo.getIntList();
list.add(69);
}
}
output:
Exception in thread "main" java.lang.UnsupportedOperationException
at java.util.Collections$UnmodifiableCollection.add(Collections.java:1055)
at App.main(App.java:12)
Iterating over Collections
Iterating over List
List<String> names = new ArrayList<>(Arrays.asList("Clementine", "Duran", "Mike"));
names.forEach(System.out::println);
If we need parallelism use
names.parallelStream().forEach(System.out::println);
for (String name : names) {
System.out.println(name);
}
for (int i = 0; i < names.size(); i++) {
System.out.println(names.get(i));
}
//Creates ListIterator which supports both forward as well as backward traversel
ListIterator<String> listIterator = names.listIterator();
//Iterates list in forward direction
while(listIterator.hasNext()){
System.out.println(listIterator.next());
}
//Iterates list in backward direction once reaches the last element from above iterator in forward direction
while(listIterator.hasPrevious()){
System.out.println(listIterator.previous());
}
Iterating over Set
Set<String> names = new HashSet<>(Arrays.asList("Clementine", "Duran", "Mike"));
names.forEach(System.out::println);
for (Iterator<String> iterator = names.iterator(); iterator.hasNext(); ) {
System.out.println(iterator.next());
}
for (String name : names) {
System.out.println(name);
}
Iterator iterator = names.iterator();
while (iterator.hasNext()) {
System.out.println(iterator.next());
}
Iterating over Map
Map<Integer, String> names = new HashMap<>();
names.put(1, "Clementine");
names.put(2, "Duran");
names.put(3, "Mike");
names.forEach((key, value) -> System.out.println("Key: " + key + " Value: " + value));
for (Map.Entry<Integer, String> entry : names.entrySet()) {
System.out.println(entry.getKey());
System.out.println(entry.getValue());
}
// Iterating over only keys
for (Integer key : names.keySet()) {
System.out.println(key);
}
// Iterating over only values
for (String value : names.values()) {
System.out.println(value);
}
Iterator entries = names.entrySet().iterator();
while (entries.hasNext()) {
Map.Entry entry = (Map.Entry) entries.next();
System.out.println(entry.getKey());
System.out.println(entry.getValue());
}
Immutable Empty Collections
Sometimes it is appropriate to use an immutable empty collection. The Collections
class provides methods to get such collections in an efficient way:
List<String> anEmptyList = Collections.emptyList();
Map<Integer, Date> anEmptyMap = Collections.emptyMap();
Set<Number> anEmptySet = Collections.emptySet();
These methods are generic and will automatically convert the returned collection to the type it is assigned to. That is, an invocation of e.g. emptyList()
can be assigned to any type of List
and likewise for emptySet()
and emptyMap()
.
The collections returned by these methods are immutable in that they will throw UnsupportedOperationException
if you attempt to call methods which would change their contents (add
, put
, etc.). These collections are primarily useful as substitutes for empty method results or other default values, instead of using null
or creating objects with new
.
Collections and Primitive Values
Collections in Java only work for objects. I.e. there is no Map<int, int>
in Java. Instead, primitive values need to be boxed into objects, as in Map<Integer, Integer>
. Java auto-boxing will enable transparent use of these collections:
Map<Integer, Integer> map = new HashMap<>();
map.put(1, 17); // Automatic boxing of int to Integer objects
int a = map.get(1); // Automatic unboxing.
Unfortunately, the overhead of this is substantial. A HashMap<Integer, Integer>
will require about 72 bytes per entry (e.g. on 64-bit JVM with compressed pointers, and assuming integers larger than 256, and assuming 50% load of the map). Because the actual data is only 8 bytes, this yields a massive overhead. Furthermore, it requires two level of indirection (Map -> Entry -> Value) it is unnecessarily slow.
There exist several libraries with optimized collections for primitive data types (that require only ~16 bytes per entry at 50% load, i.e. 4x less memory, and one level of indirection less), that can yield substantial performance benefits when using large collections of primitive values in Java.
Removing matching items from Lists using Iterator.
Above I noticed an example to remove items from a List within a Loop and I thought of another example that may come in handy this time using the Iterator
interface.
This is a demonstration of a trick that might come in handy when dealing with duplicate items in lists that you want to get rid of.
Note: This is only adding on to the Removing items from a List within a loop example:
So let’s define our lists as usual
String[] names = {"James","Smith","Sonny","Huckle","Berry","Finn","Allan"};
List<String> nameList = new ArrayList<>();
//Create a List from an Array
nameList.addAll(Arrays.asList(names));
String[] removeNames = {"Sonny","Huckle","Berry"};
List<String> removeNameList = new ArrayList<>();
//Create a List from an Array
removeNameList.addAll(Arrays.asList(removeNames));
The following method takes in two Collection objects and performs the magic of removing the elements in our removeNameList
that match with elements in nameList
.
private static void removeNames(Collection<String> collection1, Collection<String> collection2) {
//get Iterator.
Iterator<String> iterator = collection1.iterator();
//Loop while collection has items
while(iterator.hasNext()){
if (collection2.contains(iterator.next()))
iterator.remove(); //remove the current Name or Item
}
}
Calling the method and passing in the nameList
and the removeNameList
as follows removeNames(nameList,removeNameList);
Will produce the following output:
Array List before removing names:
James Smith Sonny Huckle Berry Finn Allan
Array List after removing names:
James Smith Finn Allan
A simple neat use for Collections that may come in handy to remove repeating elements within lists.
Creating your own Iterable structure for use with Iterator or for-each loop.
To ensure that our collection can be iterated using iterator or for-each loop, we have to take care of following steps:
- The stuff we want to iterate upon has to be
Iterable
and expose
iterator()
.
2. Design a java.util.Iterator
by overriding hasNext()
, next()
and remove()
.
I have added a simple generic linked list implementation below that uses above entities to make the linked list iterable.
package org.algorithms.linkedlist;
import java.util.Iterator;
import java.util.NoSuchElementException;
public class LinkedList<T> implements Iterable<T> {
Node<T> head, current;
private static class Node<T> {
T data;
Node<T> next;
Node(T data) {
this.data = data;
}
}
public LinkedList(T data) {
head = new Node<>(data);
}
public Iterator<T> iterator() {
return new LinkedListIterator();
}
private class LinkedListIterator implements Iterator<T> {
Node<T> node = head;
@Override
public boolean hasNext() {
return node != null;
}
@Override
public T next() {
if (!hasNext())
throw new NoSuchElementException();
Node<T> prevNode = node;
node = node.next;
return prevNode.data;
}
@Override
public void remove() {
throw new UnsupportedOperationException("Removal logic not implemented.");
}
}
public void add(T data) {
Node current = head;
while (current.next != null)
current = current.next;
current.next = new Node<>(data);
}
}
class App {
public static void main(String[] args) {
LinkedList<Integer> list = new LinkedList<>(1);
list.add(2);
list.add(4);
list.add(3);
//Test #1
System.out.println("using Iterator:");
Iterator<Integer> itr = list.iterator();
while (itr.hasNext()) {
Integer i = itr.next();
System.out.print(i + " ");
}
//Test #2
System.out.println("\n\nusing for-each:");
for (Integer data : list) {
System.out.print(data + " ");
}
}
}
Output
using Iterator:
1 2 4 3
using for-each:
1 2 4 3
This will run in Java 7+. You can make it run on Java 5 and Java 6 also by substituting:
LinkedList<Integer> list = new LinkedList<>(1);
with
LinkedList<Integer> list = new LinkedList<Integer>(1);
or just any other version by incorporating the compatible changes.
Pitfall: concurrent modification exceptions
This exception occurs when a collection is modified while iterating over it using methods other than those provided by the iterator object. For example, we have a list of hats and we want to remove all those that have ear flaps:
List<IHat> hats = new ArrayList<>();
hats.add(new Ushanka()); // that one has ear flaps
hats.add(new Fedora());
hats.add(new Sombrero());
for (IHat hat : hats) {
if (hat.hasEarFlaps()) {
hats.remove(hat);
}
}
If we run this code, ConcurrentModificationException will be raised since the code modifies the collection while iterating it. The same exception may occur if one of the multiple threads working with the same list is trying to modify the collection while others iterate over it. Concurrent modification of collections in multiple threads is a natural thing, but should be treated with usual tools from the concurrent programming toolbox such as synchronization locks, special collections adopted for concurrent modification, modifying the cloned collection from initial etc.
Sub Collections
List subList(int fromIndex, int toIndex)
Here fromIndex is inclusive and toIndex is exclusive.
List list = new ArrayList();
List list1 = list.subList(fromIndex,toIndex);
- If the list doesn’t exist in the give range, it throws IndexOutofBoundException.
- What ever changes made on the list1 will impact the same changes in the list.This is called backed collections.
- If the fromnIndex is greater than the toIndex (fromIndex > toIndex) it throws IllegalArgumentException.
Example:
List<String> list = new ArrayList<String>();
List<String> list = new ArrayList<String>();
list.add("Hello1");
list.add("Hello2");
System.out.println("Before Sublist "+list);
List<String> list2 = list.subList(0, 1);
list2.add("Hello3");
System.out.println("After sublist changes "+list);
Output:
Before Sublist [Hello1, Hello2]
After sublist changes [Hello1, Hello3, Hello2]
Set subSet(fromIndex,toIndex)
Here fromIndex is inclusive and toIndex is exclusive.
Set set = new TreeSet();
Set set1 = set.subSet(fromIndex,toIndex);
The returned set will throw an IllegalArgumentException on an attempt to insert an element outside its range.
Map subMap(fromKey,toKey)
fromKey is inclusive and toKey is exclusive
Map map = new TreeMap();
Map map1 = map.get(fromKey,toKey);
If fromKey is greater than toKey or if this map itself has a restricted range, and fromKey or toKey lies outside the bounds of the range then it throws IllegalArgumentException.
All the collections support backed collections means changes made on the sub collection will have same change on the main collection.