Core Java Part 2 : Collections Framework (HashMap & HashSet)
- ✅ 1. HashMap
- ⚙️ 1.1 Internal Structure (Java 8+)
- 🔍 1.2 How Does
HashMapWork?- 🧪 1.2.1 How are keys and values stored :
put()operation? - 🪣 1.2.1 What is a “Bucket” in
HashMap(In Depth Explaination) ? - 🧪 1.2.2 What about
get()operation?- 🔍 Internal Steps of
get(key)inHashMap(In Depth Explanation) - Step 1️⃣. Compute the Hash
- Step 2️⃣. Find the Bucket Index
- Step 3️⃣. Look Inside That Bucket
- Step 4️⃣. Traverse the Chain (Linked List or Tree)
- 🚨 Important: Uses
hashCode()andequals() - 🌳 If It’s a TreeNode
- 🧠 Visual Example
- 🧮 Time Complexity for
get()operation - 🧪 Summary of
get(key)
- 🔍 Internal Steps of
- 🧪 1.2.1 How are keys and values stored :
- 🔄 1.3 Resize / Rehash
- ⏱️ 1.4 Operational Time Complexity in HashMap
- ⚠️ 1.5 Important Notes
- 📌 1.7 Summary
- ✅ 2. HashSet
✅ 1. HashMap
HashMap<K, V>is a key-value pair data structure that provides:
- Fast lookup (O(1)) for get/put operations on average
- Allows null keys and values (but only one null key)
- Unordered — doesn’t preserve insertion order
- Not synchronized
⚙️ 1.1 Internal Structure (Java 8+)
- Internally,
HashMapis an array of buckets:
transient Node<K, V>[] table;
Each bucket is a linked list (or a balanced tree if it gets too long).
- The Node structure:
static class Node<K, V> implements Map.Entry<K, V> {
final int hash;
final K key;
V value;
Node<K, V> next;
}
🔍 1.2 How Does HashMap Work?
🧪 1.2.1 How are keys and values stored : put() operation?
map.put("language", "Java");
🧠 Internally, following steps are performed:
🔸 a. Hashing the key:
int hash = hash("language"); // spreads hash bits
The hash function ensures even distribution and reduces collisions.
🔸 b. Bucket Index:
int index = (n - 1) & hash; // where n = table.length
This computes which bucket to use for storing this <key, value> node.
🔸 c. Insert Node:
- If bucket is empty → just insert node
-
If bucket has nodes:
- Check keys using
.equals() - If same key → overwrite value
- If different key → append to list or treeify (see below)
- Check keys using
🌳 Treeification (Java 8 feature)
If a bucket has >8 nodes, and total map size is >=64, the linked list becomes a balanced Red-Black Tree to avoid O(n) lookup time.
Node → TreeNode (extends Node)
➡️ This makes worst-case lookup time O(log n), not O(n).
🪣 1.2.1 What is a “Bucket” in HashMap (In Depth Explaination) ?
Think of a bucket as a container at a specific index in an internal array.
Each element we add to a HashMap ends up in one of these buckets based on its hash value.
Internally, HashMap has:
Node<K, V>[] table; // an array of buckets
Each index in this array is a bucket.
✅ Let’s Take an Example
HashMap<String, String> map = new HashMap<>();
map.put("dog", "bark");
map.put("cat", "meow");
map.put("lion", "roar");
Here’s how Java stores these:
Step 1️⃣: Hash the Key
For each key ("dog", "cat", "lion"), Java generates a hash code:
int hash = key.hashCode();
Then it calculates the bucket index like this:
int index = (hash & (table.length - 1));
So every key is mapped to a particular index in the array.
Step 2️⃣: Find the Right Bucket
Suppose table.length = 16. After applying the above formula:
| Key | hashCode | index | Bucket |
|---|---|---|---|
| “dog” | 99532 | 4 | table[4] |
| “cat” | 98239 | 9 | table[9] |
| “lion” | 183532 | 4 | table[4] |
Uh-oh! Collision!
Notice both "dog" and "lion" go to the same bucket table[4].
This is where chaining comes in.
🔗 Chaining in Buckets
If multiple entries go into the same bucket, they are chained as a linked list.
table[4] → Node("dog", "bark") → Node("lion", "roar")
Each node holds:
keyvaluenextpointer (for the next node in the same bucket)
🤖 So What Happens Internally?
Here’s what Java does on put():
- Calculates the index (bucket) from the key’s hash
- Checks if any node exists at that index
-
If yes:
- Compares keys using
equals() - If same → updates value
- If different → appends new node
- Compares keys using
-
If no:
- Simply inserts a new node at that bucket
📦 Visual Representation
Index | Bucket Contents
------+---------------------------
0 | null
1 | null
2 | null
3 | null
4 | [dog → bark] → [lion → roar]
5 | null
6 | null
7 | null
8 | null
9 | [cat → meow]
...
🧠 Summary
- A bucket is just a slot (index) in an array.
- Each bucket holds a linked list (or tree) of entries that hash to that index.
- If multiple keys hash to the same bucket → Java chains them using
Node.next. - Java 8+: if more than 8 entries collide in one bucket → switches to a Red-Black Tree for faster lookup.
🧪 1.2.2 What about get() operation?
map.get("language");
- Hash the key
- Locate bucket
- Search the bucket (linear or binary search depending on list/tree)
- Compare keys using
.equals() - Return value if found
🔍 Internal Steps of get(key) in HashMap (In Depth Explanation)
Step 1️⃣. Compute the Hash
int hash = hash("lion"); // Java computes a spread hash
Java uses
Objects.hashCode(key)+ a bit-spreading method to get a uniform distribution.
Step 2️⃣. Find the Bucket Index
int index = (n - 1) & hash; // n = table.length
This gives us the bucket index (say, index = 4).
Step 3️⃣. Look Inside That Bucket
Node<K,V> first = table[index];
That first node may be:
[dog → bark] → [lion → roar] → null
Step 4️⃣. Traverse the Chain (Linked List or Tree)
Java traverses the nodes one by one, comparing each node’s key:
if (first.hash == hash && (first.key.equals("lion"))) {
return first.value;
}
If not, go to first.next, and repeat.
"dog"≠"lion"→ skip"lion"=="lion"→ return"roar"
🚨 Important: Uses hashCode() and equals()
To identify the correct key, Java checks:
hash→ for fast lookupequals()→ for exact match
🌳 If It’s a TreeNode
If the bucket has been treeified (i.e. has a Red-Black Tree instead of a list), then it uses binary search on the tree nodes:
TreeNode.getTreeNode(hash, key)
Which is O(log n) instead of O(n)
🧠 Visual Example
Say the internal table looks like this:
table[4] →
┌─────────────────────────────┐
│ [hash1, "dog", "bark"] │
│ ↓ │
│ [hash2, "lion", "roar"] │
└─────────────────────────────┘
Doing map.get("lion") will:
- Go to
table[4] - First node:
"dog"≠"lion"→ move to next - Second node:
"lion"=="lion"→ return"roar"
🧮 Time Complexity for get() operation
| Case | Time |
|---|---|
| Best/Average | O(1) |
| Worst (list) | O(n) |
| Worst (tree) | O(log n) |
Java uses treeification to avoid O(n) degradation due to too many collisions.
🧪 Summary of get(key)
| Step | What Happens |
|---|---|
hash(key) |
Java generates a spread hash |
| Index calc | (n - 1) & hash gives bucket index |
| Search | Traverses linked list or tree at that bucket |
| Match | Uses .equals() and hash to find the exact key match |
| Return | Returns value if found, null if not |
🔄 1.3 Resize / Rehash
If the map exceeds its load factor (default 0.75), it resizes:
newCapacity = oldCapacity * 2;
All entries are rehashed and redistributed across the new table.
➡️ Resizing is expensive: O(n) ✅ But it happens infrequently
⏱️ 1.4 Operational Time Complexity in HashMap
| Operation | Best / Average | Worst Case (before Java 8) | Worst Case (Java 8+)(Treeification Introduced) |
|---|---|---|---|
put() |
O(1) | O(n) | O(log n) |
get() |
O(1) | O(n) | O(log n) |
remove() |
O(1) | O(n) | O(log n) |
⚠️ 1.5 Important Notes
- HashMap Allows one null key, multiple null values
- Not thread-safe (use
ConcurrentHashMapfor concurrency) - Uses
equals()andhashCode()to manage key uniqueness
####🧠 1.6 When to Use HashMap
✅ Fast access by key ✅ No need for order ✅ Frequent lookups ❌ Avoid when:
- Need ordering → use
LinkedHashMaporTreeMap - Need thread safety → use
ConcurrentHashMap
📌 1.7 Summary
| Feature | Value |
|---|---|
| Backed by | Array + Linked List / Tree |
| Lookup Time | O(1) avg, O(log n) worst (Java 8+) |
| Allows null key | ✅ Yes (only one) |
| Thread-safe | ❌ No |
| Maintains order | ❌ No (use LinkedHashMap if needed) |
✅ 2. HashSet
🔷 2.1 What is a HashSet?
A HashSet is a collection that:
- Stores unique elements
- Has no guaranteed order
- Is backed internally by a
HashMap
🔧 2.2 Internal Mechanics
Internally, a HashSet<E> is implemented as:
private transient HashMap<E, Object> map;
In other words:
HashSetuses aHashMap, but only the keys matter. The values are just dummy placeholders.
private static final Object PRESENT = new Object();
- When we call:
set.add("apple");
Java does this internally:
map.put("apple", PRESENT);
So every item in the HashSet becomes a key in the backing HashMap.
🔁 Example
Set<String> set = new HashSet<>();
set.add("apple");
set.add("banana");
set.add("apple"); // duplicate
Internally behaves like:
map.put("apple", PRESENT);
map.put("banana", PRESENT);
// 2nd put("apple", ...) will just overwrite, not add
The second "apple" is ignored because HashMap does not allow duplicate keys.
✅ Uniqueness Guarantee
Uniqueness in HashSet depends on:
- Correct
hashCode()andequals()in the key type - If these are improperly implemented, we may get duplicates or incorrect behavior
📦 Storage
- Elements are stored as keys in the backing
HashMap - The dummy object
PRESENTtakes up minimal memory - Hash collisions are handled just like in a normal
HashMap(linked list / tree structure)
🚀 2.3 Performance
| Operation | Big-O Time |
|---|---|
add() |
O(1) avg |
contains() |
O(1) avg |
remove() |
O(1) avg |
In worst cases (lots of collisions): O(n)
📌 2.4 Summary
HashSetis a thin wrapper overHashMap- Every element is stored as a key in the map
- Ensures uniqueness via
hashCode()andequals() - Fast access, but unordered
Next post in this series : Core Java Part 2 : Collections Framework (TreeMap & TreeSet)