- Introduction
- Internal Structure
- Hashing Mechanism
- Collision Resolution
- Put Operation
- Get Operation
- Resizing and Rehashing
- Performance Analysis
HashMap is a hash table-based implementation of the Map interface in Java. It stores key-value pairs and provides constant-time performance O(1) for basic operations (get and put), assuming the hash function disperses elements properly.
- Allows null: One null key and multiple null values
- Not synchronized: Not thread-safe (use ConcurrentHashMap for thread safety)
- No ordering: Does not maintain insertion order (use LinkedHashMap)
- Initial capacity: Default 16
- Load factor: Default 0.75
HashMap internally uses an array of Node<K,V> objects. Each node represents a key-value pair.
static class Node<K,V> implements Map.Entry<K,V> {
final int hash; // Cached hash value
final K key; // The key
V value; // The value
Node<K,V> next; // Reference to next node (for collision handling)
}HashMap Internal Structure:
┌─────────────────────────────────────────┐
│ Node<K,V>[] table │
├─────────────────────────────────────────┤
│ Index 0 │ → [Node] → null │
├──────────┼──────────────────────────────┤
│ Index 1 │ → [Node] → [Node] → null │ (Collision chain)
├──────────┼──────────────────────────────┤
│ Index 2 │ → null │
├──────────┼──────────────────────────────┤
│ Index 3 │ → [Node] → null │
├──────────┼──────────────────────────────┤
│ ... │ ... │
├──────────┼──────────────────────────────┤
│ Index 15 │ → [Node] → [Node] → [Node] │ (Longer chain)
└──────────┴──────────────────────────────┘
Each Node contains:
┌──────────────────┐
│ hash: int │
│ key: K │
│ value: V │
│ next: Node<K,V> │
└──────────────────┘
HashMap uses a two-step hashing process to determine the bucket index.
// Get hashCode from the key
int hashCode = key.hashCode();
// Spread bits (Java 8+)
static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}// Calculate bucket index
int index = (n - 1) & hash; // n is the array length (always power of 2)Hashing Process:
─────────────────────────────────────────────────────────────
Key: "Apple"
│
▼
┌─────────────────────┐
│ key.hashCode() │ → hashCode = 63476538
└─────────────────────┘
│
▼
┌─────────────────────┐
│ hash(key) │ → Spread bits to reduce collision
│ h ^ (h >>> 16) │ → finalHash = 63479898
└─────────────────────┘
│
▼
┌─────────────────────┐
│ (n-1) & hash │ → index = 10 (if n=16, then 15 & hash)
└─────────────────────┘
│
▼
Place in bucket[10]
Why spread bits?
Original: 1111 0001 1100 0101 0110 1010 (high 16 bits)
0011 1000 1101 1010 (low 16 bits)
XOR: ──────────────────────────────
Result: Better distribution in lower bits
When two keys hash to the same index, HashMap uses separate chaining (linked list) and tree-based collision resolution (Java 8+).
Collision Resolution Evolution:
Before Java 8:
─────────────────────────────────────────
Bucket[5] → [Node1] → [Node2] → [Node3] → null
(Linked List only)
Java 8+ (Treeification):
─────────────────────────────────────────
If chain length > 8 AND table size >= 64:
Convert to Red-Black Tree
Bucket[5] → TreeNode (Root)
├── Left subtree
└── Right subtree
Benefits:
- Linked List: O(n) worst case
- Red-Black Tree: O(log n) worst case
Example: Keys "cat", "dog", "rat" all hash to index 3
┌────────────────────────────────────────────────────────┐
│ Index 3 │
└────────────────────────────────────────────────────────┘
│
▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ hash: 12345 │ │ hash: 12389 │ │ hash: 12367 │
│ key: "cat" │ ───► │ key: "dog" │ ───► │ key: "rat" │
│ value: 1 │ │ value: 2 │ │ value: 3 │
│ next: ───────────┘ │ next: ───────────┘ │ next: null │
└──────────────────┘ └──────────────────┘ └──────────────────┘
The put operation inserts or updates a key-value pair.
put(K key, V value) Flow:
──────────────────────────────────────────
1. Calculate hash
├── hash = hash(key)
└── index = (n-1) & hash
2. Check if bucket is empty
├── Yes: Create new node → Done
└── No: Handle collision
3. Collision handling
├── Key exists?
│ ├── Yes: Update value → Return old value
│ └── No: Continue
│
├── Is TreeNode?
│ ├── Yes: Insert in tree
│ └── No: Traverse linked list
│
└── Add to end of chain
└── Check if treeify needed (size > 8)
4. Increment size counter
5. Check if resize needed
└── if (size > threshold) resize()
Example: map.put("key1", "value1")
Step 1: Hash Calculation
────────────────────────
hash("key1") → 1234
index = 1234 & 15 = 10
Step 2: Check Bucket[10]
────────────────────────
┌─────────────────┐
│ Bucket[10]: null │ ← Empty
└─────────────────┘
Step 3: Insert New Node
────────────────────────
┌─────────────────┐
│ Bucket[10]: │
└─────────────────┘
│
▼
┌──────────────────┐
│ hash: 1234 │
│ key: "key1" │
│ value: "value1" │
│ next: null │
└──────────────────┘
Pseudocode:
───────────
if (bucket[index] == null) {
bucket[index] = new Node(hash, key, value, null);
} else {
// Handle collision (traverse chain or tree)
for (Node e = bucket[index]; e != null; e = e.next) {
if (e.hash == hash && e.key.equals(key)) {
e.value = value; // Update
return;
}
}
// Add to end of chain
addToChain(bucket[index], new Node(hash, key, value));
}
Retrieves the value associated with a key.
get(K key) Flow:
──────────────────────────────────────────
1. Calculate hash
├── hash = hash(key)
└── index = (n-1) & hash
2. Get first node at bucket[index]
└── if (first == null) return null
3. Check first node
└── if (first.key.equals(key)) return first.value
4. Traverse chain/tree
├── Is TreeNode?
│ └── Search in Red-Black Tree: O(log n)
│
└── Is LinkedList?
└── Linear search: O(n)
for (Node e = first.next; e != null; e = e.next)
if (e.key.equals(key)) return e.value
5. Not found
└── return null
Example: map.get("key2")
Step 1: Calculate Index
────────────────────────
hash("key2") → 5678
index = 5678 & 15 = 6
Step 2: Access Bucket[6]
────────────────────────
┌─────────────────┐
│ Bucket[6]: │
└─────────────────┘
│
▼
┌──────────────────┐ ┌──────────────────┐ ┌──────────────────┐
│ hash: 5600 │ │ hash: 5678 │ │ hash: 5690 │
│ key: "key1" │ ───► │ key: "key2" │ ───► │ key: "key3" │
│ value: "val1" │ │ value: "val2" │ ✓ │ value: "val3" │
│ next: ───────────┘ │ next: ───────────┘ │ next: null │
└──────────────────┘ └──────────────────┘ └──────────────────┘
│
└── Found! Return "val2"
Step 3: Compare Keys
────────────────────────
1. Check node1: "key1" != "key2" → Continue
2. Check node2: "key2" == "key2" → Match! Return "val2"
When HashMap reaches its threshold (capacity × load factor), it doubles in size.
Resize Trigger:
───────────────────────────────────────────
if (size > capacity * loadFactor) {
resize();
}
Default: capacity = 16, loadFactor = 0.75
Threshold = 16 * 0.75 = 12
When 13th element added → resize to 32
Before Resize (capacity = 4, size = 3):
────────────────────────────────────────
┌─────────┬───────────────────────┐
│ Index 0 │ → [A] → null │
├─────────┼───────────────────────┤
│ Index 1 │ → null │
├─────────┼───────────────────────┤
│ Index 2 │ → [B] → [C] → null │
├─────────┼───────────────────────┤
│ Index 3 │ → null │
└─────────┴───────────────────────┘
After Resize (capacity = 8):
────────────────────────────────────────
┌─────────┬───────────────────────┐
│ Index 0 │ → [A] → null │ (stayed)
├─────────┼───────────────────────┤
│ Index 1 │ → null │
├─────────┼───────────────────────┤
│ Index 2 │ → [B] → null │ (stayed)
├─────────┼───────────────────────┤
│ Index 3 │ → null │
├─────────┼───────────────────────┤
│ Index 4 │ → null │
├─────────┼───────────────────────┤
│ Index 5 │ → null │
├─────────┼───────────────────────┤
│ Index 6 │ → [C] → null │ (moved)
├─────────┼───────────────────────┤
│ Index 7 │ → null │
└─────────┴───────────────────────┘
Rehashing Logic:
────────────────────────────────────────
Old capacity: n
New capacity: 2n
For each node:
newIndex = hash & (2n - 1)
Because n is power of 2:
Node either stays at same index
OR moves to (oldIndex + oldCapacity)
Bit Optimization:
─────────────────────────────────────────
Old capacity = 16 (10000 in binary)
New capacity = 32 (100000 in binary)
For a hash value:
Old index: hash & 15 (0b01111)
New index: hash & 31 (0b11111)
▲
Extra bit
If extra bit is 0: Stay at old index
If extra bit is 1: Move to old index + 16
Example:
hash = 0b10110101 (181)
Old: 181 & 15 = 5
New: 181 & 31 = 21 (5 + 16)
hash = 0b10100101 (165)
Old: 165 & 15 = 5
New: 165 & 31 = 5 (stays)
| Operation | Average Case | Worst Case (before Java 8) | Worst Case (Java 8+) |
|---|---|---|---|
| get() | O(1) | O(n) | O(log n) |
| put() | O(1) | O(n) | O(log n) |
| remove() | O(1) | O(n) | O(log n) |
| containsKey() | O(1) | O(n) | O(log n) |
Space = O(n) where n is number of entries
Actual space = capacity * sizeof(Node)
Minimum capacity ≥ n / loadFactor
Example:
1000 entries, loadFactor = 0.75
Required capacity ≥ 1000 / 0.75 ≈ 1334
Actual capacity = 2048 (next power of 2)
Load Factor vs Performance:
────────────────────────────────────────
Load Factor = 0.5
├── More memory usage
├── Fewer collisions
└── Faster operations
Load Factor = 0.75 (Default)
├── Balanced trade-off
└── Good for most cases
Load Factor = 1.0
├── Less memory usage
├── More collisions
└── Slower operations
Collision probability increases exponentially
as load factor approaches 1.0
Collision Chain Distribution (Good Hash):
──────────────────────────────────────────
Bucket 0: ████ (4 items)
Bucket 1: ███ (3 items)
Bucket 2: █████ (5 items)
Bucket 3: ██ (2 items)
...
Average chain length ≈ loadFactor
Collision Chain Distribution (Bad Hash):
──────────────────────────────────────────
Bucket 0:
Bucket 1: ████████████████████ (20 items)
Bucket 2:
Bucket 3: ███████████████████████ (23 items)
...
Performance degrades significantly
- Power of 2 Capacity: Enables fast bitwise operations for index calculation
- Separate Chaining: Simple and effective collision resolution
- Treeification: Prevents worst-case O(n) performance in Java 8+
- Load Factor 0.75: Optimal balance between time and space
// 1. Initialize with expected size
Map<String, Integer> map = new HashMap<>(expectedSize);
// 2. Override hashCode() and equals() properly
@Override
public int hashCode() {
return Objects.hash(field1, field2);
}
@Override
public boolean equals(Object obj) {
// Implement proper equality check
}
// 3. Use immutable keys
final class Key {
private final String id;
private final int code;
// ... immutable implementation
}
// 4. Avoid excessive resizing
int initialCapacity = (int) (expectedSize / 0.75f) + 1;
Map<K, V> map = new HashMap<>(initialCapacity);❌ Mutable keys
└── hashCode changes → can't find entry
❌ Poor hashCode implementation
└── All keys hash to same bucket → O(n) performance
❌ Not overriding equals with hashCode
└── Violates contract → unpredictable behavior
❌ Using default capacity for large datasets
└── Multiple resize operations → performance hit
✓ Use immutable keys
✓ Implement good hash function
✓ Override both hashCode and equals
✓ Size appropriately from start
HashMap is a highly optimized data structure that balances performance and memory usage. Understanding its internals helps in writing efficient code and debugging performance issues. The evolution from linked lists to red-black trees in Java 8 showcases continuous optimization for real-world scenarios.
Key Points:
- Uses array + linked list/tree hybrid structure
- O(1) average case performance for basic operations
- Automatic resizing maintains performance as size grows
- Proper hash function and key design are critical for performance