Master the most versatile data structure in coding interviews — O(1) lookups, frequency counting, complement patterns, and more.
A hash map (also called a hash table) is a data structure that stores key-value pairs. It allows you to associate a value with a unique key and retrieve that value later in near-constant time.
dict (dictionary)Object or MapHashMapunordered_mapUnder the hood a hash map is backed by an array. When you insert a key, a hash function converts the key into an integer, and that integer is mapped to an index in the array (usually via modulo). The value is then stored at that index.
Suppose our internal array has 8 slots and we want to store "alice" -> 90.
hash("alice") returns, say, 75728391837572839183 % 8 = 7("alice", 90) at slot 7Later, to look up "alice", we repeat the same hash and go directly to slot 7 — O(1).
/* Hash Table with 8 buckets */
Key hash(key) % 8 Bucket
------- -------------- ----------------------------
"alice" 7 [0] ->
"bob" 3 [1] ->
"charlie" 1 [2] ->
"diana" 3 (collision!) [3] -> ("bob",85) -> ("diana",92)
"eve" 5 [4] ->
[5] -> ("eve",78)
[6] ->
[7] -> ("alice",90)
Bucket 1: ("charlie", 88)
Bucket 3: ("bob", 85) -> ("diana", 92) // chained
A collision occurs when two different keys hash to the same index. There are two main strategies for resolving collisions:
Each bucket holds a linked list (or another collection). When a collision happens, the new key-value pair is appended to the list at that bucket.
# Chaining: bucket 3 has two entries
bucket[3] -> [("bob", 85)] -> [("diana", 92)] -> None
All entries live directly in the array. When a collision occurs, the algorithm probes forward to the next empty slot.
# Open Addressing: "diana" finds slot 3 taken, probes to slot 4
[0] [1] charlie [2] [3] bob [4] diana [5] eve [6] [7] alice
The load factor = (number of entries) / (number of buckets). When it exceeds a threshold (often 0.75), the hash map resizes (typically doubles) and rehashes all entries. This keeps operations near O(1) on average.
| Operation | Average | Worst Case |
|---|---|---|
| Insert | O(1) | O(n) |
| Lookup | O(1) | O(n) |
| Delete | O(1) | O(n) |
| Space | O(n) | O(n) |
If every single key hashes to the same bucket, the hash map degrades into a linked list. Every lookup would need to traverse all n elements — hence O(n).
# All keys hash to bucket 0:
bucket[0] -> ("a", 1) -> ("b", 2) -> ("c", 3) -> ... -> ("z", 26)
# Looking up "z" requires traversing all 26 entries
In practice, a good hash function distributes keys uniformly across buckets. With a reasonable load factor, the expected number of items per bucket is a small constant. This is called amortized O(1) — on average, each operation takes constant time.
For coding interviews, always state hash map operations as O(1) average, O(n) worst case. Most interviewers accept O(1) unless they specifically ask about worst case.
dictPython
# --- Creating ---
d = {} # empty dict
d = {"name": "Alice", "age": 25} # with initial values
d = dict(name="Alice", age=25) # using dict() constructor
# --- Accessing ---
d["name"] # "Alice" (KeyError if missing)
d.get("name") # "Alice" (None if missing)
d.get("gpa", 0.0) # 0.0 (default if missing)
# --- Inserting / Updating ---
d["grade"] = "A" # add new key
d["age"] = 26 # update existing key
# --- Deleting ---
del d["grade"] # remove key (KeyError if missing)
d.pop("grade", None) # remove key (no error if missing)
# --- Checking membership ---
"name" in d # True
"gpa" not in d # True
# --- Iterating ---
for key in d:
print(key) # "name", "age"
for key, value in d.items():
print(key, value) # "name" "Alice", "age" 26
for val in d.values():
print(val) # "Alice", 26
for key in d.keys():
print(key) # "name", "age"
# --- Useful methods ---
len(d) # 2
d.update({"gpa": 3.9}) # merge another dict
defaultdict and CounterPython
from collections import defaultdict, Counter
# --- defaultdict: auto-creates default values for missing keys ---
freq = defaultdict(int) # default value is 0
for char in "hello":
freq[char] += 1
# freq = {'h': 1, 'e': 1, 'l': 2, 'o': 1}
groups = defaultdict(list) # default value is []
groups["fruits"].append("apple") # no need to check if key exists
groups["fruits"].append("banana")
# --- Counter: count elements in one line ---
count = Counter("abracadabra")
# Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})
count.most_common(2) # [('a', 5), ('b', 2)]
# Counter from a list
nums = [1, 2, 2, 3, 3, 3]
count = Counter(nums)
# Counter({3: 3, 2: 2, 1: 1})
Map when keys are not strings (numbers, objects, etc.), when you need to preserve insertion order, or when you frequently add/remove keys.Object for simple string-keyed config, JSON-compatible data, or when you need object destructuring.JavaScript
// --- Creating ---
const obj = {};
const obj2 = { "name": "Alice", "age": 25 };
// --- Accessing ---
obj2["name"]; // "Alice"
obj2.name; // "Alice" (dot notation)
// --- Inserting / Updating ---
obj2["grade"] = "A";
obj2.grade = "A";
// --- Deleting ---
delete obj2.grade;
// --- Checking ---
"name" in obj2; // true
obj2.hasOwnProperty("name"); // true
// --- Iterating ---
for (const key in obj2) {
console.log(key, obj2[key]);
}
Object.keys(obj2); // ["name", "age"]
Object.values(obj2); // ["Alice", 25]
Object.entries(obj2); // [["name","Alice"], ["age",25]]
JavaScript
// --- Creating ---
const map = new Map();
const map2 = new Map([
["name", "Alice"],
["age", 25]
]);
// --- Operations ---
map2.set("grade", "A"); // insert / update
map2.get("name"); // "Alice"
map2.has("name"); // true
map2.delete("grade"); // remove
map2.size; // 2
// --- Iterating ---
map2.forEach((value, key) => {
console.log(key, value);
});
for (const [key, value] of map2) {
console.log(key, value);
}
// --- Keys can be ANY type ---
const objKey = { id: 1 };
map.set(objKey, "object as key"); // works!
map.set(42, "number as key"); // works!
// --- Convert to/from array ---
const arr = [...map2]; // [["name","Alice"], ["age",25]]
const fromArr = new Map(arr); // back to Map
A hash set is like a hash map but it stores only keys — no associated values. It answers one question fast: "Is this element in the collection?"
setPython
# --- Creating ---
s = set() # empty set (NOT {} which is a dict)
s = {1, 2, 3} # set with values
s = set([1, 2, 2, 3]) # from list (duplicates removed)
# --- Operations ---
s.add(4) # add element
s.remove(4) # remove (KeyError if missing)
s.discard(4) # remove (no error if missing)
3 in s # True (O(1) lookup)
len(s) # 3
# --- Set Operations ---
a = {1, 2, 3}
b = {2, 3, 4}
a | b # Union: {1, 2, 3, 4}
a.union(b) # Union: {1, 2, 3, 4}
a & b # Intersection: {2, 3}
a.intersection(b) # Intersection: {2, 3}
a - b # Difference: {1}
a.difference(b) # Difference: {1}
a ^ b # Symmetric difference: {1, 4}
a.symmetric_difference(b) # Symmetric difference: {1, 4}
a.issubset(b) # False
a.issuperset(b) # False
SetJavaScript
// --- Creating ---
const s = new Set();
const s2 = new Set([1, 2, 2, 3]); // {1, 2, 3} (duplicates removed)
// --- Operations ---
s2.add(4); // add element
s2.delete(4); // remove element (returns boolean)
s2.has(3); // true (O(1) lookup)
s2.size; // 3
// --- Iterating ---
s2.forEach(val => console.log(val));
for (const val of s2) { console.log(val); }
// --- Set Operations (manual in JS) ---
const a = new Set([1, 2, 3]);
const b = new Set([2, 3, 4]);
// Union
const union = new Set([...a, ...b]); // {1, 2, 3, 4}
// Intersection
const inter = new Set([...a].filter(x => b.has(x))); // {2, 3}
// Difference (a - b)
const diff = new Set([...a].filter(x => !b.has(x))); // {1}
// Symmetric Difference
const symDiff = new Set(
[...a].filter(x => !b.has(x)).concat([...b].filter(x => !a.has(x)))
); // {1, 4}
// Convert to array
const arr = [...s2]; // [1, 2, 3]
const arr2 = Array.from(s2); // [1, 2, 3]
Count how often each element appears. This is the single most common hash map pattern in interviews.
Python
def frequency_count(nums):
freq = {}
for num in nums:
freq[num] = freq.get(num, 0) + 1
return freq
# Or with Counter:
from collections import Counter
freq = Counter(nums)
# Example
print(frequency_count([1, 2, 2, 3, 3, 3]))
# {1: 1, 2: 2, 3: 3}
JavaScript
function frequencyCount(nums) {
const freq = new Map();
for (const num of nums) {
freq.set(num, (freq.get(num) || 0) + 1);
}
return freq;
}
// Example
console.log(frequencyCount([1, 2, 2, 3, 3, 3]));
// Map { 1 => 1, 2 => 2, 3 => 3 }
Given an array of integers and a target, return the indices of the two numbers that add up to the target.
num, check if target - num already exists in the hash map.
Brute force: Check every pair — O(n^2).
Hash map: One pass, store complements — O(n).
Python
def two_sum(nums, target):
# Map: number -> its index
seen = {}
for i, num in enumerate(nums):
complement = target - num # what we need to find
if complement in seen: # have we seen it before?
return [seen[complement], i] # return both indices
seen[num] = i # store current number and index
return [] # no solution found
# --- Walk-through ---
# nums = [2, 7, 11, 15], target = 9
#
# i=0: num=2, comp=7, seen={} -> not found, store {2: 0}
# i=1: num=7, comp=2, seen={2: 0} -> FOUND! return [0, 1]
print(two_sum([2, 7, 11, 15], 9)) # [0, 1]
JavaScript
function twoSum(nums, target) {
const seen = new Map(); // number -> index
for (let i = 0; i < nums.length; i++) {
const complement = target - nums[i]; // what we need
if (seen.has(complement)) { // have we seen it?
return [seen.get(complement), i]; // return both indices
}
seen.set(nums[i], i); // store current number
}
return []; // no solution
}
// Walk-through:
// nums = [2, 7, 11, 15], target = 9
//
// i=0: num=2, comp=7, seen=Map{} -> not found, store {2->0}
// i=1: num=7, comp=2, seen=Map{2->0} -> FOUND! return [0, 1]
console.log(twoSum([2, 7, 11, 15], 9)); // [0, 1]
Given an array of strings, group anagrams together. Two strings are anagrams if they contain the same characters in any order.
Python
from collections import defaultdict
def group_anagrams(strs):
groups = defaultdict(list)
for s in strs:
# Sort the string to create a canonical key
key = tuple(sorted(s)) # tuple because lists can't be dict keys
groups[key].append(s)
return list(groups.values())
# --- Walk-through ---
# strs = ["eat", "tea", "tan", "ate", "nat", "bat"]
#
# "eat" -> sorted = ('a','e','t') -> groups[('a','e','t')] = ["eat"]
# "tea" -> sorted = ('a','e','t') -> groups[('a','e','t')] = ["eat","tea"]
# "tan" -> sorted = ('a','n','t') -> groups[('a','n','t')] = ["tan"]
# "ate" -> sorted = ('a','e','t') -> groups[('a','e','t')] = ["eat","tea","ate"]
# "nat" -> sorted = ('a','n','t') -> groups[('a','n','t')] = ["tan","nat"]
# "bat" -> sorted = ('a','b','t') -> groups[('a','b','t')] = ["bat"]
#
# Result: [["eat","tea","ate"], ["tan","nat"], ["bat"]]
print(group_anagrams(["eat","tea","tan","ate","nat","bat"]))
JavaScript
function groupAnagrams(strs) {
const groups = new Map();
for (const s of strs) {
// Sort the string to create a canonical key
const key = s.split("").sort().join("");
if (!groups.has(key)) {
groups.set(key, []);
}
groups.get(key).push(s);
}
return [...groups.values()];
}
// Walk-through:
// "eat" -> sorted = "aet" -> groups{"aet": ["eat"]}
// "tea" -> sorted = "aet" -> groups{"aet": ["eat","tea"]}
// "tan" -> sorted = "ant" -> groups{"ant": ["tan"]}
// "ate" -> sorted = "aet" -> groups{"aet": ["eat","tea","ate"]}
// "nat" -> sorted = "ant" -> groups{"ant": ["tan","nat"]}
// "bat" -> sorted = "abt" -> groups{"abt": ["bat"]}
//
// Result: [["eat","tea","ate"], ["tan","nat"], ["bat"]]
console.log(groupAnagrams(["eat","tea","tan","ate","nat","bat"]));
Python
def contains_duplicate(nums):
seen = set()
for num in nums:
if num in seen:
return True
seen.add(num)
return False
# One-liner alternative:
def contains_duplicate(nums):
return len(nums) != len(set(nums))
JavaScript
function containsDuplicate(nums) {
const seen = new Set();
for (const num of nums) {
if (seen.has(num)) return true;
seen.add(num);
}
return false;
}
// One-liner alternative:
const containsDuplicate = nums => new Set(nums).size !== nums.length;
Reach for a hash map (or hash set) when you see these patterns in a problem:
| Pattern | Use | Example |
|---|---|---|
| O(1) lookups | Hash map / set | "Have we seen this element before?" |
| Counting frequencies | Hash map | "How many times does each char appear?" |
| Caching / memoization | Hash map | "Store computed results to avoid recalculation" |
| Mapping relationships | Hash map | "Map each node to its parent" |
| Removing duplicates | Hash set | "Return unique elements only" |
| Complement problems | Hash map | "Find two numbers that sum to target" |
| Grouping | Hash map of lists | "Group anagrams", "group by category" |
| Prefix sums | Hash map | "Subarray sum equals K" |
If the brute force is O(n^2) because of nested "find" operations, a hash map can almost always bring it down to O(n).
Essential hash map problems ordered from easy to medium. Practice these to build pattern recognition.
| Problem | Difficulty | Pattern | Key Idea |
|---|---|---|---|
| Two Sum | Easy | Complement lookup | Store num->index, check if target-num exists |
| Valid Anagram | Easy | Frequency counter | Compare character frequencies of both strings |
| Contains Duplicate | Easy | Set membership | Add to set; if already present, duplicate found |
| Group Anagrams | Medium | Sorted key grouping | Sort each word as key, group values in a list |
| Top K Frequent Elements | Medium | Freq map + bucket sort | Count frequencies, then use bucket sort or heap for top K |
| Longest Consecutive Sequence | Medium | Set + sequence start | Put all in set; only start counting from sequence beginnings |
| Subarray Sum Equals K | Medium | Prefix sum + hash map | Store prefix_sum->count; check if prefix_sum - k exists |
| LRU Cache | Medium | Hash map + doubly linked list | O(1) access via map, O(1) eviction via linked list |
Given an array of integers nums and an integer target, return indices of the two numbers such that they add up to target. You may assume each input has exactly one solution.
Python
def twoSum(nums, target):
seen = {} # value -> index
for i, num in enumerate(nums):
complement = target - num # the number we need
if complement in seen: # O(1) lookup
return [seen[complement], i]
seen[num] = i # store for future lookups
return []
# Time: O(n) - single pass through array
# Space: O(n) - hash map stores up to n entries
JavaScript
var twoSum = function(nums, target) {
const seen = new Map(); // value -> index
for (let i = 0; i < nums.length; i++) {
const complement = target - nums[i]; // the number we need
if (seen.has(complement)) { // O(1) lookup
return [seen.get(complement), i];
}
seen.set(nums[i], i); // store for future lookups
}
return [];
};
// Time: O(n) - single pass through array
// Space: O(n) - Map stores up to n entries
Given an array of strings strs, group the anagrams together. You can return the answer in any order.
Example: ["eat","tea","tan","ate","nat","bat"] returns [["eat","tea","ate"],["tan","nat"],["bat"]]
Python
from collections import defaultdict
def groupAnagrams(strs):
groups = defaultdict(list) # sorted_key -> [original strings]
for s in strs:
key = tuple(sorted(s)) # O(k log k) where k = len(s)
groups[key].append(s)
return list(groups.values())
# Time: O(n * k log k) where n = number of strings, k = max string length
# Space: O(n * k) for storing all strings in groups
# --- Alternative: character count key (avoids sorting) ---
def groupAnagrams(strs):
groups = defaultdict(list)
for s in strs:
# Count frequency of each letter as key
count = [0] * 26
for c in s:
count[ord(c) - ord('a')] += 1
groups[tuple(count)].append(s)
return list(groups.values())
# Time: O(n * k) - no sorting needed
# Space: O(n * k)
JavaScript
var groupAnagrams = function(strs) {
const groups = new Map(); // sorted_key -> [original strings]
for (const s of strs) {
const key = s.split("").sort().join(""); // O(k log k)
if (!groups.has(key)) {
groups.set(key, []);
}
groups.get(key).push(s);
}
return [...groups.values()];
};
// Time: O(n * k log k)
// Space: O(n * k)
// --- Alternative: character count key ---
var groupAnagrams = function(strs) {
const groups = new Map();
for (const s of strs) {
const count = new Array(26).fill(0);
for (const c of s) {
count[c.charCodeAt(0) - 97]++;
}
const key = count.join(","); // "1,0,0,...,1,0" as string key
if (!groups.has(key)) {
groups.set(key, []);
}
groups.get(key).push(s);
}
return [...groups.values()];
};
// Time: O(n * k)
// Space: O(n * k)
Building your own hash map helps you understand the internals. Here is a simplified but educational implementation using chaining (array of linked lists).
Python
class HashMap:
def __init__(self, capacity=16):
self.capacity = capacity
self.size = 0
self.buckets = [[] for _ in range(capacity)]
def _hash(self, key):
"""Convert key to bucket index."""
return hash(key) % self.capacity
def put(self, key, value):
"""Insert or update a key-value pair."""
index = self._hash(key)
bucket = self.buckets[index]
# Check if key already exists in this bucket
for i, (k, v) in enumerate(bucket):
if k == key:
bucket[i] = (key, value) # update existing
return
# Key not found, add new entry
bucket.append((key, value))
self.size += 1
# Resize if load factor exceeds 0.75
if self.size / self.capacity > 0.75:
self._resize()
def get(self, key, default=None):
"""Retrieve value by key. Returns default if not found."""
index = self._hash(key)
bucket = self.buckets[index]
for k, v in bucket:
if k == key:
return v
return default
def remove(self, key):
"""Remove a key-value pair. Returns True if removed."""
index = self._hash(key)
bucket = self.buckets[index]
for i, (k, v) in enumerate(bucket):
if k == key:
bucket.pop(i)
self.size -= 1
return True
return False
def contains(self, key):
"""Check if key exists."""
return self.get(key) is not None
def _resize(self):
"""Double capacity and rehash all entries."""
old_buckets = self.buckets
self.capacity *= 2
self.size = 0
self.buckets = [[] for _ in range(self.capacity)]
for bucket in old_buckets:
for key, value in bucket:
self.put(key, value)
def __repr__(self):
items = []
for bucket in self.buckets:
for k, v in bucket:
items.append(f"{k}: {v}")
return "HashMap({" + ", ".join(items) + "})"
# --- Test it out ---
hm = HashMap()
hm.put("alice", 90)
hm.put("bob", 85)
hm.put("charlie", 88)
print(hm.get("alice")) # 90
print(hm.get("bob")) # 85
print(hm.get("unknown", -1)) # -1
print(hm.contains("charlie")) # True
hm.put("alice", 95) # update Alice's score
print(hm.get("alice")) # 95
hm.remove("bob")
print(hm.contains("bob")) # False
print(hm) # HashMap({alice: 95, charlie: 88})
hash(key) % capacity maps any key to a bucket indexThis implementation is for learning. Real hash maps (like Python's dict) use open addressing, better hash functions, and many optimizations. But the core ideas are the same, and this is the level of detail interviewers expect.
d.get("key", default) returns the value if the key exists, or default if it does not. Option A would raise a KeyError if the key is missing (the or only helps if the value is falsy, not if the key is absent). Options C and D are not valid dict methods.