Hash Tables

Contents[Show]

We missed hash table in C++ for a long time. They promise to have constant access time. C++11 has hash tables in four variations. The official name is unordered associative containers. Unofficially, they are called dictionaries or just simple associative arrays. 

Classical C++ has four different associative containers. With C++11 we get four additional ones. First, we need a systematic.

Associative container

All associative container have in common that the associated a key with a value. Therefore, you can get the value by using the key. The classical associative containers are called ordered associative containers; the new ones unordered associative containers.

Ordered associative containers

The small difference is that the keys of the classical associative containers are ordered. By default, the smaller relation (<) is used. Therefore, the container are sorted in an increasing order.

This ordering has interesting consequences for the ordered associative containers.

  • The key has to support an ordering relation.
  • Associative containers are typically implemented in balanced binary search trees.
  • The access time to the key and therefore to the value is logarithmic.

The most often used ordered associative container is std::map:

std::map<int,std::string> int2String{ {3,"three"},{2,"two"},{1,"one"},{5,"five"},{6,"six"},{4,"four"},{7,"seven"} };

 

The balanced binary search tree may have the following structure.

geordneteAssoziativeArrays

Unordered associative containers

The key idea of the unordered associative containers is that the key is mapped with the help of the hash function onto the bucket. You can find the key/value pair in the bucket.

I have to introduce a few terms before I describe the characteristics of unordered associative containers.

  • Hash value: The value you will get if you apply the hash function onto the key.
  • Collision: If different keys are mapped to the same has value, we will have a collision. The unordered associative containers have to deal with this situation.

The usage of the hash function has very important consequences for the unordered associative container.

  • The keys have to support the equal comparison in order to deal with collisions.
  • The hash value of a key has to be available.
  • The execution of a hash function is a constant. Therefore, the access time to the keys of an unordered associative container is constant. For simplicity reasons I ignored collisions.

 

According to std::map that is the most often used ordered associative container std::unordered_map will become the most often used unordered associative container.

std::unordered_map<std::string,int> str2Int{ {"Grimm",491634333356},{"Grimm-Jaud",49160123335}, {"Schmidt",4913333318},{"Huber",490001326} };
 

The graphic show the mapping of the keys to their bucket by using the hash function.

UngeordnetesAssoziativesArrayEng

The similarities

The similar names of std::map and std::unordered_map is not by accident. Both have a similar in interface. To be more precise. The interface of std::map is a subset of the interface of std::unorederd_map. Have a look:

 

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// mapHashCompare.cpp

#include <iostream>
#include <map>
#include <unordered_map>

int main(){

  std::cout << std::endl;

  std::cout << "C++ map: " << std::endl;
  std::map<std::string,int> m { {"Dijkstra",1972},{"Scott",1976} };
  m["Ritchie"] = 1983;
  std::cout << "    m[Ritchie]: " <<  m["Ritchie"] << "\n    ";
  for(auto p : m) std::cout << '{' << p.first << ',' << p.second << '}';
  m.erase("Scott");
  std::cout << "\n    ";
  for(auto p : m) std::cout << '{' << p.first << ',' << p.second << '}';
  m.clear();
  std::cout << std::endl;
  std::cout << "    m.size(): " << m.size() << std::endl;


  std::cout << std::endl;

  std::cout << "C++11 unordered_map: " << std::endl;
  std::unordered_map<std::string,int> um { {"Dijkstra",1972},{"Scott",1976} };
  um["Ritchie"] = 1983;
  std::cout << "    um[Ritchie]: " <<  um["Ritchie"] << "\n    ";
  for(auto p : um) std::cout << '{' << p.first << ',' << p.second << '}';
  um.erase("Scott");
  std::cout << "\n    ";
  for(auto p : um) std::cout << '{' << p.first << ',' << p.second << '}';
  um.clear();
  std::cout << std::endl;
  std::cout << "    um.size(): " << um.size() << std::endl;

  std::cout << std::endl;

}
 
A typical case of copy and past. The difference between the lines 11 - 21 and 26 - 36 is only that I used in the first case std::map; that I used in the second case std::unordered_map. Therefore, I will only explain the second case. I initialized in line 27 the std::unordered_map with the help of an initializer list. Then I update the value of the key "Richie" and return the associated value. In the lines 30 and 33 I display the key/values pairs by using a range-based for-loop. p.first is the first, p.second the second element of the pair. By using um.clear() I can clear the unordered associative container and by using um.size() I can determine its size.
 
The second glance reveals a small difference.
 .
mapHashCompare

 

The keys of the std::maps are ordered and therefore the pairs. Surprised? Of course not!. That is the visible difference between the ordered and unordered associative containers.

The eight variations

To get an ordering into the eight variations I start with the classical ordered associative containers. You can easily apply the ordering to the unordered associative containers.

The answers to two questions is the key to get the systematic into to ordered associative containers:

  • Does the key have an associated value?
  • Are several identical keys possible?

By answering the two questions you get the four different ordered associative containers std::set, std::multiset, std::map, and std::multimap.

Of course, you can apply the two questions to the unordered associative containers. Now you get the containers std::unordered_set, std::unordered_multiset, std::unordered_map, and std::unordered_multimap.

 All what is left is to write down the systematic. If the name of container has the component

  • map, it has an associated value.
  • multi, it can has more than one identical key. 
  • unordered, it's keys are non-sorted.

But the analogy goes on. If two container names only differ in the name component unordered, they will have a similar interface. The container without the name component unordered supports an interface that is a subset of the unordered container. You saw it already in the listing mapHashCompare.cpp.

The table shows once more the entire systematic. The systematic includes the access time for the ordered containers (logarithmic) and the unordered container (constant).

OrderedVersusUnorderedContainers

What's next?

My first plan was it to show you the performance difference between ordered and unordered container. So, I will do it in the next post.

 


 

 

 

 

 

 

 

 

 

title page smalltitle page small Go to Leanpub/cpplibrary "What every professional C++ programmer should know about the C++ standard library".   Get your e-book. Support my blog.

 

Comments   

0 #1 Alexey Komnin 2016-11-24 16:48
unordered containers have amortised constant time complexity for access and insert operations.
It may take up to O(n) in worst case.
Quote
0 #2 Rainer Grimm 2016-11-24 17:35
Quoting Alexey Komnin:
unordered containers have amortised constant time complexity for access and insert operations.
It may take up to O(n) in worst case.

Alexey, you are totally right. But the details will follow in a later post. I argued in my post: "For simplicity reasons I ignored collisions."
Quote
0 #3 Siwen Zhang 2016-11-27 06:55
Just buy your book: The C++ Standard Library, keep up with your great posts, thanks!
Quote
0 #4 Rainer Grimm 2016-11-27 16:37
Quoting Siwen Zhang:
Just buy your book: The C++ Standard Library, keep up with your great posts, thanks!
Siwen, thanks a lot.
Quote

Add comment


My Newest E-Book

Latest comments

Subscribe to the newsletter (+ pdf bundle)

Blog archive

Source Code

Visitors

Today 559

All 255864

Currently are 332 guests and no members online