JavaScript Hash Table

What is a Hash Table in JavaScript?

In JavaScript, a hash table, commonly referred to as a hash map, is recognized as a data structure that facilitates the efficient storage of key-value pairs. By employing hash functions, we can associate keys with specific indices within an array, which enables quicker data insertion and retrieval.

In summary, it is a data structure that enables the creation of a collection of associated value pairs. By utilizing a hash table, we can access a particular value by employing the corresponding key, which can be stored within the table.

A Hash Table is a fundamental and extensively utilized data structure within JavaScript. In this context, a map object serves the function of a hash table. It includes a set method for inserting a new key-value pair, as well as a get method to retrieve a value associated with a specific key. Additionally, this mapping features a has method to determine whether a key is present, and a delete method to eliminate key-value pairs.

Hash tables are made up of two parts:

Object

A structure known as a table is utilized for storing data. This table contains an array that encompasses all the key-value pairs. The dimensions of the array must be determined based on the anticipated volume of data.

Hash function

This function is designed to identify the index of our key-value pair. It ought to operate as a one-way function and generate a distinct hash for every key.

How Does a Hash Table Work?

In JavaScript, a hash table is composed of an array along with a hash function. This hash function receives a key as its input and produces an index that indicates where the corresponding value linked to that key ought to be stored or accessed.

Let’s consider a straightforward illustration to clarify the idea of a hash table: Imagine you are in a vast library containing thousands of books. In order to locate a particular book, you would have to sift through all the shelves until you come across the one you need. This process could take a significant amount of time.

Envision having an enchanting function that reveals precisely which shelf houses a specific book. This would eliminate the need to sift through every shelf; instead, we could head straight to the correct shelf and locate the book with ease. This extraordinary function resembles the hash function utilized in a hash table, while the shelves in the library represent the array.

Why are Hash Tables Useful?

In JavaScript, hash tables serve as effective and efficient data structures for a variety of reasons:

Fast Lookups

In JavaScript, utilizing an effective hash function allows us to achieve constant-time (O(1)) lookups. This indicates that the duration required to retrieve a value remains unaffected by the total number of elements present in the table.

Flexible keys

In JavaScript, by utilizing hash tables, it becomes possible to employ any data type as a key, rendering them highly adaptable and versatile.

No duplicate keys

As each key within a hash table is required to be distinct, there is no need to be concerned about duplicate keys replacing existing values.

These characteristics render hash tables ideal for various applications such as saving configuration parameters, caching information, tallying the occurrence of words within a document, or developing functionalities such as autocomplete.

Example


class HashTable {
    constructor(size = 53) {
      this.keyMap = new Array(size);
    }
  
    _hash(key) {
        let total = 0;
        let WEIRD_PRIME = 31;
        for (let i = 0; i < Math.min(key.length, 100); i++) {
          let char = key[i];
          let value = char.charCodeAt(0) - 96;
          total = (total * WEIRD_PRIME + value) % this.keyMap.length;
        }
        return total;
      }
      set(key, value) {
        let index = this._hash(key);
        if (!this.keyMap[index]) {
          this.keyMap[index] = [];
        }
        this.keyMap[index].push([key, value]);
      }
      get(key) {
        let index = this._hash(key);
        if (this.keyMap[index]) {
          for (let i = 0; i < this.keyMap[index].length; i++) {
            if (this.keyMap[index][i][0] === key) {
              return this.keyMap[index][i][1];
            }
          }
        }
        return undefined;
      }
      keys() {
        let keysArr = [];
        for (let i = 0; i < this.keyMap.length; i++) {
          if (this.keyMap[i]) {
            for (let j = 0; j < this.keyMap[i].length; j++) {
              if (!keysArr.includes(this.keyMap[i][j][0])) {
                keysArr.push(this.keyMap[i][j][0])
              }
            }
          }
        }
        return keysArr;
      }
      values() {
        let valuesArr = [];
        for (let i = 0; i < this.keyMap.length; i++) {
          if (this.keyMap[i]) {
            for (let j = 0; j < this.keyMap[i].length; j++) {
              if (!valuesArr.includes(this.keyMap[i][j][1])) {
                valuesArr.push(this.keyMap[i][j][1])
              }
            }
          }
        }
        return valuesArr;
      }
    }

This implementation includes the following methods:

✅ constructor(size): It will initialise a new hash table with a given size. The default size is 53.
✅ _hash(key): hashes a given key, and it returns an integer representing the index at which the key pair should be stored in the keymap array.
✅ set(key, value): It stores a key-value pair in the hash table.
✅ get(key): It returns the value associated with a given key, or undefined if the key is not found in the hash table.
✅ keys: It returns an array of all keys in the hash table.
✅ values: It returns an array of all values in the hash table.

Common hash functions

There are different kinds of hash functions that have different uses, such as:

✅ Arithmetic Modular: In this function, we take the modular of the key with the list/array size: index= key MOD tableSize. So, the index will always stay between 0 and tableSize -1.
✅ Truncation: Here, we select a part of the key as the index rather than the whole key. We can use a mod function for this operation, although it does not to be based on the array size.
✅ Folding: with the help of this approach, it divides the key into small chunks and applies a different arithmetic strategy at each chunk.

Hash Table collisions

In JavaScript, there are instances when a hash function produces an identical index for multiple keys. This phenomenon is referred to as a hash collision. In the context of JavaScript, collisions pose a challenge since each slot within a hash table is intended to accommodate only one element.

There are four prevalent methods we can employ to manage hash collisions:

Linear probing

Linear probing is a technique that operates by bypassing an index that is already occupied. This method can be implemented by incorporating an increment value to an index that has already been calculated. If that resulting index is also occupied, the increment is applied once more, and this process continues iteratively.

However, one limitation of employing this approach is that if we do not select an appropriate offset, we may revert to our initial position, thereby overlooking numerous potential locations within the array.

Chaining

Within the chaining approach, every position in our hash table maintains a reference to another data structure, which could be a linked list or a tree. Any entry that corresponds to that specific index will be added to the linked list associated with that index.

As demonstrated, chaining enables us to hash several key-value pairs at a single index in constant time. This method significantly enhances performance; however, it comes with a considerable space cost.

Resizing the Array or List

In JavaScript, an alternative method to diminish the likelihood of collisions is by resizing the array or list. We can establish a threshold, and once this limit is exceeded, we can generate a new table that has twice the capacity of the original. The only requirement is to transfer the elements from the former table to the new one.

It is possible to adjust the size of the list or array in order to substantially decrease the number of collisions; however, the resizing operation can be quite resource-intensive. Consequently, it is essential to exercise caution regarding the threshold we establish. A common practice is to define the threshold at 0.6, indicating that a resize should occur once 60% of the table has been occupied.

Double hashing

Double hashing employs a pair of hash functions. The second hash function generates an offset value that is utilized when the first function results in a collision. This method of double hashing allows for quicker identification of the next available slot compared to using a linear probing technique. It proves to be particularly advantageous for scenarios involving a more compact hash table.

The function presented below serves as an illustration of the double hashing technique:

Example


(firstHash(key) + i * secondHash(key)) % tableSize

Conclusion

A hash table is a data structure utilized to associate keys with corresponding values, facilitating efficient operations for searching, inserting, and deleting elements.

It comprises an object (array) designed to hold data alongside a hash function that identifies the index for the key-value pair.

Hash tables are well-suited for handling extensive data collections and typically offer an average time complexity of O(1) for search operations.

Frequently utilized hash functions encompass Arithmetic Modular, Truncation, and Folding.

Hash collisions occur when two distinct keys are assigned the same index in a hash table. Various strategies are employed to manage these collisions, including:

✅ Linear Probing: This method involves searching for the next available index in a sequential manner.
✅ Chaining: In this approach, each index in the hash table points to a linked list that contains all the keys that hash to that index.
✅ Resizing: This technique entails altering the size of the hash table when the load factor exceeds a certain threshold, thereby reducing the likelihood of collisions.
✅ Double Hashing: This strategy utilizes a secondary hash function to determine the next index when a collision occurs.