High Performance Self Organizing Dynamic Dictionaries

Author: Ziaul Choudhury
Date: 2016-07-18
Report no: IIIT/TH/2016/36
Advisor:Suresh Purini

Abstract

A dictionary data structure is a key-value store which supports insert, search and delete operations. Conventional data structures such as balanced binary search trees and hash tables are good both in theory and practice. However, these dictionaries are not adaptive as they do not exploit the temporal locality in key access patterns over time. On the other hand dictionaries like splay trees and working-set trees adapt to the changing key access patterns by reorganizing themselves with every key access. Consequently, the set of most frequently accessed key, i.e. the working-set, can be retrieved quickly in these dictionaries. With the gaining popularity of GPU based accelerator platforms for general purpose computing, several data structures such as B-tees and hash tables have been successfully ported to these platforms. In this thesis, we propose heterogeneous (CPU+GPU) hash tables, that optimizes operations for the frequently accessed keys. The basic idea is to maintain a dynamic set of most frequently accessed keys in the GPU memory and the rest of the keys in the CPU main memory. Further, queries are processed in batches of fixed size. We measured the query throughput of our hash tables using Millions of Queries Processed per Second (MQPS) as a metric, on different key access distributions. On distributions, where some keys are queried more frequently than others, we achieved on an average 10x higher MQPS when compared to a highly tuned serial hash table in the C++ Boost library; and 5x higher MQPS against a state of the art concurrent lock free hash table. The maximum load factor on the hash tables was set to 0.9. On uniform random query distributions, as expected our hash tables do not out perform concurrent lock free hash tables, nevertheless matches their performance. The second main contribution of this thesis is the design and implementation of self organizing Cache Oblivious B-trees (CO B-tree) which use the memory hierarchy effectively to reduce the number of cache misses incurred while processing a query. A cache oblivious B-tree (CO B-tree) incurs optimal number of cache block transfers across any two levels in a multi level memory hierarchy. In this work, we propose two variants of the dynamic CO B-tree, a working set CO B-tree and a hybrid CO B-tree. They reorganize themselves with every key access facilitating faster retrieval of the frequently accessed keys with the minimum number of cache misses. Theoretically, we prove that a self organizing CO B-tree can be designed with a small additive factor overhead to the already proven bounds for the data structure. In empirical experiments we compare both our data structures to a B-tree, an AVL tree, a top down splay tree and a static/dynamic CO B-tree. Our CO B-trees, on average, have 1.5 to 2 times lesser cache misses compared to the B-tree and the top down splay tree respectively. For higher degree of temporal locality in access patterns, the hybrid CO B-tree has better query throughput (queries processed per unit time) and lesser number of cache misses compared to the rest of the data structures including a static CO B-tree.

Full thesis: pdf

Centre for Software Engineering Research Lab

IIIT Hyderabad Publications

High Performance Self Organizing Dynamic Dictionaries

Abstract