redis-improve

2021-11-23

redis

回忆下redis的查询，大概就是判断下是否是空、是否在rehash，然后计算出哈希值，到两个哈希表中都找找。

dictEntry *dictFind(dict *d, const void *key)
{
    dictEntry *he;
    uint64_t h, idx, table;

    if (dictSize(d) == 0) return NULL; /* dict is empty */
    if (dictIsRehashing(d)) _dictRehashStep(d);
    h = dictHashKey(d, key);
    for (table = 0; table <= 1; table++) {
        idx = h & DICTHT_SIZE_MASK(d->ht_size_exp[table]);
        he = d->ht_table[table][idx];
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key))
                return he;
            he = he->next;
        }
        if (!dictIsRehashing(d)) return NULL;
    }
    return NULL;
}

dict的数据结构

struct dict {
    dictType *type;

    dictEntry **ht_table[2];
    unsigned long ht_used[2];

    long rehashidx; /* rehashing not in progress if rehashidx == -1 */

    /* Keep small vars at end for optimal (minimal) struct padding */
    int16_t pauserehash; /* If >0 rehashing is paused (<0 indicates coding error) */
    signed char ht_size_exp[2]; /* exponent of size. (size = 1<<exp) */
};

发现和之前看的dict很不一样

原来的dict和dictht
/* This is our hash table structure. Every dictionary has two of this as we
 * implement incremental rehashing, for the old to the new table. */
typedef struct dictht {
    dictEntry **table;
    unsigned long size;
    unsigned long sizemask;
    unsigned long used;
} dictht;

typedef struct dict {
    dictType *type;
    void *privdata;
    dictht ht[2];
    long rehashidx; /* rehashing not in progress if rehashidx == -1 */
    int16_t pauserehash; /* If >0 rehashing is paused (<0 indicates coding error) */
} dict;

现在的：
struct dict {
    dictType *type;

    dictEntry **ht_table[2];
    unsigned long ht_used[2];

    long rehashidx; /* rehashing not in progress if rehashidx == -1 */

    /* Keep small vars at end for optimal (minimal) struct padding */
    int16_t pauserehash; /* If >0 rehashing is paused (<0 indicates coding error) */
    char ht_size_exp[2]; /* exponent of size. (size = 1<<exp) */
};

看下issue：https://github.com/redis/redis/pull/9228

总结一下就是省内存吧hashtable直接放到dict中去了。另开一篇来讲下这个issue吧，改动量还是很大的。

再看下rehash的一个过程

/* Performs N steps of incremental rehashing. Returns 1 if there are still
 * keys to move from the old to the new hash table, otherwise 0 is returned.
 *
 * Note that a rehashing step consists in moving a bucket (that may have more
 * than one key as we use chaining) from the old to the new hash table, however
 * since part of the hash table may be composed of empty spaces, it is not
 * guaranteed that this function will rehash even a single bucket, since it
 * will visit at max N*10 empty buckets in total, otherwise the amount of
 * work it does would be unbound and the function may block for a long time. */
 翻译：渐进式rehash的N步骤。返回1如果dict依然有key从老的哈希表到新的哈希表，否则返回0.
 注意一个rehash的过程，包含移动一个bucket（处理的有可能比一个key更多）。然而一个哈希表可能有空的组成，这个不保证这个方法会rehash到每一个bucket，自从它会访问最大N*10个空bucket，否则这个方法会无限制，有可能会阻塞很长一段时间。
int dictRehash(dict *d, int n) {
    int empty_visits = n*10; /* Max number of empty buckets to visit. */
    if (!dictIsRehashing(d)) return 0;

    while(n-- && d->ht_used[0] != 0) {	//table0中还有元素
        dictEntry *de, *nextde;

        /* Note that rehashidx can't overflow as we are sure there are more
         * elements because ht[0].used != 0 */
        assert(DICTHT_SIZE(d->ht_size_exp[0]) > (unsigned long)d->rehashidx);
        while(d->ht_table[0][d->rehashidx] == NULL) {	//如果table0的这个entry为空
            d->rehashidx++;	
            if (--empty_visits == 0) return 1;	//空的到数了也要退出，不能阻塞
        }
        de = d->ht_table[0][d->rehashidx];	//这个de是有值的
        /* Move all the keys in this bucket from the old to the new hash HT */
        while(de) {
            uint64_t h;

            nextde = de->next;
            /* Get the index in the new hash table */
            h = dictHashKey(d, de->key) & DICTHT_SIZE_MASK(d->ht_size_exp[1]);
            de->next = d->ht_table[1][h];//接起来
            d->ht_table[1][h] = de;	//放到entry的头
            d->ht_used[0]--;	
            d->ht_used[1]++;
            de = nextde;	
        }
        d->ht_table[0][d->rehashidx] = NULL;
        d->rehashidx++;
    }

    /* Check if we already rehashed the whole table... */
    if (d->ht_used[0] == 0) {	//处理完后判断下是否还需要rehash
        zfree(d->ht_table[0]);	//释放ht0
        /* Copy the new ht onto the old one */
        d->ht_table[0] = d->ht_table[1];	
        d->ht_used[0] = d->ht_used[1];
        d->ht_size_exp[0] = d->ht_size_exp[1];
        _dictReset(d, 1);
        d->rehashidx = -1;
        return 0;
    }

    /* More to rehash... */
    return 1;
}

我们注意到rehash是有一个empty_visits，值是10*N，函数内有两个地方调用，传参分别是1,和100，1是增删改查中加入的，100是下面这个方法中加入的

/* Rehash in ms+"delta" milliseconds. The value of "delta" is larger 
 * than 0, and is smaller than 1 in most cases. The exact upper bound 
 * depends on the running time of dictRehash(d,100).*/
int dictRehashMilliseconds(dict *d, int ms) {
    if (d->pauserehash > 0) return 0;

    long long start = timeInMilliseconds();
    int rehashes = 0;

    while(dictRehash(d,100)) {
        rehashes += 100;
        if (timeInMilliseconds()-start > ms) break;
    }
    return rehashes;
}

回到开头的dictfind函数的分析，就有一个优化的想法：

我们之前都是需要到两个哈希表中都找找，但如果用到rehashidx这个参数，如果dict不在rehash那么rehashidx=-1，如果在rehash并且我们要找的key处理后的位置idx_0要小于rehashidx，那就去table0中去寻找，如果大于等于rehashidx的话就肯定实在表格二中去寻找。

于是就提了一个issue：https://github.com/redis/redis/issues/9837

修改的代码如下：

原代码：
dictEntry *dictFind(dict *d, const void *key)
{
    dictEntry *he;
    uint64_t h, idx, table;

    if (dictSize(d) == 0) return NULL; /* dict is empty */
    if (dictIsRehashing(d)) _dictRehashStep(d);
    h = dictHashKey(d, key);
    for (table = 0; table <= 1; table++) {
        idx = h & DICTHT_SIZE_MASK(d->ht_size_exp[table]);
        he = d->ht_table[table][idx];
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key))
                return he;
            he = he->next;
        }
        if (!dictIsRehashing(d)) return NULL;
    }
    return NULL;
}

修改后：
dictEntry *dictFind(dict *d, const void *key)
{
    dictEntry *he;
    uint64_t h;
    long idx_0, idx_1;

    if (dictSize(d) == 0) return NULL; /* dict is empty */
    if (dictIsRehashing(d)) _dictRehashStep(d);
    h = dictHashKey(d, key);
    idx_0 = h & DICTHT_SIZE_MASK(d->ht_size_exp[0]);	先算出这个key在idx_0中的位置
    if (likely(d->rehashidx < idx_0)){	//采用likely进行优化 如果大于rehashidx 说明没有在rehash或者rehash的还没有到这里来
        he = d->ht_table[0][idx_0];		
    } else {	//则说明已经rehash并且key保存的数据已经到table1中了
        idx_1 = h & DICTHT_SIZE_MASK(d->ht_size_exp[1]);
        he = d->ht_table[1][idx_1];
    }
    while(he) {
        if (key==he->key || dictCompareKeys(d, key, he->key))
            return he;
        he = he->next;
    }
    return NULL;
}

初看代码，感觉是有进行性能的优化的，毕竟之前是要查找两个表现在只要查找一个表。但仔细分析：

1.如果dict有key对应的value，且在table0。那我优化后的代码和之前比，性能并无差异，甚至还多了一个rehashidx与idx_0的判断，所以为了优化后面我在判断上加上了likely。

2.如果dict有key，且已rehash到table1了。源代码中看似要去找table0会消耗无用的查找，但仔细分析其实这个时候的table0中对应下标的entry已经是空了，也不会在消耗太多时间，而优化的代码依然是多了一个判断，还多了一个h & DICTHT_SIZE_MASK(d->ht_size_exp[1])的运算，这里我有想过队医idx_0和idx_1的关系，并有对其进行优化的想法，就引出了C语言中算术符号的速度的了解。

3.如果dict中没有这个key，那么原来代码就需要遍历两个table下面的entry，而我这里确只要遍历一个，看似有性能的提升。但是源代码中的两个table中的entry最多只会有一个是有值的（如果没在rehash那就只有在table0中有值，如果在rehash，那么也是只有table0或者table1中会有值），而优化后的其实也是回去找一个table中的entry，而且这个table的entry的有值概率和修改前的概率是一样的。

总结来说：优化了个寂寞，在各种场景来看都没有进行优化，最终的测试结果也说明了这一点。不过整个过程还是对dict更加熟悉了，并且对redis还是更有‘敬畏之心’了，毕竟这么成熟的一个项目，对于dictfind这么大的一个方法可以是磨练过很多次了。还有一些coding和测试中的一些疑惑后续的文章会展开来讲下。

12/5新进展：https://github.com/redis/redis/pull/9838

在社区里面有一些回复：

这个叫madolson的网友，说我的修改替代在原来的代码上加一行：

if (dictIsRehashing(d) && d->rehashidx >= idx) continue;

那就改为了

dictEntry *dictFind(dict *d, const void *key)
{
    dictEntry *he;
    uint64_t h, idx, table;

    if (dictSize(d) == 0) return NULL; /* dict is empty */
    if (dictIsRehashing(d)) _dictRehashStep(d);
    h = dictHashKey(d, key);
    for (table = 0; table <= 1; table++) {
        idx = h & DICTHT_SIZE_MASK(d->ht_size_exp[table]);
        if (dictIsRehashing(d) && d->rehashidx >= idx) continue;
        he = d->ht_table[table][idx];
        while(he) {
            if (key==he->key || dictCompareKeys(d, key, he->key))
                return he;
            he = he->next;
        }
    }
    return NULL;
}

看样子也还不错，但是我觉得有点不好的地方在于这里没有likely的使用，这个判断在正常的没有hash的情况下也会做，影响性能。还有一个缺点下面的评论中也提及了。

这个叫oranagra的网友是在redis公司工作，并且有合入权限，所以他的建议比较重要，要顺着他心意才有合入的可能。大致翻译下他的内容：

让我们检查下到底改变了什么：
他阻止了在错误的哈希表上查询的动作，但链表总有一个是空的，所以我们没有改变这个。

另一个改变就是他阻止了从获取table的空指针，这个将会是一个缓存未命中，所以这里是有一些价值的

最后，即使我们找的bucket我们还没有rehash的，如果key不存在在dict中，原来的代码会继续在第二个table找：做了一个多余的哈希index（由于ht_size_exp有可能已经在CPU的cache中，性能不会影响太多），但是他还会继续尝试获取ht_table[1][idx]，这个就肯定会造成一个缓存未命中

这个就是madolson的建议是不太好的原因，也许我们可以添加continue和break，但是重要看起来做了过多的操作去保护loop

最后提下我不确定压测会告诉我们这个改变有啥结果，因为我们需要在rehashing的时候压测。有可能当dict变大的时候我们会看到变化，但这个改进会在压测数据没有在rehashing的时候被冲淡。

总的来说他还是认可这个pr中的一些价值，但是对于改进的效率不太好测也不会很明显。