Skip to content

bpo-34751: improved hash function for tuples#9534

Closed
jdemeyer wants to merge 1 commit into
python:masterfrom
jdemeyer:bpo34751-fnv
Closed

bpo-34751: improved hash function for tuples#9534
jdemeyer wants to merge 1 commit into
python:masterfrom
jdemeyer:bpo34751-fnv

Conversation

@jdemeyer

@jdemeyer jdemeyer commented Sep 24, 2018

Copy link
Copy Markdown
Contributor

This patch improves the hash code for tuples to avoid the obvious hash collision

hash((3, 3)) == hash((-3, -3))

Pseudo-code of the new hash:

def tuplehash(t):
    h = INITIALVALUE
    for x in t:
        y = hash(x)
        y = mangle(y)
        h = (h ^ y) * MULTIPLIER
    return h + FINALVALUE

def mangle(y):
    return y ^ (2 * y)

This has the structure of a standard FNV-1a hash. The line y = mangle(y) was added to avoid hash collisions for nested tuples and to work around collisions due to x ^ -2 = -x for odd x.

The constants were chosen as follows:

  • INITIALVALUE = 3430008: kept from old algorithm
  • FINALVALUE = 97531: kept from old algorithm
  • MULTIPLIER = 3**41 (truncated to the platform bitsize): a sufficiently big odd number without obvious bit structure. The standard FNV multipliers tended to create more collisions, probably due to the high number of 0 bits.

https://bugs.python.org/issue34751

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants