What's the logic behind Python's hash function order? -
as know, of python's data structures use hash tables storing items set
or dictionary
. there no order in these objects. seems that, sequences of numbers that's not true.
for example consider following examples :
>>> set([7,2,5,3,6]) set([2, 3, 5, 6, 7]) >>> set([4,5,3,0,1,2]) set([0, 1, 2, 3, 4, 5])
but isn't sorted if make small change :
>>> set([8,2,5,3,6]) set([8, 2, 3, 5, 6])
so question is: how python's hash function work on integer sequences?
although there lot of questions in hash
, order,but no 1 of them explains algorithm of hash function.
so need here know how python calculate indices in hash table.
if go through hashtable.c
file in cpython source code you'll see following lines in _py_hashtable_set
function shows way python calculate index of hash table keys :
key_hash = ht->hash_func(key); index = key_hash & (ht->num_buckets - 1);
so hash value of integers integer * (except -1) index based on number , length of data structure (ht->num_buckets - 1
) , calculated bitwise-and between (ht->num_buckets - 1)
, number.
now consider following example set
use hash-table :
>>> set([0,1919,2000,3,45,33,333,5]) set([0, 33, 3, 5, 45, 333, 2000, 1919])
for number 33
have :
33 & (ht->num_buckets - 1) = 1
that it's :
'0b100001' & '0b111'= '0b1' # 1 index of 33
note in case (ht->num_buckets - 1)
8-1=7
or 0b111
.
and 1919
:
'0b11101111111' & '0b111' = '0b111' # 7 index of 1919
and 333
:
'0b101001101' & '0b111' = '0b101' # 5 index of 333
and preceding examples in question :
>>> set([8,2,5,3,6]) set([8, 2, 3, 5, 6]) '0b1000' & '0b100'='0b0' # 8 '0b110' & '0b100'='0b100' # 8
* hash function class int
:
class int: def __hash__(self): value = self if value == -1: value = -2 return value
Comments
Post a Comment