What's the logic behind Python's hash function order? -


as know, of python's data structures use hash tables storing items set or dictionary. there no order in these objects. seems that, sequences of numbers that's not true.

for example consider following examples :

>>> set([7,2,5,3,6]) set([2, 3, 5, 6, 7])  >>> set([4,5,3,0,1,2]) set([0, 1, 2, 3, 4, 5]) 

but isn't sorted if make small change :

>>> set([8,2,5,3,6]) set([8, 2, 3, 5, 6]) 

so question is: how python's hash function work on integer sequences?

although there lot of questions in hash , order,but no 1 of them explains algorithm of hash function.

so need here know how python calculate indices in hash table.

if go through hashtable.c file in cpython source code you'll see following lines in _py_hashtable_set function shows way python calculate index of hash table keys :

key_hash = ht->hash_func(key); index = key_hash & (ht->num_buckets - 1); 

so hash value of integers integer * (except -1) index based on number , length of data structure (ht->num_buckets - 1) , calculated bitwise-and between (ht->num_buckets - 1) , number.

now consider following example set use hash-table :

>>> set([0,1919,2000,3,45,33,333,5]) set([0, 33, 3, 5, 45, 333, 2000, 1919]) 

for number 33 have :

33 & (ht->num_buckets - 1) = 1 

that it's :

'0b100001' & '0b111'= '0b1' # 1 index of 33 

note in case (ht->num_buckets - 1) 8-1=7 or 0b111.

and 1919 :

'0b11101111111' & '0b111' = '0b111' # 7 index of 1919 

and 333 :

'0b101001101' & '0b111' = '0b101' # 5 index of 333 

and preceding examples in question :

>>> set([8,2,5,3,6]) set([8, 2, 3, 5, 6])  '0b1000' & '0b100'='0b0' # 8 '0b110' & '0b100'='0b100' # 8 

* hash function class int :

class int:     def __hash__(self):         value = self         if value == -1:             value = -2         return value 


Comments

Popular posts from this blog

Java 3D LWJGL collision -

spring - SubProtocolWebSocketHandler - No handlers -

methods - python can't use function in submodule -