Skip to content

Instantly share code, notes, and snippets.

@joeldrapper
Created May 13, 2024 12:09
Show Gist options
  • Select an option

  • Save joeldrapper/e5e345f8904f93d7a602677ccba4722c to your computer and use it in GitHub Desktop.

Select an option

Save joeldrapper/e5e345f8904f93d7a602677ccba4722c to your computer and use it in GitHub Desktop.

Revisions

  1. joeldrapper created this gist May 13, 2024.
    63 changes: 63 additions & 0 deletions murmur.rb
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,63 @@
    module Murmur
    MASK32 = 0xffffffff

    def self.hash(str, seed = 0)
    # Initialize variables
    # h1: The main hash value that will be iteratively updated
    # k1: A temporary value used to process chunks of 4 bytes
    # i: Counter to keep track of the number of bytes processed
    h1 = seed
    k1 = 0
    i = 0

    # Iterate over each byte of the input string
    str.each_byte do |byte|
    # Combine the current byte with the existing k1 value
    # This is done to build a chunk of 4 bytes
    k1 |= byte << (i * 8)
    i += 1

    # Process the combined bytes when i reaches 4
    # This ensures that the hash is computed in chunks of 4 bytes
    next unless i == 4

    # Perform bitwise operations on k1
    # These operations are designed to mix and scramble the bits of k1
    # The constants 0xcc9e2d51 and 0x1b873593 are chosen for optimal mixing
    k1 = (k1 * 0xcc9e2d51) & MASK32
    k1 = ((k1 << 15) | (k1 >> (32 - 15))) & MASK32
    k1 = (k1 * 0x1b873593) & MASK32

    # Update the hash value h1 by XORing it with the scrambled k1
    # This combines the current chunk's hash with the overall hash
    # The shift and addition operations further mix the bits of h1
    h1 ^= k1
    h1 = ((h1 << 13) | (h1 >> (32 - 13))) & MASK32
    h1 = ((h1 * 5) + 0xe6546b64) & MASK32

    # Reset i and k1 for the next iteration
    i = 0
    k1 = 0
    end

    # Process any remaining bytes if the string length is not a multiple of 4
    # This ensures that all bytes contribute to the final hash value
    if i > 0
    k1 = (k1 * 0xcc9e2d51) & MASK32
    k1 = ((k1 << 15) | (k1 >> (32 - 15))) & MASK32
    k1 = (k1 * 0x1b873593) & MASK32

    h1 ^= k1
    end

    # Finalize the hash value
    # These operations mix the bits of h1 with the length of the input string
    # The constants used are chosen for optimal mixing and avalanche effect
    h1 ^= str.bytesize
    h1 ^= h1 >> 16
    h1 = (h1 * 0x85ebca6b) & MASK32
    h1 ^= h1 >> 13
    h1 = (h1 * 0xc2b2ae35) & MASK32
    h1 ^ (h1 >> 16)
    end
    end