Skip to content

Instantly share code, notes, and snippets.

@artagnon
Created April 28, 2025 18:48
Show Gist options
  • Select an option

  • Save artagnon/b9524935f7f9d355eff082469bc72d03 to your computer and use it in GitHub Desktop.

Select an option

Save artagnon/b9524935f7f9d355eff082469bc72d03 to your computer and use it in GitHub Desktop.
//===----------------------------------------------------------------------===//
//
// The HashRecognize analysis recognizes unoptimized polynomial hash functions
// with operations over a Galois field of characteristic 2, also called binary
// fields, or GF(2^n): this class of hash functions can be optimized using a
// lookup-table-driven implementation, or with target-specific instructions.
// Examples:
//
// 1. Cyclic redundancy check (CRC), which is a polynomial division in GF(2).
// 2. Rabin fingerprint, a component of the Rabin-Karp algorithm, which is a
// rolling hash polynomial division in GF(2).
// 3. Rijndael MixColumns, a step in AES computation, which is a polynomial
// multiplication in GF(2^3).
// 4. GHASH, the authentication mechanism in AES Galois/Counter Mode (GCM),
// which is a polynomial evaluation in GF(2^128).
//
// All of them use an irreducible generating polynomial of degree m,
//
// c_m * x^m + c_(m-1) * x^(m-1) + ... + c_0 * x^0
//
// where each coefficient c is can take values in GF(2^n), where 2^n is termed
// the order of the Galois field. For GF(2), each coefficient can take values
// either 0 or 1, and the polynomial is simply represented by m+1 bits,
// corresponding to the coefficients. The different variants of CRC are named by
// degree of generating polynomial used: so CRC-32 would use a polynomial of
// degree 32.
//
// The reason algorithms on GF(2^n) can be optimized with a lookup-table is the
// following: in such fields, polynomial addition and subtraction are identical
// and equivalent to XOR, polynomial multiplication is an AND, and polynomial
// division is identity: the XOR and AND operations in unoptimized
// implmentations are performed bit-wise, and can be optimized to be performed
// chunk-wise, by interleaving copies of the generating polynomial, and storing
// the pre-computed values in a table.
//
// A generating polynomial of m bits always has the MSB set, so we usually
// omit it. An example of a 16-bit polynomial is the CRC-16-CCITT polynomial:
//
// (x^16) + x^12 + x^5 + 1 = (1) 0001 0000 0010 0001 = 0x1021
//
// Transmissions are either in big-endian or little-endian form, and hash
// algorithms are written according to this. For example, IEEE 802 and RS-232
// specify little-endian transmission.
//
//===----------------------------------------------------------------------===//
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment