Hamming Numbers

lua-users home
wiki

Hamming Numbers are numbers whose only prime factors are 2, 3 and 5. They are named after [Richard Hamming] but became famous (or notorious) after [Edsger Dijkstra] posed the question of how to efficiently enumerate them in numeric order.

The problem is frequently trotted out to explain why "x Programming" is better (particularly for the values of x: functional and logic). An O(n) solution (where n is the size of the sequence of numbers) is possible if you ignore the cost of manipulating large numbers. Buried deep in an interesting [thread] on [Lambda the Ultimate] is an analysis which indicates that the cost might be O(n^(4/3)).

The Haskell solution to this problem is delightfully brief:


-- Merges two infinite lists

merge :: (Ord a) => [a] -> [a] -> [a]

merge (x:xs)(y:ys)

  | x == y    = x : merge xs ys

  | x <  y    = x : merge xs (y:ys)

  | otherwise = y : merge (x:xs) ys



-- Lazily produce the hamming sequence

hamming :: [Integer]

hamming 

  = 1 : merge (map (2*) hamming) (merge (map (3*) hamming) (map (5*) hamming))

although even shorter ones exist. The essence of the above solution is that Haskell evaluates lazily, so the infinite sequence hamming can be self-referential. That's possible to do in Lua, using a sort of promise. The code below implements just enough of a bignum implementation to be able to do the multiplications by 2, 3 and 5, and to print out the results. (Hamming number 7305 is 2^52, so you'll need Bignum support if you want to compute more Hamming numbers than that. In practice, the bignum interface only adds about 33% to the cost of the computation; the interesting thing is that all of the code to implement Hamming is completely oblivious to the actual implementation of the numbers, provided that the implementation supports <, ==, and multiply-by-a-small-integer.

I checked the performance of the program by computing hamming(k) for values of k from 50,000 to 1,000,000; the performance does appear to be roughly O(n) in both space and time, although I made no attempt to optimise and the space consumption is somewhat porcine.


                             usec    RSS

       n     sec.       Mb     /n   kb/n   value

 -------     ----    -----   ----   ----   --------------------------------------------------------------------

   50000     1.99     3768   39.8   75.4   2379126835648701693226200858624

  100000     4.41     5524   44.1   55.2   290142196707511001929482240000000000000

  150000     6.87     6784   45.8   45.2   136551478823710021067381144334863695872000000

  200000     9.42     7932   47.1   39.7   4479571262811807241115438439905203543080960000000

  250000    11.99     8932   48.0   35.7   29168801439715713801701411811093381120000000000000000

  300000    14.57     9944   48.6   33.1   62864273786544799804687500000000000000000000000000000000

  350000    17.18    10768   49.1   30.8   60133943158606419031489628585123616917982648729600000000000

  400000    20.06    14240   50.1   35.6   30774090693237851027531250000000000000000000000000000000000000

  450000    23.10    16104   51.3   35.8   9544028161703913537712243143807801346335324481500000000000000000

  500000    25.98    17392   52.0   34.8   1962938367679548095642112423564462631020433036610484123229980468750

  550000    28.98    18724   52.7   34.0   286188649932207038880389529600000000000000000000000000000000000000000

  600000    32.05    19256   53.4   32.1   31118367413797724237140050340541118674764220486395061016082763671875000

  650000    35.27    20332   54.3   31.3   2624748786803793305341077210881955201024000000000000000000000000000000000

  700000    38.13    21372   54.5   30.5   177288702931066945536000000000000000000000000000000000000000000000000000000

  750000    41.34    21536   55.1   28.7   9844116074857088479400896102109753641824661421606444941755765750304761446400

  800000    44.51    22280   55.6   27.9   458936503790258814279745536000000000000000000000000000000000000000000000000000

  850000    47.59    23896   56.0   28.1   18286806033409508387421738007105886936187744140625000000000000000000000000000000

  900000    50.64    23952   56.3   26.6   632306706993546185913146473467350006103515625000000000000000000000000000000000000

  950000    54.02    26552   56.9   27.9   19221158232427643481897048859403926759149694825267200000000000000000000000000000000

 1000000    57.36    26800   57.4   26.8   519312780448388736089589843750000000000000000000000000000000000000000000000000000000

So here's the code:

do 

  local meta = {}

  function meta:__index(k)

    if k == "tail" then

      local rv = self.gen(self)

      self.tail, self.gen = rv, nil

      return rv

    end

  end

  function Seq(h, gen)

    return setmetatable({head = h, gen = gen}, meta)

  end

end



function SMap(func, seq)

  return Seq(func(seq.head),

             function() return SMap(func, seq.tail) end)

end



function SMerge(seq1, seq2)

  local head1, head2 = seq1.head, seq2.head

  local step

  if head1 < head2 then

    function step() return SMerge(seq1.tail, seq2) end

  elseif head2 < head1 then

    function step() return SMerge(seq1, seq2.tail) end

    head1 = head2

  else

    function step() return SMerge(seq1.tail, seq2.tail) end

  end

  return Seq(head1, step)

end



function Times(k)

  if k then

    return function(x) return x * k end

  else

    return function(x, y) return x * y end

  end

end



function Hamming()

  local init = 1

  if Bignum then init = Bignum(init) end

  local seq = Seq(init)

  local seq2, seq3, seq5 =

        SMap(Times(2), seq), SMap(Times(3), seq), SMap(Times(5), seq)  

  function seq.gen() return SMerge(seq2, SMerge(seq3, seq5)) end

  return seq

end



function SeqTail(seq, k)

  for i = 1, k do

    seq = seq.tail

  end

  return seq

end



if arg then

  local start, finish, inc = tonumber(arg[1]), tonumber(arg[2]), tonumber(arg[3])

  if not start then start, finish, inc = 1, 20, 1 end

  local h = SeqTail(Hamming(), start-1)

  print("hamming", start, h.head)

  if finish then

    while start + inc <= finish do

      start = start + inc

      h = SeqTail(h, inc)

      print("hamming", start, h.head)

    end

  end

end

Here are a few more interesting infinite sequence functions, including a Fibonacci generator (although I wouldn't recommend it in practice, it does run in constant space and more-or-less linear time.)

function SAlways(val)

  return Seq(val, function(seq) return seq end)

end



function SDist(func, init, seq)

  return Seq(init, function(self) return SDist(func, func(self.head, seq.head), seq.tail) end)

end



function SMap2(func, seq1, seq2)

  return Seq(func(seq1.head, seq2.head),

             function() return SMap2(func, seq1.tail, seq2.tail) end)

end



function Plus(k)

  if k then

    return function(x) return x + k end

  else

    return function(x, y) return x + y end

  end

end



Iota = SDist(Plus(), 1, SAlways(1))



function Fib(i, j)

  i, j = i or 1, j or 1

  local seq = Seq(Bignum(i))

  seq.tail = SDist(Plus(), Bignum(j), seq)

  return seq

end

and here's the simple Bignum implementation (define this first if you want to test the above code):

do

  -- very limited bignum stuff; just enough for the examples here.

  -- Please feel free to improve it.

  local base = 1e15

  local fmt = "%015.0f"

  local meta = {}

  function meta:__lt(other)

    if self.n ~= other.n then return self.n < other.n end

    for i = 1, self.n do

      if self[i] ~= other[i] then return self[i] < other[i] end

    end

  end

  function meta:__eq(other)

    if self.n == other.n then

      for i = 1, self.n do

        if self[i] ~= other[i] then return false end

      end

      return true

    end

  end

  function meta:__mul(k)

    -- If the base where a multiple of all possible multipliers, then

    -- we could figure out the length of the result directly from the

    -- first "digit". On the other hand, printing the numbers would be

    -- difficult. So we accept the occasional overflow.

    local offset = 0

    if self[1] * k >= base then offset = 1 end

    local carry = 0

    local result = {}

    local n = self.n

    for i = n, 1, -1 do

      local tmp = self[i] * k + carry

      local digit = tmp % base

      carry = (tmp - digit) / base

      result[offset + i] = digit

    end

    if carry ~= 0 then

      n = n + 1

      if offset == 0 then

        for i = n, 2, -1 do

          result[i] = result[i - 1]

        end

      end

      result[1] = carry

    end

    result.n = n

    return setmetatable(result, meta)

  end

  -- Added so that Fib would work; other must be a Bignum

  function meta:__add(other)

    local result = {}

    if self.n < other.n then self, other = other, self end

    local n, m = self.n, other.n

    local diff = n - m

    local carry = 0

    local result = {}

    for i = m, 1, -1 do

      local tmp = self[i + diff] + other[i] + carry

      if tmp < base then

        carry = 0

      else

        tmp = tmp - base

        carry = 1

      end

      result[i + diff] = tmp

    end

    for i = diff, 1, -1 do

      local tmp = self[i] + carry

      if tmp < base then

        carry = 0

      else

        tmp = tmp - base

        carry = 1

      end

      result[i] = tmp

    end

    if carry > 0 then

      n = n + 1

      for i = n, 2, -1 do

        result[i] = result[i - 1]

      end

      result[1] = carry

    end

    result.n = n

    return setmetatable(result, meta)

  end



  function meta:__tostring()

    local tmp = {}

    tmp[1] = ("%.0f"):format(self[1])

    for i = 2, self.n do

      tmp[i] = fmt:format(self[i])

    end

    return table.concat(tmp)

  end

  function Bignum(k)

    return setmetatable({k, n = 1}, meta)

  end

end

For reference, the time/space chart was created (except for the heading) with the following (which will probably only work on FreeBSD or similar):


for ((i = 50000; $i<=1000000; i = $i + 50000)) do /usr/bin/time -apl -o hamming.log ./hamming.lua $i >> hamming.log; done

egrep '(hamming|user|maximum)' hamming.log | \

 lua -e 'local a = io.read"*a"; \

         for n, res, time, size in string.gfind(a, "(%d+)%s+(%d+)%D+([%d.]+)%s+(%d+)") do \

           print(string.format("%8i %8.2f %8i %6.1f %6.1f   %s", \

                               n, time, size, (time*1e6)/n, (size*1e3)/n, res)) \

         end' > hamming.res

--RiciLake

See Also


RecentChanges · preferences
edit · history
Last edited May 28, 2007 8:42 pm GMT (diff)