Lua Type Checking

lua-users home
wiki

Many programming languages provide some form of static (compile-time) or dynamic (run-time) type checking, with each form having its own merits [1]. Lua performs run-time type checking on its built-in operations. For example, this code triggers a run-time error:

> x = 5 + "ok"

stdin:1: attempt to perform arithmetic on a string value

However, unlike in languages like C, there is no built-in mechanism in Lua for type checking the parameters and return values of function calls. The types are left unspecified:

function abs(x)

  return x >= 0 and x or -x

end

This allows a lot of flexibility. For example, a function like print can accept values of many types. However, it can leave functions underspecified and more prone to usage errors. You are allowed to call this function on something other than a number, though the operations inside the function can will trigger a somewhat cryptic error at run-time (where it would be a compile-time error in C).

> print(abs("hello"))

stdin:2: attempt to compare number with string

stack traceback:

        stdin:2: in function 'abs'

        stdin:1: in main chunk

        [C]: ?

Solution: Assertions at Top of Function

For improved error reporting, one is typically told to do something like this:

function abs(x)

  assert(type(x) == "number", "abs expects a number")

  return x >= 0 and x or -x

end

> print(abs("hello"))

stdin:2: abs expects a number

stack traceback:

        [C]: in function 'assert'

        stdin:2: in function 'abs'

        stdin:1: in main chunk

        [C]: ?

That's a good suggestion, but one might complain that this imposes additional run-time overhead, it will only detect program errors for code that gets executed instead of all code that is compiled, the types of the function values are entirely in the function's implementation (not available by introspection), and it can involve a lot of repetitive code (especially when the named arguments are passed via a table).

Here is one ways to proactively check named parameters:

local Box = {}

local is_key = {x=true,y=true,width=true,height=true,color=true}

function create_box(t)

  local x = t.x or 0

  local y = t.y or 0

  local width = t.width or 0

  local height = t.height or 0

  local color = t.color

  assert(type(x) == "number", "x must be number or nil")

  assert(type(y) == "number", "y must be number or nil")

  assert(type(width) == "number", "width must be number be number or nil")

  assert(type(height) == "number", "height must be number or nil")

  assert(color == "red" or color == "blue", "color must be 'red' or 'blue'")

  for k,v in pairs(t) do

    assert(is_key[k], tostring(k) .. " not valid key")

  end

  return setmetatable({x1=x,y1=y,x2=x+width,y2=y+width,color=color}, Box)

end

It is relatively a lot of code. In fact, we might want to use error rather than assert to provide an appropriate level parameter for the stack traceback.

Solution: Function Decorators

Another approach is to place the type check code outside of the original function, potentially with a "function decorator" (see DecoratorsAndDocstrings for background).

is_typecheck = true



function typecheck(...)

  return function(f)

    return function(...)

      assert(false, "FIX-TODO: ADD SOME GENERIC TYPE CHECK CODE HERE")

      return f(...)

    end

  end

end



function notypecheck()

  return function(f) return f end

end



typecheck = is_typecheck and typecheck or notypecheck



sum = typecheck("number", "number", "->", "number")(

  function(a,b)

    return a + b

  end

)

The advantage is that the type information is outside of the function implementation. We can disable all the type checking by switching a single variable, and no added overhead would remain when the functions are executed (though there is some slight added overhead when the functions are built). The typecheck function could also store away the type info for later introspection.

This approach is similar to the one described in LuaList:/2002-07/msg00209.html (warning: Lua 4).

Solution: the checks library

The solutions above present some limitations:

* they are rather verbose, and non-trivial verifications can be hard to read;

* the error messages aren't as clear as those returned by Lua's primitives. Moreover, they indicate an error in the called function, at the place where assert() fails, rather than in the calling function which passed invalid arguments.

The checks library offers a terse, flexible and readable way to produce good error messages. Types are described by strings, which can of course be Lua type names, but can also be stored in an object's metatable, under the __type field. Additional, arbitrary type-checking functions can also be registered in a dedicated checkers table. For instance, if one wants to check for an IP port number, which must be between 0 and 0xffff, one can define a port type as follows:

function checkers.port(x)

    return type(x)=='number' and 0<=x and x<=0xffff and math.floor(x)==0 

end

To remove useless boilerplate code, checks() retrieves the parameters directly from the stack frame, no need to repeat them; for instance, if function f(num, str) needs a number and a string, it can be implemented as follows:

function f(num, str)

    checks('number', 'string')

    --actual function body

end

Types can be combined:

* the vertical bar allows to accept several types, e.g. checks('string|number') accepts both strings and numbers as first parameter.

* a prefix "?" makes a type optional, i.e. also accepts nil. Functionally, it's equivalent to a "nil|" prefix, although it's more readable and faster to check at runtime.

* the question mark can be combined with union bars, e.g. checks('?string|number') accepts strings, numbers and nil.

* finally, a special "!" type accepts anything but nil.

A more detailed description of how the library works can be found in its source's header (https://github.com/fab13n/checks/blob/master/checks.c). The library is part of Sierra Wireless' application framework, accessible here: https://github.com/SierraWireless/luasched. For convinience, it's also available here as a standalone rock: https://github.com/fab13n/checks

Hack: Boxed Values + Possible Values

As mentioned, run-time type checking will not detect program errors that don't get executed. Extensive test suites are particularly essential for programs in dynamically typed languages so that all branches of the code are executed with all conceivable data sets (or at least a good representation of them) so that the run-time assertions are are sufficiently hit. You can't rely as much on the compiler to do some of this checking for you.

Perhaps we can mitigate this by carrying more complete type information with the values. Below is one approach, though is it more a novel proof-of-concept rather than anything for production use at this point.

-- ExTypeCheck.lua ("ExTypeCheck")

-- Type checking for Lua.

--

-- In this type model, types are associated with values at run-time.

-- A type consists of the set of values the value could have

-- at run-time.  This set can be large or infinite, so we

-- store only a small representative subset of those values.

-- Typically one would want to include at least the boundary

-- values (e.g. max and min) in this set.

-- Type checking is performed by verifying that all values

-- in that set are accepted by a predicate function for that type.

-- This predicate function takes a values and returns true or false

-- whether the value is a member of that type.

--

-- As an alternative to representing types as a set of representative

-- values, we could represent types more formally, such as with

-- first-order logic, but then we get into theorem proving,

-- which is more involved.

--

-- DavidManura, 2007, licensed under the same terms as Lua itself.



local M = {}



-- Stored Expression design pattern

-- ( http://lua-users.org/wiki/StatementsInExpressions )

local StoredExpression

do

  local function call(self, ...)

    self.__index = {n = select('#', ...), ...}

    return ...

  end

  function StoredExpression()

    local self = {__call = call}

    return setmetatable(self, self)

  end

end

 

-- Whether to enable type checking (true/false).  Default true.

local is_typecheck = true



-- TypeValue is an abstract type for values that are typed

-- This holds the both the actual value and a subset of possible

-- values the value could assume at runtime.  That set should at least

-- include the min and max values (for bounds checking).

local TypedValue = {}



-- Check that value x satisfies type predicate function f.

function M.check_type(x, f)

  for _,v in ipairs(x) do

    assert(f(v))

  end

  return x.v

end





-- Type check function that decorates functions.

-- Example:

--   abs = typecheck(ranged_real'(-inf,inf)', '->', ranged_real'[0,inf)')(

--     function(x) return x >= 0 and x or -x end

--   )

function M.typecheck(...)

  local types = {...}

  return function(f)

    local function check(i, ...)

      -- Check types of return values.

      if types[i] == "->" then i = i + 1 end

      local j = i

      while types[i] ~= nil do

        M.check_type(select(i - j + 1, ...), types[i])

        i = i + 1

      end

      return ...

    end

    return function(...)

      -- Check types of input parameters.

      local i = 1

      while types[i] ~= nil and types[i] ~= "->" do

        M.check_type(select(i, ...), types[i])

        i = i + 1

      end

      return check(i, f(...))  -- call function

    end

  end

end





function M.notypecheck() return function(f) return f end end

function M.nounbox(x) return x end



M.typecheck = is_typecheck and M.typecheck or M.notypecheck

M.unbox = is_typecheck and M.unbox or M.nounbox



-- Return a boxed version of a binary operation function.

-- For the returned function,

--   Zero, one, or two of the arguments may be boxed.

--   The result value is boxed.

-- Example:

--   __add = boxed_op(function(a,b) return a+b end)

function M.boxed_op(op)

  return function(a, b)

    if getmetatable(a) ~= TypedValue then a = M.box(a) end

    if getmetatable(b) ~= TypedValue then b = M.box(b) end

    local t = M.box(op(M.unbox(a), M.unbox(b)))

    local seen = {[t[1]] = true}

    for _,a2 in ipairs(a) do

      for _,b2 in ipairs(b) do

        local c2 = op(a2, b2)

        if not seen[c2] then

          t[#t + 1] = op(a2, b2)

          seen[c2] = true

        end

      end

    end

    return t

  end

end



-- Return a boxed version of a unary operation function.

-- For the returned function,

--   The argument may optionally be boxed.

--   The result value is boxed.

-- Example:

--   __unm = boxed_uop(function(a) return -a end)

function M.boxed_uop(op)

  return function(a)

    if getmetatable(a) ~= TypedValue then a = M.box(a) end

    local t = M.box(op(M.unbox(a)))

    local seen = {[t[1]] = true}

    for _,a2 in ipairs(a) do

      local c2 = op(a2)

      if not seen[c2] then

        t[#t + 1] = op(a2)

        seen[c2] = true

      end

    end

    return t

  end

end



TypedValue.__index = TypedValue

TypedValue.__add = M.boxed_op(function(a, b) return a + b end)

TypedValue.__sub = M.boxed_op(function(a, b) return a - b end)

TypedValue.__mul = M.boxed_op(function(a, b) return a * b end)

TypedValue.__div = M.boxed_op(function(a, b) return a / b end)

TypedValue.__pow = M.boxed_op(function(a, b) return a ^ b end)

TypedValue.__mod = M.boxed_op(function(a, b) return a % b end)

TypedValue.__concat = M.boxed_op(function(a, b) return a .. b end)

-- TypedValue.__le -- not going to work? (metafunction returns Boolean)

-- TypedValue.__lt -- not going to work? (metafunction returns Boolean)

-- TypedValue.__eq -- not going to work? (metafunction returns Boolean)

TypedValue.__tostring = function(self)

  local str = "[" .. tostring(self.v) .. " in "

  for i,v in ipairs(self) do

    if i ~= 1 then str = str .. ", " end

    str = str .. v

  end

  str = str .. "]"

  return str 

end

-- Convert a regular value into a TypedValue.  We call this "boxing".

function M.box(v, ...)

  local t = setmetatable({v = v, ...}, TypedValue)

  if #t == 0 then t[1] = v end

  return t

end

-- Convert a TypedValue into a regular value.  We call this "unboxing".

function M.unbox(x)

  assert(getmetatable(x) == TypedValue)

  return x.v

end





-- Returns a type predicate function for a given interval over the reals.

-- Example: ranged_real'[0,inf)'

-- Note: this function could be memoized.

function M.ranged_real(name, a, b)

  local ex = StoredExpression()



  if name == "(a,b)" then

    return function(x) return type(x) == "number" and x > a and x < b end

  elseif name == "(a,b]" then

    return function(x) return type(x) == "number" and x > a and x <= b end

  elseif name == "[a,b)" then

    return function(x) return type(x) == "number" and x >= a and x < b end

  elseif name == "[a,b]" then

    return function(x) return type(x) == "number" and x >= a and x <= b end

  elseif name == "(inf,inf)" then

    return function(x) return type(x) == "number" end

  elseif name == "[a,inf)" then

    return function(x) return type(x) == "number" and x >= a end

  elseif name == "(a,inf)" then

    return function(x) return type(x) == "number" and x > a end

  elseif name == "(-inf,a]" then

    return function(x) return type(x) == "number" and x <= a end

  elseif name == "(-inf,a)" then

    return function(x) return type(x) == "number" and x < a end

  elseif name == "[0,inf)" then

    return function(x) return type(x) == "number" and x >= 0 end

  elseif name == "(0,inf)" then

    return function(x) return type(x) == "number" and x > 0 end

  elseif name == "(-inf,0]" then

    return function(x) return type(x) == "number" and x <= 0 end

  elseif name == "(-inf,0)" then

    return function(x) return type(x) == "number" and x < 0 end

  elseif ex(name:match("^([%[%(])(%d+%.?%d*),(%d+%.?%d*)([%]%)])$")) then

    local left, a, b, right = ex[1], tonumber(ex[2]), tonumber(ex[3]), ex[4]

    if left == "(" and right == ")" then

      return function(x) return type(x) == "number" and x > a and x < b end

    elseif left == "(" and right == "]" then

      return function(x) return type(x) == "number" and x > a and x <= b end

    elseif left == "[" and right == ")" then

      return function(x) return type(x) == "number" and x >= a and x < b end

    elseif left == "[" and right == "]" then

      return function(x) return type(x) == "number" and x >= a and x <= b end

    else assert(false)

    end

  else

    error("invalid arg " .. name, 2)

  end

end





return M

Example usage:

-- type_example.lua

-- Test of ExTypeCheck.lua.



local TC = require "ExTypeCheck"

local typecheck = TC.typecheck

local ranged_real = TC.ranged_real

local boxed_uop = TC.boxed_uop

local box = TC.box



-- Checked sqrt function.

local sqrt = typecheck(ranged_real'[0,inf)', '->', ranged_real'[0,inf)')(

  function(x)

    return boxed_uop(math.sqrt)(x)

  end

)



-- Checked random function.

local random = typecheck('->', ranged_real'[0,1)')(

  function()

    return box(math.random(), 0, 0.999999)

  end

)



print(box("a", "a", "b") .. "z")

print(box(3, 3,4,5) % 4)



math.randomseed(os.time())

local x = 0 + random() * 10 - 1 + (random()+1) * 0

print(x + 1); print(sqrt(x + 1)) -- ok

print(x); print(sqrt(x)) -- always asserts! (since x might be negative)

Example output:


[az in az, bz]

[3 in 3, 0, 1]

[8.7835848325787 in 8.7835848325787, 0, 9.99999]

[2.9637113274708 in 2.9637113274708, 0, 3.1622760790292]

[7.7835848325787 in 7.7835848325787, -1, 8.99999]

lua: ./ExTypeCheck.lua:50: assertion failed!

stack traceback:

        [C]: in function 'assert'

        ./ExTypeCheck.lua:50: in function 'check_type'

        ./ExTypeCheck.lua:78: in function 'sqrt'

        testt.lua:30: in main chunk

        [C]: ?

Note: this approach of values holding multiple values does have some similarities to Perl6 junctions (originally termed "quantum superpositions").

Solution: Metalua Runtime Type Checking

There is a Runtime type checking example in Metalua [2].

Solution: Dao

The Dao language, partly based on Lua, has built-in support for optional typing [3].


--DavidManura

I'd argue that, except for very simple programs or those that always have the same input, disabling type checks in a script is a bad idea. --JohnBelmonte

The comment above was anonymously deleted with "This isn't a forum for drive-by opinions". On the contrary, this is established wiki style. People can make comments on pages, and for points of contention that's considered courteous versus summarily changing the original text. The original author can decide to integrate such comments into the original text or not. --JohnBelmonte


RecentChanges · preferences
edit · history
Last edited October 4, 2012 3:08 pm GMT (diff)