Simpler For Iterator

lua-users home
wiki

Possibly the current model of iterator "for" loops may be simplified. Here is both a trial to explain the present model and an introduction to a simpler alternative -- maybe. It seems the complication is due to:

The current syntax reads:

for A, data in iterator_func, X, Y do block end

Data is the actual data returned by the function and later used in the block. A, X, & Y are left to further explaination below. Here is a possible implementation of both a collection iterator and a generator iterator, based on the tutorial example (tried to be very explicit, and started at 1 for a change):

-- collection iterator --

numbers = {1,3,5,7,9,11,13}

function coll_squares(coll)

    local function next_square(coll, index)

        if index > #coll then

            return nil

        end

        n = coll[index]

        return index+1, n*n

    end

    return next_square, coll, 1

end

for i, square in coll_squares(numbers) do print (square) end     --> OK



-- generator iterator --

function gen_squares(limit)

    local function next_square(limit, number)

        if number > limit then

            return nil

        end

        return number+1, number*number

    end

    return next_square, limit, 1

end

for n, square in gen_squares(7) do print (square) end     --> OK

So, what are A, X, & Y? In case of a collection:

In case of a generator:

It is difficult to find a common ground in order to explain and name A, X & Y meaningfully. X is called 's' in the reference manual, and 'state' in the tutorial. In the reference manual, A is called var1, while Y is called var. Here is a trial to make sense out of that:

[If anyone finds better names...] In addition to their use in yielding next data, the mark and the range are also used together to know when to stop iterating. It is not trivial to guess what the iterator and the iterator func are supposed to return, as well what the func implicitely receives from lua, and the proper order of all these values.

The code above may be rewritten as follows:

-- collection iterator --

function coll_squares(coll)

    local index = 1

    local coll = coll       -- just to make things clear

    local function next_square()

        if index > #coll then

            return nil

        end

        n = coll[index]

        index = index+1

        return n*n

    end

    return next_square

end

for square in coll_squares(numbers) do print (square) end     -- OK



-- generator iterator --

function gen_squares(limit)

    local number = 1

    local limit = limit     -- ditto

    local function next_square()

        if number > limit then

            return nil

        end

        n = number

        number = number+1

        return n*n

    end

    return next_square

end

for square in gen_squares(7) do print (square) end     -- OK

There are little differences which are all simplifications, except for the last one:

The last point makes the mark (index or number) a local var in the iterator which is reachable to the nested func _closure_ as an upvalue (right?). The "range" can only be a local var in the iterator, so there is no need to pass it explicitely as an argument to the function. (please correct if anything is wrong here, including terminology)

We can imagine more complex cases, eg specifying the generator interval. Additional data becomes iterator parameters:

-- generator iterator --

function gen_squares(start, stop, step)

    local number = start

    local function next_square()

        if number > stop then

            return nil

        end

        n = number

        number = number+step

        return n*n

    end

    return next_square

end

for square in gen_squares(3,9,2) do print (square) end     --> OK

Idem, if we complexify a collection iterator (here rather artificially):

-- collection iterator --

require "math"

numbers = {1,3,5,7,9,11,13,15,17}

function coll_squares(coll, modulo)

    local index = 1

    local function number_filter()

        -- return next number in coll multiple of modulo, else nil

        while (index < #coll) do

            number = coll[index]

            if math.fmod(number, modulo) == 0 then

                return number

            end

            index = index+1

        end

        return nil

    end

    local function next_square()

        -- yield squares of multiples of modulo in coll

        n = number_filter()

        if not n then

            return nil

        end

        index = index+1

        return n*n

    end

    return next_square

end

for square in coll_squares(numbers, 3) do print (square) end     --> OK

In all cases, it seems A, X & Y are not needed. This way of implementing iterators makes a good use of lua basic features: funcs as values, nested funcs, closures/upvalues. So, a question is: can we simplify the interface between "for" syntax, iterator, and iterator func by getting rid of A, X & Y? If yes, a new syntax could be:

for data in iterator_func do block end

While the present one is:
for A, data in iterator_func, X, Y do block end

As a consequence, the variety of iterators would not be globally caught by the syntax itself, in a rather complicated manner, but let to the user implementation instead. It would sertainly be easier to learn & explain both the syntax and the proper way to write an iterator for a given task.

The reference manual states:

<< f, s, and var are invisible variables. The names are here for explanatory purposes only. >>
In the present proposal, they are inexistent. The necessary data is passed as parameters to the iterator, as is done now: collection, bounds or whatever.

(first page formulation by DeniSpir)


RecentChanges · preferences
edit · history
Last edited November 13, 2009 2:41 pm GMT (diff)