- [!] VersionNotice: The problem described in this article has been addressed. Starting in version 5.0, the value n is no longer referenced in size calculations of the list component of a table.
The problem
Tables contain the member "n" as an optimisation for table insertions. ie., from the manual :-
- getn (table)
- Returns the size of a table, when seen as a list. If the table has an
n
field with a numeric value, this value is the size of the table. Otherwise, the size is the largest numerical index with a non-nil value in the table. This function could be defined in Lua:
function getn (t)
if type(t.n) == "number" then return t.n end
local max = 0
for i, _ in t do
if type(i) == "number" and i>max then max=i end
end
return max
end
Due to the dual nature of tables, ie. they can be lists and dictionaries at the same time, "n" can conflict with user data in a table. The section below lists a number of solutions to the problem. Please feel free to comment next to a solution or put forward your own. Please leave your name or initials so "votes" can be counted.
This could be defined better. The n field is not the "table size" and is not for "table insertions". (Granted the Lua API function names and documentation confuse this matter.) Tables are a pure data type in Lua, while lists are not. There are several ways to implement lists on top of the table data type. Since lists are an important data type and required even within the VM itself (for varargs), the standard library provides an implementation including use of an n field for the list size and an algorithm for determining the "list component" of an arbitrary table (that's what getn
does when there is no n field).
Alternative solutions
It's not a problem, leave it.
Your programming style may not necessitate the mixing of lists and dictionaries in the same table so it may not be a problem.
- It's not that my programming style doesn't require the mixing of vectors and dictionaries in the same table; it is rather that I don't think that mixing uncontrolled keys with controlled keys in the same table is a good idea. Sooner or later it will bite you, whether the controlled keys are called n or __n__ or ____NUMBER_OF_NUMERIC_KEYS_____. I dislike names like __n__ both from an esthetic view and from the viewpoint that "avoidance by unlikeliness" just creates harder to find bugs. Mixing controlled keys (such as slot names) is another issue, but I have no problem not calling my slots "n". Furthermore, embarking on a change which will create backwards incompatibility is not something to take lightly. -- RiciLake
- So, basically you dont want to break backwards incompatibility, you find that because the table size is named "n" any conflict bugs are easy to find, and you dont mind always having your table size called "n". Why not remove the problem, as you state, "sooner or later it will bite you", and you can be flexible about naming? I dont remember seeing any code which sets a table size using n (so no backward problem?). --NDT
- I think that summarises it :-). If I'm using a table as a vector, I don't use it as a dictionary although I might put my own keys in it. In that case, I simply don't use the key
n
or any numeric key. That's no different from the case of stashing keys in any object implemented as a table; you have to avoid using the object's defined keys, which hopefully are documented. As it happens, I do have code which sets table size using n
-- unless you have seen every line of Lua code in existence, I don't think you can blithely make the claim that there is no backward problem. In any event, I think it's pretty common to retrieve the table size using vec.n
rather than getn(vec)
because that is significantly faster if you can be assured that the key exists (or even vec.n or getn(vec)
. -- RiciLake
- "which hopefully are documented" :-) Its this kind of confusion that can be avoided. The fact that tables are multipurpose and can be used as lists or dictionaries means that this type of restriction should not be applied. Personally I'd rather put up with fixing code which may be broken by this. I consider Lua still in its infancy as a language, with its roots in convenient embedding and configuration. It still has to have little quirks, such as this, ironed out before it is considered a serious scripting language. I'm not sure becoming a "Python-beater" is one its design goals, but I'm sure the authors are keen to see the language and its user base develop. This will continue to happen as the Lua improves. :-) --NDT
- If you don't like getn, define your own version of getn, tinsert and tremove. That's all. There's only one place else where an 'n' is used: in the arg table of vararg functions; just say that arg is not a list but a table ;-) But see below for another idea (len[t]). -- ET
- To be fair, there are a couple of other places, one being
call
. However, it is certainly easy enough to write replacement functions and no-one is forcing you to use tinsert and tremove -- RiciLake
- My opinion is more like "It is a problem, leave it". I have yet to see a good solution. The
setn
idea is a hack on top of a hack. --JohnBelmonte
It should be renamed
The "n" variable should be renamed to something less likely to clash, eg. "__n__"
- "__n__" would be fine with me, this would not break any code in uCore. --ErikHougaard
- Any direct namespace incursion is ugly and non-general to a degree that it may hurt real world functionality. If someone wanted to use the premapping domain of a table as the environment variables for a shell for an OS, for example, then you would somehow have to restrict the names to exclude whatever name is chosen to be used for the table size. Internally there is no reason for this name be a legal Lua variable name. So in fact a NULL pointer or other sentinel would be preferable, so that it does not invade the Lua variable namespace. This may lead to some exceptions in the code, however since they overlap with illegal possible values, those exceptions should be being checked today anyway. --Paul Hsieh
- Use table[0]. Whatever arbitrary non-numeric name is choosen has the potential to intrude upon the table's namespace for non-list usage. However, since the table already includes a list, it is already denying access to all positive integers as indicies (table[1], table[2], etc) and so adding zero (0) to that is not much of an extra intrusion; far less so than 'n'. -- PeterHill
setn()
setn()
would be a better solution to compliment getn()
.
- I think this is the best choice, it removes any possibility for a conflict and allows for some variation in the actual underlying implementation --Tom Wrensch
- This is my favorite - make it a value that is not accessible via normal indexing... this could be realized internally by a weak table - maybe even expose that table even though i think a setn() function would be better for readability --PeterPrade
- This seems like a better method. I don't like the idea of something arbitrary being forced on me, regardless of how easy it may be to code around it. --Terence Martin
- This is best. They should probably be tag methods too. --DavidJones
- Probably the best method. --LavergneThomas?
- I like this approach as well. I'm tired of working around 'n' in my foreach() loops. It makes sense to compliment getn(). The approach below is very clever, and I would certainly be okay with it, but I think it might seem rather unintuitive to new users.
- This is IMHO the best solution, as I really see the addition of an "internal" field as very dangerous. Imagine having a table including all the space separated strings found in a file, for example. Such "hidden" fields are very dangerous, even when named in an "improbable" way. -- Vincent Penquerc'h
len[t]
- Ok, this would be for Lua 4.1 where you have weak tables (basic idea from Luiz Carlos Silveira): store the length of a list (or any other object) in a global table, i.e. called
len
. To access the length for an object x
simply write len[x]
instead of x.n
or getn(x)
/setn(x,n)
. You could even set the index tag method on len
to calculate the length if it's not already present. And, if you fear you might type () instead of [] set the call method too ;-) -- ET
- I like the cleverness of this idea, although I think
setn()
rocks the boat the least. --JamesHearn
- I like the cleverness of this idea, too. And it has a number of advantages: it does not leave any clutter in either the internal or visible implementation of a list, so you could iterate over a pure list without worrying about ignoring the
n
, __n__
or whatever
keys. Table lookups are faster than procedure calls, so it's probably faster than the current implementation. And furthermore, it's backward compatible for tables without an index
tag method:
settagmethod(tag({}), "index",
function(v, k) if k == "n" then return len[v] end end)
- So I'm willing to change my vote --RiciLake
- This is also compatible with
getn
and setn
; if that's what you want, just include:
function getn(v) return len[v] end
function setn(v, n) len[v] = n end
- I'd like to change my vote to this. A weak keyed length table would detach
n
from the table and leave the table "pure". As John points out, lists are usage of tables and this implementation detaches the list usage from the table. getn()
/setn()
can still be used. Perhaps len[x]
should be changed to tlen[x]
for consistency with tinsert()
/tremove()
. -- Nick Trout
- Does len[] map global names to a integer? Or does it map from the domain of objects to an integer? I'm new to Lua, so if I am misundertanding, feel free to edit this or my other comments with wild abandon. --Paul Hsieh Paul: len[] is a map that maps map objects (as weak keys) to their "list-length" (if they have been treated as a list before, i.e. getn() or tinsert() etc has been called with this table object as argument) -- PeterPrade
- The very fact of adding a global variable for this strikes me as being the same kind of things than adding a hidden field in a table, albeit less intrusive. Sure, adding setn would also introduce a global, but I see it as less a hack, since it's an API entry, not a place where to get rid of an embarassing thing you'd like to put somewhere. -- Vincent Penquerc'h
Please add any other solutions...
Implement an non-overridable getnEx() function
- I do not understand the desire to be able to set the size of a table to something other than what it really is. I would instead propose that an unexposed true max integer domain member for each table be tracked by the implementation in a function called, say, getnEx(). Then for backwards compatibility getn() could be implemented as:
function getn (t)
if type(t.n) == "number" then return t.n end
return getnEx(t) -- internal function
end
and people could ignore the getn() function if it causes them grief, and just use the getnEx() function which doesn't require work arounds. getnEx() would be essentially equivalent to:
function getnEx (t)
local max = 0
for i, _ in t do
if type(i) == "number" and i>max then max=i end
end
return max
end
--Paul Hsieh
Votes cast
Please update the list below. If you would prefer to vote anonymously just add a vote below (but it would be nice to hear your opinion :-).
- Leave as is : 1 vote.
- Change "n" : 1 vote
- Implement an non-overridable getnEx() function : 1 vote
- Add
setn()
: 11 votes
- Add
len[v]
: 4 vote
RecentChanges · preferences
edit · history
Last edited October 10, 2009 1:32 pm GMT (diff)