Lua Locales

lua-users home
wiki

Here are a few notes on [locales] in Lua.

Lua is heavily influenced by Standard C, so much of the C treatment of locales carries over info Lua.

The current locale can be get or set using the [os.setlocale] function.

Warning: os.setlocale internally calls the C [setlocale] function, which globally sets the locale for all Lua states and all OS threads in the current process. However, some C implementations provide non-standard functions (e.g. Microsoft's [_configthreadlocale]) so that only the current OS thread is affected.

A brief background on locales in [PiL 22.2] (warning: see Historical notes below).

Lua syntax

In Lua 5.1, identifiers in the Lua language were locale dependent [1]. In 5.2, "Lua identifiers cannot use locale dependent characters [2].

Numbers in Lua code must be written in the C locale style:

print(1.23) -- always valid

print(1,23) -- always interpreted as two numbers: "1" and "23".

Tests on Numbers

> assert(os.setlocale('fr_FR'))

> = 1,23 , 1.23

1	23	1,23

> =loadstring("return 1,23 , 1.23")()

1	23	1,23

> print(1.23, tostring(1.23), string.format("%0.2f",1.23))

1,23	1,23	1,23

> 

> ="1.23" + 0

stdin:1: attempt to perform arithmetic on a string value

stack traceback:

	stdin:1: in main chunk

	[C]: in ?

> ="1,23" + 0

1,23

> =tonumber("1.23"), tonumber("1,23")

nil	1,23

The above will depend on luaconf.h settings, which in 5.2.0rc2 mentions this:


/*

@@ lua_str2number converts a decimal numeric string to a number.

@@ lua_strx2number converts an hexadecimal numeric string to a number.

** In C99, 'strtod' do both conversions. C89, however, has no function

** to convert floating hexadecimal strings to numbers. For these

** systems, you can leave 'lua_strx2number' undefined and Lua will

** provide its own implementation.

*/

#define lua_str2number(s,p)	strtod((s), (p))



#if defined(LUA_USE_STRTODHEX)

#define lua_strx2number(s,p)	strtod((s), (p))

#endif

If you disable LUA_USE_STRTODHEX, then Lua's own lua_strx2number implementation is used. This relies on locale-independent functions (e.g. lisspace from lctype.c). Observe the curious behavior:

Lua 5.2.0  Copyright (C) 1994-2011 Lua.org, PUC-Rio

> assert(os.setlocale'fr_FR')

> return tonumber'1.5', tonumber'1,5'

nil	1,5

> return tonumber'0x1.5', tonumber'0x1,5'

1,3125	nil

Tests on Character Classes

function findrange(pat)

  for i=0,255 do

    if string.char(i):match(pat) then

      print(i, string.char(i))

    end

  end

end

assert(os.setlocale'C')

findrange'%a' --> A-Z,a-z

assert(os.setlocale'en_US.ISO-8859-1')

findrange'%a' --> A-Z,a-z,\170,\181,\186,\192-\255

findrange'%l' --> a-z,\181,\223-\255

Other classes like isspace (%w) and isdigit (%d) potentially might return more than the C locale [3].

string.lower and string.upper are also locale dependent.

String Comparisons

> assert(os.setlocale'C')

> return "é" < "e"

false

> assert(os.setlocale('fr_FR'))

> return "é" < "e"

true

Tests on Dates

> assert(os.setlocale('C'))

> =os.date()

Sat Nov 26 13:22:56 2011

> assert(os.setlocale('fr_FR'))

> =os.date()

sam. 26 nov. 2011 13:22:56 EST

Historical Notes

-- This error occurred in 5.0 (but not 5.1 or above)

-- http://www.lua.org/pil/22.2.html

print(os.setlocale('pt_BR'))    --> pt_BR (Portuguese-Brazil)

print(3,4)                      --> 3    4

print(3.4)       --> stdin:1: malformed number near `3.4'

See Also


RecentChanges · preferences
edit · history
Last edited November 26, 2011 9:13 pm GMT (diff)