Lexically scoped variables

Integral · Post by **Integral** » September 15th, 2004, 3:46 pm

In a recent discussion, it was pointed out to me that Wesnoth doesn't have local variables at all, and in particular, it doesn't have lexically scoped local variables. In the context of Wesnoth, if a (local) variable is lexically scoped, that would mean that, for instance, an event defined within its scope would always use the value of the variable at the time the event was defined, rather than at the time it fires.

To be a little more concrete, the patch attached to this message adds a "let" tag which can be used within events. Its syntax looks like this:

Code: Select all

[event]
	name=moveto
	first_time_only=no
		[filter]
			description=Konrad
		[/filter]

		[let]
		[var]
			name=deathx
			value=$x1
		[/var]
		[var]
			name=deathy
			value=$y1
		[/var]

		[in]
			[message]
				description=Konrad
				message="Next person who steps at ($deathx, $deathy) dies!"
			[/message]

			[event]
			name=moveto
				[filter]
				side=2,3,4
				x=$deathx
				y=$deathy
				[/filter]

				[message]
				description=Konrad
				message="I stepped at ($deathx, $deathy), and now you shall die!"
				[/message]

				[kill]
				x=$deathx
				y=$deathy

				animate=yes
				fire_event=yes
				[/kill]
			[/event]
		[/in]
		[/let]
	[/event]

This is test code I've been using; it gives Konrad godlike powers in "Elves Besieged" (any enemies that step on a hex that he previously stepped on are instantly killed). $deathx and $deathy in the inner event are replaced with the values of $x1 and $y1 *at the time the outer event fired*. A less silly example where it would be useful is the idea of creating randomly placed chests: you could define a chest's location using [var] tags, and then define events relating to the chest in the [in] tag.

The syntax of [var] is identical to the syntax of [set_variable]: you can add to variables, multiply them, etc (up to the limitations of [set_variable] -- it doesn't appear to be possible to directly add two variables). [in] just contains a list of event-action tags. Variables created by [let] are immutable and cannot be modified after they are created (so we should call them "constants", but who expects terminology to make sense?); set_variable will always modify a global variable, even if a local variable with the same name exists. Local variables can be used pretty much anywhere that global variables can be (although it's quite possible that I missed some cases), with the limitation that local variables must be simple integer/string variables -- no structures allowed for now.

Internal changes

This patch is quite a bit more invasive than my multiplayer recall patch. To implement lexical scope, the typical approach is to thread an "environment" through your interpreter, which stores the name -> value bindings that are active in the local scope; when you look up a name, you do it through the environment. The problem is that a lot of code assumes that the syntax object (config) will do variable substitution transparently; since all variables are global, there's no provision for passing an environment into the various places where substitution occurs; and the substitution is done via operator[], which (IIRC) can only take a single argument.

What I did was to first define a name_scope type, which is basically just a refcounted wrapper around a string_map (refcounted because environments may have unlimited extent and I didn't want to be constantly copying them); I then [0] built a scoped_config class that pairs a name_scope with a config& object, and has an operator[] that does name lookup with correct replacement of lexical variables. In the parts of the interpreter that might occur within a local scope, I changed the config& argument to methods and functions to a scoped_config&. (note: the config -> scoped_config constructor is explicit so that it's obvious when the code enters a potentially scoped method, and in order to reduce the risk of scope inappropriately being lost).

These changes, with appropriate code to thread scopes through method calls, weren't quite enough to get everything converted, so two more changes had to be made to the internal API. First, scoped_config gained a deep_subst() method call. This recursively does simple name replacement on all the attributes contained within a scoped_config, which allows some syntax to be "lifted" out of its context *IF* (a) its contents only need simple replacement and (b) simple replacement should be performed on all of its contents. (this is a bit of a hack) Second, I modified name interpolation: it previously assumed it always would get variable bindings from a string_map, which is too simplistic; now, instead of passing in a string_map, you pass in an instance of a class that does variable lookup (simple canned classes are available for the common cases).

Oh, and the reason that lexical variables are immutable is mainly that, because scopes have unlimited extent, making variables mutable would mean that I'd have to explicitly store the graph structure of scopes (which I currently collapse) and that I'd have to figure out how to write it to save files and restore it. If there's demand for mutable local variables, they could be added.

In the absence of [let] tags, the program should behave just as it did before; all local scopes will be empty, so code that looks up names will get them from the global scope.

Daniel

[0] actually, this is an idealized (not chronological) picture of what happened; I'm ignoring the fact that, for instance, name_scope was typedef'd to string_map until long after scoped_config was created.

P.S. - I changed my editor settings to use 8-column indentation and I've tried to be more consistent with Wesnoth's coding style, but I'm afraid Dave's last message about coding guidelines was sent after most of this patch was complete...I've adjusted things as I notice them, but it probably isn't perfect.

Elvish_Pillager · September 15th, 2004, 6:48 pm

Really nice implementation, IMO.

Better than my suggestion, anyway.

Invisible Philosopher · September 15th, 2004, 7:10 pm

I can't download it! (http://www.wesnoth.org/forum/download.php?id=4232)

Battle for Wesnoth Forum wrote:General Error

Ã‚Â

The selected Attachment does not exist anymore

404 File Not Found: The File files/wesnoth-lexical.patch.gz does not exist.

Integral · Post by **Integral** » September 15th, 2004, 8:42 pm

Invisible Philosopher wrote:I can't download it!

Gah. That might be thanks to my double-post. Can you download this one?

Daniel

Invisible Philosopher · September 15th, 2004, 8:58 pm

Yes, successful download.

Integral · Post by **Integral** » September 23rd, 2004, 12:15 pm

Can I safely assume that this patch has been rejected? (if so, I'll delete the patched files and revert my CVS tree)

Daniel

PS: updated patch against current CVS attached.

Post by **Boucman** » September 24th, 2004, 5:12 am

no you can't...

patches are long to get in too... did you post it to savannah ?

Post by **Dave** » September 24th, 2004, 11:48 am

Well I haven't really looked over this patch properly yet.

It being substantial size, and me not really understanding exactly what it does doesn't help its chances though...

David

Integral · Post by **Integral** » September 24th, 2004, 3:40 pm

Hm, ok. I don't know exactly what you aren't understanding, so I'll start from the beginning.

What is lexical scope?

This is a brief explanation...it might be a bid obscure if you aren't familiar with the technique of functional programming.

In a language that's lexically scoped, a variable name always refers to the syntactically "closest" definition of that variable. This becomes important once you can do things like defining functions within other functions; for instance:

Code: Select all

f=lambda x: lambda y:x+y

I'm using lambda notation here -- "lambda x:BLAH" is a function of one argument; you invoke it by executing BLAH with x bound to the argument. So the function "f" above is a function of one argument that returns another function of one argument. You could use it like this:

Code: Select all

f(1)(2)          ; returns 3
f(1)(5)          ; returns 5
g=f(1)          ;  g is now "lambda y:1+y", the function "add 1 to y"
g(3)              ; returns 4
x=100
g(3)              ; still returns 4!

The important thing here is that the variable "x" within the function g always refers to the variable at the moment the function was created, and never to any other variable named "x". This sounds simple, but it's not the most obvious way of implementing functions-within-functions, and so a lot of "first programming languages" get this wrong.

How is lexical scope usually implemented?

Well, I'm sure there are a lot of approaches, but the implementation I used is basically the most naive implementation possible (which also makes it the simplest). Every expression that you encounter while evaluating the program has a "local scope" that represents the value of any variable accessed by that expression. For instance (abstractly), the local scope of the body of "g" above looks like this (it says that "x" has the value "1"):

Code: Select all

x -> 1

Normally, the local scope is just "threaded" through your interpreter's evaluation function as an extra parameter. When a new function is created (for instance, using "lambda"), you store the function as a "lexical closure" which combines the body of the function with the local scope in which it was created. To evaluate a function call, you extract the body of the function and its scope from the lexical closure, and evaluate the body in that scope
(actually, in a scope where the formal arguments are bound to their values).

Incidentally, lexical closures are why it's convenient to refcount (or otherwise memory-manage) scopes: you don't know how long a lexical closure will hang around, and it needs to have a valid reference to its scope for that whole time, so you pretty much need to either use automatic memory management, or copy the local scope every time you create a closure. (and copying won't work with mutable variables, see below)

The thing about mutable vs immutable local variables that I mentioned has to do with code like this:

Code: Select all

let x=1, y=2 in
   let x=2 in
       x:=5
       y:=7
   endlet
endlet

Under the principle of lexical scope, only the inner definition of x should be changed by "x:=5", but the outer definition of y should be changed by "y:=7". Because of this sort of thing, you end up having to maintain a stack of local scopes (each inner scope has a pointer to its "parent"). If you can't change local variables, it's sufficient to "collapse" this stack: when you create a new local scope, you just copy the old one, modifying it to create or change variable bindings.

As far as substitution goes: if variables can't be modified, you can eliminate references to variables within a scope by just replacing them by their values. For instance,

Code: Select all

let x=5 in
   x+4
endlet

gives the same result as

Code: Select all

5+4

This becomes slightly more complicated if there's an inner scope that binds the same variable:

Code: Select all

let x=5 in
  let x=6 in
     x+4
   endlet
endlet

is NOT the same as "5+4". Where I used substitution, I didn't take inner scopes into account, so it's just a hack (see below).

What does this have to do with Wesnoth?

As I pointed out earlier, lexical scoping is most useful when you have the ability to create procedures that can be accessed after the program has left the scope. Now, WML doesn't have procedures that can be invoked explicitly, but it does have the event system: events are anonymous procedures that are invoked by the Wesnoth game engine. Because you can define events within events, you run into scoping issues: maybe, for instance, the inner event needs to access the values of $x1 and $y1 that were used by the outer event. Unfortunately, there's no reliable way to get at them if the outer event is triggered multiple times (see the thread in Scenario and Campaign development on treasure chests).

The implementation basically just adds local scopes that look a lot like config objects and threads them into every method that can be called from an event; that accounts for most of the changes. In a few places, where a bunch of syntax (that can't contain "let"s) is being moved from one place to another, it replaces all local variables with their values in order to eliminate the need for a local scope. This is the only sensible way I could see to deal with those cases within the framework of the WML interpreter.

I'm not sure how helpful that will be...I could probably answer specific questions better.

Daniel

Post by **Dave** » September 25th, 2004, 12:08 am

I am familiar with functional languages and know what lexical scoping is. I simply mean that I don't understand exactly what the patch does to the game's internal mechanics.

I guess I will have to look at it more closely...

David

Post by **Dave** » September 26th, 2004, 7:13 pm

I think that this patch will have to be rejected/held off until at least version 1.0.

We don't really want to add any more feature-adding patches that affect multiple areas of the game at this stage in development.

David