WML semantic structure -- parsing

Discussion of all aspects of the game engine, including development of new and existing features.

Moderators: Forum Moderators, Developers

User avatar
solsword
Code Contributor
Posts: 291
Joined: January 12th, 2009, 10:21 pm
Location: Santa Cruz, CA
Contact:

Re: WML semantic structure -- parsing

Post by solsword » October 13th, 2009, 7:49 am

Yeah... I figured it might not be as simple as what I wrote (heck, I knew it wasn't *that* simple, but I thought it might not be that much more difficult)...

In any case, my script has many flaws. Right now, it looks for WML_HANDLER_FUNCTION functions, and then scans their bodies for any use of cfg["<something>"] and just prints out those values. This picks up extra parameters in some places, because the cfg being referenced isn't the one that the function is using, and possibly omits others. And like AI said, it only works for a subset of WML. It completely misses anything to do with subtags, as well, I think. Now, I'm not convinced that it's *impossible* to write an extended script that searched for a whole bunch of things and extracted the right stuff, but I admit that it would probably be pretty difficult, and it might be impossible. Clever use of cscope would probably be involved.

@spir: my worry about a semantics file is that that's basically the system we have now. It isn't formalized, but we basically have the wiki, which is *supposed* to detail the semantics of WML (and which does a reasonable job in some places) and which is *supposed* to be maintained by UMC dev people (I think... I don't think there's actually any concrete responsibility for it). And of course, the wiki is horribly out-of-date, and likely will always be. The people who actually know what needs to get changed are the people submitting patches to the C++ code. My rationale, is that, while it's a lot of work (and an easily forgotten task) to add their changes to the wiki (it can be a tough problem just to find *where* to add the changes, in some cases, and it's easy to miss something), it's a much easier task to add specifically-structured comments, right next to the code that the developer just wrote, that explain basic things about how the WML side of it works. Heck, I'd be happy to contribute a lot of the initial comments, and after that work is done, the load on the devs really isn't that big, in my opinion. If you've got a comment sitting there with the proper format, explaining how the WML function that you just changed works, assuming your change is minor (which most changes to WML are these days, I presume), it's really not a big deal to change the comment a bit. I mean, you already have to write an entry in the changelog...

To me, it's slightly more work (and easier to forget) to have to remember where the WML semantics file is, go into that file, find where your tag is, and edit its entry. But that's also not a bad system, IMO.

One good thing about moving documentation into the code (or at least near it) would be that the documentation becomes versioned automatically. That's one of the big problems with the Wiki: it gets cluttered with "Developer version only" stuff, and then some of that stuff doesn't get updated, etc. Generating WML docs from some authoritative, versioned source, would be better. Ideally, we'd have docs.wesnoth.org that worked just like units.wesnoth.org.
The Knights of the Silver Spire campaign.

http://www.cs.hmc.edu/~pmawhorter - my website.

Teamcolors for everyone! PM me for a teamcolored version of your sprite, or you can do it yourself. If you just happen to like magenta, no hard feelings?

spir
Posts: 97
Joined: September 15th, 2009, 9:31 am
Contact:

Re: WML semantic structure -- parsing

Post by spir » October 14th, 2009, 8:22 am

solsword wrote:My rationale, is that, while it's a lot of work (and an easily forgotten task) to add their changes to the wiki (it can be a tough problem just to find *where* to add the changes, in some cases, and it's easy to miss something), it's a much easier task to add specifically-structured comments, right next to the code that the developer just wrote, that explain basic things about how the WML side of it works.
Well, I can only repeat my words that the work is the same if we put the same quality requirements in terms of accuracy and completeness (and pedagogy!). Both to first write the huge amount of metadata in either C++ or WML code, and then to maintain it up-to-date. Now, I agree that in the best of possible worlds your solution is the simplest and most straighforward one; but it would IMO the code be thought with this in mind from start on, and the devs be taught from start on, too, to regard this as of highest importance (after all, the use of their work is, or should be, its raison d'être, no?). And, before you reply this yourself ;-), I agree too that in the best of possible worlds we would not be talking about that for there would be a wonderful reference already, one constantly maintained & improved by a community melting most of the dev team members and nearly all WML users. I agree that in the so-called "reality" a majority of users won't contribute. People are like that.
A side note is that in any case the work can be done bit per bit (section type per section type) because proper schemas are "closed under union", meaning that concatenation produces a bigger schema (under a new top-section). Concretely, this means we can indepently cope with side, objectives, story..., then only build scenario out of that. Hopefully, first achievements will be an incentive for some to help & go on. This is indeed true for reference guide as well.
Also, as you mention yourself, the information must be published anyway by the dev team, if only into change logs. I guess this precisely may be a proper way to get the data we need, in the form we need. A small team knowing the WML-related code enough may write or review or comment relevant change logs, for the "canonical" WML code to be properly updated (actually, once semantic schemas exist, it may be as easy to update them directly -- or even because it's a formal description of their work the devs would wish/demand to maintain it themselves).

Anyway. I propose that we collaborate on this because we have the same goals in mind. We may together (and whoever wants) design the target, namely the clearest possible reference guide template. We can also help each other and design the means for each solution (note that once data is extracted from C code, we may use a single path to produce schema and/or reference, provided data is stored using the same or compatible type(s)).
Denis
life is strange

various stuff about BfW (rules, stats, alternatives) and WML (parser, semantic schema, evolution)

AI
Developer
Posts: 2394
Joined: January 31st, 2008, 8:38 pm

Re: WML semantic structure -- parsing

Post by AI » October 14th, 2009, 11:44 am

One *is* supposed to update the wiki after changing WML, but people are forgetful sometimes.

spir
Posts: 97
Joined: September 15th, 2009, 9:31 am
Contact:

Re: WML semantic structure -- project

Post by spir » October 14th, 2009, 3:24 pm

Ok, done a draft version.

The WML parser is now able to cope with metadata "notes". To be short, while parsing a section (that's all it does!), it parses and stores notes as pseudo-items; then when the section node is created, they are attached to (following) "real" items, and removed.
The section nodes have the ability to output themselves into a semantic schema, including separate schemas for subsections.

Showcase: the test script first returns the source, then builds and outputs (in tree view) a parse tree, finally writes the schema. The source doc is supposed to be a formatted and (meta) annotated article. (would be crazy to use use WML grammar for such a document!)
semantic schema test output:
Note: the output formats are purely arbitrary. I gave a distinct format to notes using ':' instead of '=' just to distinguish them. The *input* format for notes in source using '@' is also just an example. Also, there are 2 incorrect notes in source (parser just informs then ignores).

I'd like feedback on this, as much as you like.

What remains is to define a reader-friendly template for reference guide sections, try this on real WML section type (small-scale, & without infinitely nested sub-sections), then... start to use by writing "canonical" code.


PS: Just found an issue: notes should be kept in order...
Denis
life is strange

various stuff about BfW (rules, stats, alternatives) and WML (parser, semantic schema, evolution)

AI
Developer
Posts: 2394
Joined: January 31st, 2008, 8:38 pm

Re: WML semantic structure -- parsing

Post by AI » October 14th, 2009, 9:40 pm

If you can wait a week or two, I'll rewrite wmlgrammar into something similar. It'll also include all the schemas I wrote in late 1.5

spir
Posts: 97
Joined: September 15th, 2009, 9:31 am
Contact:

Re: WML semantic structure -- parsing

Post by spir » October 16th, 2009, 6:14 pm

AI wrote:If you can wait a week or two, I'll rewrite wmlgrammar into something similar. It'll also include all the schemas I wrote in late 1.5
Great, I'm waiting for this!
But in the meanwhile, I'll go on my way... see post to come below.
Denis
life is strange

various stuff about BfW (rules, stats, alternatives) and WML (parser, semantic schema, evolution)

spir
Posts: 97
Joined: September 15th, 2009, 9:31 am
Contact:

Re: WML semantic structure -- parsing

Post by spir » October 16th, 2009, 6:39 pm

Right, I have a user reference template generator and a draft template ready.
To be able to view it styled, I'll add BBCodes and post an example right here... Step by step. So that this post will constantly change.

Tell me what you think of it. When It looks good, it'll be easy to change BBcode to html, I guess.

EDIT: Ok, works no good 'cause BBCode does not preserve start of line whitespace. And there is no other way to preserve it else 'code', which escapes all markup all together, indeed (a real plague in all struct text markup -- they now better than the author what's good for them). Anyway... I'll see if the wiki allows enough formatting to do that. Tomorrow.

EDIT BIS: Changed it directly to xhtml instead. Easier (for me, than to learn MediaWiki weirdities), and it shows everywhere. Just copy the code below into an .xhtml file, and click ;-)
Hope you like the (trend of) the result; it's first trial, anyway. There should be more fields for "meaning", "use", or such verbal things intended to humans...
ref manual template trial:
Denis
life is strange

various stuff about BfW (rules, stats, alternatives) and WML (parser, semantic schema, evolution)

spir
Posts: 97
Joined: September 15th, 2009, 9:31 am
Contact:

Re: WML semantic structure -- parsing

Post by spir » October 22nd, 2009, 7:56 pm

Holà,

Right, I have now what seems to be a decent toolset to generate both semantic schema and user manual page from (what I decided to call) "reference code", as described in earlier exchanges on this thread. Including ~ clear and nice-looking formatting for user reference.
I have put online some explaination, howto-use, comments, example, and pointer to the module.
You may also have a look at (much) more stuff around BfW & WML.

EDIT: Just found an issue I really do not understand: when viewing the result page (user manual) online, indentation of sub-sections (margins, actually) is not respected — while everything is fine on the local page. Well, they hold exactly the xhtml code, I have only one version anyway. First time I see that. If you have any clue… help welcome.
What I'd like to know, too, is whether you see in your browser the same thing as me: namely that all section //definitions// are left-aligned, only items are indented.
Denis
life is strange

various stuff about BfW (rules, stats, alternatives) and WML (parser, semantic schema, evolution)

Soliton
Site Administrator
Posts: 1597
Joined: April 5th, 2005, 3:25 pm
Location: #wesnoth-mp

Re: WML semantic structure -- parsing

Post by Soliton » October 22nd, 2009, 8:15 pm

spir wrote:What I'd like to know, too, is whether you see in your browser the same thing as me: namely that all section //definitions// are left-aligned, only items are indented.
Same here. Looks quite nice anyway.
"If gameplay requires it, they can be made to live on Venus." -- scott

spir
Posts: 97
Joined: September 15th, 2009, 9:31 am
Contact:

Re: WML semantic structure -- parsing

Post by spir » October 25th, 2009, 6:52 pm

Hello,

endly had time to solve this issue. Took the opportunity to rewrite the "styler", to properly export style sheet. Here is the resulting reference manual page online. Critics welcome!
[Note: it's not BfW game code, indeed ;-)]

Feedback welcome, too, about the whole reference generation stuff.
Denis
life is strange

various stuff about BfW (rules, stats, alternatives) and WML (parser, semantic schema, evolution)

Post Reply