WeSpell - A Python .po spellchecker
Moderator: Forum Moderators
- Elvish_Hunter
- Posts: 1575
- Joined: September 4th, 2009, 2:39 pm
- Location: Lintanir Forest...
WeSpell - A Python .po spellchecker
As some of you may already know, I'm registered also to the Wesnoth Italian Forum. There, RockScorpion recently finished translating After the Storm, and asked for help in catching spelling errors. (If you know Italian, or want to take a look with Google Translate, here there is the topic). Clearly, checking the .po of such a campaign by hand is no easy task, so I searched if there was a spellchecker for .po files around, and to my great surprise I was unable to find one.
So, I started writing one in Python, relying on the Enchant library. Unfortunately, my first version printed the output to the command line, and became quickly clear that the Windows' Command Prompt is unable to handle Unicode, except perhaps by using some black magic.
So, I had to implement a GUI sooner than I expected, and as usual my toolkit of choice is Tkinter with the ttk library, that by the way supports Unicode like a charm.
Finally, a first version of my .po spellchecker is here. To run it, you'll need to install Python 2.7, Enchant, and the dictionary for you language as explained in the Enchant's website.
Its current features are:
So, I started writing one in Python, relying on the Enchant library. Unfortunately, my first version printed the output to the command line, and became quickly clear that the Windows' Command Prompt is unable to handle Unicode, except perhaps by using some black magic.
So, I had to implement a GUI sooner than I expected, and as usual my toolkit of choice is Tkinter with the ttk library, that by the way supports Unicode like a charm.
Finally, a first version of my .po spellchecker is here. To run it, you'll need to install Python 2.7, Enchant, and the dictionary for you language as explained in the Enchant's website.
Its current features are:
- support for Unicode
- support for ignore files: these files are simple text files, with one word on each line. If one of them is supplied, such file is scanned, and all its words are added to Enchant's ignore list for the current session
- support for copying the output to the clipboard
- support for saving the output as text, HTML, PDF file as well as printing
- released as GPL v3 or any next version
- got rid of add_to_ignore function, now it relies on an Enchant function
- about button
- title for the main window and the output window
- a better regexp to split strings, and some preliminary character substitutions
- a better formatted title in the main screen
- menu bar, with Preferences option disabled
- the output window locks the main window until it is destroyed
- added a sizegrip in the output window
- moved buttonbox to top in output window
- make it so that Text, Entries and Comboboxes resize if the main window is resized
- on Linux ttk's Clam style is used
- added Separators and padding for a better layout
- moved buttonbox in main window to its own frame
- added GPL v3 logo (in the main screen) and text (in the zip file)
- added imagePath function, to allow the program finding the images based on where the script is placed
- added Bluecurve icons
- added Bluecurve CC-BY-SA copyright note
- added Help buttons
- added persistency file that keeps track of last used dictionary, po file and ignore file
- better styling on Linux
- added support for xgettext
- added a Choose Language dialog
- added support for multiline msgstr; however, the price to pay was losing the line counter
- Pango markup, macros and variable names are removed from spellchecking
- added back line numbering in output
- Use of polib and PySide libraries
- Crystal, Gnome and Oxygen icons
- Output as HTML, PDF and on paper
- Attachments
-
- Wespell.zip
- Version 1.0
- (956.53 KiB) Downloaded 376 times
-
- po spellchecker 4.zip
- Version 4
- (48.16 KiB) Downloaded 440 times
-
- po spellchecker 3.zip
- Version 3
- (47.74 KiB) Downloaded 366 times
-
- po spellchecker 2.zip
- Version 2
- (15.64 KiB) Downloaded 463 times
-
- po spellchecker 1.zip
- (2.12 KiB) Downloaded 439 times
Last edited by Elvish_Hunter on August 28th, 2013, 8:52 am, edited 6 times in total.
Reason: Uploaded version 1.0
Reason: Uploaded version 1.0
Current maintainer of these add-ons, all on 1.16:
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
Re: A Python .po spellchecker
Hi Elvish_Hunter, and thanks for this good tool. I find it really useful (I tend to make a lot of mistakes when writing at a decent speed, damn notebook keybord...).
I want to add that it doesn't seems to work with python 2.6, while all is ok with 2.7. The problem seems to be re.split() that doesn't support "flags=" in 2.6 (and removing that there are problems with the accents). So you probably need at least python 2.7 to make it work.
It will be really useful if you can also add a way to scan the add-on folder (or the one of a mainline campaign) searching for unit names and exclude them automatically. This will reduce drastically the false positives.
Keep up the good work.
I want to add that it doesn't seems to work with python 2.6, while all is ok with 2.7. The problem seems to be re.split() that doesn't support "flags=" in 2.6 (and removing that there are problems with the accents). So you probably need at least python 2.7 to make it work.
Yes, this will be useful to keep selected the correct dictionary and exception file.Add a preferences file?
It will be really useful if you can also add a way to scan the add-on folder (or the one of a mainline campaign) searching for unit names and exclude them automatically. This will reduce drastically the false positives.
Keep up the good work.
- Elvish_Hunter
- Posts: 1575
- Joined: September 4th, 2009, 2:39 pm
- Location: Lintanir Forest...
Re: A Python .po spellchecker
No, thank you for using it so early.mich wrote:Hi Elvish_Hunter, and thanks for this good tool. I find it really useful (I tend to make a lot of mistakes when writing at a decent speed, damn notebook keybord...).
Right. I corrected my initial post.mich wrote:I want to add that it doesn't seems to work with python 2.6, while all is ok with 2.7. The problem seems to be re.split() that doesn't support "flags=" in 2.6 (and removing that there are problems with the accents). So you probably need at least python 2.7 to make it work.
There is a library for this (ConfigParser), although this isn't a high-priority modification.mich wrote:Yes, this will be useful to keep selected the correct dictionary and exception file.
I'm not so sure that collecting a bunch of untranslated unit names will be that useful - for them there is already wmllint's spellchecker.mich wrote:It will be really useful if you can also add a way to scan the add-on folder (or the one of a mainline campaign) searching for unit names and exclude them automatically.
I'll do!mich wrote:Keep up the good work.
Current maintainer of these add-ons, all on 1.16:
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
- Elvish_Hunter
- Posts: 1575
- Joined: September 4th, 2009, 2:39 pm
- Location: Lintanir Forest...
Re: A Python .po spellchecker
Version 0.2 of my spellchecker is available for download in the first post. I switched to GPL v3 license, as it grants better protection than GPL v2. A full list of all changes is in the first post, that contains also the first version: I'll keep it for historical/backup purpose.
Current maintainer of these add-ons, all on 1.16:
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
- Elvish_Hunter
- Posts: 1575
- Joined: September 4th, 2009, 2:39 pm
- Location: Lintanir Forest...
Re: A Python .po spellchecker
I just made the 0.3 version of my .po spellchecker available for download in the first post. Finally, I managed to add support for multiline msgstr; but unfortunately, supporting the line number of spelling mistakes proved to be quite complex, so I had to drop this function. After all, not even GNU msgexec yields the line numbers in its outputs.
The program can now be translated, thanks to Python's gettext library, and I added some .po files for this purpose.
I added some icons taken from the Bluecurve set. According to this page, I also added the requested copyright notes, so everything should be fine.
I'll try to add back support for line numbering, should I find a good solution for it.
The program can now be translated, thanks to Python's gettext library, and I added some .po files for this purpose.
I added some icons taken from the Bluecurve set. According to this page, I also added the requested copyright notes, so everything should be fine.
I'll try to add back support for line numbering, should I find a good solution for it.
Current maintainer of these add-ons, all on 1.16:
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
- Elvish_Hunter
- Posts: 1575
- Joined: September 4th, 2009, 2:39 pm
- Location: Lintanir Forest...
Re: A Python .po spellchecker
I just added version 0.4 to the first post. This time, the main changes are that now Pango markup, variable names and macro calls are not spellchecked; also, I added back line numbering in the output.
I tried packaging the program with py2exe, but the result was a monster of about 17 Mb . So, I decided to scrap this idea...
Anyway, the application is pretty much finished, and its future versions will probably receive only translations and bugfixes.
I tried packaging the program with py2exe, but the result was a monster of about 17 Mb . So, I decided to scrap this idea...
Anyway, the application is pretty much finished, and its future versions will probably receive only translations and bugfixes.
Current maintainer of these add-ons, all on 1.16:
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
- Elvish_Hunter
- Posts: 1575
- Joined: September 4th, 2009, 2:39 pm
- Location: Lintanir Forest...
WeSpell - A Python .po spellchecker
First of all, excuse me for my sudden disappearance from both the forums and IRC, but real life has been pretty hectic over the last month.
Anyway, I'm announcing the release of a new version of my spellchecker (that I decided to call WeSpell - a pun between Wesnoth and spell). You can find it in the first post.
A lot of changes were implemented, to the point that I decided to mark this release as 1.0: almost all of the old code was discarded.
The first change that you can see is that I abandoned Tkinter as GUI toolkit, and replaced it with Qt/PySide. The downside of this change is that a plain Python installation isn't enough to run the program: you'll need to install PySide as well. On the other hand, this toolkit is much faster than Tk and it has a lot of functions that Tk misses - printing, direct support to several image formats, a lot of widgets, sound support... For example, moving to Qt allowed me to implement printing to paper or PDF.
Second notable change is that I removed my parser, and replaced it with polib. You'll need to install this library as well to run the program, but it works much better than my old parser. However, it doesn't support line numbers, so spellcheck outputs now point to msgids - that, as I remember caslav.ilic saying once, are the only way to point to a certain entry in a .po file.
Speaking of output: after spellchecking a file, you can now save the output as text, HTML, PDF, or print it on paper; you can also change font and paper size.
Another interesting change is that, instead of placing the whole program in a single file, I packaged it in several modules: each window now has its own module.
You can also choose your icon set between four: Bluecurve, Crystal, Gnome and Oxygen; every set is available in five sizes (16, 24, 32, 48, 64), so you can choose the one that best suits your display. All of them are correctly credited in the About section - in this case, the GPL v3 license applies only to my own code: every other component, library or artwork is still released under its own license.
And now, time for some questions:
How do I run it?
First of all, you need to install Python. This program should run on both 2.7 and 3.3, so you can pick up what do you prefer.
Next, you need to install PySide (http://qt-project.org/wiki/Category:Lan ... :Downloads), PyEnchant (http://pythonhosted.org/pyenchant/download.html) and polib (https://pypi.python.org/pypi/polib, read the installation guide at http://polib.readthedocs.org/en/latest/ ... stallation).
Finally, extract the content of the zip file and double click on "Po spellchecker Qt.py".
Why didn't you package it as a .exe file?
Oh, I tried by using cx_Freeze (py2exe wasn't a viable solution, because it's not available for Python3). Let's say that PyEnchant refused to be properly packaged no matter what, so I had to scrap the idea.
How do I install a dictionary?
You'll need a dictionary in MySpell/Hunspell format. Two good sources of them are OpenOffice.org's archive (http://extensions.openoffice.org/ or http://wiki.openoffice.org/w/index.php? ... did=229123) and LibreOffice (http://extensions.libreoffice.org/extension-center; alternatively http://download.documentfoundation.org/libreoffice/src/, select the latest version's directory and download the file named libreoffice-dictionaries-X.X.X.X.tar.xz; you need 7-Zip to open it).
Once that you extracted the archive and got your dictionary, put its .aif and .dic files into the dictionaries folder that you can find inside this program's directory. Restart the program if it was open. That's it.
Anyway, I'm announcing the release of a new version of my spellchecker (that I decided to call WeSpell - a pun between Wesnoth and spell). You can find it in the first post.
A lot of changes were implemented, to the point that I decided to mark this release as 1.0: almost all of the old code was discarded.
The first change that you can see is that I abandoned Tkinter as GUI toolkit, and replaced it with Qt/PySide. The downside of this change is that a plain Python installation isn't enough to run the program: you'll need to install PySide as well. On the other hand, this toolkit is much faster than Tk and it has a lot of functions that Tk misses - printing, direct support to several image formats, a lot of widgets, sound support... For example, moving to Qt allowed me to implement printing to paper or PDF.
Second notable change is that I removed my parser, and replaced it with polib. You'll need to install this library as well to run the program, but it works much better than my old parser. However, it doesn't support line numbers, so spellcheck outputs now point to msgids - that, as I remember caslav.ilic saying once, are the only way to point to a certain entry in a .po file.
Speaking of output: after spellchecking a file, you can now save the output as text, HTML, PDF, or print it on paper; you can also change font and paper size.
Another interesting change is that, instead of placing the whole program in a single file, I packaged it in several modules: each window now has its own module.
You can also choose your icon set between four: Bluecurve, Crystal, Gnome and Oxygen; every set is available in five sizes (16, 24, 32, 48, 64), so you can choose the one that best suits your display. All of them are correctly credited in the About section - in this case, the GPL v3 license applies only to my own code: every other component, library or artwork is still released under its own license.
And now, time for some questions:
How do I run it?
First of all, you need to install Python. This program should run on both 2.7 and 3.3, so you can pick up what do you prefer.
Next, you need to install PySide (http://qt-project.org/wiki/Category:Lan ... :Downloads), PyEnchant (http://pythonhosted.org/pyenchant/download.html) and polib (https://pypi.python.org/pypi/polib, read the installation guide at http://polib.readthedocs.org/en/latest/ ... stallation).
Finally, extract the content of the zip file and double click on "Po spellchecker Qt.py".
Why didn't you package it as a .exe file?
Oh, I tried by using cx_Freeze (py2exe wasn't a viable solution, because it's not available for Python3). Let's say that PyEnchant refused to be properly packaged no matter what, so I had to scrap the idea.
How do I install a dictionary?
You'll need a dictionary in MySpell/Hunspell format. Two good sources of them are OpenOffice.org's archive (http://extensions.openoffice.org/ or http://wiki.openoffice.org/w/index.php? ... did=229123) and LibreOffice (http://extensions.libreoffice.org/extension-center; alternatively http://download.documentfoundation.org/libreoffice/src/, select the latest version's directory and download the file named libreoffice-dictionaries-X.X.X.X.tar.xz; you need 7-Zip to open it).
Once that you extracted the archive and got your dictionary, put its .aif and .dic files into the dictionaries folder that you can find inside this program's directory. Restart the program if it was open. That's it.
Current maintainer of these add-ons, all on 1.16:
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)
The Sojournings of Grog, Children of Dragons, A Rough Life, Wesnoth Lua Pack, The White Troll (co-author)