PyWordGen
About pyWordGen PyWordGen started life as a simple random name generator PHP
script on MiddleEarth.net,
re-incarnated as a set of perl scripts, then as a custom product for
Zope, and now lives its life as a stand-alone python class.
The true origins of the name generator (now PyWordGen) were in cryptography, or rather, the history of cryptography. Several years ago, we
were playing with letter frequency analysis in different languages and at
different historical times. (It was very interesting to see these trends
over time...) We were wondering if character frequencies could indicate
source language in old encryption styles (one-to-one mappings).
The information we collected was inconclusive, but the perl scripts
we wrote gave birth to another idea: a character name generator for
games. Since then, we have used it not only to create names for people
in games and on Web sites, but to assist in coming up with vocabulary
while developing whole new languages.
The FutureWe'd like to weave more linguistically oriented material into the code (we're open to suggestions). PyWordGen has delusions of the
linguistic sort: it wants to simulate phonological and morphological
evolution. How far this goes is anyone's guess, but right now it works
and is tons of fun to play with ;-) Honestly, though, this started as a
game and is currently used in games -- there are no genuine aspirations
of academic accomplishment or the heady vapours of linguistic science.
Example UsageExample usage has been moved the wiki here.
News
New Branch[ 2007.09.15] More work is continuing on PyWordGen, with foucs on adding support for setuptools, unifying python modules in a single namespace, adding support
for other languages, and improving test coverage. There is no timeline set
currently, but this work has been a long time waiting and I'm glad to be
playing with it again.
Reorg Branch Merged to Trunk[ 2006.07.26] After a long hiatus, PyWordGen is back, and completely re-written. The initial intent of the reorg was to make use of NLTK, but this actually
turned out being over-kill for the intent of PyWordGen. However, a much
needed make-over has just been completed and merged back to trunk.
The new version of PyWordGen includes a much-improved API, package and
module organization, and more efficient algorithms. In addition, there are
also new source languages, including Gaelic, Hebrew and Sanskrit.
NLTK and Code Re-Org[ 2004.12.28] After setting up various projects in a multiple-project trac configuration, I did a couple informal code reviews and stopped
short once I got to PyWordGen. Terrible, just terrible. I wanted
more rigorous math, a clear separation of application code and
library code/utilities, and I wanted to get rid of all the aesthetic
issues I have with it.
I started reorganizing the code, breaking out the mini-corpora,
generated stats, actual code, and the tool/app that have all
been conglomerated as PyWordGen. Then, vaguely remembering that
NLTK (Natural Language Tool Kit) had the capability to do some
form of syntagmata, I started experimenting. Right now, things
are looking good. I've done a couple code experiments and
prototypes with NLTK that may end up replacing all the stuff I
did originally. Which is good, since NLTK is a hard-core,
rigorous computational linguistic toolset.
So, if everything goes well, the next release of PyWordGen will
include well organized and separated code and an NLTK wrapper
for the consonant and vowel games I am playing.
New Projecct Web Site[ 2003.11.21] A special thanks goes out the the PHP.net folks, since we totally stole their look at feel ;-) Their site has been one of the
nicest and cleanest on the net for some time; it was natural source
of inspiration. We've used a combination of SSI (server-side includes)
and python CGI to develop a "site in a box" that only requires edits
to flat text file s (shell access). This combination of technologies
is very common in hosted environments and is as simple to implement
as uploadeding and unpacking a gzip'ed tarball or doing a CVS
checkout.
PyWordGen Launched on SF.net[ 2003.11.16] PyWordGen has an official home on SourceForge.net. Although the first version of the code is written, the project site is yet
to be completed and there is nothing in CVS at the moment.
|