r _Web.log

tag: english

Hackpact 2009/09/#8: A comprehensive and un-noisy list of English noun inflections

K http://www.erase.net/dump/noun-inflections.txt

Small hack today: cleaning up and unifying the list of noun inflections used by t+7 (info). A decent, un-noisy yet comprehensive list of English nouns is a surprisingly hard thing to find. This one, derived from the 2of12 list, contains around 25,000.

Hackpact 2009/09/#6: Pluralizing English nouns in Python

[icon] plurals_en.txt

As part of the Django/Twitter-based hackpact project mentioned yesterday, I need to be able to generically pluralize English nouns. This is a distinctly non-trivial job, given the vast array of irregularities and unusual inflections in the English language: think tooth/teeth, vertex/vertices, stimulus/stimuli, wolf/wolves, starfish/starfish, mother-in-law/mothers-in-law. The linguistics and algorithms behind this process process has been written about by Damien Conway for a related Perl module. Today, I have been porting the same process to Python, based on a simpler example from the Dive Into Python reference.

The datafile in its current format is attached. I'll publish the rest of the code (and reveal the underlying plan!) when the project is completely, hopefully in tomorrow's session...