Tuesday, November 21, 2017

Unicode-Preeti-English Transliteration

Days ago I wrote python scripts for converting texts between Unicode, Preeti and English(transliteration). The initial motivation was to convert Unicode Nepali names in a database to their English transliterations for the ease of carrying out SQL queries. That led me to put together the first script. One or two days later, a friend of mine asked if I could write something for converting between Preeti and Unicode and so I gave it a try and consequently produced Preeti to Unicode and Unicode to Preeti scripts.
The above scripts at their core are mapping-dictionaries complemented with rules for outliers/exceptions that occur rather frequently during the conversion process. As a result, they probably aren't perfect but can certainly be refined to near that.
Now, the only thing missing is conversion from English to Unicode which I don't think is necessary because ... Google Input Tools - among other alternatives.
Oh, and while I was pushing revision commits to the Unicode to Preeti script on GitHub, I wondered if there already was a better, reliable program for it. So, I searched GitHub for any existing 'Unicode to Preeti converter', and among others, I landed upon this which is pretty good at its job.