A Garbled Story
Have you ever thought about making up your own language?
What would it sound like? How would you form words?
When I decided to give it a try, my first thought went to language learning models - the backbone of software like ChatGPT and other text based "AI" assistants. This led to the creation of two scripts: one that would turn a source text (or corpus) into a dataset, and another that could use the dataset to create words that seem to follow the rules of the source material's language.
With a method to create plausable words down, I began to look for a way to create a grammar. This led me to visit the library, where I picked up some books on linguistics. While reading them, I learned that linguists had already "cracked the code" of English. There's no need for LLMs or fancy scripts - words in English are, in theory,
created by a simple set of rules.
Realizing this, I recreated the word generating script using Javascript, these rules, and some HTML. More recently, I expanded it with some additional ways to create new "words". The entire code base is less than a single megabyte, so I've shared it online where everyone can
play with it for free.
As for creating my own fictional language, that goal ended up getting lost in the shuffle. But if I ever want to try it again, I won't be starting from scratch!