Skip to content

mafuchi/language_generator

 
 

Repository files navigation

Ok so: Classes should inherit from methodselector.MethodSelector Classes that are responsible for creating syllables should also inherit from morphono.MorphemeGeneratorMixin The way I have it set up, all classes are set up like this:

  • self.phonological_inventory (dict): as generated by Phonology.get_inventory. I strongly recommend passing all the classes the same inventory. We'll take care of this when we build the master class.
  • self.base_likelihoods (dict): keys are labels for groups of methods, values are dictionaries whose keys are class method names (as strings), and whose values are likelihoods--percentages as integers (e.g., % * 100). Each group of methods represents a parameter (in the linguistic sense, more or less), the potential settings of which are represented by its methods. As an example, morphono.Nominal has a group case, whose methods represent total lack of case, nominative-accusative only, nominative-accusative-dative, and nominative-accusative-dative-genitive. The likelihoods associated with each method (e.g., 'case_nom_acc_dat': 20) vaguely represent cross-linguistic trends, based on a combination of my own gut feelings and the slightly more informed opinions of Corbin. I'm not sure there would be any major problems, but I personally feel safer prefixing the names of parameter methods with some version of their group label, e.g., case_nom_acc, gen_pers_nonpers, &c. I have it set up so that base_likelihoods is updated by both the contents of the likelihoods_dict parameter (if it's a dictionary) and likelihoods kwargs, which are turned into a dictionary internally, so you can pass init {'case': {'case_method1': likelihood1, 'case_method2': likelihood2,...}, {'number': {'number_method1': likelihood1,...}} or case={'case_method1': likelihood1...}, number={'number_method1': likelihood1...} (NB: it's called base_likelihoods so it doesn't override the likelihoods property of MethodSelector)
  • self.adjustments (dict): keys are group labels + '_adjustment' (for clarity), values are positive or negative integers. These represent 'weights' we can assign to the random selection of each parameter method, to make things more interesting. They all default to 0 (i.e., totally random given the likelihoods assigned in base_likelihoods). Updated when called in the same ways as base_likelihoods, i.e. via a whole dictionary or by keyword arguments.
  • self.flags (set): I tried to make things as simple as possible, but this is linguistics, so there still wound up being a significant number of interdependencies. flags is how I'm solving this: certain methods that have far-reaching consequences, e.g. Nominal.case_none, should .add a flag to the set of flags, which will be checked by other methods. The flags system is largely the reason why an instance of Nominal gets passed to Verbal and Other at init. As an example, the poss_prep (possessive preposition) methods in Other are largely dependent on the presence of case. We'll check case-related flags in self.nominal_instance.flags and 'fall back' to appropriate methods (e.g., if the language is nom-acc only, poss_none_before_dat will just call poss_none_before_acc.)
  • self.inventory (dict): this is going to be where morphemes are going to be stored, e.g. 'fem': 'ba'.
  • self.syllables (set): everytime a morpheme is generated, it'll be added to this set. If it's already present, it'll be 're-rolled'. (If you remember to inherit from MorphemeGeneratorMixin, its gen_morpheme method will take care of this for you. I realized last night that our CV syllable structure limits us to # of consonants x # of vowels syllables per category, which likely won't be enough. We should discuss this.

The way I envision it, syntax/word-order will be handled by template strings that take the same keyword arguments, so that the difference between SVO and OSV will be '{subj} {verb} {obj}'.format(subj='ba', verb='gu', obj='pi')' vs '{obj} {subj} {verb}'.format(subj='ba', verb='gu', obj='pi'). We can do recursive formatting ('{noun}'.format(noun='{root}{suffix}')) to handle things like affixation, possessor placement, etc.

There's probably more to it, but I'm at work and should actually do some work. Good luck, and hmu via email or gchat if you have questions.

Hey doobers: Here's what you should do after you checkout this repo for the first time:

  1. install python3 if you haven't already

  2. C:> pyvenv lang_gen_venv This will set up a virtual environment for this project. DO IT. You can call it something besides lang_gen_venv, but just fucking do it, ok?

  3. C:> lang_gen_venv/Scripts/activate.bat This will 'activate' your virtual environment, so you can use the proper packages and python version without having to think about it

  4. C:> python3 nltk/setup.py This will install NLTK 3.0a3, the latest version of NLTK 3, which is Py3k compatible. pip3 install nltk won't cut it, since they haven't released a stable version of NLTK 3 on PyPi yet.

If you guys could have this done by Saturday, that'd be great! GO TEAM!

Here is what you do if you are a use windows.

  1. install python 3 like normal

  2. Add python to the Environmental variables, Right click on 'This computer', click on advanced system settings, on the Advanced tab click on Environment Variables... Under System Variable scroll down to path, click edit and add ;C:\python33;C:\python33\Scripts (or where ever you put python, this is default) This will allow command line to use python and any added script as a function

  3. Install PIP, Easy_install nltk-3.0a3, setuptools I put the files in the github repot for you in command line, go to the folder where the files are located cd C:\Documents\GitHub\language_generator and the use the command to open the py document python get-pip.py this will install each file, for the zips, you must unzip do the same for the setup.py for each

  4. Install virtualenv at command line easy_install virtualenv==1.01.1 this should install the correct version for you to have it work

  5. run virtualenv in command line virtualenv lang_gen_env or whatever you want

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%