Lexicon Management in GLIF
GLIF is a framework for describing the translation of natural language into logical expressions. This requires the specification of a grammar, a target logic, a domain theory in that logic, and a semantics construction (mapping of parse trees into the domain theory). If you participated in the LBS lecture, you should be familiar with the setup.
Adding a new word (like "woman") to a GLIF pipeline requires the following additions:
- abstract syntax:
woman_N : N;
- concrete syntax:
woman_N = mkN "woman" "women";
(if e.g. German is also supported:woman_N = mkN "Frau" feminine;
) - domain theory:
woman : i -> o
- semantics construction:
woman_N = woman
There clearly is a lot of repetition here. Your task would be to improve this by designing a lexicon format from which the necessary files can be generated automatically. A naive attempt to write a lexicon entry could look like this:
woman
noun
eng: "woman" "women"
ger: "Frau" feminine
We probably want customization in different places. For example, the semantics construction for the name John might be john_PN = john
or john_PN = [P] P john
depending on the context. Also, not every project uses the resource grammar library, so the operations used in the concrete syntax might vary.
The lexicon management should also be supported in GLIF's Jupyter front-end.
It would also be interesting to take existing lexica and generate a generic lexicon from that, which can be imported into any projects (with the required customization to make it fit in).