diff --git a/research/semantization.md b/research/semantization.md index e3d4aaf76a9ad69441eda776aead633c86372181..1be74228532a2e739d403f09aa4604ed7db2652a 100644 --- a/research/semantization.md +++ b/research/semantization.md @@ -4,4 +4,22 @@ title: Semi-Automated Semantization menu_title: Semantization menu_order: 105 --- -...more to be written ... +is the process of making the knowledge and structure in informal representations explicit, +so that they can be acted upon by machines. + +Currently, 99% of the available mathematical kwnoledge is encoded in informal mathematical +documents: journal articles, books, preprints, handwritten course notes or recordings of +lectures. To make these accessible to +[semantic services and knowledge managment systems](kminteract), we must semanticize them. + +The KWARC group engages in multiple projects to help along semantization. In the +[sTeX](/systems/sTeX/) format, we enable authors to semantically prelaop LaTeX documents +so that we can generate [OMDoc](/systems/OMDoc) representation from them (again via +[LaTeXML](http://dlmf.nist.gov/LaTeXML)). + +In the [arXMLiv project](/systems/arXMLiv) we transform the +[Cornell ePrint arXiv](http://arxiv.org) into XML with MathML and explicit document +structure via [LaTeXML](http://dlmf.nist.gov/LaTeXML). In the [LLaMaPuN](/systems/lamapun) +project we develop libraries for automatically identifying meaning structures in arXMLiv +documents so that we will eventually be able to harvest OMDoc from the results. +