SIGMathLing provides its member and the research community with a set of services to meet its [objectives](objectives). These are jointly funded and maintained by SIGMathLing members ([technical concerns](technical/)
In particular, SIGMathLing maintains
1. a system of repositories for math linguistics resources.
2. the math analysis blackboard, i.e. an information system, where analysis results
4. a public citable reference page of all the math linguistic resources (just descriptions; not necessary public access for non-members).
5. a suite of systems and libraries
6. internal and outreach communication channels.
1. a system of repositories for math linguistics datasets an resources.
2. a public citable reference page of all the math linguistic resources (just descriptions; not necessary public access for non-members).
Recall that SIGMathLing maintains [a bouquet of services](services/); here we air some
technical concerns and ideas.
1. a system of **resource repositories**. MK: I would just make a GitHub/Lab organization and somehow pay for their services or use our KWARC GitLab. Git LFS should help us deal with the large files involved and Git would take care of permission management.
2. the **math analysis blackboard** I would develop and publish an annotation schema
(using the KAT schema as a starting point) and establish a math result triple store
that manages all of these. Technical details are still open how best to do this, but I am sure Deyan has some ideas.
3. A **web site**, MK: I would go via GitHub/GitLab pages and jekyll, that makes communal
development
4. a **resource reference page**: MK, this is just a page on the web page, probably automatically generated from an internal data base of resources and/or harvested from the repositories. Licensing should be made transparent.
5. a **suite of systems and libraries**: Initially, this will be a page on the website with links to their repositories (the LlaMaPuN library, CorTeX, KaT, .... ), mostly by reference to public resources.
6.**communication channels**: we start out with a members mailing list, a public atom feed for announcments (from the web site), later there may even be a regular newsletter that digests these.
Recall that SIGMathLing maintains [a bouquet of services](services/); here we air some technical concerns and ideas.
### Resource Repositories
We have a [SIGMathLing group](http://gl.kwarc.info/SIGMathLing) on the [GitLab](https://en.wikipedia.org/wiki/GitLab) server [gl.kwarc.info](http://gl.kwarc.info), where we will start making repositories on.
This allows us to use Git permissions for access control and the GitLab permission UI for management.
We estimate that for the first two years SIGMathLing will have below 25 members (reducing the traffic) and below 5 TB data sets.
gl.kwarc.info should be able to serve that given that most data sets will be served via [Git LFS](https://git-lfs.github.com/).
Should space or traffic become a problem for the KWARC servers to handle, we will try to raise money for a more scalable solution.
We will also have a close look at [Zenodo](http://zenodo.org) and see whether we can delegate hosting to them.
### Standardizing Datasets and Resources
We will need to develop standards for representing, classifying, describing, and citing data sets and reources.
1.*Representation*: file formats, repository layout, data models
2.*Classification/description*: is the dataset
* a corpus (raw, processed, ...),
* a set of annotations to a corpus,
* automatically/automatically created, by which process/system?
* an evaluation data set (gold standard)?
* what is the quality? f-measure,
* what is the license.
3. how to cite them.
### Resource Reference Page
Currently, this is just a manually curated [page on the SIGMathLing web site](/resources/), eventually we will statically generate it from an internal data base of resources and/or harvested from the repositories. Licensing should be made transparent.
### Suite of Systems and Libraries
Currently, this is just a manually curated [page on the SIGMathLing web site](/systems/), eventually we will statically generate it from an internal data base of resources and/or harvested from the repositories. Licensing should be made transparent.
Initially, this will be a page on the website with links to their repositories (the LlaMaPuN library, CorTeX, KaT, .... ), mostly by reference to public resources.
### Communication Channels
We start out with a members mailing list, a public atom feed for announcments (from the web site), later there may even be a regular newsletter that digests these.
### Math Analysis Blackboard
MK would like develop and publish an annotation schema (using the KAT schema as a starting point) and establish a math result triple store that manages all of these. Technical details are still open how best to do this, but Deyan is quite skeptical.