Commit cec32f4e authored by Michael Kohlhase's avatar Michael Kohlhase

new

parent c49e368e
---
layout: post
title: First Data Set on SIGMathLing
---
SIGMathLing has published a first data set, which also acts as a template for future data
sets. The content of this data set is licensed to [SIGMathLing members](/member/) for research
and tool development purposes subject to the [SIGMathLing Non-Disclosure-Agreement](/nda/).
This collection of 1.1 Million HTML5 documents
has been developed as part of the [arXMLiv](https://kwarc.info/systems/arXMLiv/) project at
the [KWARC](https://kwarc.info/) research group. It was created by converting the
[arXiv collection of scientific preprints until August 2017](http://arxiv.org) via
[LaTeXML](https://github.com/brucemiller/LaTeXML) using the
[CorTeX corpus management system](https://github.com/dginev/CorTeX).
Details can be found on the [SIGMathLing Resource page](/resources/arxmliv/).
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment