diff --git a/_posts/2018-01-08-dataset.md b/_posts/2018-01-08-dataset.md
new file mode 100644
index 0000000000000000000000000000000000000000..87219945035b26f2936bf52c7813fa46e9b7b051
--- /dev/null
+++ b/_posts/2018-01-08-dataset.md
@@ -0,0 +1,18 @@
+---
+layout: post
+title: First Data Set on SIGMathLing
+---
+SIGMathLing has published a first data set, which also acts as a template for future data
+sets. The content of this data set is licensed to [SIGMathLing members](/member/) for research
+and tool development purposes subject to the [SIGMathLing Non-Disclosure-Agreement](/nda/).
+
+This collection of 1.1 Million HTML5 documents
+has been developed as part of the [arXMLiv](https://kwarc.info/systems/arXMLiv/) project at
+the [KWARC](https://kwarc.info/) research group.  It was created by converting the
+[arXiv collection of scientific preprints until August 2017](http://arxiv.org) via
+[LaTeXML](https://github.com/brucemiller/LaTeXML) using the
+[CorTeX corpus management system](https://github.com/dginev/CorTeX).
+
+Details can be found on the [SIGMathLing Resource page](/resources/arxmliv/).
+
+