Skip to content
Snippets Groups Projects
Commit 7f961f50 authored by Deyan Ginev's avatar Deyan Ginev
Browse files

mock page stub for arxmliv

parent a20aeee7
No related branches found
No related tags found
1 merge request!1Document 08.2017 arxlmiv dataset release
An HTML dataset of arXiv.org
### Current release
- 08.2017
### License
TODO: Official SIGMathLing license link
### Generated by
- [LaTeXML 0.8.2](https://github.com/brucemiller/LaTeXML/releases/tag/v0.8.2),
- [CorTeX 0.2](https://github.com/dginev/CorTeX/releases/tag/0.2.0)
### Details:
- Size: `todo-add-size`GB archived,
- MD5: `todo-add-hash` arXMLiv_08_2017.zip
- Contents:
- 1,088,375 HTML5 documents
- By conversion severity: 112,088 `no_problem`, 574,642 `warning`, 401,645 `error`
### Description:
This is a first public release of the arXMLiv dataset generated by the [KWARC](https://kwarc.info/) research group. Its intended redistribution is confined to the scope of the [SIGMathLing] interest group, and access is members-only..
We welcome community feedback on all of: data quality, need for auxiliary resources (e.g. figures, token models), representation issues, as well as organization and archival best practices.
Next release is planned for mid-2018, with an up-to-date arXiv dataset and community feedback incorporated. We anticipate annual dataset releases going forward.
### Citing this Resource
TODO: Bibtex
### Download
[Download link (password-protected)](https://gl.kwarc.info/SIGMathLing/dataset-arXMLiv-08-2017)
......@@ -3,4 +3,6 @@ layout: page
title: SIGMathLing - Datasets and Resources
---
none yet, but see the [plan](/technical/)
1. [arXMLiv corpus, 08.2017 release](/arxmliv/)
Additional resources are en route, see the [plan](/technical/) for details.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment