Skip to content
Snippets Groups Projects
Commit 87593b42 authored by Michael Kohlhase's avatar Michael Kohlhase
Browse files

adding subset id and subset explanation text.

parent 42652c89
No related branches found
No related tags found
1 merge request!1Document 08.2017 arxlmiv dataset release
......@@ -25,11 +25,11 @@ articles, the right of distribution was only given (or assumed) to arXiv itself.
- 1,088,370 HTML5 documents
- Three separate archive bundles separated by LaTeXML conversion severity
| subset | MD5 | number of documents | size archived | size unpacked |
| subset ID | file name | MD5 | number of documents | size archived | size unpacked |
| --- | --- | --- | --- | --- |
| arXMLiv_08_2017_no_problem.zip | `036945755c7cc75ea1577cf04ca4fead` | 112,088 | 5 GB | 37 GB |
| arXMLiv_08_2017_warning.zip | `c0d5c1baf626225b48264510ac4c6bd5` | 574,638 | 71 GB | 595 GB |
| arXMLiv_08_2017_error.zip | `2f4e60b993d85d30523b064c19e45733` | 401,644 | 50 GB | 421 GB |
| no_problems| arXMLiv_08_2017_no_problem.zip | `036945755c7cc75ea1577cf04ca4fead` | 112,088 | 5 GB | 37 GB |
| warning| arXMLiv_08_2017_warning.zip | `c0d5c1baf626225b48264510ac4c6bd5` | 574,638 | 71 GB | 595 GB |
| error| arXMLiv_08_2017_error.zip | `2f4e60b993d85d30523b064c19e45733` | 401,644 | 50 GB | 421 GB |
### Description
......@@ -52,7 +52,10 @@ A following release is planned for mid-2018, with an up-to-date arXiv dataset an
The dataset should be referenced in all academic publications that present results
obtained with its help. The reference should contain the identifier `arXMLiv:08.2017` in
the title, the author, year, a reference to SIGMathLing, and the URL of the resource
description page. For convenience, we supply some records for bibTeX and EndNote below.
description page. For convenience, we supply some records for bibTeX and EndNote below. To
cite a particular part of the dataset use the subset identifiers in the ciation; e.g. `
\cite[no_problem subset]{arXMLiv:08.2017}` or just explain it in the text using the
concrete identifier.
#### pure bibTeX
```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment