Commit d62121df authored by Michael Kohlhase's avatar Michael Kohlhase

Merge branch 'grounding-dataset-v1' into 'master'

Grounding dataset v1

See merge request !12
parents 628a2ae7 49cef0de
Pipeline #2080 passed with stage
in 2 minutes and 25 seconds
layout: page
title: Dataset for Grounding of Formulae
### Basic Information
* Author: Takuto Asakura, André Greiner-Petter, Akiko Aizawa, and Yusuke Miyao
* Updated: 2020-03-26
### Accessibility and License
The content of this dataset is licensed to [SIGMathLing members](/member/) for
research and tool development purposes.
Access is restricted to [SIGMathLing members](/member/) under the [SIGMathLing
Non-Disclosure-Agreement](/nda/) as for most [arXiv](
articles, the right of distribution was only given (or assumed) to arXiv
### Description
This is the project to create a dataset for grounding of formulae.
As a trial work, this dataset consists of an annotated long paper (20 pages in
* Simeone, O.: A Very Brief Introduction to Machine Learning with Applications
to Communication Systems. IEEE Transactions on Cognitive Communications and
Networking 4(4) (2018)
The original XHTML file of the paper was taken from the [arXMLiv:08.2018
dataset](/resources/arxmliv-dataset-082018/), and we manually annotated all
937 identifiers (i.e., `<mi>` tags) in the document to the corresponding
mathematical objects (meanings).
### Download
[Download link](
([SIGMathLing members](/member/) only)
......@@ -12,6 +12,9 @@ title: SIGMathLing - Datasets and Resources
1. [arXMLiv word embeddings, 08.2017 release](/resources/arxmliv-embeddings-082017)
1. [arXMLiv corpus, 08.2017 release](/resources/arxmliv-dataset-082017/)
## Work-In-Progress Resources hosted on the SIGMathLing Repository
1. [Dataset for Grounding of Formulae](/resources/grounding-dataset)
## Resources hosted externally
1. [ACL-math-annotation](
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment