Skip to content
Snippets Groups Projects
Commit 611906ff authored by Takuto Asakura's avatar Takuto Asakura
Browse files

grounding-dataset: now the number of annotated paper is 15

parent 3242371a
No related branches found
No related tags found
No related merge requests found
Pipeline #4050 passed
......@@ -5,8 +5,8 @@ title: Dataset for Grounding of Formulae
### Basic Information
* Author: Takuto Asakura, André Greiner-Petter, Akiko Aizawa, and Yusuke Miyao
* Updated: 2021-04-01
* Author: Takuto Asakura, Yusuke Miyao, and Akiko Aizawa
* Updated: 2022-01-20
### Accessibility and License
......@@ -20,19 +20,12 @@ itself.
### Description
This is the project to create a dataset for grounding of formulae.
As a trial work, this dataset consists of an annotated long paper (20 pages in
PDF):
* Simeone, O.: A Very Brief Introduction to Machine Learning with Applications
to Communication Systems. IEEE Transactions on Cognitive Communications and
Networking 4(4) (2018)
The original XHTML file of the paper was taken from the [arXMLiv:08.2018
dataset](/resources/arxmliv-dataset-082018/), and we manually annotated all
937 identifiers (i.e., `<mi>` tags) in the document to the corresponding
mathematical objects (meanings).
This dataset is a ground truth of formula grounding annotation data for 15
scientific papers. More specifically, a total of 12,352 math identifiers were
annotated with their referring mathematical concepts, explicitly indicating
coreference relations within each article. A total of 938 text spans, called
grounding sources, that were used as the basis for human grounding were
labeled.
The annotation is performed with our open-source annotation tool
[MioGatto](https://github.com/wtsnjp/MioGatto). The tool is also suitable for
......@@ -40,5 +33,5 @@ viewing the data. Please refer to its documentation for the details.
### Download
[Download link](https://gl.kwarc.info/SIGMathLing/grounding-dataset-v1)
[Download link](https://gl.kwarc.info/SIGMathLing/grounding-dataset)
([SIGMathLing members](/member/) only)
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment