From 611906ff69da22f0115797773fa482ce4669c5fa Mon Sep 17 00:00:00 2001 From: Takuto ASAKURA <wtsnjp@gmail.com> Date: Thu, 20 Jan 2022 23:12:23 +0900 Subject: [PATCH] grounding-dataset: now the number of annotated paper is 15 --- resources/grounding-dataset.md | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-) diff --git a/resources/grounding-dataset.md b/resources/grounding-dataset.md index 4daaa56..aa80e24 100644 --- a/resources/grounding-dataset.md +++ b/resources/grounding-dataset.md @@ -5,8 +5,8 @@ title: Dataset for Grounding of Formulae ### Basic Information -* Author: Takuto Asakura, AndreÌ Greiner-Petter, Akiko Aizawa, and Yusuke Miyao -* Updated: 2021-04-01 +* Author: Takuto Asakura, Yusuke Miyao, and Akiko Aizawa +* Updated: 2022-01-20 ### Accessibility and License @@ -20,19 +20,12 @@ itself. ### Description -This is the project to create a dataset for grounding of formulae. - -As a trial work, this dataset consists of an annotated long paper (20 pages in -PDF): - -* Simeone, O.: A Very Brief Introduction to Machine Learning with Applications -to Communication Systems. IEEE Transactions on Cognitive Communications and -Networking 4(4) (2018) - -The original XHTML file of the paper was taken from the [arXMLiv:08.2018 -dataset](/resources/arxmliv-dataset-082018/), and we manually annotated all -937 identifiers (i.e., `<mi>` tags) in the document to the corresponding -mathematical objects (meanings). +This dataset is a ground truth of formula grounding annotation data for 15 +scientific papers. More specifically, a total of 12,352 math identifiers were +annotated with their referring mathematical concepts, explicitly indicating +coreference relations within each article. A total of 938 text spans, called +grounding sources, that were used as the basis for human grounding were +labeled. The annotation is performed with our open-source annotation tool [MioGatto](https://github.com/wtsnjp/MioGatto). The tool is also suitable for @@ -40,5 +33,5 @@ viewing the data. Please refer to its documentation for the details. ### Download -[Download link](https://gl.kwarc.info/SIGMathLing/grounding-dataset-v1) +[Download link](https://gl.kwarc.info/SIGMathLing/grounding-dataset) ([SIGMathLing members](/member/) only) -- GitLab