Forked from
KWARC / kwarc.info / www
964 commits behind the upstream repository.
-
Michael Kohlhase authoredMichael Kohlhase authored
arXMLiv.md 559 B
layout: system
title: arXMLiv
teaser: Translating the arXiv to XML/HTML5
start_date: '2006'
people:
- mkohlhase
- dginev
website: http://cortex.mathweb.info
repository: https://github.com/dginev/CorTeX
The Cornell e-print arXiv contains one of the largest corpora of scientific literature in the world. Unfortunately, its contents are locked up in the TeX/LaTeX format, which makes it nearly useless for knowledge management techniques. We translate it to XML to have a basis for uncovering it's structural semantics.