Skip to content
Snippets Groups Projects
arXMLiv.md 455 B
layout: system
menu_title: arXMLiv  
title: arXMLiv
start: 2006
pillar: semantization
people: mkohlhase,dginev

The Cornell e-print arXiv contains one of the largest corpora of scientific literature in the world. Unfortunately, its contents are locked up in the TeX/LaTeX format, which makes it nearly useless for knowledge management techniques. We translate it to XML to have a basis for uncovering it's structural semantics.