Skip to content
Snippets Groups Projects
introduction.tex 3.68 KiB
Newer Older
  • Learn to ignore specific revisions
  • \section{Introduction}\label{sec:introduction}
    
    To tackle the vast array of mathematical publications, various ways of
    \emph{computerizing} mathematical knowledge have been experimented
    
    with. Already it is difficult for human mathematicians to keep even a
    subset of all mathematical knowledge in their mind, a problem referred
    to as the ``one brain barrier'' in literature~\cite{onebrain}.  A hope
    is that computerization will yield great improvement to formal
    research by making the results of all collected publications readily
    available and easy to search.
    
    Andreas Schärtl's avatar
    Andreas Schärtl committed
    One topic of research in this field is the idea of a \emph{tetrapodal
      search} that combines four distinct areas of mathematical knowledge.
    
    These four kinds being (1)~the actual formulae as \emph{symbolic
    
    Andreas Schärtl's avatar
    Andreas Schärtl committed
      knowledge}, (2)~examples and concrete objects as \emph{concrete
    
      knowledge}, (3)~prose, names and comments as \emph{narrative knowledge} and
    
    Andreas Schärtl's avatar
    Andreas Schärtl committed
    finally (4)~identifiers, references and their relationships, referred
    to as \emph{organizational knowledge}~\cite{tetra}.
    
    
    Tetrapodal search aims to provide a unified search engine that indexes
    
    Andreas Schärtl's avatar
    Andreas Schärtl committed
    each of these four different subsets of mathematical knowledge.
    Because all four kinds of knowledge are inherently unique in their
    
    structure, tetrapodal search proposes that each subset of mathematical
    
    knowledge should be made available in a storage backend that fits the
    kind of data it is providing. With all four areas available for
    querying, tetrapodal search intends to then combine the four indexes
    
    into a single query interface. Currently, research aims at providing
    schemas, storage backends and indexes for the four different kinds of
    mathematical knowledge. In this context, the focus of
    \emph{ulo-storage} is the area of organizational knowledge.
    
    A previously proposed way to structure organizational knowledge is the
    
    \emph{upper level ontology} (ULO)~\cite{ulo}. ULO takes the form of an
    
    OWL~ontology~\cite{uloonto} and as such all organization knowledge is
    stored as RDF~triplets with a unified schema of
    
    ULO~predicates~\cite{owl}.  Some effort has been made to export
    
    existing collections of formal mathematical knowledge to {ULO}. In
    
    particular, exports from Isabelle and Coq-based libraries are
    
    available~\cite{afpexport, uloisabelle, ulocoq}. The resulting data
    set is already quite large, the Isabelle export alone containing more
    than 200~million triplets.
    
    Andreas Schärtl's avatar
    Andreas Schärtl committed
    Existing exports from Isabelle and Coq result in sets of XML~files
    
    that contain RDF~triplets. This is a convenient format for exchange
    
    and easily tracked using version control systems such as
    Git~\cite{gitpaper} as employed by MathHub~\cite{mh}. However,
    considering the vast number of triplets, it is impossible to query
    easily and efficiently in this state. This is what \emph{ulo-storage}
    
    Andreas Schärtl's avatar
    Andreas Schärtl committed
    is focused on, that is making ULO data sets accessible for querying and
    
    analysis. We collected RDF files spread over different Git
    repositories, imported them into a database and then experimented with
    APIs for accessing that data set.
    
    The contribution of \emph{ulo-storage} is twofold. First, (1)~we built
    up various infrastructure components that make organizational
    
    Andreas Schärtl's avatar
    Andreas Schärtl committed
    knowledge easy to query.  These components can make up building blocks
    of a larger tetrapodal search system. Their design and implementation
    are discussed in Section~\ref{sec:implementation}.  Second, (2)~we ran
    prototype applications and queries on top of this
    
    Andreas Schärtl's avatar
    Andreas Schärtl committed
    infrastructure. While the applications are not useful in itself, they
    can give us insight about future development of the upper level
    ontology and related schemas. These applications and queries are the
    focus of Section~\ref{sec:applications}. A summary of encountered
    problems and suggestions for next steps concludes this report in
    
    Section~\ref{sec:conclusion}.