Skip to content
Snippets Groups Projects
components.tex 2.96 KiB
Newer Older
  • Learn to ignore specific revisions
  • \section{Components}\label{sec:components}
    
    With ULO/RDF files in place and available for download on
    MathHub~\cite{uloisabelle, ulocoq} as Git repositories, we have the
    aim of making the underlying data available for use in applications.
    It makes sense to first identify the various components that might
    be involved in such a system. They are illustrated in
    figure~\ref{fig:components}.
    
    \begin{figure}[]\begin{center}
        \includegraphics{figs/components}
        \caption{Components involved in the \emph{ulo-storage} system.}\label{fig:components}
    \end{center}\end{figure}
    
    \begin{itemize}
    \item ULO data is present on various locations, be it Git
      repositories, available on web servers or on the local disk.
      Regardless where this ULO data is stored, it is the job of a
      \emph{Collecter} to assemble these {RDF}~files and forward them
      for further processing. This may involve cloning a Git repository or
      crawling a file system.
    
    \item With streams of ULO files assembled by the Collecter, this
      data then gets passed to an \emph{Importer}. An Importer uploads
      received RDF~streams into some kind of permanent storage. For
      use in this project, the GraphDB~\cite{graphdb} triplet store was
      natural fit.
    
      In this project, both Collecter and Importer ended up being one piece
      of software, but this does not have to be the case.
    
    \item Finally, with all triplets stored in a database, an
      \emph{Endpoint} is where applications access the underlying
      knowledge base. This does not necessarily need to be any specific
      software, rather the programming interface of the underlying
      database itself could be understood as an endpoint of its
      own.
    
      Regardless, some thought can be put into designing an Endpoint as a
      layer that lives between application and database that is more
      convenient to use than the one provided by the database.
    \end{itemize}
    
    These are the components realized for \emph{ulo-storage}. Additionally
    to these components, one could think of a \emph{Harvester} component.
    We assumed that the ULO triplets are already available in RDF~format.
    Indeed for this project this was the case as we work with already
    exported triplets from the Isabelle and Coq libraries. However, this
    is not necessarily true.
    
    It might be desirable to automate the export from third party formats
    to ULO and we think this is what a Harvester should do.  It fetches
    mathematical knowledge from some remote source and then provides a
    volatile stream of ULO data to the Collecter, which then passes it
    to the Importer and so on. The big advantage of such an approach would
    be that exports from third party libraries can always be up to date
    and do not have to be initiated manually. We did not implement a Harvester
    for \emph{ulo-storage} but we suggest that it is an idea to keep in mind.
    
    Each component we did implement (Collecter, Importer, Endpoint) will
    now be discussed in a separate section. Here we only wanted to give
    the reader a general understanding in the infrastructure that makes
    up \emph{ulo-storage}.