Skip to content
Snippets Groups Projects
components.tex 2.82 KiB
Newer Older
  • Learn to ignore specific revisions
  • \section{Components}\label{sec:components}
    
    With ULO/RDF files in place and available for download on
    MathHub~\cite{uloisabelle, ulocoq} as Git repositories, we have the
    aim of making the underlying data available for use in applications.
    It makes sense to first identify out the various components that might
    be involved in such a system. They are illustrated in
    figure~\ref{fig:components}.
    
    
    \begin{figure}[]\begin{center}
        \includegraphics{figs/components}
        \caption{Components involved in the \emph{ulo-storage} system.}\label{fig:components}
    \end{center}\end{figure}
    
    
    \begin{itemize}
    \item ULO/RDF data is present on various locations, be it Git
    
      repositories, available on web servers or on the local disk.
      Regardless where this ULO/RDF data is stored, it is the job of a
      \emph{Collecter} to assemble these {ULO/RDF}~files and forward them
      for further processing. This may involve cloning a Git repository or
      crawling a file system.
    
    \item With streams of ULO/RDF files assembled bye the Collecter, this
      data then gets passed to an \emph{Importer}. The Importer uploads
      the received RDF~streams into some kind of permanent storage. For
      use in this project, the GraphDB~\cite{graphdb} triplet store was
      natural fit.
    
      In practice, both Collecter and Importer end up being one piece of
      software, but this does not have to be the case.
    
    
    \item Finally, with all triplets stored in a database, an
      \emph{Endpoint} is where applications access the underlying
      knowledge base. This does not necessarily need to be any specific
      software, rather the programming API of the database could be
    
      understood as an endpoint of its own. However, some thought can be
      put into designing an Endpoint that is more convenient to use than
      the one provided by the database.
    
    \end{itemize}
    
    
    Additionally to these components, one could think of a \emph{Harvester} component.
    We assumed that the ULO/RDF triplets are already available in RDF~format.
    Indeed for this project this is the case as we work with already
    
    exported triplets from the Isabelle and Coq libraries. However, this
    
    does not need to be the case.
    
    It might be desirable to automate the export from third party formats
    to ULO/RDF and indeed this is what a Harvester would do.  It fetches
    mathematical knowledge from some remote source and then provides a
    volatile stream of ULO/RDF data to the Collecter, which then passes it
    to the Importer and so on. The big advantage of such an approach would
    be that exports from third party libraries can always be up to date
    and do not have to be initiated manually. However, due to time
    constraints this idea was not followed up upon in this project.
    
    Each component (Collecter, Importer, Endpoint) will later be discussed
    in more detail. This section serves only for the reader to get a
    general understanding of the developed infrastructure and its
    topology.