report: rework structure

The contribution of this project is twofold. So we have two big chapter for either for those two contribution: (1) Implementation and (2) Applications.

report: rework structure
c44d4786 · Andreas Schärtl · 5df1e0aa · 5df1e0aa · c44d4786 · c44d4786
Commit c44d4786 authored 4 years ago by Andreas Schärtl
--- a/doc/report/collecter.tex
+++ b/doc/report/collecter.tex
-\section{Collecter and Importer}\label{sec:collecter}
-
-\emph{TODO}
-
--- a/doc/report/conclusion.tex
+++ b/doc/report/conclusion.tex
-\section{Conclusion and Future Work}\label{sec:conclusion}
+\section{Conclusions and Next Steps}\label{sec:conclusion}
+
+\subsection{An Additional Harvester Component} % copied from introduction.tex
+
+These are the three components realized for
+\emph{ulo-storage}. However, additionally to these components, one
+could think of a \emph{Harvester} component.  Above we assumed that
+the ULO triplets are already available in RDF~format.  This is not
+necessarily true.  It might be desirable to automate the export from
+third party formats to ULO and we think this should be the job of a
+Harvester component.  It fetches mathematical knowledge from some
+remote source and then provides a volatile stream of ULO data to the
+Collecter, which then passes it to the Importer and so on. The big
+advantage of such an approach would be that exports from third party
+libraries can always be up to date and do not have to be initiated
+manually. Another advantage of this hypothetical component is that
+running exports through the Harvester involves the whole import chain
+of Collecter and Importer which involves syntax~checking for the
+exported RDF data. Bugs in exporters that produce faulty XML would be
+found earlier in development.
--- a/doc/report/endpoints.tex
+++ b/doc/report/endpoints.tex
-\section{Endpoints}\label{sec:endpoints}
+\section{Implementation}\label{sec:implementation}
+
+\subsection{Components Implemented for \emph{ulo-storage}}\label{sec:components}
+
+With RDF files exported and available for download as Git repositories
+on MathHub, we have the goal of making the underlying data available
+for use in applications.  It makes sense to first identify the various
+components that might be involved in such a system.
+Figure~\ref{fig:components} illustrates all components and their
+relationships.
+
+\begin{figure}[]\begin{center}
+    \includegraphics{figs/components}
+    \caption{Components involved in the \emph{ulo-storage} system.}\label{fig:components}
+\end{center}\end{figure}
+
+\begin{itemize}
+\item ULO triplets are present in various locations, be it Git
+  repositories, on web servers or the local disk.  It is the job of a
+  \emph{Collecter} to assemble these {RDF}~files and forward them for further
+  processing. This may involve cloning a Git repository or crawling
+  the file system.
+
+  \item With streams of ULO files assembled by the Collecter, this
+  data then gets passed to an \emph{Importer}. An Importer uploads
+  RDF~streams into some kind of permanent storage. For
+  use in this project, the GraphDB~\cite{graphdb} triplet store was
+  a natural fit.
+
+  For this project, both Collecter and Importer ended up being one
+  piece of monolithic software, but this does not have to be the case.
+
+\item Finally, with all triplets stored in a database, an
+  \emph{Endpoint} is where applications access the underlying
+  knowledge base. This does not necessarily need to be any custom
+  software, rather the programming interface of the underlying
+  database itself could be understood as an endpoint of its own.
+
+  Regardless, some thought can be put into designing an Endpoint as a
+  layer that lives between application and database that is more
+  convenient to use than the one provided by the database. It comes
+  down to the programming interface we wish to provide to a developer
+  using this system.
+\end{itemize}
+
+Collecter, Importer and Endpoint provide us with an easy and automated
+way of making RDF files ready for use with applications.  In this
+introduction we only wanted to give the reader a general understanding
+in the infrastructure that makes up \emph{ulo-storage}, the following
+sections will explain each component in more detail.
+
+\subsection{Endpoints}\label{sec:endpoints}

 With ULO triplets imported into the GraphDB triplet store by Collecter
 and Importer, we now have all data available necessary for querying.

--- a/doc/report/intro.tex
+++ b/doc/report/intro.tex
-\section{Introduction}\label{sec:introduction}
+\section{Introduction to the \emph{ulo-storage} Project}\label{sec:introduction}

 To tackle the vast array of mathematical
 publications, various ways of \emph{computerizing} mathematical
@@ -26,8 +26,6 @@ kind of data it is providing. With all four areas available for
 querying, tetrapodal search intends to then combine the four indexes
 into a single query interface.

-\subsection{Focus on Organizational Knowledge}
-
 Currently, research is focused on providing schemas, storage backends
 and indexes for the four different kinds of mathematical
 knowledge. The focus of \emph{ulo-storage} is the area of
@@ -59,73 +57,3 @@ blocks in a larger tetrapodal search system. Second, (2)~we ran sample
 prototype applications and queries on top of this interface. While the
 applications themselves are admittedly not very interesting, they can
 give us insight about future development of the upper level ontology.
-
-\subsection{Components Implemented for \emph{ulo-storage}}\label{sec:components}
-
-With RDF files exported and available for download as Git repositories
-on MathHub, we have the goal of making the underlying data available
-for use in applications.  It makes sense to first identify the various
-components that might be involved in such a system.
-Figure~\ref{fig:components} illustrates all components and their
-relationships.
-
-\begin{figure}[]\begin{center}
-    \includegraphics{figs/components}
-    \caption{Components involved in the \emph{ulo-storage} system.}\label{fig:components}
-\end{center}\end{figure}
-
-\begin{itemize}
-\item ULO triplets are present in various locations, be it Git
-  repositories, on web servers or the local disk.  It is the job of a
-  \emph{Collecter} to assemble these {RDF}~files and forward them for further
-  processing. This may involve cloning a Git repository or crawling
-  the file system.
-
-  \item With streams of ULO files assembled by the Collecter, this
-  data then gets passed to an \emph{Importer}. An Importer uploads
-  RDF~streams into some kind of permanent storage. For
-  use in this project, the GraphDB~\cite{graphdb} triplet store was
-  a natural fit.
-
-  For this project, both Collecter and Importer ended up being one
-  piece of monolithic software, but this does not have to be the case.
-
-\item Finally, with all triplets stored in a database, an
-  \emph{Endpoint} is where applications access the underlying
-  knowledge base. This does not necessarily need to be any custom
-  software, rather the programming interface of the underlying
-  database itself could be understood as an endpoint of its own.
-
-  Regardless, some thought can be put into designing an Endpoint as a
-  layer that lives between application and database that is more
-  convenient to use than the one provided by the database. It comes
-  down to the programming interface we wish to provide to a developer
-  using this system.
-\end{itemize}
-
-\subsection{An Additional Harvester Component}
-
-These are the three components realized for
-\emph{ulo-storage}. However, additionally to these components, one
-could think of a \emph{Harvester} component.  Above we assumed that
-the ULO triplets are already available in RDF~format.  This is not
-necessarily true.  It might be desirable to automate the export from
-third party formats to ULO and we think this should be the job of a
-Harvester component.  It fetches mathematical knowledge from some
-remote source and then provides a volatile stream of ULO data to the
-Collecter, which then passes it to the Importer and so on. The big
-advantage of such an approach would be that exports from third party
-libraries can always be up to date and do not have to be initiated
-manually. Another advantage of this hypothetical component is that
-running exports through the Harvester involves the whole import chain
-of Collecter and Importer which involves syntax~checking for the
-exported RDF data. Bugs in exporters that produce faulty XML would be
-found earlier in development.
-
-We did not implement a Harvester for \emph{ulo-storage} but we suggest
-that it is an idea to keep in mind. The components we did implement
-(Collecter, Importer and Endpoint) provide us with an easy and
-automated way of making RDF files ready for use with applications.  In
-this introduction we only wanted to give the reader a general
-understanding in the infrastructure that makes up \emph{ulo-storage},
-the following sections will explain each component in more detail.
--- a/doc/report/report.tex
+++ b/doc/report/report.tex
@@ -50,11 +50,9 @@
 \tableofcontents

 \newpage
-\input{intro.tex}
+\input{introduction.tex}
 \newpage
-\input{collecter.tex}
-\newpage
-\input{endpoints.tex}
+\input{implementation.tex}
 \newpage
 \input{applications.tex}
 \newpage