diff --git a/doc/report/introduction.tex b/doc/report/introduction.tex index e80811f67601fa6ea2d6f7ab06a765aa355f23cb..09831480116364a1c0336df6852ee2f7da40c6c2 100644 --- a/doc/report/introduction.tex +++ b/doc/report/introduction.tex @@ -9,17 +9,17 @@ is that computerization will yield great improvement to formal research by making the results of all collected publications readily available and easy to search. -One research topic in this field is the idea of a \emph{tetrapodal -search} that combines four distinct areas of mathematical knowledge. +One topic of research in this field is the idea of a \emph{tetrapodal + search} that combines four distinct areas of mathematical knowledge. These four kinds being (1)~the actual formulae as \emph{symbolic -knowledge}, (2)~examples and concrete objects as \emph{concrete knowledge}, -(3)~names and comments as \emph{narrative knowledge} and finally -(4)~identifiers, references and their relationships, referred to as -\emph{organizational knowledge}~\cite{tetra}. + knowledge}, (2)~examples and concrete objects as \emph{concrete + knowledge}, (3)~names and comments as \emph{narrative knowledge} and +finally (4)~identifiers, references and their relationships, referred +to as \emph{organizational knowledge}~\cite{tetra}. Tetrapodal search aims to provide a unified search engine that indexes -each of the four different subsets of mathematical knowledge. Because -all four kinds of knowledge are inherently different in their +each of these four different subsets of mathematical knowledge. +Because all four kinds of knowledge are inherently unique in their structure, tetrapodal search proposes that each kind of mathematical knowledge should be made available in a storage backend that fits the kind of data it is providing. With all four areas available for @@ -40,26 +40,27 @@ available~\cite{afpexport, uloisabelle, ulocoq}. The resulting data set is already quite large, the Isabelle export alone containing more than 200~million triplets. -Existing exports from Isabelle and Coq result in a set of XML~files +Existing exports from Isabelle and Coq result in sets of XML~files that contain RDF~triplets. This is a convenient format for exchange and easily tracked using version control systems such as Git~\cite{gitpaper} as employed by MathHub~\cite{mh}. However, considering the vast number of triplets, it is impossible to query easily and efficiently in this state. This is what \emph{ulo-storage} -is focused on: Making ULO data sets accessible for querying and +is focused on, that is making ULO data sets accessible for querying and analysis. We collected RDF files spread over different Git repositories, imported them into a database and then experimented with APIs for accessing that data set. The main contribution of \emph{ulo-storage} is twofold. First, (1)~we -built up various infrastructure components for making organizational -knowledge queryable. These components can make up building blocks of -a larger tetrapodal search system. Their design and implementation are -discussed in Section~\ref{sec:implementation}. Second, (2)~we ran -sample prototype applications and queries on top of this -interface. While the applications themselves are admittedly not very -useful in itself, they can give us insight about future development of -the upper level ontology. These applications and queries are the focus -of Section~\ref{sec:applications}. A summary of encountered problems -and suggestions for next steps concludes this report in +built up various infrastructure components that make organizational +knowledge easy to query. These components can make up building blocks +of a larger tetrapodal search system. Their design and implementation +are discussed in Section~\ref{sec:implementation}. Second, (2)~we ran +prototype applications and queries on top of this +infrastructure. While the applications themselves are admittedly not +very useful in itself, they can give us insight about future +development of the upper level ontology and related schemas. These +applications and queries are the focus of +Section~\ref{sec:applications}. A summary of encountered problems and +suggestions for next steps concludes this report in Section~\ref{sec:conclusion}.