diff --git a/doc/report/applications.tex b/doc/report/applications.tex index 0e179c27cac58beba1a1deaa16849196838a6279..4549842506d504cc6e14236898b71decad7b648e 100644 --- a/doc/report/applications.tex +++ b/doc/report/applications.tex @@ -79,12 +79,14 @@ increases. The second application we want to discuss illustrates how to browse an ULO data set interactively. Here, we developed a web front end that allows users to browse contributions to the knowledge base sorted by -author. In this particular case, we used the RDF4J Endpoint, the -application itself is implemented in Java. Figure~\ref{fig:appss} -shows four screenshots of the current version. Once the user has -selected a given contribution, we list some basic information about -the selected object, such as type (e.g.\ lemma or corollary) and name -as well as references to other objects in the knowledge base. +author. Figure~\ref{fig:appss} shows four screenshots of the current +version, a public demo is available at \url{https://ulordf4j.mathhub.info}. + +In this particular case, we used the RDF4J Endpoint, the application +itself is implemented in Java. Once the user has selected a given +contribution, we list some basic information about the selected +object, such as type (e.g.\ lemma or corollary) and name as well as +references to other objects in the knowledge base. \input{applications-screenshots.tex} @@ -99,15 +101,15 @@ original source code. In particular, we took a look at the \texttt{ulo:sourceref} predicate defined for many objects in the knowledge base. \texttt{ulo:sourceref} is supposed to contain ``the URI of the physical location (e.g., file/URI, line, column) of the -source code that introduced the subject''~\cite{uloonto}; the Isabelle -export. But while making the connection from exported ULO~object to +source code that introduced the subject''~\cite{uloonto}. +But while making the connection from exported ULO~object to original Isabelle document is convenient for the user, this feature required some extra work from us as application implementors. The reason for this is that the format of the \texttt{ulo:sourceref} property appears to be not well-defined, rather it is up to implementors of library exporters how to format these source -references. For example, in the Isabelle export, we have to translate -source references such as +references. For example, in the Isabelle export~\cite{uloisabelle}, we +have to translate source references such as \begin{verbatim} https://isabelle.in.tum.de/source/HOL/HOL-Algebra/ HOL-Algebra.Group.theory#17829.576.8:17836.576.15 @@ -162,16 +164,15 @@ formal proofs for difficulty. On the other hand, there is very specific research on rating proofs for concrete problems such as the satisfiability of a given CNF formula~\cite{proofsat}. -\emph{Organizational Aspect.} A first working hypothesis -might be to assume that elementary proofs are short. In that case, the -size, that is the number of bytes to store the proof, is our first -indicator of proof complexity. This is by no means perfect, as even -identical proofs can be represented in different ways that have vastly -different size in bytes. It might be tempting to imagine a unique -normal form for each proof, but finding such a normal form may very -well be impossible. As it is very difficult to find a generalized -definition of proof difficulty, we will accept proof size as a first -working hypothesis. +\emph{Organizational Aspect.} A working hypothesis might be to assume +that elementary proofs are short. In that case, the size, that is the +number of bytes to store the proof, is our first indicator of proof +complexity. This is by no means perfect, as even identical proofs can +be represented in different ways that have vastly different size in +bytes. It might be tempting to imagine a unique normal form for each +proof, but finding such a normal form may very well be impossible. As +it is very difficult to find a generalized definition of proof +difficulty, we will accept proof size as a first working hypothesis. {ULO} offers the \texttt{ulo:external-size} predicate which will allow us to sort by file size. Maybe proof complexity also leads to quick @@ -223,7 +224,7 @@ need to compute scores for all relevant objects first. In \emph{ulo-storage}, all scores we experimented with were easy enough and the data sets small enough such that this did not become a concrete problem. But in a larger tetrapodal search system, caching -lazily or ahead of time computed results will probably we a +results that were computed lazily or ahead of time will probably be a necessity. Which component takes care of keeping this cache is not clear right now, but we advocate for keeping caches of previously computed scores separate from the core \emph{ulo-storage} Endpoint diff --git a/doc/report/conclusion.tex b/doc/report/conclusion.tex index 933caf3f8f11813d49448f73d50f8bf481f92b55..76da74d07d9e30906fef6eaf65c464f7806551e2 100644 --- a/doc/report/conclusion.tex +++ b/doc/report/conclusion.tex @@ -42,11 +42,11 @@ Finally, perhaps the most interesting but also most challenging question is the discussion about how to develop ULO, the universal ontology for organizational knowledge. On one hand, ULO boasts a total of almost 80~predicates, yet only a third of them are actually -used by existing Coq and Isabelle exports. A developer writing queries -that take advantage of the full ULO~vocabulary might be surprised that -not data is coming back. On the other hand, we found it difficult to -model algorithms in terms of ULO predicates and argued that it might -be necessary to further extend the upper level ontology. It will be +used by existing exports. A developer writing queries that take +advantage of the full ULO~vocabulary might be surprised that not data +is coming back. On the other hand, we found it difficult to model +algorithms in terms of ULO predicates and argued that it might be +necessary to further extend the upper level ontology. It will be necessary to look at the properties of future exports to make a more well-funded decision. diff --git a/doc/report/implementation.tex b/doc/report/implementation.tex index 434d52365820671d16711075b75ee3500a0315f5..ad4dd3c1901bceb55887ea16809499dfa31c1911 100644 --- a/doc/report/implementation.tex +++ b/doc/report/implementation.tex @@ -26,7 +26,7 @@ up the infrastructure provided by \emph{ulo-storage}. the file system. \item With streams of ULO files assembled by the Collector, these - streams then gets passed to an \emph{Importer}. The Importer then + streams then get passed to an \emph{Importer}. The Importer then uploads RDF~streams into some kind of permanent storage. As we will see, the GraphDB~\cite{graphdb} triple store was a natural fit. @@ -88,7 +88,7 @@ passed to the Importer which imports the encoded RDF triplets into some kind of persistent storage. In theory, multiple implementations of this Importer are possible, namely different implementations for different database backends. As we will see in -Section~\ref{sec:endpoints}, for our projected we selected the GraphDB +Section~\ref{sec:endpoints}, for our project we selected the GraphDB triple store alone. The Importer merely needs to make the necessary API~calls to import the RDF stream into the database. As such the import itself is straight-forward, our software only needs to upload @@ -206,8 +206,8 @@ recent versions of the SPARQL query language~\cite{graphdbsparql} and OWL~Reasoning~\cite{owlspec, graphdbreason}. In particular, this means that GraphDB offers support for transitive queries as described in previous work on~ULO~\cite{ulo}. A transitive query is one that, -given a relation~$R$, asks for the transitive closure~$S$ -(Figure~\ref{fig:tc}) of~$R$. +given a relation~$R$, asks for the transitive closure~$S$of~$R$. +(Figure~\ref{fig:tc}). \input{implementation-transitive-closure.tex} diff --git a/doc/report/introduction.tex b/doc/report/introduction.tex index e198187f43a2f44adb87dfaf9d550cac6142cf4d..99d02a5b5c95a92f4e0dbfbaf8e1edc22b23574e 100644 --- a/doc/report/introduction.tex +++ b/doc/report/introduction.tex @@ -57,10 +57,9 @@ knowledge easy to query. These components can make up building blocks of a larger tetrapodal search system. Their design and implementation are discussed in Section~\ref{sec:implementation}. Second, (2)~we ran prototype applications and queries on top of this -infrastructure. While the applications themselves are admittedly not -very useful in itself, they can give us insight about future -development of the upper level ontology and related schemas. These -applications and queries are the focus of -Section~\ref{sec:applications}. A summary of encountered problems and -suggestions for next steps concludes this report in +infrastructure. While the applications are not useful in itself, they +can give us insight about future development of the upper level +ontology and related schemas. These applications and queries are the +focus of Section~\ref{sec:applications}. A summary of encountered +problems and suggestions for next steps concludes this report in Section~\ref{sec:conclusion}.