Skip to content
Snippets Groups Projects
Commit f2e58d63 authored by Andreas Schärtl's avatar Andreas Schärtl
Browse files

report: conclusion: open questions

This still needs lots of fixing for sure.
parent 09c4f6dc
Branches
No related tags found
No related merge requests found
\section{Conclusions and Next Steps}\label{sec:conclusion}
\subsection{An Additional Harvester Component} % copied from introduction.tex
Using the software stack explained in Section~\ref{sec:implementation}
we were able to take existing RDF exports and import them into a
GraphDB database. This made it possible to create the applications and
examples of Section~\ref{sec:applications}. We showed that
organizational knowledge formulated as ULO triplets can already give
some insights, in particular it is possible to formulate queries for
meta information such as authorship and contribution and resolve the
interlinks between proofs and theorems. These examples also showed
that existing ULO~exports only take advantage of a subset of
ULO~predicates, something to keep in mind for future exports and in
particular something developers of applications built on top of ULO
have to be aware of.
These are the three components realized for
\emph{ulo-storage}. However, additionally to these components, one
could think of a \emph{Harvester} component. Above we assumed that
the ULO triplets are already available in RDF~format. This is not
necessarily true. It might be desirable to automate the export from
third party formats to ULO and we think this should be the job of a
Harvester component. It fetches mathematical knowledge from some
remote source and then provides a volatile stream of ULO data to the
Collecter, which then passes it to the Importer and so on. The big
advantage of such an approach would be that exports from third party
libraries can always be up to date and do not have to be initiated
manually. Another advantage of this hypothetical component is that
running exports through the Harvester involves the whole import chain
of Collecter and Importer which involves syntax~checking for the
exported RDF data. Bugs in exporters that produce faulty XML would be
found earlier in development.
For this conclusion, we want to recap four different problems we
encountered when working on \emph{ulo-storage}. The first problem was
that of malformed RDF~exports. Various exports contained invalid URIs
and wrong namespaces. As a work around we provided on the fly
correction but of course this does not solve the problem in itself. A
proposed long term solution to this is to automate the export from
third party libraries (e.g.\ Coq, Isabelle) to ULO triplets in a
database, eliminating the step of XML files on disk. During import
into the database, the imported data is thoroughly checked and mistakes
are reported right away. Bugs in exporters that produce faulty XML
would be found earlier in development.
The second problem is that of versioning triplets in the GraphDB
store. While adding new triplets to an existing GraphDB store is not
a problem, updating existing triplets is difficult. \emph{ulo-storage}
solves this by simply re-creating the GraphDB data set in regular
intervals. This does work, but introduces questions regarding
scalability in the future. That said, it might be difficult to find an
efficient alternative. Tagging each triplet with some version number
doubles the number of triplets that need to be stored and will
undoubtedly makes imports in the database more costly. Maybe re-creating
the index is in fact the best solution.
The third problem is that of missing predicates in existing ULO
exports. The upper level ontology boats a total of almost
80~predicates, yet only a third of them are actually used by Coq and
Isabelle exports. A developer writing queries that take advantage of
the full ULO~vocabulary might be surprised that not data is coming
back. This shows the difficulty of designing an ontology that is both
concise and expressive. While it is all good and well to recommend
writers of exports to use the full set of predicates, it might simply
not make sense to use the full set for a given third party library. We
think that it is a bit too early to argue for the removal of
particular predicates, rather it might be better to look at future
export projects and then evaluate which predicates are used and which
are not.
Finally we encountered problems in regards to how data should be
represented at all. Our example showed that a concept such as an
algorithm might be representable using existing logic concepts. This
is surely very tempting for the theorist, but it might not necessarily
be the most practical. The question here is what ULO is supposed to
be. Is it supposed to be a kind of \emph{machine language} for
representing language? If that is the case, it very well might be
reasonable to represent algorithms and other advanced concepts in
terms of basic logic. This however, we conjecture, a language on top
of ULO to make this machine language representation available in terms
of a high level language understood by the majority of users. If on
the other hand, ULO~already is that high level language, it is not
unreasonable to extend the ontology with the concept of algorithms and
so on.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment