Skip to content
Snippets Groups Projects
Commit 2c60b424 authored by Andreas Schärtl's avatar Andreas Schärtl
Browse files

report: towards: review

parent b11197ea
Branches
No related tags found
No related merge requests found
......@@ -2,43 +2,45 @@
\section{Towards Manageable Ontologies}
Before finishing up this report with a general conclusion, we want to
dedicate a section to our thoughts on the upper level ontology and
ontology design in general. Primarily we offer the potential for
interesting future work.
first dedicate a section to thoughts on the upper level ontology and
ontology design in general. The contribution of this section is
primarily that of potential for future work. At this point in time,
the ideas formulated here lack concrete implementations.
\subsection{Automatic Processing}
ULO and tetrapodal search are most certainly in their infancy. As
such it is easy to dismiss concerns about scalability as premature.
Regardless, we encourage research on ULO and tetrapodal search to keep
the greater picture in mind. We believe that a greater tetrapodal
search system can only succeed if the process of export and indexing
is completely automated. Automation, we believe, come downs to two
things, (1)~automation of the export infrastructure and (2)~enabling
automation through machine readability.
\begin{description}
\item[Fully Automated Checks] First of all, we believe that the
export of third party library into Endpoint storage needs to be
fully automated. We believe this is the for two major
reason. First of all, syntax errors and invalid predicates need to
be avoided. It is unreasonable to expect a systems administrator
to fix each ULO~export in its one particular way. At the very
least, automated validators~\cite{rval, rcmd} should be used to
check the validity of ULO~exports.
\item[Well Defined Formats] The second problem is one of ontology
design. The goal of RDF and related technologies was to have
universal machine readable knowledge available for querying. As
such it is necessary to make efforts that the ULO exports we
create are machine readable. Here we want to remind the reader of
the previously discussed \texttt{ulo:sourceref} dilemma
(Section~\ref{sec:exp}). It required special domain knowledge
about the specific export for us to resolve a source reference to
actual source code. A machine readable approach would be to
instead decide on a fixed format for field such
as \texttt{ulo:sourceref}.
\end{description}
Let us first look at some system level concerns. It is true that the
upper level ontology and tetrapodal search are most certainly in their
infancy. As such it is easy to dismiss concerns about scalability as
premature. Regardless, we encourage research on ULO and tetrapodal
search to keep the greater picture in mind. We believe that a greater
tetrapodal search system can only succeed if the process of export and
indexing is completely automated. Automation, we believe, come downs
to two things, (1)~automation of the export infrastructure and
(2)~enabling automation through machine readability.
\emph{Automation of Exports.} First of all, we believe that the
export of third party library into Endpoint storage needs to be fully
automated. We believe this is the for two major reason. First of all,
syntax errors and invalid predicates need to be avoided. It is
unreasonable to expect a systems administrator to fix each ULO~export
in its one particular way. At the very least, automated
validators~\cite{rval, rcmd} should be used to check the validity of
ULO~exports.
\emph{Enabling Automation Through Machine Readability.} The second
problem is one of normalization. The goal of RDF and related
technologies was to have universal machine readable knowledge
available for querying. As such it is necessary to make efforts such
that ULO exports we create are machine readable, that is it is easy
for programs to interpret the encoded knowledge. We want to remind the
reader of the previously discussed \texttt{ulo:sourceref} dilemma
(Section~\ref{sec:exp}). It required special domain knowledge about
the specific export for us to resolve a source reference to actual
source code. A machine readable approach would be to instead decide on
a fixed format for field such as \texttt{ulo:sourceref}. This makes it
easy for application implementors to take full advantage of any
ULO knowledge base.
Infrastructure that runs without the need of outside intervention and
a machine readable knowledge base can lay out the groundwork for a
......@@ -46,18 +48,19 @@ greater tetrapodal search system.
\subsection{The Challenge of Universality}
We should remember that ULO aims to be a universal format for
formulating organizational mathematical knowledge. Maybe it is time
to reflect on what an outstandingly grand task this actually is. With
ULO, we are aiming for nothing less than an universal schema on top of
all collected (organizational) mathematical knowledge.
While system level concerns must not be discarded, we believe they are
a small problem compared to the challenge of ontology design as a
whole. Remember that ULO aims to be a universal language for
formulating organizational mathematical knowledge. An outstandingly
grand task. ULO aims at nothing less than a universal schema on top
of all collected (organizational) mathematical knowledge.
The current version of ULO already yields worthwhile results when
formal libraries are exported to ULO~triplets. Especially when it
comes to metadata, querying such data sets proved to be easy. But an
ontology such as~ULO can only be a real game changer when it is truly
universal, that is, when it is easy to formulate any kind of
organizational knowledge in form of an ULO data set.
organizational knowledge in the form of a ULO data set.
As such it should not be terribly surprising that ULO forgoes the
requirement of being absolutely correct. For example, what
......@@ -72,11 +75,13 @@ While that is not the hardest pill to swallow, it would be preferable
to maintain organizational knowledge in a format that is both (1)~as
correct as possible and (2)~easy to generalize and search. Future
development of the upper level ontology first needs to be very clear
on where it wants to be on this spectrum between accuracy and
generalizability. We believe that ULO is best positioned as an
ontology that is more general, at the cost of accuracy. It can serve
as a generalized way of indexing vast amounts of formal knowledge,
making it easy to discover and connect.
on where it wants to position itself on this spectrum between accuracy
and generalizability.
In its position as an upper level ontology, we believe that ULO is best
positioned as an ontology that favors generality at the cost of
accuracy. It can serve as a generalized way of indexing vast amounts
of formal knowledge, making it easy to discover and connect.
\subsection{A Layered Knowledge Architecture}
......@@ -98,27 +103,26 @@ both. We can have our cake and eat it it too.
Current exports investigated in this report take the approach of
taking some library of formal knowledge and then converting that
library directly into ULO triplets. Perhaps a better approach would be
to use a \emph{layered architecture} instead. In this layered
architecture, we would first convert a given third party library into
triplets defined by an intermediate ontology. These triplets could
then be compiled to ULO~triplets for search. It is an approach not
unlike intermediate byte codes used in compiler
construction~\cite[pp.~357]{dragon}. While lower layers preserve more
detail, higher levels are more general and easier to search.
A valid criticism of this proposed approach for future experiments is
that we could understand the base library as an ontology of its own.
In practice, the only difference is the file format. While formal
libraries are formulated in some domain specific formal language, when
we talk about ontologies, our canonical understanding is that of
OWL~ontologies, that is RDF~predicates with which knowledge is
formulated.
A first retort to this must be that RDF is easier to index using
triple store databases such as GraphDB and that it will probably e
easier to architecture a system based around a unified format~(RDF)
rather than a zoo of formats and languages. But a final judgment
requires further investigation. Either way, we believe it to be
worthwhile to consider the accuracy-generalizability spectrum into
account and investigate how this spectrum can be serviced with
to use a \emph{layered architecture} instead. The idea is sketched out
in Figure~\ref{fig:love}. In this layered architecture, we would first
convert a given third party library into triplets defined by an
intermediate ontology. These triplets could then be compiled to
ULO~triplets for search. It is an approach not unlike intermediate
byte codes used in compiler construction~\cite[pp.~357]{dragon}. While
lower layers preserve more detail, higher levels are more general and
easier to search.
A valid criticism to this would be that we can understand the base
library as an ontology of its own. In practice, the only difference
is the file format. While formal libraries are formulated in some
domain specific formal language, when we talk about ontologies, our
understanding is that of OWL~ontologies, that is RDF~predicates with
which knowledge is formulated. But RDF is easier to index using triple
store databases such as {GraphDB}. And it should be easier to
architecture a search system based around a unified format~(RDF)
rather than a zoo of formats and languages.
But a final judgment requires further investigation. Either way, we
find it is necessary to take the accuracy-generalizability spectrum
into account and investigate how this spectrum can be serviced with
different layers of ontologies.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment