report: towards: review

2c60b424 · Andreas Schärtl · b11197ea · 2c60b424
Commit 2c60b424 authored 4 years ago by Andreas Schärtl
--- a/doc/report/towards.tex
+++ b/doc/report/towards.tex
@@ -2,43 +2,45 @@
 \section{Towards Manageable Ontologies}
 Before finishing up this report with a general conclusion, we want to
-dedicate a section to our thoughts on the upper level ontology and
+first dedicate a section to thoughts on the upper level ontology and
-ontology design in general. Primarily we offer the potential for
+ontology design in general. The contribution of this section is
-interesting future work.
+primarily that of potential for future work. At this point in time,
+the ideas formulated here lack concrete implementations.
 \subsection{Automatic Processing}
-ULO and tetrapodal search are most certainly in their infancy.  As
+Let us first look at some system level concerns.  It is true that the
-such it is easy to dismiss concerns about scalability as premature.
+upper level ontology and tetrapodal search are most certainly in their
-Regardless, we encourage research on ULO and tetrapodal search to keep
+infancy.  As such it is easy to dismiss concerns about scalability as
-the greater picture in mind. We believe that a greater tetrapodal
+premature.  Regardless, we encourage research on ULO and tetrapodal
-search system can only succeed if the process of export and indexing
+search to keep the greater picture in mind. We believe that a greater
-is completely automated. Automation, we believe, come downs to two
+tetrapodal search system can only succeed if the process of export and
-things, (1)~automation of the export infrastructure and (2)~enabling
+indexing is completely automated. Automation, we believe, come downs
-automation through machine readability.
+to two things, (1)~automation of the export infrastructure and
+(2)~enabling automation through machine readability.
-\begin{description}
-    \item[Fully Automated Checks] First of all, we believe that the
+\emph{Automation of Exports.} First of all, we believe that the
-    export of third party library into Endpoint storage needs to be
+export of third party library into Endpoint storage needs to be fully
-    fully automated. We believe this is the for two major
+automated. We believe this is the for two major reason. First of all,
-    reason. First of all, syntax errors and invalid predicates need to
+syntax errors and invalid predicates need to be avoided. It is
-    be avoided. It is unreasonable to expect a systems administrator
+unreasonable to expect a systems administrator to fix each ULO~export
-    to fix each ULO~export in its one particular way. At the very
+in its one particular way. At the very least, automated
-    least, automated validators~\cite{rval, rcmd} should be used to
+validators~\cite{rval, rcmd} should be used to check the validity of
-    check the validity of ULO~exports.
+ULO~exports.
-    \item[Well Defined Formats] The second problem is one of ontology
+\emph{Enabling Automation Through Machine Readability.} The second
-    design. The goal of RDF and related technologies was to have
+problem is one of normalization. The goal of RDF and related
-    universal machine readable knowledge available for querying. As
+technologies was to have universal machine readable knowledge
-    such it is necessary to make efforts that the ULO exports we
+available for querying. As such it is necessary to make efforts such
-    create are machine readable. Here we want to remind the reader of
+that ULO exports we create are machine readable, that is it is easy
-    the previously discussed \texttt{ulo:sourceref} dilemma
+for programs to interpret the encoded knowledge. We want to remind the
-    (Section~\ref{sec:exp}). It required special domain knowledge
+reader of the previously discussed \texttt{ulo:sourceref} dilemma
-    about the specific export for us to resolve a source reference to
+(Section~\ref{sec:exp}). It required special domain knowledge about
-    actual source code. A machine readable approach would be to
+the specific export for us to resolve a source reference to actual
-    instead decide on a fixed format for field such
+source code. A machine readable approach would be to instead decide on
-    as \texttt{ulo:sourceref}.
+a fixed format for field such as \texttt{ulo:sourceref}. This makes it
-\end{description}
+easy for application implementors to take full advantage of any
+ULO knowledge base.
 Infrastructure that runs without the need of outside intervention and
 a machine readable knowledge base can lay out the groundwork for a
@@ -46,18 +48,19 @@ greater tetrapodal search system.
 \subsection{The Challenge of Universality}
-We should remember that ULO aims to be a universal format for
+While system level concerns must not be discarded, we believe they are
-formulating organizational mathematical knowledge.  Maybe it is time
+a small problem compared to the challenge of ontology design as a
-to reflect on what an outstandingly grand task this actually is. With
+whole.  Remember that ULO aims to be a universal language for
-ULO, we are aiming for nothing less than an universal schema on top of
+formulating organizational mathematical knowledge.  An outstandingly
-all collected (organizational) mathematical knowledge.
+grand task.  ULO aims at nothing less than a universal schema on top
+of all collected (organizational) mathematical knowledge.
 The current version of ULO already yields worthwhile results when
 formal libraries are exported to ULO~triplets. Especially when it
 comes to metadata, querying such data sets proved to be easy. But an
 ontology such as~ULO can only be a real game changer when it is truly
 universal, that is, when it is easy to formulate any kind of
-organizational knowledge in form of an ULO data set.
+organizational knowledge in the form of a ULO data set.
 As such it should not be terribly surprising that ULO forgoes the
 requirement of being absolutely correct. For example, what
@@ -72,11 +75,13 @@ While that is not the hardest pill to swallow, it would be preferable
 to maintain organizational knowledge in a format that is both (1)~as
 correct as possible and (2)~easy to generalize and search. Future
 development of the upper level ontology first needs to be very clear
-on where it wants to be on this spectrum between accuracy and
+on where it wants to position itself on this spectrum between accuracy
-generalizability.  We believe that ULO is best positioned as an
+and generalizability.
-ontology that is more general, at the cost of accuracy. It can serve
-as a generalized way of indexing vast amounts of formal knowledge,
+In its position as an upper level ontology, we believe that ULO is best
-making it easy to discover and connect.
+positioned as an ontology that favors generality at the cost of
+accuracy. It can serve as a generalized way of indexing vast amounts
+of formal knowledge, making it easy to discover and connect.
 \subsection{A Layered Knowledge Architecture}
@@ -98,27 +103,26 @@ both. We can have our cake and eat it it too.
 Current exports investigated in this report take the approach of
 taking some library of formal knowledge and then converting that
 library directly into ULO triplets. Perhaps a better approach would be
-to use a \emph{layered architecture} instead. In this layered
+to use a \emph{layered architecture} instead. The idea is sketched out
-architecture, we would first convert a given third party library into
+in Figure~\ref{fig:love}. In this layered architecture, we would first
-triplets defined by an intermediate ontology. These triplets could
+convert a given third party library into triplets defined by an
-then be compiled to ULO~triplets for search. It is an approach not
+intermediate ontology. These triplets could then be compiled to
-unlike intermediate byte codes used in compiler
+ULO~triplets for search. It is an approach not unlike intermediate
-construction~\cite[pp.~357]{dragon}. While lower layers preserve more
+byte codes used in compiler construction~\cite[pp.~357]{dragon}. While
-detail, higher levels are more general and easier to search.
+lower layers preserve more detail, higher levels are more general and
+easier to search.
-A valid criticism of this proposed approach for future experiments is
-that we could understand the base library as an ontology of its own.
+A valid criticism to this would be that we can understand the base
-In practice, the only difference is the file format. While formal
+library as an ontology of its own.  In practice, the only difference
-libraries are formulated in some domain specific formal language, when
+is the file format. While formal libraries are formulated in some
-we talk about ontologies, our canonical understanding is that of
+domain specific formal language, when we talk about ontologies, our
-OWL~ontologies, that is RDF~predicates with which knowledge is
+understanding is that of OWL~ontologies, that is RDF~predicates with
-formulated.
+which knowledge is formulated. But RDF is easier to index using triple
+store databases such as {GraphDB}. And it should be easier to
-A first retort to this must be that RDF is easier to index using
+architecture a search system based around a unified format~(RDF)
-triple store databases such as GraphDB and that it will probably e
+rather than a zoo of formats and languages.
-easier to architecture a system based around a unified format~(RDF)
-rather than a zoo of formats and languages. But a final judgment
+But a final judgment requires further investigation. Either way, we
-requires further investigation. Either way, we believe it to be
+find it is necessary to take the accuracy-generalizability spectrum
-worthwhile to consider the accuracy-generalizability spectrum into
+into account and investigate how this spectrum can be serviced with
-account and investigate how this spectrum can be serviced with
 different layers of ontologies.