Skip to content
Snippets Groups Projects
Select Git revision
  • b3183291e28375084b7cc19fae269c90a974228d
  • master default
  • fin/ulo-section
  • week45/fancy-builds
  • fin/applogos
  • week41/final-review
  • week41/review-again
  • week41/reporting-on-app
  • week40/apppep
  • week40/review-report
  • week40/elementary
  • week39/transitive
  • week39/lazy-scores
  • week39/application-sections-fix
  • week39/feedback-holes
  • week39/feedback-versioning
  • week38/slide-review
  • issue13/fix
  • issue13/version-upgrade
  • issue12/setup
  • issue10/explorer
21 results

applications.tex

Blame
  • applications.tex 13.11 KiB
    \section{Applications}\label{sec:applications}
    
    With programming endpoints in place, we can now query the data set containing
    both Isabelle and Coq exports stored in {GraphDB}. We experimented with
    various queries and applications:
    
    \begin{itemize}
        \item Exploring which ULO predicates are actually used in the
          existing Coq and Isabelle exports.  We find that more than two
          thirds of existing ULO predicates were not taken advantage of
          (Section~\ref{sec:expl}).
    
        \item We investigated queries that could be used to extend the
          system into a larger tetrapodal search system. While some
          organizational queries have obvious canonical solutions others
          introduce questions on how organizational knowledge should be
          organized (Section~\ref{sec:tetraq}).
    
        \item We also experimented with various other more general queries
          for organizational data recommended in literature
          (Section~\ref{sec:miscq}).
    
        \item Finally we built a small web front end that takes visualizes
          the ULO data set (Section~\ref{sec:webq}).
    \end{itemize}
    
    \noindent Each application will now be discussed in a dedicated section.
    
    \subsection{Exploring Existing Data Sets}\label{sec:expl}
    
    Four our first application, we looked at what ULO predicates are
    actually used by the respective data sets. With more than 250~million
    triplets in the store, we hoped that this would give us some insight
    into the kind of knowledge we are dealing with.
    
    Implementing a query for this job is not very difficult. In SPARQL,
    this can be achieved with the \texttt{COUNT} aggregate, the full query
    is given in verbatim in Figure~\ref{fig:preds-query}.  This yields a
    list of all used predicates with \texttt{?count} being the number of
    occurrences (Figure~\ref{fig:preds-result}). Looking at the results,
    we find that both the Isabelle and the Coq data sets only use subsets
    of the predicates provided by the ULO ontology. The full results are
    listed in Appendix~\ref{sec:used}. In both cases, what sta ndsndsout is
    that either exports use less than a third of the available predicates.
    
    We also see that the Isabelle and Coq exports use different
    predicates.  For example, the Isabelle contains organizational meta
    information such as information about paragraphs and sections in the
    source document while the Coq export only tells us about the filename
    of the Coq source. That is not particularly problematic as long as we
    can trace a given object back to the original source.  Regardless, our
    results do show that both exports have their own particularities and
    with more and more third party libraries exported to ULO one has to
    assume that this heterogeneity will only grow. In particular we want
    to point to the large number of predicates which remain unused in both
    Isabelle and Coq exports. A user formulating queries for ULO might be
    oblivious to the fact that only subsets of exports support given
    predicates.
    
    While not a problem for \emph{ulo-storage} per se, we do expect this
    to be a challenge when building a tetrapodal search
    system. Recommended ways around this ``missing fields'' problem in
    database literature include the clever use of default values or
    inference of missing values~\cite{kdisc, aidisc}, neither of which
    feels particularly applicable to an ULO data set.
    
    \input{applications-preds.tex}
    
    \subsection{Querying for Tetrapodal Search}\label{sec:tetraq}