Skip to content
GitLab
Explore
Sign in
Primary navigation
Search or go to…
Project
S
schaertl_andreas
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Container Registry
Model registry
Operate
Environments
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
supervision
schaertl_andreas
Commits
2127bd46
Commit
2127bd46
authored
4 years ago
by
Andreas Schärtl
Browse files
Options
Downloads
Patches
Plain Diff
report: review stats query
parent
b27054b8
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
2
Hide whitespace changes
Inline
Side-by-side
Showing
2 changed files
doc/report/applications.tex
+53
-28
53 additions, 28 deletions
doc/report/applications.tex
doc/report/references.bib
+17
-0
17 additions, 0 deletions
doc/report/references.bib
with
70 additions
and
28 deletions
doc/report/applications.tex
+
53
−
28
View file @
2127bd46
\section
{
Applications
}
\label
{
sec:applications
}
\section
{
Applications
}
\label
{
sec:applications
}
With endpoints in place, we can now query the ULO/RDF data set. This
With programming endpoints in place, we can now query the data set containing
section describes some experiments with the
\emph
{
ulo-endpoint
}
both Isabelle and Coq exports stored in
{
GraphDB
}
. We experimented with
Endpoint
{
API
}
. In particular, we query the storage backend for some
various queries and applications:
statistics, implement some queries suggested for tetrapodal search
\begin{itemize}
\subsection
{
Exploring Existing Data Sets
}
\item
Exploring which ULO predicates are actually used and which
remain unused (Section~
\ref
{
sec:expl
}
).
As previously stated, there already exist exports to ULO for both
Isabelle and Coq libraries~
\cite
{
uloisabelle, ulocoq
}
. As a very first
\item
We ran some queries that were suggested as building blocks
application, we simply look at what ULO predicates are actually used
of a larger tetrapodal search system (Section~
\ref
{
sec:tetraq
}
).
by the respective data sets. Implementing such a query is not very
difficult. In SPARQL, this can be achieved with the
\texttt
{
COUNT
}
\item
We also experimented with various other more general queries
aggregate.
for organizational data recommended in literature
(Section~
\ref
{
sec:miscq
}
).
\item
Finally we built a small web front end that takes visualizes
the ULO data set (Section~
\ref
{
sec:webq
}
).
\end{itemize}
For each example query or application, we try to describe how to
implement it, what results we observed and if possible we conclude
with some recommendations for future development of
{
ULO
}
.
\subsection
{
Exploring Existing Data Sets
}
\label
{
sec:expl
}
As a very first application, we simply looked at what ULO predicates
are actually used by the respective data sets. With more than
250~million triplets in the store, we hoped that this would give us
some insight into the kind of knowledge we are dealing with.
Implementing a query for this job is not very difficult. In SPARQL,
this can be achieved with the
\texttt
{
COUNT
}
aggregate.
\begin{lstlisting}
\begin{lstlisting}
PREFIX ulo: <https://mathhub.info/ulo#>
PREFIX ulo: <https://mathhub.info/ulo#>
...
@@ -27,7 +46,7 @@ This yields a list of all used predicates with \texttt{?count} being
...
@@ -27,7 +46,7 @@ This yields a list of all used predicates with \texttt{?count} being
the number of occurrences. Looking at the results, we find that both
the number of occurrences. Looking at the results, we find that both
the Isabelle and the Coq data sets only use subsets of the predicates
the Isabelle and the Coq data sets only use subsets of the predicates
provided by the ULO ontology. The results are listed in
provided by the ULO ontology. The results are listed in
f
igure~
\ref
{
fig:used
}
. In both cases, the exports use less than a
F
igure~
\ref
{
fig:used
}
. In both cases, the exports use less than a
third of the available predicates.
third of the available predicates.
\input
{
applications-ulo-table.tex
}
\input
{
applications-ulo-table.tex
}
...
@@ -37,19 +56,23 @@ predicates. For example, the Isabelle contains organizational meta
...
@@ -37,19 +56,23 @@ predicates. For example, the Isabelle contains organizational meta
information such as information about paragraphs and sections in the
information such as information about paragraphs and sections in the
source document while the Coq export only tells us about the filename
source document while the Coq export only tells us about the filename
of the Coq source. That is not particularly problematic as long as we
of the Coq source. That is not particularly problematic as long as we
can trace a given object back to the original Isabelle/Coq source.
can trace a given object back to the original source. Regardless, our
results do show that both exports have their own particularities and
However, our results do show that both exports have their own
with more and more third party libraries exported to ULO one has to
particularities and with more and more third party libraries exported
assume that this heterogeneity will only grow. In particular we want
to ULO one can assume that this heterogeneity only grows. In particular
to point to the large number of predicates which remain unused in both
we want to point to the large number of predicates which remain unused
Isabelle and Coq exports. A user formulating queries for ULO might be
in both Isabelle and Coq exports. A user formulating queries for ULO
oblivious to the fact that only subsets of exports support given
might be oblivious to the fact that only subsets of exports support
predicates.
given predicates. While not a problem for
\emph
{
ulo-storage
}
per se,
we expect this to be a major challenge when building a system of
While not a problem for
\emph
{
ulo-storage
}
per se, we do expect this
tetrapodal search.
to be a challenge when building a tetrapodal search
system. Recommended ways around this ``missing fields'' problem in
\subsection
{
Querying for Tetrapodal Search
}
database literature include the clever use of default values or
inference of missing values~
\cite
{
kdisc, aidisc
}
, neither of which
feels particularly applicable to an ULO data set.
\subsection
{
Querying for Tetrapodal Search
}
\label
{
sec:tetraq
}
\emph
{
ulo-storage
}
was started with the goal of making organizational
\emph
{
ulo-storage
}
was started with the goal of making organizational
knowledge available for tetrapodal search. We will first take a look
knowledge available for tetrapodal search. We will first take a look
...
@@ -259,6 +282,8 @@ proof of concept implementations.
...
@@ -259,6 +282,8 @@ proof of concept implementations.
handled by the database access should be quick.
handled by the database access should be quick.
\end{itemize}
\end{itemize}
\subsection
{
O
ther Queries
}
\subsection
{
O
rganizational Queries
}
\label
{
sec:miscq
}
\emph
{{
TODO
}
: SPARQL Queries references in ULO paper
}
\emph
{{
TODO
}
: SPARQL Queries references in ULO paper
}
\subsection
{
Experience with Building a Web Frontend
}
\label
{
sec:webq
}
This diff is collapsed.
Click to expand it.
doc/report/references.bib
+
17
−
0
View file @
2127bd46
...
@@ -154,3 +154,20 @@
...
@@ -154,3 +154,20 @@
year
=
{2013}
,
year
=
{2013}
,
publisher
=
{" O'Reilly Media, Inc."}
publisher
=
{" O'Reilly Media, Inc."}
}
}
@article
{
kdisc
,
title
=
{Knowledge discovery in databases: An overview}
,
author
=
{Frawley, William J and Piatetsky-Shapiro, Gregory and Matheus, Christopher J}
,
journal
=
{AI magazine}
,
volume
=
{13}
,
number
=
{3}
,
pages
=
{57--57}
,
year
=
{1992}
}
@inproceedings
{
aidisc
,
title
=
{Discovering Missing Values in Semi-Structured Databases.}
,
author
=
{Yi, Xing and Allan, James and Lavrenko, Victor}
,
booktitle
=
{RIAO}
,
year
=
{2007}
}
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment