Skip to content
Snippets Groups Projects
Commit 243f1296 authored by Andreas Schärtl's avatar Andreas Schärtl
Browse files

report: set up for tetra applications

parent 400fc60a
No related branches found
No related tags found
No related merge requests found
...@@ -5,102 +5,75 @@ With endpoints in place, we can now query the ULO/RDF ...@@ -5,102 +5,75 @@ With endpoints in place, we can now query the ULO/RDF
data set. Depending on the kind of application, different interfaces data set. Depending on the kind of application, different interfaces
and approaches to querying the database might make sense. and approaches to querying the database might make sense.
\subsection{Kinds of Applications}
Storing information in RDF triplets allows for any kind of queries,
meaning it is not optimized for any kind of application. For the sake
of this project, we tried out three categories of applications.
\begin{itemize}
\item Of course the initial starting point for this project was
the idea of tetrapodal search. Our first application
\emph{ulosearch} tires to offer an easy way of searching in the
ULO/RDF data set.
\item With lots of data in a database, it appears attractive to
visualize the data set in some kind graphical way in the
\emph{ulovisualize} application.
\item Finally, we want to experiment a bit. The available ULO/RDF
data sets are about proofs and theorems and should include links
between. It might be interesting to find out which proofs and
definitions are more important than others such that we can
create a kind of ranking of them. This is explored in the
\emph{ulorate} application.
\end{itemize}
\subsection{Database Interface}
For integrating the ULO/RDF data set into an existing application, it
probably is reasonable to directly query the data set using RDF4J.
That is, of course, assuming the existing co debase is based on the
{JVM}. If that is not the case, generating SPARQL queries is the
obvious choice.
The advantage of this approach is that connecting and interacting
with the database is straightforward. The disadvantage is that this
approach requires a deep understanding of structure of the underlying
ULO triplets.
\subsection{A Language for Organizational Data}
ULO/RDF is a subset of RDF. While it can be queried as just standard
RDF data, maybe it is helpful to design a query language only for
ULO/RDF triplets. Expressions in this particular query language could
then be converted to SPARQL or RDF4J expressions. Ideally this means
that (1)~the query language is intuitive and easy to use for this
specific use case and (2)~execution is still fast as the underlying
SPARQL database is already very optimized.
% This does not really fit, in general this entire section is kind
% of a mess and contains more stuff about things that do not even
% exist yet than actual information.
\subsection{Querying for Tetrapodal Search} \subsection{Querying for Tetrapodal Search}
The first introduction of tetrapodal search contains various queries \emph{ulo-storage} was started with the goal of making organizational
that such a system should answer~\cite{tetra}. For each of the knowledge available for tetrapodal search. We will first take a look
suggested queries, we experimented how well ULO/RDF can answer this at how ULO/RDF performs at this task. Conviniently, various queries
queries. As \emph{ulo-storage} provides only one of the four required for a tetrapodal search system were suggested in~\cite{tetra}; we will
kinds of mathematical knowledge required by tetrapodal search, not all investigate how well each of the suggested queries~$\mathcal{Q}_{1}$
queries could be answered. They are replicated here in verbatim. to~$\mathcal{Q}_{13}$ can be realized with ULO/RDF datasets. Where
possible, we evaluate proof of concept implementations.
\subsubsection*{$\mathcal{Q}_{1}$ Find theorems with non-elementary proofs.}
\begin{enumerate} Here be dragons
\item Find theorems with non-elementary proofs. \subsubsection*{$\mathcal{Q}_{2}$ Find algorithms that solve $NP$-complete graph problems.}
\item Find algorithms that solve $NP$-complete graph problems. Here be dragons
\item Find integer sequences whose generating function is a rational \subsubsection*{$\mathcal{Q}_{3}$ Find integer sequences whose generating function is a rational
polynomial in $\sin(x)$ that has a Maple implementation not affected polynomial in $\sin(x)$ that has a Maple implementation not affected
by the bug in module~$x$. by the bug in module~$x$.}
Here be dragons
\subsubsection*{$\mathcal{Q}_{4}$ $CAS$ implementation of Groebner bases that conform to a
definition in AFP.}
\item $CAS$ implementation of Groebner bases that conform to a Here be dragons
definition in AFP.
\item Find all group representations that are good for~$X$ (say a \subsubsection*{$\mathcal{Q}_{5}$ Find all group representations that are good for~$X$ (say a
software engineer working on something and doesn't know group software engineer working on something and doesn't know group
theory), maybe ``computing with in/finite groups''. theory), maybe ``computing with in/finite groups''.}
Here be dragons
\subsubsection*{$\mathcal{Q}_{6}$ Math software systems that implement algorithms from MSC48CXX
(or that compute a particular thing).}
Here be dragons
\subsubsection*{$\mathcal{Q}_{7}$ All areas of math that {Nicolas G.\ de Bruijn} has worked in and
his main contributions.}
Here be dragons
\subsubsection*{$\mathcal{Q}_{8}$ All the researchers that have worked on problem~$X$ (where~$X$
does not have a good name, maybe connected to ``Go'').}
Here be dragons
\subsubsection*{$\mathcal{Q}_{9}$ Areas of mathematics that immediate descendants of~$X$ worked
on.}
Here be dragons
\item Math software systems that implement algorithms from MSC48CXX \subsubsection*{$\mathcal{Q}_{10}$ All graphs whose order is larger than the publication record of
(or that compute a particular thing). its ``inventor'' (name patron).}
\item All areas of math that {Nicolas G.\ de Bruijn} has worked in and Here be dragons
his main contributions.
\item All the researchers that have worked on problem~$X$ (where~$X$ \subsubsection*{$\mathcal{Q}_{11}$ Integer sequences that grow sub-exponentially.}
does not have a good name, maybe connected to ``Go'').
\item Areas of mathematics that immediate descendants of~$X$ worked Here be dragons
on.
\item All graphs whose order is larger than the publication record of \subsubsection*{$\mathcal{Q}_{12}$ Published integer sequences not listed in the OEIS.}
its ``inventor'' (name patron).
\item Integer sequences that grow sub-exponentially. Here be dragons
\item Published integer sequences not listed in the OEIS. \subsubsection*{$\mathcal{Q}_{13}$ Find all polynomials whose list of coefficients occurs as a
subsequence of a specific OEIS sequence.}
\item Find all polynomials whose list of coefficients occurs as a Here be dragons
subsequence of a specific OEIS sequence.
\end{enumerate}
% This file contains sections and paragraphs cut from the report.
% I'll delete them eventually, but for now I'd like to have them
% available for referencing.
\subsection{Kinds of Applications}
Storing information in RDF triplets allows for any kind of queries,
meaning it is not optimized for any kind of application. For the sake
of this project, we tried out three categories of applications.
\begin{itemize}
\item Of course the initial starting point for this project was
the idea of tetrapodal search. Our first application
\emph{ulosearch} tires to offer an easy way of searching in the
ULO/RDF data set.
\item With lots of data in a database, it appears attractive to
visualize the data set in some kind graphical way in the
\emph{ulovisualize} application.
\item Finally, we want to experiment a bit. The available ULO/RDF
data sets are about proofs and theorems and should include links
between. It might be interesting to find out which proofs and
definitions are more important than others such that we can
create a kind of ranking of them. This is explored in the
\emph{ulorate} application.
\end{itemize}
\subsection{Database Interface}
For integrating the ULO/RDF data set into an existing application, it
probably is reasonable to directly query the data set using RDF4J.
That is, of course, assuming the existing co debase is based on the
{JVM}. If that is not the case, generating SPARQL queries is the
obvious choice.
The advantage of this approach is that connecting and interacting
with the database is straightforward. The disadvantage is that this
approach requires a deep understanding of structure of the underlying
ULO triplets.
\subsection{A Language for Organizational Data}
ULO/RDF is a subset of RDF. While it can be queried as just standard
RDF data, maybe it is helpful to design a query language only for
ULO/RDF triplets. Expressions in this particular query language could
then be converted to SPARQL or RDF4J expressions. Ideally this means
that (1)~the query language is intuitive and easy to use for this
specific use case and (2)~execution is still fast as the underlying
SPARQL database is already very optimized.
% This does not really fit, in general this entire section is kind
% of a mess and contains more stuff about things that do not even
% exist yet than actual information.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment