report: set up for tetra applications

243f1296 · Andreas Schärtl · 400fc60a · 243f1296 · 243f1296
Commit 243f1296 authored 4 years ago by Andreas Schärtl
--- a/doc/report/applications.tex
+++ b/doc/report/applications.tex
@@ -5,102 +5,75 @@ With endpoints in place, we can now query the ULO/RDF
 data set. Depending on the kind of application, different interfaces
 and approaches to querying the database might make sense.
-\subsection{Kinds of Applications}
-Storing information in RDF triplets allows for any kind of queries,
-meaning it is not optimized for any kind of application. For the sake
-of this project, we tried out three categories of applications.
-\begin{itemize}
-    \item Of course the initial starting point for this project was
-      the idea of tetrapodal search. Our first application
-      \emph{ulosearch} tires to offer an easy way of searching in the
-      ULO/RDF data set.
-    \item With lots of data in a database, it appears attractive to
-      visualize the data set in some kind graphical way in the
-      \emph{ulovisualize} application.
-    \item Finally, we want to experiment a bit. The available ULO/RDF
-      data sets are about proofs and theorems and should include links
-      between. It might be interesting to find out which proofs and
-      definitions are more important than others such that we can
-      create a kind of ranking of them. This is explored in the
-      \emph{ulorate} application.
-\end{itemize}
-\subsection{Database Interface}
-For integrating the ULO/RDF data set into an existing application, it
-probably is reasonable to directly query the data set using RDF4J.
-That is, of course, assuming the existing co debase is based on the
-{JVM}.  If that is not the case, generating SPARQL queries is the
-obvious choice.
-The advantage of this approach is that connecting and interacting
-with the database is straightforward. The disadvantage is that this
-approach requires a deep understanding of structure of the underlying
-ULO triplets.
-\subsection{A Language for Organizational Data}
-ULO/RDF is a subset of RDF. While it can be queried as just standard
-RDF data, maybe it is helpful to design a query language only for
-ULO/RDF triplets. Expressions in this particular query language could
-then be converted to SPARQL or RDF4J expressions. Ideally this means
-that (1)~the query language is intuitive and easy to use for this
-specific use case and (2)~execution is still fast as the underlying
-SPARQL database is already very optimized.
-% This does not really fit, in general this entire section is kind
-% of a mess and contains more stuff about things that do not even
-% exist yet than actual information.
 \subsection{Querying for Tetrapodal Search}
-The first introduction of tetrapodal search contains various queries
+\emph{ulo-storage} was started with the goal of making organizational
-that such a system should answer~\cite{tetra}. For each of the
+knowledge available for tetrapodal search. We will first take a look
-suggested queries, we experimented how well ULO/RDF can answer this
+at how ULO/RDF performs at this task. Conviniently, various queries
-queries. As \emph{ulo-storage} provides only one of the four required
+for a tetrapodal search system were suggested in~\cite{tetra}; we will
-kinds of mathematical knowledge required by tetrapodal search, not all
+investigate how well each of the suggested queries~$\mathcal{Q}_{1}$
-queries could be answered. They are replicated here in verbatim.
+to~$\mathcal{Q}_{13}$ can be realized with ULO/RDF datasets. Where
+possible, we evaluate proof of concept implementations.
+\subsubsection*{$\mathcal{Q}_{1}$ Find theorems with non-elementary proofs.}
-\begin{enumerate}
+Here be dragons
-\item Find theorems with non-elementary proofs.
+\subsubsection*{$\mathcal{Q}_{2}$ Find algorithms that solve $NP$-complete graph problems.}
-\item Find algorithms that solve $NP$-complete graph problems.
+Here be dragons
-\item Find integer sequences whose generating function is a rational
+\subsubsection*{$\mathcal{Q}_{3}$ Find integer sequences whose generating function is a rational
  polynomial in $\sin(x)$ that has a Maple implementation not affected
-  by the bug in module~$x$.
+  by the bug in module~$x$.}
+Here be dragons
+\subsubsection*{$\mathcal{Q}_{4}$ $CAS$ implementation of Groebner bases that conform to a
+  definition in AFP.}
-\item $CAS$ implementation of Groebner bases that conform to a
+Here be dragons
-  definition in AFP.
-\item Find all group representations that are good for~$X$ (say a
+\subsubsection*{$\mathcal{Q}_{5}$ Find all group representations that are good for~$X$ (say a
  software engineer working on something and doesn't know group
-  theory), maybe ``computing with in/finite groups''.
+  theory), maybe ``computing with in/finite groups''.}
+Here be dragons
+\subsubsection*{$\mathcal{Q}_{6}$ Math software systems that implement algorithms from MSC48CXX
+  (or that compute a particular thing).}
+Here be dragons
+\subsubsection*{$\mathcal{Q}_{7}$ All areas of math that {Nicolas G.\ de Bruijn} has worked in and
+  his main contributions.}
+Here be dragons
+\subsubsection*{$\mathcal{Q}_{8}$ All the researchers that have worked on problem~$X$ (where~$X$
+  does not have a good name, maybe connected to ``Go'').}
+Here be dragons
+\subsubsection*{$\mathcal{Q}_{9}$ Areas of mathematics that immediate descendants of~$X$ worked
+  on.}
+Here be dragons
-\item Math software systems that implement algorithms from MSC48CXX
+\subsubsection*{$\mathcal{Q}_{10}$ All graphs whose order is larger than the publication record of
-  (or that compute a particular thing).
+  its ``inventor'' (name patron).}
-\item All areas of math that {Nicolas G.\ de Bruijn} has worked in and
+Here be dragons
-  his main contributions.
-\item All the researchers that have worked on problem~$X$ (where~$X$
+\subsubsection*{$\mathcal{Q}_{11}$ Integer sequences that grow sub-exponentially.}
-  does not have a good name, maybe connected to ``Go'').
-\item Areas of mathematics that immediate descendants of~$X$ worked
+Here be dragons
-  on.
-\item All graphs whose order is larger than the publication record of
+\subsubsection*{$\mathcal{Q}_{12}$ Published integer sequences not listed in the OEIS.}
-  its ``inventor'' (name patron).
-\item Integer sequences that grow sub-exponentially.
+Here be dragons
-\item Published integer sequences not listed in the OEIS.
+\subsubsection*{$\mathcal{Q}_{13}$ Find all polynomials whose list of coefficients occurs as a
+  subsequence of a specific OEIS sequence.}
-\item Find all polynomials whose list of coefficients occurs as a
+Here be dragons
-  subsequence of a specific OEIS sequence.
-\end{enumerate}
--- a/doc/report/cut.tex
+++ b/doc/report/cut.tex
+% This file contains sections and paragraphs cut from the report.
+% I'll delete them eventually, but for now I'd like to have them
+% available for referencing.
+\subsection{Kinds of Applications}
+Storing information in RDF triplets allows for any kind of queries,
+meaning it is not optimized for any kind of application. For the sake
+of this project, we tried out three categories of applications.
+\begin{itemize}
+    \item Of course the initial starting point for this project was
+      the idea of tetrapodal search. Our first application
+      \emph{ulosearch} tires to offer an easy way of searching in the
+      ULO/RDF data set.
+    \item With lots of data in a database, it appears attractive to
+      visualize the data set in some kind graphical way in the
+      \emph{ulovisualize} application.
+    \item Finally, we want to experiment a bit. The available ULO/RDF
+      data sets are about proofs and theorems and should include links
+      between. It might be interesting to find out which proofs and
+      definitions are more important than others such that we can
+      create a kind of ranking of them. This is explored in the
+      \emph{ulorate} application.
+\end{itemize}
+\subsection{Database Interface}
+For integrating the ULO/RDF data set into an existing application, it
+probably is reasonable to directly query the data set using RDF4J.
+That is, of course, assuming the existing co debase is based on the
+{JVM}.  If that is not the case, generating SPARQL queries is the
+obvious choice.
+The advantage of this approach is that connecting and interacting
+with the database is straightforward. The disadvantage is that this
+approach requires a deep understanding of structure of the underlying
+ULO triplets.
+\subsection{A Language for Organizational Data}
+ULO/RDF is a subset of RDF. While it can be queried as just standard
+RDF data, maybe it is helpful to design a query language only for
+ULO/RDF triplets. Expressions in this particular query language could
+then be converted to SPARQL or RDF4J expressions. Ideally this means
+that (1)~the query language is intuitive and easy to use for this
+specific use case and (2)~execution is still fast as the underlying
+SPARQL database is already very optimized.
+% This does not really fit, in general this entire section is kind
+% of a mess and contains more stuff about things that do not even
+% exist yet than actual information.