diff --git a/doc/report/applications.tex b/doc/report/applications.tex index aec7f415afeced144cd41991adcdce6b93b6f9be..e4097b82b34da820d21b78c32b3e4fc7f20104e4 100644 --- a/doc/report/applications.tex +++ b/doc/report/applications.tex @@ -5,102 +5,75 @@ With endpoints in place, we can now query the ULO/RDF data set. Depending on the kind of application, different interfaces and approaches to querying the database might make sense. -\subsection{Kinds of Applications} - -Storing information in RDF triplets allows for any kind of queries, -meaning it is not optimized for any kind of application. For the sake -of this project, we tried out three categories of applications. - -\begin{itemize} - \item Of course the initial starting point for this project was - the idea of tetrapodal search. Our first application - \emph{ulosearch} tires to offer an easy way of searching in the - ULO/RDF data set. - - \item With lots of data in a database, it appears attractive to - visualize the data set in some kind graphical way in the - \emph{ulovisualize} application. - - \item Finally, we want to experiment a bit. The available ULO/RDF - data sets are about proofs and theorems and should include links - between. It might be interesting to find out which proofs and - definitions are more important than others such that we can - create a kind of ranking of them. This is explored in the - \emph{ulorate} application. -\end{itemize} - -\subsection{Database Interface} - -For integrating the ULO/RDF data set into an existing application, it -probably is reasonable to directly query the data set using RDF4J. -That is, of course, assuming the existing co debase is based on the -{JVM}. If that is not the case, generating SPARQL queries is the -obvious choice. - -The advantage of this approach is that connecting and interacting -with the database is straightforward. The disadvantage is that this -approach requires a deep understanding of structure of the underlying -ULO triplets. - -\subsection{A Language for Organizational Data} - -ULO/RDF is a subset of RDF. While it can be queried as just standard -RDF data, maybe it is helpful to design a query language only for -ULO/RDF triplets. Expressions in this particular query language could -then be converted to SPARQL or RDF4J expressions. Ideally this means -that (1)~the query language is intuitive and easy to use for this -specific use case and (2)~execution is still fast as the underlying -SPARQL database is already very optimized. - -% This does not really fit, in general this entire section is kind -% of a mess and contains more stuff about things that do not even -% exist yet than actual information. - \subsection{Querying for Tetrapodal Search} -The first introduction of tetrapodal search contains various queries -that such a system should answer~\cite{tetra}. For each of the -suggested queries, we experimented how well ULO/RDF can answer this -queries. As \emph{ulo-storage} provides only one of the four required -kinds of mathematical knowledge required by tetrapodal search, not all -queries could be answered. They are replicated here in verbatim. +\emph{ulo-storage} was started with the goal of making organizational +knowledge available for tetrapodal search. We will first take a look +at how ULO/RDF performs at this task. Conviniently, various queries +for a tetrapodal search system were suggested in~\cite{tetra}; we will +investigate how well each of the suggested queries~$\mathcal{Q}_{1}$ +to~$\mathcal{Q}_{13}$ can be realized with ULO/RDF datasets. Where +possible, we evaluate proof of concept implementations. + +\subsubsection*{$\mathcal{Q}_{1}$ Find theorems with non-elementary proofs.} -\begin{enumerate} +Here be dragons -\item Find theorems with non-elementary proofs. +\subsubsection*{$\mathcal{Q}_{2}$ Find algorithms that solve $NP$-complete graph problems.} -\item Find algorithms that solve $NP$-complete graph problems. +Here be dragons -\item Find integer sequences whose generating function is a rational +\subsubsection*{$\mathcal{Q}_{3}$ Find integer sequences whose generating function is a rational polynomial in $\sin(x)$ that has a Maple implementation not affected - by the bug in module~$x$. + by the bug in module~$x$.} + +Here be dragons + +\subsubsection*{$\mathcal{Q}_{4}$ $CAS$ implementation of Groebner bases that conform to a + definition in AFP.} -\item $CAS$ implementation of Groebner bases that conform to a - definition in AFP. +Here be dragons -\item Find all group representations that are good for~$X$ (say a +\subsubsection*{$\mathcal{Q}_{5}$ Find all group representations that are good for~$X$ (say a software engineer working on something and doesn't know group - theory), maybe ``computing with in/finite groups''. + theory), maybe ``computing with in/finite groups''.} + +Here be dragons + +\subsubsection*{$\mathcal{Q}_{6}$ Math software systems that implement algorithms from MSC48CXX + (or that compute a particular thing).} + +Here be dragons + +\subsubsection*{$\mathcal{Q}_{7}$ All areas of math that {Nicolas G.\ de Bruijn} has worked in and + his main contributions.} + +Here be dragons + +\subsubsection*{$\mathcal{Q}_{8}$ All the researchers that have worked on problem~$X$ (where~$X$ + does not have a good name, maybe connected to ``Go'').} + +Here be dragons + +\subsubsection*{$\mathcal{Q}_{9}$ Areas of mathematics that immediate descendants of~$X$ worked + on.} + +Here be dragons -\item Math software systems that implement algorithms from MSC48CXX - (or that compute a particular thing). +\subsubsection*{$\mathcal{Q}_{10}$ All graphs whose order is larger than the publication record of + its ``inventor'' (name patron).} -\item All areas of math that {Nicolas G.\ de Bruijn} has worked in and - his main contributions. +Here be dragons -\item All the researchers that have worked on problem~$X$ (where~$X$ - does not have a good name, maybe connected to ``Go''). +\subsubsection*{$\mathcal{Q}_{11}$ Integer sequences that grow sub-exponentially.} -\item Areas of mathematics that immediate descendants of~$X$ worked - on. +Here be dragons -\item All graphs whose order is larger than the publication record of - its ``inventor'' (name patron). +\subsubsection*{$\mathcal{Q}_{12}$ Published integer sequences not listed in the OEIS.} -\item Integer sequences that grow sub-exponentially. +Here be dragons -\item Published integer sequences not listed in the OEIS. +\subsubsection*{$\mathcal{Q}_{13}$ Find all polynomials whose list of coefficients occurs as a + subsequence of a specific OEIS sequence.} -\item Find all polynomials whose list of coefficients occurs as a - subsequence of a specific OEIS sequence. -\end{enumerate} +Here be dragons diff --git a/doc/report/cut.tex b/doc/report/cut.tex new file mode 100644 index 0000000000000000000000000000000000000000..e97311199cee13e40ed2588be2d9c5b163acddd2 --- /dev/null +++ b/doc/report/cut.tex @@ -0,0 +1,55 @@ +% This file contains sections and paragraphs cut from the report. +% I'll delete them eventually, but for now I'd like to have them +% available for referencing. + +\subsection{Kinds of Applications} + +Storing information in RDF triplets allows for any kind of queries, +meaning it is not optimized for any kind of application. For the sake +of this project, we tried out three categories of applications. + +\begin{itemize} + \item Of course the initial starting point for this project was + the idea of tetrapodal search. Our first application + \emph{ulosearch} tires to offer an easy way of searching in the + ULO/RDF data set. + + \item With lots of data in a database, it appears attractive to + visualize the data set in some kind graphical way in the + \emph{ulovisualize} application. + + \item Finally, we want to experiment a bit. The available ULO/RDF + data sets are about proofs and theorems and should include links + between. It might be interesting to find out which proofs and + definitions are more important than others such that we can + create a kind of ranking of them. This is explored in the + \emph{ulorate} application. +\end{itemize} + +\subsection{Database Interface} + +For integrating the ULO/RDF data set into an existing application, it +probably is reasonable to directly query the data set using RDF4J. +That is, of course, assuming the existing co debase is based on the +{JVM}. If that is not the case, generating SPARQL queries is the +obvious choice. + +The advantage of this approach is that connecting and interacting +with the database is straightforward. The disadvantage is that this +approach requires a deep understanding of structure of the underlying +ULO triplets. + +\subsection{A Language for Organizational Data} + +ULO/RDF is a subset of RDF. While it can be queried as just standard +RDF data, maybe it is helpful to design a query language only for +ULO/RDF triplets. Expressions in this particular query language could +then be converted to SPARQL or RDF4J expressions. Ideally this means +that (1)~the query language is intuitive and easy to use for this +specific use case and (2)~execution is still fast as the underlying +SPARQL database is already very optimized. + +% This does not really fit, in general this entire section is kind +% of a mess and contains more stuff about things that do not even +% exist yet than actual information. +