report: applications: rework section on queries Q_i to not use \itemize

Too many indents for my taste.

report: applications: rework section on queries Q_i to not use \itemize
12539924 · Andreas Schärtl · 098f2497 · 12539924
Commit 12539924 authored 4 years ago by Andreas Schärtl
--- a/doc/report/applications.tex
+++ b/doc/report/applications.tex
@@ -18,7 +18,8 @@ endpoint.
      structured (Section~\ref{sec:tetraq}).
 \end{itemize}

-\noindent Both applications will now be discussed in a dedicated section.
+\noindent These applications will now be discussed in dedicated
+sections.

 \subsection{Exploring Existing Data Sets}\label{sec:expl}

@@ -80,87 +81,92 @@ can be realized with ULO data sets and where other data sources are
 required. Where possible, we evaluate proof of concept
 implementations.

-\begin{itemize}
-    \item \textbf{$\mathcal{Q}_1$ ``Find theorems with non-elementary
-    proofs.''}  Elementary proofs are those that are considered easy
-    and obvious~\cite{elempro}. In consequence,~$\mathcal{Q}_1$ has
-    to search for all proofs which are not trivial.  Of course, just
-    like nay distinction between ``theorem'' and ``corollary'' is
-    going to be somewhat arbitrary, so is any judgment about whether
-    a proof is easy or not.
-
-    Existing research on proof difficulty is either very broad or
-    specific to one problem. For example, some experiments showed that
-    students and prospective school teachers have problems with
-    notation, term rewriting and required
-    prerequisites~\cite{proofund, proofteach}, none of which seems
-    applicable for grading individual proofs for difficulty. On the
-    other hand, there is research on rating proofs for individual
-    subsets of problems, e.g.\ on the satisfiability of a given CNF
-    formula. A particular example is focused on heuristics for how
+\subsubsection{Elementary Proofs and Computes Scores}
+
+Our first query~$\mathcal{Q}_1$ illustrates how we can compute
+arithmetic scores for some nodes in our knowledge
+graph. Query~$\mathcal{Q}_1$ asks us to ``[f]ind theorems with
+non-elementary proofs.''.  Elementary proofs are those that are
+considered easy and obvious~\cite{elempro}. In
+consequence,~$\mathcal{Q}_1$ has to search for all proofs which are
+not trivial.  Of course, just like nay distinction between ``theorem''
+and ``corollary'' is going to be somewhat arbitrary, so is any
+judgment about whether a proof is easy or not.
+
+Existing research on proof difficulty is either very broad or specific
+to one problem. For example, some experiments showed that students and
+prospective school teachers have problems with notation, term
+rewriting and required prerequisites~\cite{proofund, proofteach}, none
+of which seems applicable for grading individual proofs for
+difficulty. On the other hand, there is research on rating proofs for
+individual subsets of problems, e.g.\ on the satisfiability of a given
+CNF formula. A particular example is focused on heuristics for how
 long a SAT solver will take to find a solution for a
 given~{CNF}. Here, solutions that take long are considered
 harder~\cite{proofsat}.

-    \textbf{Organizational Aspect} A first working hypothesis might be
-    to assume that elementary proofs are short. In that case, the
+\noindent\textbf{Organizational Aspect} A first working hypothesis
+might be to assume that elementary proofs are short. In that case, the
 size, that is the number of bytes to store the proof, is our first
-    indicator of proof complexity. This is by no means perfect, as
-    even identical proofs can be represented in different ways that
-    might have vastly different size in bytes. It might be tempting to
-    imagine a unique normal form for each proof, but finding such a
-    normal form might very well be impossible. As it is very difficult
-    to find a generalized definition of proof difficulty, we will
-    accept proof size as a first working hypothesis.
-
-    {ULO} offers the \texttt{ulo:external-size} predicate which will
-    allow us to sort by file size. Maybe proof complexity also leads
-    to quick check times in proof assistants and automatic theorem
-    provers. With this assumption in mind we could use the
-    \texttt{ulo:check-time} predicate. Correlating proof complexity
-    with file size allows us to define one indicator of proof
-    complexity based on organizational knowledge alone.
-
-    \textbf{Other Aspects} A tetrapodal search system should probably
-    also take symbolic knowledge into account. Based on some kind of
-    measure of formula complexity, different proofs could be
-    rated. Similarly, with narrative knowledge available to us, we
-    could count the number of words, references and so on to rate the
-    narrative complexity of a proof. Combining symbolic knowledge,
-    narrative knowledge and organizational knowledge allows us to
-    find proofs which are probably straight forward.
+indicator of proof complexity. This is by no means perfect, as even
+identical proofs can be represented in different ways that might have
+vastly different size in bytes. It might be tempting to imagine a
+unique normal form for each proof, but finding such a normal form
+might very well be impossible. As it is very difficult to find a
+generalized definition of proof difficulty, we will accept proof size
+as a first working hypothesis.
+
+{ULO} offers the \texttt{ulo:external-size} predicate which will allow
+us to sort by file size. Maybe proof complexity also leads to quick
+check times in proof assistants and automatic theorem provers. With
+this assumption in mind we could use the \texttt{ulo:check-time}
+predicate. Correlating proof complexity with file size allows us to
+define one indicator of proof complexity based on organizational
+knowledge alone.
+
+\noindent\textbf{Other Aspects} A tetrapodal search system should
+probably also take symbolic knowledge into account. Based on some kind
+of measure of formula complexity, different proofs could be
+rated. Similarly, with narrative knowledge available to us, we could
+count the number of words, references and so on to rate the narrative
+complexity of a proof. Combining symbolic knowledge, narrative
+knowledge and organizational knowledge allows us to find proofs which
+are probably straight forward.

 \input{applications-q1.tex}

-    \textbf{Implementation} Implementing a naive version of the
-    organizational aspect can be as simple as querying for all
-    theorems justified by proofs, ordered by size (or check time).
-    Figure~\ref{fig:q1short} illustrates how this can be achieved with
-    a SPARQL query.  Maybe we want to go one step further and
-    calculate a rating that assigns each proof some numeric score of
-    complexity based on a number of properties. We can achieve this in
-    SPARQL as recent versions support arithmetic as part of the SPARQL
-    specification; Figure~\ref{fig:q1long} shows an example.  Finding
-    a reasonable rating is its own topic of research, but we see that
-    as long as it is based on standard arithmetic, it will be possible
-    to formulate in a SPARQL query.
-
-    The queries in Figure~\ref{fig:q1full} return a list of all
-    theorems and associated proofs, naturally this list is bound to be
-    very long. A suggested way to solve this problem is to introduce
-    some kind of cutoff value for our complexity score. Another potential
-    solution is to only list the first~$n$ results, something a user
-    interface would do anyways. Either way, this is not so much an issue
-    for the organizational storage engine and more one that a tetrapodal
-    search aggregator has to account for.
-
-    \item \textbf{$\mathcal{Q}_2$ ``Find algorithms that solve
-    $NP$-complete graph problems.''} Here we want the tetrapodal search
-    system to return a listing of algorithms that solve (graph)
-    problems with a given property (runtime complexity). We need
-    to consider where each of these three components might be stored.
-
-    \textbf{Symbolic and Concrete Aspects} First, let us consider
+\noindent\textbf{Implementation} Implementing a naive version of the
+organizational aspect can be as simple as querying for all theorems
+justified by proofs, ordered by size (or check time).
+Figure~\ref{fig:q1short} illustrates how this can be achieved with a
+SPARQL query.  Maybe we want to go one step further and calculate a
+rating that assigns each proof some numeric score of complexity based
+on a number of properties. We can achieve this in SPARQL as recent
+versions support arithmetic as part of the SPARQL specification;
+Figure~\ref{fig:q1long} shows an example.  Finding a reasonable rating
+is its own topic of research, but we see that as long as it is based
+on standard arithmetic, it will be possible to formulate in a SPARQL
+query.
+
+The queries in Figure~\ref{fig:q1full} return a list of all theorems
+and associated proofs, naturally this list is bound to be very long. A
+suggested way to solve this problem is to introduce some kind of
+cutoff value for our complexity score. Another potential solution is
+to only list the first~$n$ results, something a user interface would
+do anyways. Either way, this is not so much an issue for the
+organizational storage engine and more one that a tetrapodal search
+aggregator has to account for.
+
+\subsubsection{Categorizing Algorithms and Algorithmic Problems}
+
+The second query~$\mathcal{Q}_2$ we decided to focus on wants to
+``[f]ind algorithms that solve $NP$-complete graph problems.'' Here we
+want the tetrapodal search system to return a listing of algorithms
+that solve (graph) problems with a given property (runtime
+complexity). We need to consider where each of these three components
+might be stored.
+
+\noindent\textbf{Symbolic and Concrete Aspects} First, let us consider
 algorithms. Algorithms can be formulated as computer code which
 can be understood as symbolic knowledge (code represented as a
 syntax tree) or as concrete knowledge (code as text
@@ -171,7 +177,7 @@ implementations.
 stored in a separate index for organizational knowledge, it being
 the only fit.

-    \textbf{Organizational Aspect} If we wish to look up properties
+\noindent\textbf{Organizational Aspect} If we wish to look up properties
 about algorithms from organizational knowledge, we first have to
 think about how to represent this information. We propose two
 approaches, one using the existing ULO ontology and one that
@@ -202,12 +208,15 @@ implementations.
 dcowl}) and keeping concepts separate is not entirely
 unattractive in itself.

-    \item \textbf{$\mathcal{Q}_3$ ``All areas of math that {Nicolas G.\
-    de Bruijn} has worked in and his main contributions.''}  This query
-    is asking by works of a given author~$A$.  It also asks for their
-    main contributions, e.g.\ what paragraphs or code~$A$ has authored.
+\subsubsection{Contributors and Number of References}
+
+The final query~$\mathcal{Q}_3$ from literature~\cite{tetra} we wish
+to look at wants to know ``[a]ll areas of math that {Nicolas G.\ de
+Bruijn} has worked in and his main contributions.'' This query is
+asking by works of a given author~$A$.  It also asks for their main
+contributions, e.g.\ what paragraphs or code~$A$ has authored.

-    \textbf{Organizational Aspect} ULO has no concept of authors,
+\noindent\textbf{Organizational Aspect} ULO has no concept of authors,
 contributors, dates and so on. Rather, the idea is to take
 advantage of the Dublin Core project which provides an ontology
 for such metadata~\cite{dcreport, dcowl}. For example, Dublin Core
@@ -227,7 +236,7 @@ implementations.
 important. Importance is a quality measure, simply sorting the
 result by number of references might be a good start.

-    \textbf{Implementation} A search for contributions by a given author
+\noindent\textbf{Implementation} A search for contributions by a given author
 can easily be formulated in {SPARQL}.
 \begin{lstlisting}
    PREFIX ulo: <https://mathhub.info/ulo#>
@@ -259,7 +268,6 @@ implementations.
 We can formulate~$\mathcal{Q}_3$ with just one SPARQL
 query. Because everything is handled by the database, access
 should be about as quick as we can hope it to be.
-\end{itemize}

 Experimenting with $\mathcal{Q}_1$ to $\mathcal{Q}_3$ provided us with
 some insight into ULO and existing ULO exports. $\mathcal{Q}_1$ shows