From 400fc60a5c05a7999fdf1661f26dfdec72c66aa6 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Andreas=20Sch=C3=A4rtl?= <andreas@schaertl.me>
Date: Mon, 29 Jun 2020 10:49:09 +0200
Subject: [PATCH] applications: formulate thoughts on applications

Next up: Move to report, formalize.
---
 .../tetrapodal-applications/applications.txt  | 232 +++++++++++++++++-
 1 file changed, 225 insertions(+), 7 deletions(-)

diff --git a/experimental/tetrapodal-applications/applications.txt b/experimental/tetrapodal-applications/applications.txt
index 0cd2931..8adc484 100644
--- a/experimental/tetrapodal-applications/applications.txt
+++ b/experimental/tetrapodal-applications/applications.txt
@@ -1,35 +1,253 @@
 01. Find theorems with non-elementary proofs.
+=========================================================================
+
+- elementary proofs are easy
+
+- just like distinction between *theorem* and *corollary* is
+  arbitrary, it is not really defined what an *elementary* proof
+  really has to be
+
+-> proofs could be rated by `check-time`; nothing in literature really
+   mentions the assumption that elementary proofs are easy to check
+
+-> more of a job of (a) symbolic knowledge (complexity for formulas)
+   and (b) narrative knowledge (count the number of references, count
+   the number of narrative elements, e.g. prose, itself)
+
+- if we think of elementary proofs as proofs that support elementary
+  theorems, i.e. theorems that make up the foundations of lots of
+  other theories, we could look into the number of links for a given
+  object
+
+-> look at *inspired-by*, *alternative-for*
 
 02. Find algorithms that solve NP-complete graph problems.
+=========================================================================
+
+- Here we are looking for algorithms. An algorithm should have a
+  (a) specific purpose and (b) ideally input and output parameters of a given
+  type.
+
+-> algorithms aren't really part of ULO; it might make sense to add the
+  concept of an algorithm to ULO, including input parameters and
+  results (types)
+
+-> a theorem should be able to state that a given algorithm is in a
+   given complexity class (space and time) with a proof associated to
+   it
+
+-> with part of ULO, we could search for algorithms of a given `name`
+
+-> even better, it should be possible to link the algorithm to a given
+   "problem"; "problems" don't really fit with the idea of theorems, they
+   are their own thing (TSP is not a theorem or algorithm, it is a problem
+   for which we can state algorithms and about which we can state theorems)
+
+-> of course at that point we are asking ULO to be an ontology for
+   algorithms as well which might be too far fetched
 
 03. Find integer sequences whose generating function is a rational
     polynomial in sin(x) that has a Maple implementation not affected by
     the bug in module x.
+=========================================================================
+
+- This query is about finding concrete integer sequences with some
+  properties.
+
+-> This is a case where information would be split between many
+   sources.  The `name` of the sequences might be part of the
+   organizational data set.  The generating function is part of
+   symbolic knowledge, the Maple implementation could be part of
+   concrete knowledge.
+
+-> Handling this query would probably start by filtering for all
+   integer sequences. It is not clear how this should be achieved with ULO
+   as certainly there is no unified concept of a sequences.
+
+-> It might be possible to take advantage of `aligned-with` or some
+   similar concept to find all such sequences. If this succeeds, ULO
+   can indeed provide the first step in servicing this query.
+
+-> However, finding concrete algorithms and in particular looking
+   inside of them is not organizational data.
+
+04. CAS implementation of Groebner bases that conform to a definition
+    in AFP.
+=========================================================================
+
+- Gröbner Bases are an idea from mathematics particular attractive for
+  use in computer algebra systems (CAS).
+
+- This query is asking for concrete implementations of Gröbner Bases
+  that match the definition in the Archive of Formal Proofs (AFP).
+
+-> Stated like this, it is reasonable to assume that the user already
+   has a specific definition in mind. First a translation from AFP
+   identifier (whatever that is) to an ULO URI is necessary.
+
+-> Armed with the ULO URI, we would need links to symbolic
+   knowledge/concrete knowledge that implements the given
+   concept. Here, we require explicit links to the organizational data
+   set (tags), something that is probably unreasonable.
+
+-> Automatic matching from symbolic knowledge (definition from AFP) to
+   concrete knowledge (implementations) would be preferable.
+
+05. Find all group representations that are good for X (say a software
+    engineer working on something and doesn't know group theory),
+    maybe "computing with in/finite groups".
+=========================================================================
+
+- Here the query is asking for ideas/concepts/data structures for a
+  given problem X.
+
+-> As previously stated, there currently is no concept of a problem
+   class in ULO (e.g. TSP is a problem) which might be exactly what X
+   is about.
 
-04. CAS implementation of Groebner bases that conform to a
-    definition in AFP.
+-> As a fallback, we could look into narrative knowledge and search
+   for keywords related to X. This would yield certain references we
+   could follow up upon.
 
-05. Find all group representations that are good for X (say a
-    software engineer working on something and doesn't know group
-    theory), maybe ``computing with in/finite groups''.
+-> Group representations I would except as part of symbolic knowledge.
+   With the references either from (a) narrative knowledge (keywords)
+   or (b) from organizational knowledge ULO (URIs), look up the
+   symbolic information with these
 
 06. Math software systems that implement algorithms from MSC48CXX (or
     that compute a particular thing).
+=========================================================================
+
+- Again, this is a question about algorithms A that solve a given
+  problem P. In this case, P is part of some specific resource or
+  publication.
+
+-> ULO should allow us to list all objects in a given resource.
+
+-> If we have a concept of "problems" as discussed before as well as
+   links from algorithms (concrete knowledge) to problems, this query
+   is easy to service.
 
 07. All areas of math that {Nicolas G.\ de Bruijn} has worked in and
     his main contributions.
+=========================================================================
+
+- This is a query that has to be answerable by ULO as it is asking
+  about organizational metadata. In particular it is asking by works
+  of a given author A. It also ask for his main contributions,
+  e.g. what paragraphs of a publications A has authored.
+
+-> While we expected this to be a task for organizational data, it
+   might not be the case. The organizational data is more concerned
+   with the "what" and not so much about the "by who". Indeed the
+   tetrapodal search paper states that it is narrative knowledge which
+   contains names (of theorems, proofs, ...) and comments. Maybe this
+   is where we should look instead.
+
+-> However, as ULO contains things like `file` and `para` (paragraph)
+   maybe it does belong to ULO after all. In this case, ULO should add
+   an `authored-by` property that states that a given object was
+   authored by some authors X,Y,Z...
 
 08. All the researchers that have worked on problem X (where X does not
-    have a good name, maybe connected to ``Go'').
+    have a good name, maybe connected to "Go").
+=========================================================================
+
+- This query is similar to 07 in that it is a lookup from author to
+  their works. What is different is that we don't have a formal
+  identifier for X.
+
+-> Mapping a search term such as "Go" to searchable URIs seems like a
+   job for a narrative index. Armed with these URIs lookup is like
+   described in 07.
 
 09. Areas of mathematics that immediate descendants of X worked on.
+=========================================================================
+
+- Again, this is similar to query 07. The difference is that instead
+  of asking for works authored by X it asks for descendants Y of
+  X. Alternatively it could ask for "all colleagues Y of X" or "all
+  researches Y housed at the same department as X.
+
+-> What we need here is knowledge about the connections between
+   different authors. It is not clear where this information should be
+   situated.
+
+   It isn't symbolic nor concrete knowledge. It should be possible to
+   gain some of this information from narrative knowledge. As for
+   organizational knowledge, ULO doesn't really have an elaborate
+   concept of authors and their relationships. At least information
+   such as "Y is professor at university U or research group R" might
+   make sense in ULO.  However it does feel like a stretch to include
+   this information in ULO which right now is more about the
+   organization of mathematical knowledge and not so much about its
+   historic and "earthly" contexts.
 
 10. All graphs whose order is larger than the publication record of
-    its ``inventor'' (name patron).
+    its "inventor" (name patron).
+=========================================================================
+
+- This query is asking about two things: (a) The number of published
+  works by a given author and (b) graphs constructed by a given
+  author. If we can query both of these factoids, merging them to
+  service the query should not be difficult.
+
+-> With a concept of authorship, it should be easy to count the
+   publications for a given author.
+
+-> Counting all vertices is sufficient to gain the order of a given
+   graph G. Of course, such an operation can be quite time consuming
+   and should be cached as part of the graph as part of the index for
+   concrete knowledge.
 
 11. Integer sequences that grow sub-exponentially.
+=========================================================================
+
+- This query is asking about all integer sequences that have a
+  specific property.
+
+-> Query 03 already focused on what needs to be done to search for
+   integer sequences.
+
+-> The growth of an integer sequence is a property that could be
+   stored explicitly (with an associated proof). Then it is easy to
+   serve this query.
+
+-> If we do not have this property stored explicitly, things get more
+   difficult. One could imagine a heuristic on the first n members of
+   a sequence. But doing this for every query is not feasible. It would
+   need to be computed ahead of time from the given concrete knowledge.
 
 12. Published integer sequences not listed in the OEIS.
+=========================================================================
+
+- This query is asking for a match between two indices, namely our own
+  and the OEIS.
+
+-> If we (a) can extract integer sequences from collected concrete
+   knowledge and symbolic knowledge and (b) have an index of the OEIS, this
+   should not be very difficult.
+
+-> As a starter, it might be sufficient to compare some first n
+   members of a sequence and check whether such a sequence is part of
+   OEIS.
+
+-> Of course, if a sequence is generated by some symbolic formula F,
+   this might proof difficult for various reasons. F might be hard to
+   compute and the representation of the OEIS different from ours.
 
 13. Find all polynomials whose list of coefficients occurs as a
     subsequence of a specific OEIS sequence.
+=========================================================================
+
+- This query is asking about (a) coefficients from polynomials and (b)
+  integer sequences.
+
+-> We've already discussed integer sequences in previous queries.
+
+-> Querying coefficients of polynomials might be tricky. It could be
+   extracted from symbolic knowledge, but easy this is not.
+
+-> One could imagine a particular schema for concrete knowledge that
+   stores specific polynomials. But at that point we are building a dedicated
+   index for polynomials, something that might be out of scope.
-- 
GitLab