From 400fc60a5c05a7999fdf1661f26dfdec72c66aa6 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Andreas=20Sch=C3=A4rtl?= <andreas@schaertl.me> Date: Mon, 29 Jun 2020 10:49:09 +0200 Subject: [PATCH] applications: formulate thoughts on applications Next up: Move to report, formalize. --- .../tetrapodal-applications/applications.txt | 232 +++++++++++++++++- 1 file changed, 225 insertions(+), 7 deletions(-) diff --git a/experimental/tetrapodal-applications/applications.txt b/experimental/tetrapodal-applications/applications.txt index 0cd2931..8adc484 100644 --- a/experimental/tetrapodal-applications/applications.txt +++ b/experimental/tetrapodal-applications/applications.txt @@ -1,35 +1,253 @@ 01. Find theorems with non-elementary proofs. +========================================================================= + +- elementary proofs are easy + +- just like distinction between *theorem* and *corollary* is + arbitrary, it is not really defined what an *elementary* proof + really has to be + +-> proofs could be rated by `check-time`; nothing in literature really + mentions the assumption that elementary proofs are easy to check + +-> more of a job of (a) symbolic knowledge (complexity for formulas) + and (b) narrative knowledge (count the number of references, count + the number of narrative elements, e.g. prose, itself) + +- if we think of elementary proofs as proofs that support elementary + theorems, i.e. theorems that make up the foundations of lots of + other theories, we could look into the number of links for a given + object + +-> look at *inspired-by*, *alternative-for* 02. Find algorithms that solve NP-complete graph problems. +========================================================================= + +- Here we are looking for algorithms. An algorithm should have a + (a) specific purpose and (b) ideally input and output parameters of a given + type. + +-> algorithms aren't really part of ULO; it might make sense to add the + concept of an algorithm to ULO, including input parameters and + results (types) + +-> a theorem should be able to state that a given algorithm is in a + given complexity class (space and time) with a proof associated to + it + +-> with part of ULO, we could search for algorithms of a given `name` + +-> even better, it should be possible to link the algorithm to a given + "problem"; "problems" don't really fit with the idea of theorems, they + are their own thing (TSP is not a theorem or algorithm, it is a problem + for which we can state algorithms and about which we can state theorems) + +-> of course at that point we are asking ULO to be an ontology for + algorithms as well which might be too far fetched 03. Find integer sequences whose generating function is a rational polynomial in sin(x) that has a Maple implementation not affected by the bug in module x. +========================================================================= + +- This query is about finding concrete integer sequences with some + properties. + +-> This is a case where information would be split between many + sources. The `name` of the sequences might be part of the + organizational data set. The generating function is part of + symbolic knowledge, the Maple implementation could be part of + concrete knowledge. + +-> Handling this query would probably start by filtering for all + integer sequences. It is not clear how this should be achieved with ULO + as certainly there is no unified concept of a sequences. + +-> It might be possible to take advantage of `aligned-with` or some + similar concept to find all such sequences. If this succeeds, ULO + can indeed provide the first step in servicing this query. + +-> However, finding concrete algorithms and in particular looking + inside of them is not organizational data. + +04. CAS implementation of Groebner bases that conform to a definition + in AFP. +========================================================================= + +- Gröbner Bases are an idea from mathematics particular attractive for + use in computer algebra systems (CAS). + +- This query is asking for concrete implementations of Gröbner Bases + that match the definition in the Archive of Formal Proofs (AFP). + +-> Stated like this, it is reasonable to assume that the user already + has a specific definition in mind. First a translation from AFP + identifier (whatever that is) to an ULO URI is necessary. + +-> Armed with the ULO URI, we would need links to symbolic + knowledge/concrete knowledge that implements the given + concept. Here, we require explicit links to the organizational data + set (tags), something that is probably unreasonable. + +-> Automatic matching from symbolic knowledge (definition from AFP) to + concrete knowledge (implementations) would be preferable. + +05. Find all group representations that are good for X (say a software + engineer working on something and doesn't know group theory), + maybe "computing with in/finite groups". +========================================================================= + +- Here the query is asking for ideas/concepts/data structures for a + given problem X. + +-> As previously stated, there currently is no concept of a problem + class in ULO (e.g. TSP is a problem) which might be exactly what X + is about. -04. CAS implementation of Groebner bases that conform to a - definition in AFP. +-> As a fallback, we could look into narrative knowledge and search + for keywords related to X. This would yield certain references we + could follow up upon. -05. Find all group representations that are good for X (say a - software engineer working on something and doesn't know group - theory), maybe ``computing with in/finite groups''. +-> Group representations I would except as part of symbolic knowledge. + With the references either from (a) narrative knowledge (keywords) + or (b) from organizational knowledge ULO (URIs), look up the + symbolic information with these 06. Math software systems that implement algorithms from MSC48CXX (or that compute a particular thing). +========================================================================= + +- Again, this is a question about algorithms A that solve a given + problem P. In this case, P is part of some specific resource or + publication. + +-> ULO should allow us to list all objects in a given resource. + +-> If we have a concept of "problems" as discussed before as well as + links from algorithms (concrete knowledge) to problems, this query + is easy to service. 07. All areas of math that {Nicolas G.\ de Bruijn} has worked in and his main contributions. +========================================================================= + +- This is a query that has to be answerable by ULO as it is asking + about organizational metadata. In particular it is asking by works + of a given author A. It also ask for his main contributions, + e.g. what paragraphs of a publications A has authored. + +-> While we expected this to be a task for organizational data, it + might not be the case. The organizational data is more concerned + with the "what" and not so much about the "by who". Indeed the + tetrapodal search paper states that it is narrative knowledge which + contains names (of theorems, proofs, ...) and comments. Maybe this + is where we should look instead. + +-> However, as ULO contains things like `file` and `para` (paragraph) + maybe it does belong to ULO after all. In this case, ULO should add + an `authored-by` property that states that a given object was + authored by some authors X,Y,Z... 08. All the researchers that have worked on problem X (where X does not - have a good name, maybe connected to ``Go''). + have a good name, maybe connected to "Go"). +========================================================================= + +- This query is similar to 07 in that it is a lookup from author to + their works. What is different is that we don't have a formal + identifier for X. + +-> Mapping a search term such as "Go" to searchable URIs seems like a + job for a narrative index. Armed with these URIs lookup is like + described in 07. 09. Areas of mathematics that immediate descendants of X worked on. +========================================================================= + +- Again, this is similar to query 07. The difference is that instead + of asking for works authored by X it asks for descendants Y of + X. Alternatively it could ask for "all colleagues Y of X" or "all + researches Y housed at the same department as X. + +-> What we need here is knowledge about the connections between + different authors. It is not clear where this information should be + situated. + + It isn't symbolic nor concrete knowledge. It should be possible to + gain some of this information from narrative knowledge. As for + organizational knowledge, ULO doesn't really have an elaborate + concept of authors and their relationships. At least information + such as "Y is professor at university U or research group R" might + make sense in ULO. However it does feel like a stretch to include + this information in ULO which right now is more about the + organization of mathematical knowledge and not so much about its + historic and "earthly" contexts. 10. All graphs whose order is larger than the publication record of - its ``inventor'' (name patron). + its "inventor" (name patron). +========================================================================= + +- This query is asking about two things: (a) The number of published + works by a given author and (b) graphs constructed by a given + author. If we can query both of these factoids, merging them to + service the query should not be difficult. + +-> With a concept of authorship, it should be easy to count the + publications for a given author. + +-> Counting all vertices is sufficient to gain the order of a given + graph G. Of course, such an operation can be quite time consuming + and should be cached as part of the graph as part of the index for + concrete knowledge. 11. Integer sequences that grow sub-exponentially. +========================================================================= + +- This query is asking about all integer sequences that have a + specific property. + +-> Query 03 already focused on what needs to be done to search for + integer sequences. + +-> The growth of an integer sequence is a property that could be + stored explicitly (with an associated proof). Then it is easy to + serve this query. + +-> If we do not have this property stored explicitly, things get more + difficult. One could imagine a heuristic on the first n members of + a sequence. But doing this for every query is not feasible. It would + need to be computed ahead of time from the given concrete knowledge. 12. Published integer sequences not listed in the OEIS. +========================================================================= + +- This query is asking for a match between two indices, namely our own + and the OEIS. + +-> If we (a) can extract integer sequences from collected concrete + knowledge and symbolic knowledge and (b) have an index of the OEIS, this + should not be very difficult. + +-> As a starter, it might be sufficient to compare some first n + members of a sequence and check whether such a sequence is part of + OEIS. + +-> Of course, if a sequence is generated by some symbolic formula F, + this might proof difficult for various reasons. F might be hard to + compute and the representation of the OEIS different from ours. 13. Find all polynomials whose list of coefficients occurs as a subsequence of a specific OEIS sequence. +========================================================================= + +- This query is asking about (a) coefficients from polynomials and (b) + integer sequences. + +-> We've already discussed integer sequences in previous queries. + +-> Querying coefficients of polynomials might be tricky. It could be + extracted from symbolic knowledge, but easy this is not. + +-> One could imagine a particular schema for concrete knowledge that + stores specific polynomials. But at that point we are building a dedicated + index for polynomials, something that might be out of scope. -- GitLab