Commit 862fdfbd authored by Dennis Müller's avatar Dennis Müller
Browse files

standardized "morphisms" and "view finder"

parent ed72c382
\begin{newpart}{DM}
We have seen how a viewfinder can be used for theory \emph{discovery} and finding constants with specific desired properties, but many other potential use cases are imaginable. The main problems to solve with respect to these is less about the algorithm or software design challenges, but user interfaces.
We have seen how a view finder can be used for theory \emph{discovery} and finding constants with specific desired properties, but many other potential use cases are imaginable. The main problems to solve with respect to these is less about the algorithm or software design challenges, but user interfaces.
The theory discovery use case described in Sec. \ref{sec:usecase} is mostly desirable in a setting where a user is actively writing or editing a theory, so the integration in jEdit is sensible. However, the across-library use case in Sec. \ref{sec:pvs} already would be a lot more useful in a theory exploration setting, such as when browsing available archives on MathHub~\cite{mathhub} or in the graph viewer integrated in \mmt ~\cite{RupKohMue:fitgv17}. Additional specialized user interfaces would enable or improve the following use cases:
\end{newpart}
\begin{itemize}
\item \textbf{Model-/Countermodel Finding:} If the codomain of a view is a theory representing a specific model, it would tell her that those
\item \textbf{Model-/Countermodel Finding:} If the codomain of a morphism is a theory representing a specific model, it would tell her that those
are \emph{examples} of her abstract theory.
Furthermore, partial views -- especially those that are total on some included theory -- could
Furthermore, partial morphisms -- especially those that are total on some included theory -- could
be insightful \emph{counterexamples}.
\item \textbf{Library Refactoring:} Given that the Viewfinder looks for \emph{partial} views, we can use it to find natural
\item \textbf{Library Refactoring:} Given that the view finder looks for \emph{partial} morphisms, we can use it to find natural
extensions of a starting theory. Imagine Jane removing the last of her axioms for ``beautiful sets'' --
the other axioms (disregarding finitude of her sets) would allow her to find e.g. both Matroids and
\emph{Ideals}, which would suggest to her to refactor her library accordingly.
Additionally, \emph{surjective} partial views would inform her, that her theory would probably better
Additionally, \emph{surjective} partial morphisms would inform her, that her theory would probably better
be refactored as an extension of the codomain, which would allow her to use all theorems and definitions
therein.
\item \textbf{Theory Generalization:} If we additionally consider views into and out of the theories found, this can make theory discovery even
more attractive. For example, a view from a theory of vector spaces intro matroids could inform Jane additionally,
\item \textbf{Theory Generalization:} If we additionally consider morphisms into and out of the theories found, this can make theory discovery even
more attractive. For example, a morphism from a theory of vector spaces intro matroids could inform Jane additionally,
that her beautiful sets, being matroids, form a generalization of the notion of linear independence in linear algebra.
\item \textbf{Folklore-based Conjecture:} If we were to keep book on our transfomations during preprocessing and normalization, we could use the found
views for translating both into the codomain as well as back from there into our starting theory.
morphisms for translating both into the codomain as well as back from there into our starting theory.
This would allow for e.g. discovering and importing theorems and useful definitions from some other library --
which on the basis of our encodings can be done directly by the viewfinder.
which on the basis of our encodings can be done directly by the view finder.
A useful interface might specifically prioritize views into theories on top of which there are many
A useful interface might specifically prioritize morphisms into theories on top of which there are many
theorems and definitions that have been discovered.
\end{itemize}
For some of these use cases it would be advantageous to look for views \emph{into} our working theory instead.
For some of these use cases it would be advantageous to look for morphisms \emph{into} our working theory instead.
Note that even though the algorithm is in principle symmetric, some aspects often depend on the direction -- e.g. how we preprocess the theories, which constants we use as starting points or how we treat and evaluate the resulting (partial) views (see Sections \ref{sec:algparams}, \ref{sec:normalizeintra} and \ref{sec:normalizeinter}).
Note that even though the algorithm is in principle symmetric, some aspects often depend on the direction -- e.g. how we preprocess the theories, which constants we use as starting points or how we aggregate and evaluate the resulting (partial) morphisms (see Sections \ref{sec:algparams}, \ref{sec:normalizeintra} and \ref{sec:normalizeinter}).
%%% Local Variables:
%%% mode: latex
......
We present a general MKM utility that given a \MMT theory and an \MMT library $\cL$ finds
partial and total views into $\cL$.
Such a view-finder can be used to drive various MKM applications ranging from theory classification to library merging and refactoring.
partial and total morphisms into $\cL$.
Such a view finder can be used to drive various MKM applications ranging from theory classification to library merging and refactoring.
We have presented the first and last of these and show that they are feasible. For the applications discussed but unrealized in this paper, we mainly need to determine the right application context and user interface.
\paragraph{Future Work}
The current view-finder is already efficient enough for the limited libraries we used for testing.
The current view finder is already efficient enough for the limited libraries we used for testing.
To increase efficiency, we plan to explore term indexing techniques~\cite{Graf:ti96} that support $1:n$ and even $n:m$ matching and unification queries.
The latter will be important for the library refactoring and merging applications which look for all possible (partial and total) morphisms in one or between two libraries.
As such library-scale operations will have to be run together with theory flattening to a fixed point and re-run upon every addition to the library, it will be important to integrate them with the \MMT build system and change management processes~\cite{am:doceng10,iancu:msc}.
......
\paragraph{Motivation}
``Semantic Search'' -- a very suggestive term, which is alas seriously under-defined --
has often been touted as the ``killer application'' of semantic technologies. With a view
finder, we can add another possible interpretation: searching mathematical ontologies
has often been touted as the ``killer application'' of semantic technologies. With a view finder,
we can add another possible interpretation: searching mathematical ontologies
(here modular theorem prover libraries) at the level of theories -- we call this \defemph{theory
classification}.
......@@ -57,7 +57,7 @@ of the fourth condition -- and maybe rethink the name.
% \end{figure}
We reduce the theory classification problem to that of finding theory morphisms (views)
between \MMT theories. $\cL$: given a query theory $Q$, the viewfinder computes all
between \MMT theories. $\cL$: given a query theory $Q$, the view finder computes all
(total) views from $Q$ into a library $\cL$, and returns presentations of target theories
and the assignments made by the views.
......
......@@ -84,7 +84,7 @@
practice requires authors who are familiar with source and target theories at the same
time, which limits the scalability of the manual approach.
To remedy this problem, we have developed a view-finder algorithm that automates theory
To remedy this problem, we have developed a view finder algorithm that automates theory
morphism discovery. In this paper we present an implementation in the MMT system and
show specific use cases. We focus on an application of \emph{theory discovery}, where a user can
check whether a (part of a) formal theory already exists in some library, potentially
......
......@@ -21,17 +21,17 @@ The common logical framework used for all the libraries at our disposal -- name
\end{newpart}
\subsection{Implementation}\label{sec:usecase}
The Viewfinder algorithm is implemented in the MMT system and exposed within the jEdit-IDE, allowing us to realize the use case
The view finder algorithm is implemented in the MMT system and exposed within the jEdit-IDE, allowing us to realize the use case
stated in the introduction. A screenshot of Jane's theory of beautiful sets is given in Figure \ref{fig:use:source}; it
is based on the (basic higher-order logic) foundation of the Math-in-the-Middle library (\cite{ODK:mitm:16} developed natively in MMT.
\begin{figure}[ht]\centering
\fbox{\includegraphics[width=\textwidth]{beautysource}}
\fbox{\includegraphics[width=\textwidth]{results}}
\caption{A Theory of ``Beautiful Sets'' in MMT Surface Syntax and Results of the Viewfinder}\label{fig:use:source}
\caption{A Theory of ``Beautiful Sets'' in MMT Surface Syntax and Results of the View Finder}\label{fig:use:source}
\end{figure}
Right-clicking anywhere within the theory allows Jane to select \cn{MMT} $\to$ \cn{Find\ Views\ to...} $\to$ \cn{MitM/smglom} (the main Math-in-the-Middle library), telling her (within less than one second) that two views have been found, the most promising of which points to the theory
Right-clicking anywhere within the theory allows Jane to select \cn{MMT} $\to$ \cn{Find\ Views\ to...} $\to$ \cn{MitM/smglom} (the main Math-in-the-Middle library), telling her (within less than one second) that two morphisms have been found, the most promising of which points to the theory
\cn{matroid\_theory} (see Figure \ref{fig:use:target}) in the library.
Note that the latter makes use of predefined propositions in its axioms and uses a type \cn{coll} for the collection of sets, while the former has the statements of the axioms directly in the theory and uses a predicate \cn{beautiful}. Additionally, the implication that beautiful sets (or sets in a matroid) are finite is stated as a logical formula in the former, while the latter uses the curry-howard correspondence.
......@@ -41,7 +41,7 @@ Note that the latter makes use of predefined propositions in its axioms and uses
\caption{The Theory of Matroids in the MitM Library}\label{fig:use:target}
\end{figure}
\section{Inter-Library Viewfinding}\label{sec:across}
\section{Inter-Library View Finding}\label{sec:across}
We have so far assumed one fixed meta-theory for all theories involved; we will now discuss the situation when looking for morphisms between theories in different libraries (and built on different foundations).
......@@ -64,7 +64,7 @@ PVS~\cite{pvs} is a proof assistant under active development, based on a higher-
In practice, theory parameters are often used in PVS for the signature of an abstract theory. For example, the theory of groups \cn{group\_def} in the NASA library has three theory parameters $(\cn T,\ast,\cn{one})$ for the signature, and includes the theory \cn{monoid\_def} with the same parameters; the axioms for a group are then formalized as a predicate on the theory parameters.
Given that the same practice is used in few other systems (if any), finding views without treating theory parameters in some way will not give us any useful results on these theories. We offer three approaches to handling these situations:
Given that the same practice is used in few other systems (if any), searching for morphisms without treating theory parameters in some way will not give us any useful results on these theories. We offer three approaches to handling these situations:
\begin{enumerate}
\item \emph{Simple treatment:} We can interpret references to theory parameters as free variables and turn them into holes. Includes of parametric theories with arguments are turned into simple includes.
......@@ -82,7 +82,7 @@ Given that the same practice is used in few other systems (if any), finding view
We conjecture that the second approach is most useful for inter-library search, since it most closely corresponds to formalizations of abstract theories in other systems. A problem here is that the newly introduced constants are not passed on to includes with the same arguments without additional tranformations.
The third approach would turn every occurence of e.g. group-related symbols into function applications, which is a rather rare practice in most other systems. However, since this treatment of theory parameters comes closest to the semantics of the parameters, we conjecture that it is the most useful approach for intra-library viewfinding between PVS theories.
The third approach would turn every occurence of e.g. group-related symbols into function applications, which is a rather rare practice in most other systems. However, since this treatment of theory parameters comes closest to the semantics of the parameters, we conjecture that it is the most useful approach for intra-library view finding between PVS theories.
\paragraph{} Additionally to the theory parameter related normalization, we use the following techniques:
......@@ -102,7 +102,7 @@ Given that the same practice is used in few other systems (if any), finding view
\subsection{Implementation}\label{sec:pvs}
We have tried the first two approaches regarding theory parameters -- i.e. the simple and covariant treatments -- to both inter- and intra-library viewfinding problems. Concretely, we have used a simple theory of Monoids in the Math-in-the-Middle foundation and the theory of Monoids in the NASA library as domains with the whole NASA library as target. The results are summarized in Figure \ref{fig:pvsresults}.
We have tried the first two approaches regarding theory parameters -- i.e. the simple and covariant treatments -- to both inter- and intra-library view finding problems. Concretely, we have used a simple theory of Monoids in the Math-in-the-Middle foundation and the theory of Monoids in the NASA library as domains with the whole NASA library as target. The results are summarized in Figure \ref{fig:pvsresults}.
\begin{figure}[ht]\centering
\begin{tabular}{l | l || c | c}
......@@ -112,7 +112,7 @@ We have tried the first two approaches regarding theory parameters -- i.e. the s
NASA/monoid & covariant & 1077 & 542 \\
MitM/monoid & covariant & & \\
\end{tabular}
\caption{Results of Inter- and Intra-Library Viewfinding in the PVS NASA Library}\label{fig:pvsresults}
\caption{Results of Inter- and Intra-Library View Finding in the PVS NASA Library}\label{fig:pvsresults}
\end{figure}
Most of the results in the simple MitM$\to$NASA case are artifacts of the theory parameter treatments -- in fact only two of the 17 results seem accurate. In the covariant case, the first $n$ results look meaningful, the rest are spurious small morphisms that can easily be automatically identified as such.
......@@ -126,7 +126,7 @@ With a theory from the NASA library as domain, the results are already too many
\fbox{\includegraphics[width=\textwidth]{pvs}}
\caption{Searching for Commutative Operators in PVS}\label{fig:use:pvs}
\end{figure}
This example also hints at a way to iteratively improve the results of the viewfinder: since we can find properties like commutativity and associativity, we can use the results to in turn inform a better normalization of the theory by exploiting these properties. This in turn would potentially allow for finding more views.
This example also hints at a way to iteratively improve the results of the view finder: since we can find properties like commutativity and associativity, we can use the results to in turn inform a better normalization of the theory by exploiting these properties. This in turn would potentially allow for finding more morphisms.
%%% Local Variables:
%%% mode: latex
......
......@@ -2,7 +2,7 @@ Let $C$ be a corpus of theories with the same fixed meta-theory $M$.
We call the problem of finding theory morphisms between theories of $C$ the \defemph{view finding problem} and an algorithm that solves it a \defemph{view finder}.
Note that a view finder is sufficient to solve the theory classification use case from the introduction:
Jane provides a $M$-theory $Q$ of beautiful sets, the view finder computes all (total) views from $Q$ into $C$, and returns presentations of target theories and the assignments made by the morphisms.
Jane provides a $M$-theory $Q$ of beautiful sets, the view finder computes all (total) morphisms from $Q$ into $C$, and returns presentations of target theories and the assignments made by the morphisms.
\paragraph{Efficiency Considerations}
The cost of this problem quickly explodes.
......@@ -178,9 +178,9 @@ The symbols will hence be encoded in the component $t$ instead of the list $s$.
\item \emph{The length of $s$ in the short abstract syntax tree:} By choosing a minimal length for $s$, we can guarantee that only morphisms will be produced that relate a minimal number of distinct constants.
\end{itemize}
\paragraph{Picking starting theories:} If we try to find views between whole collections of theories, we can obviously disregard all theories that are already included in some other theory in our collections, since we work with a normalized (and dependency-closed) version of a theory. Consequently, by only using maximal theories we do not find any fewer views but speed up the search significantly.
\paragraph{Picking starting theories:} If we try to find morphisms between whole collections of theories, we can obviously disregard all theories that are already included in some other theory in our collections, since we work with a normalized (and dependency-closed) version of a theory. Consequently, by only using maximal theories we do not find any fewer morphisms but speed up the search significantly.
\paragraph{Storing Encodings} Finally, for computing the encodings of a theory we only need to know the symbols to be fixed in the component $t$ of an abstract syntax tree, for which only the meta-theories of the theories are relevant. They also determine the specific preprocessings and translations we want to likely use. Since there is only a small number of meta-theories incolved that are relevant in practice, we can store and retrieve the encodings for the most important situations. Since computing the encodings (as well as sorting the theories in a library by their dependencies) is the most expensive part of the algorithm, this -- once computed and stored -- makes the viewfinding process itself rather efficent.
\paragraph{Storing Encodings} Finally, for computing the encodings of a theory we only need to know the symbols to be fixed in the component $t$ of an abstract syntax tree, for which only the meta-theories of the theories are relevant. They also determine the specific preprocessings and translations we want to likely use. Since there is only a small number of meta-theories incolved that are relevant in practice, we can store and retrieve the encodings for the most important situations. Since computing the encodings (as well as sorting the theories in a library by their dependencies) is the most expensive part of the algorithm, this -- once computed and stored -- makes the view finding process itself rather efficent.
%%% Local Variables:
%%% mode: latex
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment