viewfinder.tex 2.15 KB
Newer Older
Dennis Müller's avatar
Dennis Müller committed
The aim is to find typing-preserving morphisms between theories, i.e. given a constant $C:t$ in a theory $T_1$, we want to find a view $V:T_1\to T_2$ such that if $V(C)=C'$ and $C':T'\in T_2$, we have $V(t)=t'$. Correspondingly, we need to consider the types of the constants in our two theories, which we assume to be flat.

To run a proper unficiation algorithm is in our situation infeasible, since the flat version of a theory can become prohibitively large (and obviously finding two unifiable types in two theories is a search problem quadratic in the number of constants). To solve that problem, we first preprocess our theories such that pre-selecting plausibly ``unifiable'' constants becomes as fast as possible.

\paragraph{} We do so by considering the syntax tree of the type $t$ of a constant $C$ and replacing the leaves by an abstract representation $\cn{tree}(C)$. We can eliminate variables by replacing them by their De Bruijn index. We can take care of symbol references by considering them like a separate set of variables, replacing them by their De Brujin indices and storing the symbol's names in a list $\cn{syms}(C)$. 

As a result, we get a pair $(\cn{tree}(C),\cn{syms}(C))$. Now an assignment $V(C)=D$ is valid, iff $\cn{tree}(C)=\cn{tree}(D)$, the lists $\cn{syms}(C)$ and $\cn{syms}(D)$ have the same length and their pairwise assignments $V(\cn{syms}(C)_i)=\cn{syms}(D)_i$ are all valid.

Furthermore, since we only need $\cn{tree}(C)$ for an equality check, we can immediately replace $\cn{tree}(C)$ by an integer hashcode $\cn{h}(C)$.

\paragraph{} Now given a constant $C\in T_1$, we can find valid matches in $T_2$ by computing the pairs for each constant in $T_1$ and $T_2$, taking the list of constants $C'\in T_2$ with $\cn{h}(C')=\cn{h}(C)$ and recursing into their respective lists of symbols.

\ednote{talk about: additional preprocessing/normalization, using alignments/arbitrary translations before hashing, preselecting starting points (e.g. axioms only, length of symbol-list), using multiple representations for the same constant (e.g. with all definitions expanded), hashing fixed symbols (meta-theory) along with the syntax trees, storing/reusing hashes}