Since our approach to identifying candidates is large based on set theory, the algorithms for finding are also making heavy use of set operations, using essentially the same construction method for creating the set of optimization candidates as the definitions presented in \autoref{chap:candidates}.

\subsection{Graph Optimization}

Optimization of the theory graph happens one theory at a time, beginning at the outer most edges of the theory graphs, i.e. those theories that are not included by any other theory.

It may seem counter intuitive, since we are changing the theories before modifying the theories they depend upon, however our use of future lite code (see \autoref{sec:futurelite}) means that any change that might affect the outer theories should be discarded anyway, as it may be a desired part of the theory, even if it is not referenced in the theory itself, resulting in no effective gain for resolving inner theories first.

This is not true for the other direction, as removing unnecessary inclusions may significantly reduce the amount of dependencies that might limit our pruning of the inner theories.

As a result of these observations, it is a strictly better option to always prioritize outer theories and since the runtime will already will already be dominating the time needed for a sort, we can do so with little additional cost (see \autoref{sec:runtime}).

\subsection{Theory Optimization}

Optimization of a theory itself happens in two separate passes. In the first pass superfluous inclusions are detected and eliminated. In the second pass cleans up any remaining simply redundant inclusions. Both of these algorithms can be found below.

The reason for separating these passes is to avoid the removal of simply redundant inclusions, that will later have to be reinserted when eliminating a partially superfluous inclusion. An example for such a hazard is the example of an overlap between the two types of candidates in \autoref{fig:redundantoverlap}

\subsection{Future Lite Code}

\label{sec:futurelite}

In \autoref{sec:viability} we already mentioned the problem of future lite code. The obvious method for aquiring the future lite code is traversing the entire theory graph and whenever we find a theory inclusion, we make a note for the included theory that the theory we are currently traversing is part of its future code. Unfortunately this requires traversing the graph in its entirety.

For the actual algorithm we skip this part and instead put create a map of used theories in the future code, since this is the part we actually require. Since this means potentially going over every theory for every theory (or at least reasonably close to it), our runtime worst case for this part is quadratic in the number of theories in the graph.

\section{Pseudo Code}

\subsection{Finding simply redundant inclusions}

\label{sec:alg_sri}

The following pseudo code is for finding simply redundant inclusions (see: \autoref{sec:redinc}).

\begin{algorithm}[H]

\KwData{$theory$ = theory from our theory graph\\

pass changes that were already applied to the graph}

\KwResult{$redundantIncludes$ = set of simply redundant inclusions}

The basic idea behind this algorithm is to collect all those theories that are included indirectly by the current theory and throwing away all of those direct inclusions that are part of this set. This will leave us exactly with those inclusions that are not simply redundant, without changing the flattened theory graph.

Proof:\\

Simply redundant inclusions are those, that are both directly and indirectly included.\\

We get rid of those direct inclusions that are also indirect, so clearly all simply redundant inclusions must be gone.\\

Since theories are partially ordered by inclusion and inclusion paths are of finite depth, all indirect inclusions must ultimately be the result of a direct inclusion, so all relevant direct inclusions are still in the set.

\subsection{Finding superfluous inclusions}

\label{sec:alg_si}

The following pseudo code is for finding superfluous inclusions (see: \autoref{sec:superinc}).

\begin{algorithm}[H]

\KwData{theory = theory from our theory graph\\

futureUse = set of theories used by theories including $theory$ in future lite code\\

pass changes that were already applied to the graph}

\KwResult{replacements = map from theory inclusions to their replacement}

add candidate $\rightarrow$ neededCandidateIncludes to replacements\;

}

\KwRet superfluousIncludes\;

\caption{superfluousIncludes(theory, futureUse)}

}

\end{algorithm}

Note that in \autoref{sec:superinc} we made a destinction between purely and partially superfluous inclusions. However we do not need to make this distinction while searching for them, as we can search for them by using the criteria for generally superfluous inclusion. Since they only differ in the set of inclusions that needs to be retained and we write that set in our result anyway, both can be accomplished by the same routine.

\subsection{Optimizing theory}

The following pseudo code combines the two earlier routines to apply optimizations to a given theory.

\begin{algorithm}[H]

\KwData{theory = theory from our theory graph\\

futureUse = set of theories used by theories including $theory$ in future lite code\\

pass changes that were already applied to the graph}

\KwResult{replacements = map from theory inclusions to their replacement}

In order to optimize the theory graph we must first identify cases where it is possible to apply any optimizations. We call such theories \underline{optimization candidates} or just \underline{candidates}.

All candidates have in common that they are theories where inclusions can be removed or replaced with weaker inclusions, while still yielding a valid theory as a result without any undeclared symbols (assuming of course a valid theory graph at the start).

...

...

@@ -7,6 +8,7 @@ For something to be declared a candidates we will only require that the theory i

\section{Types of optimization candidates}

\subsection{Simply redundant inclusion}

\label{sec:redinc}

An inclusion of a theory $A\hookrightarrow{}C$ is \underline{simply redundant}, if $C$ includes a theory $B$, such that $B\hookrightarrow{}C$ (see \autoref{fig:redundantbasic}).

\providecommand\myxscale{3.9}

...

...

@@ -62,9 +64,21 @@ Simply redundant inclusions can be safely optimized by simply removing the redun

\label{fig:redundantoptimized}

\end{figure}

The simply redundant inclusions in a theory T are the set $I_R(T)=$\\$(\bigcup_{I \in DirectIncludes(T)}Includes(I))\cap DirectIncludes(T)$,\\

where $Includes(I)$ are the transitively included theories in I and $DirectIncludes(T)$ are the theories included directly in the theory T.

It is immediately obvious that removing a simply redundant inclusion I from a theory T also removes it from the above set via the intersection with $DirectIncludes(T)$.\\

Slightly less obvious is that doing so neither adds nor removes any other inclusions. This is because by design the set $\bigcup_{I \in DirectIncludes(T)}Includes(I))$ still includes I and any of its children and thus remains unchanged.

\subsection{Superfluous Inclusion}

\label{sec:superinc}

An inclusion of a theory $A\hookrightarrow{}B$ is \underline{superfluous}, if $B$ uses none of the constants declared in $A$. Such an inclusion can be \underline{purely superfluous} (\autoref{sec:puresi}) if it can be entirely removed, or \underline{partially superfluous} if it can be reduced to a subset of the theory inclusions in $A$\autoref{sec:partiallysi}.\\

In both of these cases the inclusion can be replaced or removed entirely, while still yielding a well formed theory. However this changes the resulting flattened theory graph and can invalidate theories that include the optimized theory.\\

The superfluous inclusions in a theory T are the set $I_{S}(T)=$\\$\{ I \in DirectInludes(T) | I \notin UsedTheories(T)\}$,\\

where $DirectIncludes(T)$ are the theories included directly in the theory T and UsedTheories(T) are those theories whose terms are used in T.

\subsubsection{Purely Superfluous Inclusion}

An inclusion of a theory $A\hookrightarrow{}B$ is \underline{purely superfluous}, if $B$ uses none of the constants in $A$, not even if they were declared in a theory $C$ such that $C\hookrightarrow{}A$.\\

\label{sec:puresi}

...

...

@@ -104,9 +118,12 @@ Purely superfluous inclusions can be removed while still retaining a valid theor

\label{fig:purelysuperfluousoptimized}

\end{figure}

The purely superfluous inclusions in a theory T are the set $I_{PuS}(T)=$\\$\{ I \in DirectInludes(T) | I \notin UsedTheories(T)\land Includes(I)\cap UsedTheories(T)\setminus Includes(UsedTheories(T)\setminus\{ T \})=\emptyset\}$,\\

where $Includes(I)$ are the transitively included theories in I, $DirectIncludes(T)$ are the theories included directly in the theory T and $UsedTheories(T)$ are those theories whose terms are used in T.

\subsubsection{Partially Superfluous Inclusion}

\label{sec:partiallysi}

An inclusion of a theory $A\hookrightarrow{}B$ is \underline{partially superfluous}, if $B$ uses none of the constants in $A$, but uses some declarations that were declared in one or more theory $C$ such that C is (transitively) included in A.\\

An inclusion of a theory $A\hookrightarrow{}B$ is \underline{partially superfluous}, if $B$ uses none of the constants in $A$, but uses some declarations that were declared in one or more theory $C$ such that C is (transitively) included in A. We describe theories that have declarations being used in a theory A, which includes them, as \underline{theories used by A}.\\

@@ -172,7 +189,11 @@ Partially superfluous inclusions can be optimized by identifying the used subset

\label{fig:partiallysuperfluousoptimized}

\end{figure}

The partially superfluous inclusions in a theory T are the set $I_{PaS}(T)=$\\$\{ I \in DirectInludes(T) | I \notin UsedTheories(T)\land Includes(I)\cap UsedTheories(T)\setminus Includes(UsedTheories(T)\setminus\{ T \})\neq\emptyset\}$,\\

where $Includes(I)$ are the transitively included theories in I, $DirectIncludes(T)$ are the theories included directly in the theory T and $UsedTheories(T)$ are those theories whose terms are used in T.

\section{Viability and dangers of candidate optimizations}

\label{sec:viability}

The candidates discussed earlier are cases in which inclusions can be changed, not necessarily where they should be changed, as the only criterion that makes a theory an optimization candidate is that an optimization can be performed while still retaining a valid theory. There is nothing that guarantees that this theory is also preferable to the original or even desirable at all to have.\\

The danger of performing the proposed optimizations varies between the type of candidate and the context, but overall application without at least some user guidance is not recommended.

...

...

@@ -270,4 +291,4 @@ It should be noted that all the cases discussed in this chapter assume the relat

As a rule of thumb we can assume that anything that was declared as a named structure is there for a reason and should therefore not be touched by our optimizations. Obviously this reasoning does not apply in all cases, but it is reason enough to avoid them in an automated tool.

In addition structures have proven to be especially difficult to consider for our future lite code, for reasons that will be discussed in \autoref{sec:futurelite}.

\ No newline at end of file

In addition structures have proven to be especially difficult to consider for our future lite code, for reasons that will be discussed in \autoref{sec:futurelite}.