algorithm.tex 18.3 KB
Newer Older
Michael Banken's avatar
Michael Banken committed
1
\chapter{Algorithm}
Michael Banken's avatar
Michael Banken committed
2

Michael Banken's avatar
Michael Banken committed
3
\section{Idea}
Michael Banken's avatar
Michael Banken committed
4
5
6
7
8
9
10
11
12
13
14
15
Since our approach to identifying candidates is large based on set theory, the algorithms for finding are also making heavy use of set operations, using essentially the same construction method for creating the set of optimization candidates as the definitions presented in \autoref{chap:candidates}.

\subsection{Graph Optimization}
Optimization of the theory graph happens one theory at a time, beginning at the outer most edges of the theory graphs, i.e. those theories that are not included by any other theory.

It may seem counter intuitive, since we are changing the theories before modifying the theories they depend upon, however our use of future lite code (see \autoref{sec:futurelite}) means that any change that might affect the outer theories should be discarded anyway, as it may be a desired part of the theory, even if it is not referenced in the theory itself, resulting in no effective gain for resolving inner theories first.

This is not true for the other direction, as removing unnecessary inclusions may significantly reduce the amount of dependencies that might limit our pruning of the inner theories.

As a result of these observations, it is a strictly better option to always prioritize outer theories and since the runtime will already will already be dominating the time needed for a sort, we can do so with little additional cost (see \autoref{sec:runtime}).

\subsection{Theory Optimization}
Michael Banken's avatar
Michael Banken committed
16
Optimization of a theory itself happens in two separate passes. The first pass detects and eliminates superfluous inclusions. The second pass cleans up any remaining simply redundant inclusions.
Michael Banken's avatar
Michael Banken committed
17
18
19

The reason for separating these passes is to avoid the removal of simply redundant inclusions, that will later have to be reinserted when eliminating a partially superfluous inclusion. An example for such a hazard is the example of an overlap between the two types of candidates in \autoref{fig:redundantoverlap}

Michael Banken's avatar
Michael Banken committed
20
21
\subsubsection{First pass - Eliminating superfluous inclusions}

22
23
24
In the first pass we first take a selection of optimization candidates, namely those theories that are directly included, but not used in the current theory. These are the theories that need to be replaced.

The algorithm then computes all the necessary replacements by looking at all indirect inclusions that result   from the inclusion to be replaced. The necessary replacements are those that are used by the theory or its future and are not already included by the theory or by one of its necessary dependencies. 
Michael Banken's avatar
Michael Banken committed
25
26
27

For this algorithm it is not necessary to make a distinction between purely and partially superfluous inclusions, as the pure case is entirely subsumed by the partial case with an empty set of necessary inclusions.

28
29
30
31
32
33
34
35
36
As we discussed in \autoref{sec:puresivia} this pass cannot be expected to remove all the inclusions that are unneeded for retaining a valid graph, but what we do expect it to leave a valid graph valid.

Proof sketch:
\begin{itemize}
\item Any theory used by the current theory or its future is either directly included by the current theory or indirectly.
\item Any directly included theory that is necessary isn't a candidate and will therefore remain.
\item Any indirectly included theory is either included by a theory that is retained or by one that is being replaced. Since the replacement of a theory inclusion is the necessary subset of its included theories, all necessary dependencies will remain included.
\end{itemize}

Michael Banken's avatar
Michael Banken committed
37
38
\subsubsection{Second pass - Eliminating simply redundant inclusions}

39
The idea behind the second pass is to collect all those theories that are included indirectly by the current theory and throwing away all of those direct inclusions that are part of this set. This will leave us exactly with those inclusions that are not simply redundant, without changing the flattened theory graph.
Michael Banken's avatar
Michael Banken committed
40

41
42
43
Proof sketch:
\begin{itemize}
\item Simply redundant inclusions are those, that are both directly and indirectly included.
Michael Banken's avatar
Michael Banken committed
44
We get rid of those direct inclusions that are also indirect, so clearly all simply redundant inclusions must be gone.
45
46
47
48
\item For the invariance of the flattened theory graph, we must first assume that theories are partially ordered by inclusion and inclusion paths are of finite depth, ie. that cyclic inclusions are not present.\\
If we do so, all indirect inclusions must ultimately be the result of a direct inclusion, so all relevant direct inclusions are still in the set.\\
If these assumptions are wrong however, not only could we run into problems, but we will inevitably do so if the current theory is part of the cycle. It is therefore best to only perform this operation on a graph that satisfies our assumption.
\end{itemize}
Michael Banken's avatar
Michael Banken committed
49

Michael Banken's avatar
Michael Banken committed
50
51
52
53
\subsection{Future Lite Code}
\label{sec:futurelite}
In \autoref{sec:viability} we already mentioned the problem of future lite code. The obvious method for aquiring the future lite code is traversing the entire theory graph and whenever we find a theory inclusion, we make a note for the included theory that the theory we are currently traversing is part of its future code. Unfortunately this requires traversing the graph in its entirety.

54
55
56
57
For the actual algorithm we skip this part and instead put create a map of used theories in the future code, since this is the part we actually require. Since this means potentially going over every theory for every theory (or at least reasonably close to it), our worst case runtime for this part of the algorithm is quadratic in the number of theories in the graph.

This runtime is especially problematic, since we need to update the the information after every step of the optimization to make proper use of any improvements of the graph. We can demonstrate this by considering the following example.

58
59
\begin{figure}[!htb]
\centering
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
\begin{tikzpicture}[node distance=2cm]\footnotesize
\node[thy] (bottom) {\begin{tabular}{l}
                             \textsf{bottom}\\\hline
                             X\\\hline
                             ...
                           \end{tabular}};
\node[thy,  above of = bottom] (middle) {\begin{tabular}{l}
                             \textsf{middle}\\\hline
                             Y\\\hline
                             not X
                           \end{tabular}};
\node[thy, above of = middle] (top) {\begin{tabular}{l}
                             \textsf{top}\\\hline
                             ...\\\hline
                             X, but not Y
                           \end{tabular}};
\draw[include] (bottom) -- (middle);                 
\draw[include] (middle)  -- (top);
\end{tikzpicture}
\caption{Example of a graph where optimizing changes the future.}
\label{fig:changeablefuture}
\end{figure}

As we can immediately see in \autoref{fig:changeablefuture}, it is possible to replace the partially superfluous inclusion of middle in top with bottom. The result is the changed graph in \autoref{fig:changedfuture}.

85
86
\begin{figure}[!htb]
\centering
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
\begin{tikzpicture}[node distance=2cm]\footnotesize
\node[thy] (bottom) {\begin{tabular}{l}
                             \textsf{bottom}\\\hline
                             X\\\hline
                             ...
                           \end{tabular}};
\node[thy,  above of = bottom] (middle) {\begin{tabular}{l}
                             \textsf{middle}\\\hline
                             Y\\\hline
                             not X
                           \end{tabular}};
\node[thy, above of = middle] (top) {\begin{tabular}{l}
                             \textsf{top}\\\hline
                             ...\\\hline
                             X, but not Y
                           \end{tabular}};
\draw[include] (bottom) -- (middle); 
\draw[include, bend left] (bottom) edge (top);
\end{tikzpicture}
\caption{Example of a graph where optimizing changes the future.}
\label{fig:changedfuture}
\end{figure}

In the changed graph we can now optimize the purely superfluous inclusion of bottom in the theory middle. We could not have done this in the earlier graph, since X and therefore bottom was used in middle's future. Thus we need an updated version of the future to make full use of earlier optimizations.

To somewhat mitigate the effect this has on the performance, the future is created step by step, as we work our way from the outer edges of the graph towards the core.
Michael Banken's avatar
Michael Banken committed
113

Michael Banken's avatar
Michael Banken committed
114
\section{Pseudo Code}
Michael Banken's avatar
Michael Banken committed
115
116
117

\subsection{Optimizing graph}

118
The following code applies the optimizations to the entire theory graph.
Michael Banken's avatar
Michael Banken committed
119
120

\begin{algorithm}[H]
Michael Banken's avatar
Michael Banken committed
121
122
123
\label{alg:graph}
 \KwData{theoryGraph = theory graph to be optimized}
 \KwResult{replacements = map from theories to maps of theory inclusions to their replacement}
Michael Banken's avatar
Michael Banken committed
124
 \Begin{
Michael Banken's avatar
Michael Banken committed
125
126
127
128
129
 	futureUses := empty map\;
 	replacements := empty map\;
 	theoryGraph := sort(theoryGraph)\;
 	theoryGraphRev := reverse(theoryGraph)\;
 	\For{theory $\leftarrow$ theoryGraph}{
Michael Banken's avatar
Michael Banken committed
130
 		add theory $\rightarrow$ usedTheories(theory) $\cup$ futureUses(theory) to futureUses\;
Michael Banken's avatar
Michael Banken committed
131
132
 		add theory $\rightarrow$ optimize(theory, futureUses(theory)) to replacements\;
 		\For{include $\leftarrow$ includes(theory)}{
Michael Banken's avatar
Michael Banken committed
133
 			add include $\rightarrow$ futureUses(include)$\cup$usedTheories(include)$\cup$futureUses(theory) to futureUses\;
Michael Banken's avatar
Michael Banken committed
134
135
136
 		}
 		\KwRet replacements \;
 	}
Michael Banken's avatar
Michael Banken committed
137
 }
Michael Banken's avatar
Michael Banken committed
138
 \caption{optimizeGraph(theorygraph)}
Michael Banken's avatar
Michael Banken committed
139
140
\end{algorithm}

Michael Banken's avatar
Michael Banken committed
141
\subsection{Optimizing theory}
Michael Banken's avatar
Michael Banken committed
142

143
The following pseudo code applies optimizations to a given theory.
Michael Banken's avatar
Michael Banken committed
144

Michael Banken's avatar
Michael Banken committed
145
146
147
148
149
150
151
152
153
154
155
156
157
158
\begin{algorithm}[H]
 \KwData{theory = theory from our theory graph\\
 futureUse = set of theories used by theories including $theory$ in future lite code\\
 pass changes that were already applied to the graph}
 \KwResult{replacements = map from theory inclusions to their replacement}
 \Begin{
 	replacements = superfluousIncludes(theory, futureUse)\;
 	\For{removal $\leftarrow$ redundantIncludes(theory, futureUse)}{
 		add removal $\rightarrow$ $\emptyset$ to replacements\;
 	}
 	\KwRet replacements\;
 } 
 \caption{optimize(theory, futureUse)}
\end{algorithm}
Michael Banken's avatar
Michael Banken committed
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185

\subsection{Finding superfluous inclusions}
\label{sec:alg_si}
The following pseudo code is for finding superfluous inclusions (see: \autoref{sec:superinc}).

\begin{algorithm}[H]
 \KwData{theory = theory from our theory graph\\
 futureUse = set of theories used by theories including $theory$ in future lite code\\
 pass changes that were already applied to the graph}
 \KwResult{replacements = map from theory inclusions to their replacement}
 \Begin{
 replacements := empty map\;
 futureUsedTheories := usedTheories(theory) $\cup$ futureUse\;
 candidates := DirectInludes(theory) $\setminus$ futureUsedTheories\;
 \For{candidate $\leftarrow$ candidates}{
 		neededCandidateIncludes := Includes(candidate) $\cap$ futureUsedTheories \;
		remainingIncludes := Includes((directIncludes(theory) $\cap$ futureUsedTheories))$\cup$ (directIncludes(theory) $\cap$ futureUsedTheories) \;
		neededCandidateIncludes := neededCandidateIncludes $\setminus$ remainingIncludes\;
		add candidate $\rightarrow$ neededCandidateIncludes to replacements\;
 }
 \KwRet superfluousIncludes\;
 \caption{superfluousIncludes(theory, futureUse)}
 }
\end{algorithm}

Note that in \autoref{sec:superinc} we made a destinction between purely and partially superfluous inclusions. However we do not need to make this distinction while searching for them, as we can search for them by using the criteria for generally superfluous inclusion. Since they only differ in the set of inclusions that needs to be retained and we write that set in our result anyway, both can be accomplished by the same routine.

Michael Banken's avatar
Michael Banken committed
186
187
188
\subsection{Finding simply redundant inclusions}
\label{sec:alg_sri}
The following pseudo code is for finding simply redundant inclusions (see: \autoref{sec:redinc}).
Michael Banken's avatar
Michael Banken committed
189
190

\begin{algorithm}[H]
Michael Banken's avatar
Michael Banken committed
191
 \KwData{$theory$ = theory from our theory graph\\
Michael Banken's avatar
Michael Banken committed
192
 pass changes that were already applied to the graph}
Michael Banken's avatar
Michael Banken committed
193
 \KwResult{$redundantIncludes$ = set of simply redundant inclusions}
Michael Banken's avatar
Michael Banken committed
194
 \Begin{
Michael Banken's avatar
Michael Banken committed
195
 $subIncludes := \emptyset$\;
Michael Banken's avatar
Michael Banken committed
196
197
198
 $redundantIncludes := \emptyset$\;
 \For{$i \leftarrow directIncludes(theory)$}{
	$subIncludes := subIncludes \cup includes(i)$\; 
Michael Banken's avatar
Michael Banken committed
199
 }
Michael Banken's avatar
Michael Banken committed
200
201
202
203
204
205
206
 \For{$i \leftarrow directIncludes(theory)$}{
 	\If{$i \in subIncludes$} {
		$redundantIncludes := redundantIncludes \cup \{i\}$\;
	}
 }
 \KwRet $redundantIncludes$\;
 }
Michael Banken's avatar
Michael Banken committed
207
 \caption{simplyRedundantIncludes(theory)}
Michael Banken's avatar
Michael Banken committed
208
209
\end{algorithm}

Michael Banken's avatar
Michael Banken committed
210
211
212
\section{Performance analysis}
Since the performance will be heavily dependent on the size of the graph to be optimized, we will measure runtime and memory requirements depending on the variable t, which denotes the number of theories in the graph.

Michael Banken's avatar
Michael Banken committed
213
\subsection{Runtime}
Michael Banken's avatar
Michael Banken committed
214
\label{sec:runtime}
Michael Banken's avatar
Michael Banken committed
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
Exact runtime is specific to the library implementations of used functions, but after looking at the relevant Scala documentation we can make a few helpful assumptions \cite{scaladoccolperf}.

With effective constant time to lookup, add and remove in a hashset, we can deduce that likely runtimes for the set-operations $\cup$, $\cap$ and $\setminus$ are O(n), where n is the number of elements in the sets involved. 

\subsubsection{First pass}
\begin{algorithm}[H]
 \Begin{
 replacements := empty map \tcp*{1}
 futureUsedTheories := usedTheories(theory) $\cup$ futureUse \tcp*{t}
 candidates := DirectInludes(theory) $\setminus$ futureUsedTheories \tcp*{t}
 \For{candidate $\leftarrow$ candidates\tcp*{$\times t$}}{
 		neededCandidateIncludes := Includes(candidate) $\cap$ futureUsedTheories \tcp*{t}
		remainingIncludes := Includes((directIncludes(theory) $\cap$ futureUsedTheories))$\cup$ (directIncludes(theory) $\cap$ futureUsedTheories) \tcp*{5$\cdot{}$t}
		neededCandidateIncludes := neededCandidateIncludes $\setminus$ remainingIncludes\tcp*{t}
		add candidate $\rightarrow$ neededCandidateIncludes to replacements\tcp{1}
 }
 \KwRet superfluousIncludes\tcp{1}
 \caption{superfluousIncludes(theory, futureUse) - runtime}
 }
\end{algorithm}

The loop runs up to $t$ times over up to $7 \cdot t+1$.\\
This results in an overall worst case performance of $7\cdot t^2+t+3 = O(t^2)$.

\subsubsection{Second pass}

\begin{algorithm}[H]
 \Begin{
 $subIncludes := \emptyset$\tcp*{1}
 $redundantIncludes := \emptyset$\tcp*{1}
 \For{$i \leftarrow directIncludes(theory)$\tcp*{$\times t +t$}}{
	$subIncludes := subIncludes \cup includes(i)$\tcp*{t}
 }
 \For{$i \leftarrow directIncludes(theory)$\tcp*{$\times t +t$}}{
 	\If{$i \in subIncludes$\tcp*{1}} {
		$redundantIncludes := redundantIncludes \cup \{i\}$\tcp*{t}
	}
 }
 \KwRet $redundantIncludes$\tcp*{1}
 \caption{simplyredundantIncludes(theory) - runtime}
 }
\end{algorithm}

The first loop runs up to $t$ times over up to $t$.\\
The second loop runs up to $t$ times over up to $t+1$.\\
This results in an overall worst case performance of $2\cdot t^2+t+3 = O(t^2)$.

\subsubsection{Theory optimization}

\begin{algorithm}[H]
 \Begin{
 	replacements = superfluousIncludes(theory, futureUse)\tcp*{$t^2$}
 	\For{removal $\leftarrow$ redundantIncludes(theory, futureUse)\tcp*{$\times t + t^2$}}{
 		add removal $\rightarrow$ $\emptyset$ to replacements\tcp*{$1$}
 	}
 	\KwRet replacements\tcp*{$1$}
 } 
 \caption{optimize(theory, futureUse)}
\end{algorithm}

The loop runs up to $t$ times over 1.\\
This results in an overall worst case performance of $2\cdot O(t^2)+t+1 = O(t^2)$.

\subsubsection{Graph optimization}

\begin{algorithm}[H]
 \Begin{
 	futureUses := empty map\tcp*{1}
 	replacements := empty map\tcp*{1}
 	theoryGraph := sort(theoryGraph)\tcp*{$t^2$}
 	theoryGraphRev := reverse(theoryGraph)\tcp*{$t$}
 	\For{theory $\leftarrow$ theoryGraph\tcp*{$\times t$}}{
 		add theory $\rightarrow$ usedTheories(theory) $\cup$ futureUses(theory) to futureUses\tcp*{$t$}
 		add theory $\rightarrow$ optimize(theory, futureUses(theory)) to replacements\tcp*{$O(t^2)$}
 		\For{include $\leftarrow$ includes(theory)\tcp*{$\times t$}}{
 			add include $\rightarrow$ futureUses(include)$\cup$usedTheories(include)$\cup$futureUses(theory) to futureUses\tcp*{$t$}
 		}
 		\KwRet replacements \tcp*{1}
 	}
 }
 \caption{optimizeGraph(theorygraph) - runtime}
\end{algorithm}

The inner loop runs up to $t$ times over up to $t$.\\
The outer loop runs up to $t$ times over up to $t+O(t^2+)+t^2$.\\
This results in an overall worst case performance of $t^2+t+t\cdot (t+O(t^2)+t^2) +3= O(t^3)$.
301

Michael Banken's avatar
Michael Banken committed
302
303
304
Since the total number of theories can be quite large a cubic runtime is hardly ideal. However it should be noted that the worst case requires the average theory to include most of the other theories.

In graphs where no theory includes more than the square root of the number of theories, the runtime should remain within quadratic bounds.
305

Michael Banken's avatar
Michael Banken committed
306
\subsection{Memory}
Michael Banken's avatar
Michael Banken committed
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
Making precise assumptions about the memory usage is even harder, as the documentation is insufficiently specific. We will therefore assume that the memory required by a HashSet or HashMap is in the same order of magnitude as it's elements.

\subsubsection{First pass}
\begin{algorithm}[H]
 \KwData{\\
 theory = O(1)\\
 futureUse = O(t)\\
 replacements = O($t^2$)\\
 futureUsedTheories = O(t)\\
 candidates = O(t)\\
 neededCandidateIncludes = O(t)\\
 remainingIncludes = O(t)\\
 neededCandidateIncludes = O(t)
 }
 \caption{superfluousIncludes(theory, futureUse) - memory}
\end{algorithm}

Assuming none of the library functions need larger memory, the overall memory usage lies in O($t^2$).

\subsubsection{Second pass}
\begin{algorithm}[H]
 \KwData{\\
 theory = O(1)\\
 subIncludes = O(t)\\
 redundantIncludes = O(t)\\
 $i_{first\ loop}$ = O(1)\\
 $i_{second\ loop}$ = O(1)
 }
 \caption{simplyredundantIncludes(theory) - memory}
\end{algorithm}

Assuming none of the library functions need larger memory, the overall memory usage lies in O(t).

\subsubsection{Theory optimization}
\begin{algorithm}[H]
 \KwData{theory = theory from our theory graph\\
 futureUse = set of theories used by theories including $theory$ in future lite code\\
 pass changes that were already applied to the graph}
 \KwResult{\\
 	replacements = O($t^2$)\\
 	superfluousIncludes(theory, futureUse) = O($t^2$)\\
 	removal = O(t)\\
 	redundantIncludes(theory, futureUse) = O(t)
 } 
 \caption{optimize(theory, futureUse) - memory}
\end{algorithm}

Assuming none of the library functions need larger memory, the overall memory usage lies in O($t^2$).

\subsubsection{Graph optimization}
\begin{algorithm}[H]
\label{alg:graph}
 \KwData{\\
 	theoryGraph = O(t)\\
 	futureUses = O($t^2$)\\
 	replacements = O($t^2$)\\
 	theoryGraphRev = O(t)\\
 	theory = O(1)\\
 	optimize(theory, futureUse) = O($t^2$)
 }
 \caption{optimizeGraph(theorygraph) - memory}
\end{algorithm}

Assuming none of the library functions need larger memory, the overall memory usage lies in O($t^2$).