paradyx/ma-tex: ch4_implementation.tex

% TeX root = main.tex
This chapter will present the implementation of partial evaluation with
localised property-based abstraction as introduced in \Cref{ch:theory}. We
extend \gls{koat}'s analysis of non-probabilistic programs with a new
implementation of control-flow refinement via partial evaluation and implement
partial evaluation on probabilistic programs. In \Sref{sec:implementation} we
describe the implementation using pseudo-code. We will mostly focus on the
practical aspects of the partial evaluation. \Sref{sec:technical_details} will
briefly describe the technical details of the implementation. Finally, in
\Sref{sec:performance} we will analyse the performance of the new
implementation.

\section{Implementation}\label{sec:implementation}
First, we present two variants for selecting locations on all the loops such
that the property from Lemma \ref{lem:localized_pba_finite} for the localised
property-based abstraction is satisfied. Second, we will introduce the heuristic
for property generation, that was put forward by \citeauthor{Domenech19}
\cite{Domenech19}. Third, we will describe the technique for selecting
components during analysis of a program on which the control-flow refinement via
partial evaluation is then applied, similarly to the technique presented by
\citeauthor{giesl2022lncs} \cite{giesl2022lncs}. Finally, an implementation for
unfolding and partial evaluation on subsets of transitions is presented as a
pseudo-code algorithm.

\subsection{Selecting Loops for Abstraction}\label{sec:loop_heads}
Recall Lemma \ref{lem:localized_pba_finite}; as long as the localized
abstraction function has one location per loop for which every version at that
location is abstracted onto a finite abstraction space, the partial evaluation
graph is finite. The naive approach to assert this property is to use a global
abstraction function with a finite abstraction space. However, when using
property-based abstraction, every abstraction involves many entailment checks,
which require a call to an \gls{smt}-solver in \gls{koat}. This call is
expensive and as was demonstrated in Example \ref{ex:global_abstr}, a global
abstraction function does not yield the smallest partial evaluations, which
makes successive analysis unnecessarily slow. 

In our implementation, loops are detected with the algorithm described by
\citeauthor{johnson1975} \cite{johnson1975}. The algorithm returns loops where
the first location is the smallest location with respect to some arbitrary
ordering. By ordering the locations topologically from the starting location,
the first location on the loop is always the closest location to the starting
location, which would naturally be considered as the loop-head.

The loops and loop-heads found by this method can be used for localised
property-based abstraction. We propose a second variant that minimises the
number of loop-heads by computing the feedback-vertex set of the graph. Reducing
the number of loop-heads makes the successive partial evaluation faster, and the
partial-evaluation graph more precise in the mostly linear parts, where
abstraction is not necessary.

The problem of finding the \textit{Feedback Vertex Set} or \textit{Feedback Node
Set} is one of Karp's 21 NP-complete problems \cite{karp2010reducibility}. 

\begin{definition}[Feedback Vertex Set Problem]
    Given a directed graph $G = (V, E)$, find the minimal subset $R \subseteq
    V$ such that every (directed) cycle of $G$ contains a node in $R$.
\end{definition}

The loops have already been detected with \citeauthor{johnson1975}s algorithm.
The problem can instead be formulated as a hitting set problem. 

\begin{definition}[Hitting Set Problem]
    For a given finite collection of finite sets $S = \{S_1, \dots,S_n\}$, where
    each $S_i \subseteq U$ for some finite set $U$, find a minimal set $R$ such
    that $R\intersect S_i \neq \emptyset$ for all $i = 1,\dots,n$.
\end{definition}

An instance of a hitting set problem can be solved using a \gls{ilp} and an
\gls{smt}-solver. For loops $L_1, \dots, L_N$, we formulate the following
\gls{ilp} instance, where the variables $x_{\ell_0}, \dots,x_{\ell_n}$ are
variables for every location in the program.

\begin{mini!}
    {}{\sum_{i=1}^{N} x_{\ell_i}\label{eq:ilp1-goal}}{\label{eq:ilp-vfs}}{}
    \addConstraint{\sum_{\ell_i \in L_j} x_{\ell_i}}{>
    0,\protect\quad\label{eq:lip1-hit}}{j = 1,\dots,N}
    \addConstraint{ x_{\ell_i} }{\geq 0,\protect\quad\label{eq:ilp-geq0}}{i = 1,\dots,n}
    \addConstraint{ x_{\ell_i} }{\leq 1,\protect\quad\label{eq:ilp-leq1}}{i = 1,\dots,n}
\end{mini!}

The location variables are constrained to $0 \leq x_{\ell_j}\leq 1$ for all
$\ell_j \in \Loc_\Prog$ by \eqref{eq:ilp-geq0} and \eqref{eq:ilp-leq1}. For an
assignment $\sigma : \Loc_\Prog \rightarrow \N$, a value $\sigma(x_{\ell_j}) = 1$
indicates that the location $\ell_j$ is part of the hitting set; a value
$\sigma(x_{\ell_j}) = 0$ indicates that it is not. The constraints
\eqref{eq:lip1-hit} guarantee that at least one location on every loop is
selected, and the optimisation goal \eqref{eq:ilp1-goal} minimises the number of
selected locations.

Finding the feedback-vertex-set and loops is independent of the partial
evaluation and can be done a-priori. 

\subsection{Property-based abstraction}
Once loops $L_1, \dots, L_N \subseteq \Loc_\Prog$ of a \gls{pip} $\Prog$ are
detected and the loop-heads or locations marked for abstraction $L \subseteq
\union_{i=1}^N L_i$ are selected, one needs to select an appropriate localised
abstraction function. \citeauthor{Domenech19} proposed a heuristic which
identifies the properties that are relevant for an iteration of a loop
\cite{Domenech19}. We briefly describe this heuristic. For a location $\ell \in
L$, the following heuristics are defined.

\begin{align*}
    \PRh(\ell) &= \set{\psi}{(\ell, \_, \tau, \_, \_) \in
    \TPout(\ell), \psi \in \tau|_\PV}\\
    \PRhv(\ell) &= \set{x \diamond c}{(\ell, \_, \tau, \_, \_) \in
    \TPout(\ell), x \in \PV, \diamond \in \{\leq, \geq\}, \tau \models x \diamond c}\\
    \PRc(\ell) &= \set{\psi}{(\_, \_, \tau, \eta, \ell) \in
    \TPin(\ell), \psi \in \tilde{\eta}(\tau \land \lfloor\eta\rceil))}\\
    \PRcv(\ell) &= \set{x \diamond c}{(\_, \_, \tau, \eta, \ell)
    \in \TPin(\ell), x \in \PV, \diamond \in \{\leq, \geq\}, \tilde{\eta}(\tau \land \lfloor\eta\rceil) \models x \diamond c}\\
\end{align*}

$\PRh$ captures properties that must be met on any outgoing transition. $\PRhv$
computes bounds for all the program variables that might not be captured
otherwise, because they are implicitly implied by the guard. $\PRc$ and $\PRcv$
do the same for all the incoming transitions. In addition, a heuristic $\PRd$ is
defined that propagates properties backwards through a loop, capturing the
conditions that are required to iterate through the loop without leaving the
loop. The loops are first transformed into transition loops $T_{L_1},\dots,
T_{L_n} \subseteq \TP$, as described in \Sref{sec:graph_operations}. Then for
every transition loop and loop-head on the loop, the transition loop is rotated
to have the loop-head first, and the guards are propagated backwards starting at
the end of the loop. 
For a transition loop $T = \{t_0, \dots, t_m\}$ and a location $\ell \in L_j$ on
that loop, a backwards propagation function $\back : \T \times \C \rightarrow
\C$ and the heuristic $\PRd$ are defined as follows:
\begin{align*}
    \back(t, \varphi) &= \varphi [x / \tilde{\eta}] \land \lfloor\eta\rceil \land
    \tau \text{ where } t = (\_, \_, \tau, \eta, \_) \\
    \PRd(T, \ell) &= \back(t_1, \dots \back(t_m, \texttt{true})\dots)
\end{align*}

\begin{equation*}
    \Psi_\ell = \PRh(\ell) \union \PRhv(\ell) \union \PRc(\ell)  \union \PRcv
    \union \Union_{\substack{T \in T_{L_i}\\ L_i\;:\;\ell \in L_i}} \PRd(T,
    \ell)
\end{equation*}

The localised property-based abstraction using those properties $\Psi_\ell$ for
every locations $\ell \in L$. For all other $\ell \not\in L$, we set $\Psi_\ell
= \bot$.

\subsection{Selecting transitions for partial evaluation}
During \gls{koat}s analysis of a program, size and runtime complexity bounds for
every transition are computed. Whenever the analysis fails to find linear
runtime complexity bounds for a transition, the transition is added to a set of
transitions $\T_\text{cfr}$ that will need further refinement. The set of
transitions $T$ used for control-flow refinement is generated by selecting the
smallest loop for all transitions in $\T_\text{cfr}$ and expanding it to all
parallel transitions. The reasoning behind the selection was explained in
detail by \citeauthor{giesl2022lncs} \cite{giesl2022lncs} for classical
program. The algorithm is displayed in Algorithm \ref{alg:selectT}.

\begin{algorithm}[ht]
    \caption{Selecting transitions $T$ for partial evaluation for a set of
    non-trivial transitions $\T_\text{cfr}$
    \cite{giesl2022lncs}.\label{alg:selectT}}
    \KwData{ A PIP $\Prog = (\PV, \Loc_\Prog, \ell_0, \GTP)$, and a non-empty
    subset of transitions $\T_\text{cfr} \subseteq \TP$.} 
    \KwResult{A subset of transitions $T \subseteq \TP$.}

    \SetKwFunction{FSelect}{select}
    \SetKwProg{Fn}{Function}{:}{end}
    \Fn{\FSelect{$\T_\text{cfr}$}}{
        $T = \emptyset$\;
        \For{$t = (\ell, \_, \_, \_, \ell') \in \T_\text{cfr}$}{
            $T_t \leftarrow \text{a shortest path from } \ell \text{ to }
            \ell'$\label{alg:line:shortest_path}\;
            $T_t \leftarrow T \union \{t\}$\;
            $T_t \leftarrow T \union \set{(\ell, \_, \_, \_, \ell') \in
            \T_\Prog}{(\ell, \_, \_, \_, \ell') \in T}$\;
            $T \leftarrow T \union T_t$\;
        }
        \Return{$T$}
    }
\end{algorithm}

The shortest path on line \ref{alg:line:shortest_path} of Algorithm
\ref{alg:selectT} can be computed using one of many variants of Djikstra's
algorithm \cite{dijkstra1959}, which is readily available in most graph
libraries. The shortest path is closed into a transition loop and then all
parallel edges are selected.

\subsection{Partial evaluation}
The unfolding is implemented using the abstract domain of convex polyhedra. In
practice, any other abstract domain can be used, but since the set of transitions
is kept small in general and precision is favoured over speed, polyhedra allow
exact projections of linear constraints. Let $\texttt{proj}: 2^\V \times \Poly
\rightarrow \Poly$ be the projection, and $\texttt{update}: (\PV \rightarrow
\Z[\V]) \times \Poly \rightarrow \Poly$ be an update in the abstract domain. We
can convert to and from polyhedra with the $\texttt{polyh} : \C \rightarrow
\Poly$ and $\texttt{constr}: \Poly \rightarrow \C$ function. An implementation
of the unfold operation is shown in Algorithm \ref{alg:unfold}.

\begin{algorithm}
    \caption{Unfolding\label{alg:unfold}}
    \KwData{ A \gls{pip} $\Prog = (\PV, \Loc_\Prog, \ell_0, \GTP)$ } 
    \SetKwFunction{FUnfold}{unfold}
    \SetKwProg{Fn}{Function}{:}{end}
    \Fn{\FUnfold{$\langle\ell,\varphi\rangle \in \vers_\Prog$, $(\ell_1, p,
    \tau, \eta,\ell_2) \in \T_\Prog$}}{
        \textbf{assert} $\ell = \ell_1$\;
        $P \leftarrow \texttt{polyh}(\varphi \land \tau \Land \lfloor\eta\rceil)$\;
        $P \leftarrow \texttt{update}(\tilde{\eta}, P)$\;
        $P \leftarrow \texttt{proj}(\PV_\Prog, P)$\;
        $\varphi' \leftarrow \texttt{constr}(P)$\;
        $t' \leftarrow (\langle \ell_1, \varphi\rangle, p, \varphi \Land \tau,
        \eta, \langle\ell_2,\varphi'\rangle)$\;
        \Return{$t'$}
    }
\end{algorithm}

Finally, we present the partial evaluation on subsets of transitions $T \subseteq
\TP$ that were selected as shown in the previous section using a localised
abstraction function $S$. The full algorithm is displayed in Algorithm
\ref{alg:pe}.

First, the program is converted into its graph $G_\Prog$, then the transitions
from $T$ are removed and the graph is lifted to a partial evaluation graph. An
empty set of versions $V$ is created that remembers the versions that have
already been evaluated in order to avoid unnecessary computations. Additionally
a new set of general transitions $\GTPn$ is initialised, where all newly found
general transitions are added. A single evaluation starts from the target
location $\ell \in \Loc_\Prog$ of an entry transition $t = (\_, \_, \_, \_,
\ell) \in \TPin(T)$ of the subset of transitions $T$. It evaluates recursively
all the newly encountered versions until the evaluation would leave $T$ or the
version has already been encountered. The evaluation is repeated for every entry
transition, expanding the partial evaluation every time. We just have to
consider the special case when the subset of transitions contains the starting
location, in which case an evaluation is also started from $\ell_0$. The
satisfiability check on line \ref{alg:line:check_sat} of Algorithm \ref{alg:pe}
cuts the evaluation short and avoids unnecessary computations for unreachable
versions. 

\begin{algorithm}
    \caption{Partial evaluation of a subset of transitions $T$ of a \gls{pip}
    $\Prog$.\label{alg:pe}}

    \KwData{ A \gls{pip} $\Prog = (\PV, \Loc_\Prog, \ell_0, \GTP)$, localised
    abstraction function $S$ and a subset of transitions $T \subseteq \TP$.}

    \KwResult{A PIP $\Prog'$ with equal runtime complexity bounds.}

    \SetKw{Continue}{continue}
    \SetKwFunction{FEvaluate}{evaluate}
    \SetKwProg{Fn}{Function}{:}{end}
    $G \leftarrow (G_\Prog - T)\lift$\;
    $V \leftarrow \emptyset$\;
    $\GTPn \leftarrow \set{g \in \GTPn}{\text{no } t\in T \text{ s. th. } t \in g
    \text{ exits}}\lift$\;

    \Fn{\FEvaluate{$G$, $v$}}{
        \lIf{$v \in V$}{\Return $G$}
        $V \leftarrow V \uplus \{v\}$\;
        $\langle \ell, \varphi\rangle \leftarrow v$\;
        \For{$g \in \GTPout(\ell)$}{
            \lIf{$\tau_g \land \varphi$ is satisfiable\label{alg:line:check_sat}}{\Continue}
            $g' \leftarrow \emptyset$
            \For{$t \in g$}{
                \uIf{$t \in T$}{
                    $(\langle\ell,\varphi\rangle, p, \tau, \eta, \langle\ell',
                    \varphi'\rangle) \leftarrow \unfold(\varphi, t)$\;
                    $t' \leftarrow (\langle\ell,\varphi\rangle, p, \tau, \eta,
                    S(\langle\ell',\varphi'\rangle))$\;
                    $g' \leftarrow g \union \{t'\}$\;
                    $G \leftarrow \FEvaluate{$G$, $\langle\ell',\varphi\rangle$}$\;
                }\uElse{
                    $t' \leftarrow (\langle\ell,\varphi\rangle, p, \tau, \eta,
                    \langle\ell',\texttt{true}\rangle)$\;
                    $g' \leftarrow g' \union {t'}$\;
                }
            }
            $G \leftarrow G + g'$\;
            $\GTPn \leftarrow \GTPn \union \{g'\}$\;
        }
        \Return {$G$}
    }
    \lIf{$(\ell_0, \_, \_, \_,\_) \in T$} {
        $G \leftarrow \FEvaluate{$G$, $\langle\ell_0,\texttt{true}\rangle$}$
    }
    \For{$(\_, \_, \_, \_, \ell) \in \TPin(T)$} {
        $G \leftarrow \FEvaluate{$G$, $\langle\ell,\texttt{true}\rangle$}$\;
    }
    $\GTPn \leftarrow \text{general transitions of }E(G)$\;
    $\Prog'\leftarrow (\PV,V(G),\langle\ell,\texttt{true},\GTPn)$\;
    \Return $\Prog'$\; 
\end{algorithm}

Consider the case where a transition a of general transition $g \in \GTP$ is
removed because it is part of the subset $T$, making $g$ an illegal general
transition. Still the general transition is lifted into the initial partial
evaluation graph. However, if there exists an entry of $T$ to the location
$\ell_g$, then the version $\langle \ell_g, \texttt{true}\rangle$ and all
transitions in $g$ are unfolded by the first evaluation step for this entry
transition making $g$ whole again in the partial evaluation graph. If the
$\langle \ell_g, \texttt{true}\rangle$ is reachable trough the partial
evaluation the version is evaluated then at its corresponding step. If it is not
reachable the invalid general transition is removed at the end as part of the
clean-up.

\section{Technical Details}\label{sec:technical_details}
This section delves into the implementation details of the evaluation algorithms
for \gls{pip} described in \Sref{ch:theory}. The whole implementation is done in
\gls{ocaml}. For solving satisfiability of constraints Z3\cite{z3} was used as
it is already used in other parts of \gls{koat}. For projections of constraints
and handling polyhedra we recur to \gls{apron}\cite{apron} and the
\gls{ppl}\cite{bagnara2008socp}, which can be used as an abstract domain in
apron.

The Algorithm \ref{alg:pe} which applies control-flow refinement via partial
evaluation on a subset of transitions was implemented. A variant of the
implementation which refines an entire program is provided as well. 

The evaluation of a whole program is not used in \gls{koat}s analysis due to
performance reasons, but made available through a new sub command. It might be
used by other tools that want to profit from partial evaluation on probabilistic
programs in the future. 

The evaluation on sub-\gls{scc} was integrated into \gls{koat}s analysis of
classical programs similar to iRankFinder as described in \cite{giesl2022lncs},
possibly replacing the old implementation in the future. The analysis command
was expanded with new settings, which allow to select the various parameters of
the evaluation, such as the selection of locations for abstraction.
Unfortunately, for probabilistic programs we were not able to integrate the
sub-\gls{scc} level implementation into the analysis due to technical problems.
The current implementation uses an extensive typing system that asserts that
partial evaluation cannot be used with probabilistic programs and we were not
able to change those constraints in \gls{koat}s source-code without rewriting
and restructuring major parts of the analysis, which went beyond the scope of
this thesis. Furthermore, the probabilistic analysis underwent a full rewrite
during the writing of this thesis by its developers, making the task at hand
even harder to fulfill. However, the implementation we provide should be
reasonably easy to integrate, for someone who has experience dealing with
\gls{koat}s typing system.

\section{Performance}\label{sec:performance}

This section will evaluate the performance and correctness of the new
implementation. The performance is analysed for classical programs using
benchmarks on test sets from the \gls{tpdb}\cite{tpdb}. The results are compared
to benchmarks on the same data sets using the latest version of \gls{koat}. In
addition to the performance comparison, the correctness is asserted by comparing
the upper bounds found by the new implementation to lower bounds proven by the
analysis tool \gls{loat}.

All of the code was compiled using OCaml 4.14.0. Among the more important
dependencies is the SMT library Z3 in version 4.11.0.0, Apron in version 0.9.13,
and the \gls{ppl} in version 1.2. \gls{koat} and all dependencies were built
inside a docker container. All the benchmarks were done on an AMD Ryzen 7 5800X
8-Core Processor with 32 GB RAM. Eight benchmarks are executed in parallel, each
in their own Ubuntu 20.04 docker container. A single analysis timeouts after 240
seconds. The measured computation time for each test includes the upstart and
take down time of the docker container, but the timeout is measured internally
in \gls{koat}. For each benchmark configuration, we obtain a number of programs
for which the analysis found a constant, linear, quadratic, higher-order
polynomial and exponential runtime complexity bound, as well as the average
time $AVG$ of analysis across all programs. Since the timeout is not very
representative, we also include the average analysis time $AVG^+$ only those
programs for which \gls{koat} found finite complexity bounds. Besides we do not
include the number for linear bounds in the number for quadratic bounds, etc. 

\subsection{Performance on classical programs}\label{ssec:perf_classic}
As a baseline, four configurations are used that have been used in the past to
measure \gls{koat}'s performance.\cite{giesl2022lncs} At the time of writing,
the latest version of the master branch is used.\footnote{commit
\texttt{a2eb156ac31db0f4daaa89822786a413632e637f}} The old version of \gls{koat}
is evaluated with four different settings: 
\begin{itemize}
    \item \enquote{\acrshort{koat}} is the simplest configuration. It uses \gls{mprf}
        with an $\textit{mdepth} = 1$, i.e. it computes only linear ranking
        functions \cite{giesl2022lncs}. No control-flow refinement is used.
    \item \enquote{\acrshort{koat} + \acrshort{mprf}5} is the configuration that
        uses \gls{mprf} with a $\textit{mdepth} = 5$. That is the typical maximum
        depth used in competitions \cite{giesl2022lncs}. Still, no
        control-flow refinement is used.
    \item \enquote{\acrshort{koat} + iRF} is the configuration
        that uses sub-\gls{scc} control-flow refinement refinement but limits
        the maximum depth of multi-phase ranking functions to $\textit{mdepth} =
        1$. The control-flow refinement is done by calling iRankFinder as a
        black-box.
    \item \enquote{\acrshort{koat} + \acrshort{mprf}5 + iRF}
        activates sub-\gls{scc} control-flow refinement with iRankFinder
        and \gls{mprf} with an $\textit{mdepth}=5$.
\end{itemize}

The new implementation is evaluated on classical integer programs using two 
configurations:
\begin{itemize}
    \item \enquote{new-\acrshort{koat} + nat.\acrshort{cfr}} uses the new
        control-flow refinement described in Algorithm \ref{alg:pe} with
        computation of the \gls{fvs}.
    \item \enquote{new-\acrshort{koat} + nat.-\acrshort{cfr}} uses the new
        control-flow refinement described in Algorithm \ref{alg:pe} with
        computation of the \gls{fvs} and \gls{mprf} with a maximum depth
        $\textit{mdepth}=5$.
\end{itemize}

The benchmarking for classical integer programs is done on two test sets from
the \gls{tpdb}. The first test set it called \enquote{Complexity\_ITS} and it
comprises 801 integer programs from various studies. The results of this
benchmark are displayed in \fref{fig:results_complexity_its}.

\begin{figure}[h]
    \centering
    \setlength{\tabcolsep}{0.1em}
    \begin{tabular}{l|c|c|c|c|c|c||c|c}
        & $\bigo(1)$ & $\bigo(n)$ & $\bigo(n^2)$ & $\bigo(n^{>2})$ &
        $\bigo(\text{EXP})$ & $< \infty$ & $\text{AVG}^+ [s]$ & $\text{AVG}
        [s]$ \\ \hline
        \acrshort{koat} & 128 & 218 & 95 & 10 & 6 & 457 & 3.17 & 8.84 \\ \hline
        \acrshort{koat} + \acrshort{mprf}5 & 128 & 235 & 98 & 14 & 6 & 481 &
        2.74 & 12.36 \\ \hline
        \acrshort{koat} + iRF & 133 & 244 & 98 & 9 & 6 & 490 & 4.48 &
        20.84 \\ \hline
        \acrshort{koat} + \acrshort{mprf}5 + iRF & 133 & 254 & 99 & 13
        & 6 & 505 & 4.25 & 25.05 \\ \hline
        \hline
        % new-\acrshort{koat} & & & & & & & \\ \hline
        % new-\acrshort{koat} + \acrshort{mprf}5 & & & & & & & \\ \hline
        new-\acrshort{koat} + nat.-\acrshort{cfr} & 134 & 239 & 97 & 8 & 6 & 484
        & 3.22 & 21.67 \\ \hline
        new-\acrshort{koat} + \acrshort{mprf}5 + nat.-\acrshort{cfr} & 134 &
        250 & 98 & 12 & 6 & 500 & 3.02 & 26.28 \\ \hline
    \end{tabular}
    \caption{Performance results on the \gls{tpdb} test set
    \enquote{Complexity\_ITS}.\label{fig:results_complexity_its}}
\end{figure}

We observe a noticeable improvement in computational performance when using the
native implementation over iRankFinder. Without \gls{mprf} the average analysis
of programs where a finite complexity bound was found shortens by 1.26 seconds
from 4.48 seconds to 3.22 seconds; and with \gls{mprf} by 1.23 seconds from 4.25
seconds to 3.02 seconds. Unfortunately overall, fewer finite bounds can be found
for the given test set. Where \gls{koat} found finite runtime bounds for 490
programs with iRankFinder, it manages to find finite complexity bounds for only
484 programs with the new implementation. Similarly, the old implementation with
\gls{mprf} and iRankFinder found finite complexity bounds for only 500 program
in the test set.

The second data set is called \enquote{Complexity\_C\_Integer} and contains C
programs using only integers. Because \gls{koat} can only work on integer
programs, C programs are transformed into integer programs using the tool
\texttt{llvm2kittel}\cite{llvm2kittel}. The results of this benchmark are
displayed in \fref{fig:results_complexity_c_integer}.

\begin{figure}[h]
    \centering
    \setlength{\tabcolsep}{0.1em}
    \begin{tabular}{l|c|c|c|c|c|c||c|c}
        & $\bigo(1)$ & $\bigo(n)$ & $\bigo(n^2)$ & $\bigo(n^{>2})$ &
        $\bigo(\text{EXP})$ & $< \infty$ & $\text{AVG}^+ [s]$ & $\text{AVG}
        [s]$ \\ \hline
        \gls{koat} & 23 & 175 & 68 & 10 & 0 & 276 & 3.40 & 4.69 \\ \hline
        \gls{koat} + \acrshort{mprf}5 & 23 & 203 & 96 & 10 & 0 & 305 & 3.40 &
        6.32 \\ \hline
        \gls{koat} + old-\acrshort{cfr} & 25 & 215 & 68 & 9 & 0 & 317 & 5.10 &
        13.85 \\ \hline
        \gls{koat} + \acrshort{mprf}5 + old-\acrshort{cfr} & 24 & 228 & 69 & 8 &
        0 & 329 & 5.74 & 17.24 \\ \hline
        \hline
        new-\acrshort{koat} + nat.-\acrshort{cfr} & 25 & 202 & 72 & 6 & 0 & 305
        & 4.14 & 10.06 \\ \hline
        new-\acrshort{koat} + \acrshort{mprf}5 + nat.-\acrshort{cfr} & 24 & 222
        & 71 & 6 & 0 & 323 & 3.81 & 13.66 \\ \hline
    \end{tabular}
    \caption{Performance results on the \gls{tpdb} test set
    \enquote{Complexity\_C\_Integer}.\label{fig:results_complexity_c_integer}}
\end{figure}

For the second test set, we observe a similar improvement in analysis time as
well. The average analysis time of programs where a finite runtime complexity
bound is found shortens by 0.96 seconds from 5.10 seconds to 4.14 seconds
without \glspl{mprf}; and by 1.93 seconds from 5.74 seconds to 3.81 seconds with
\glspl{mprf}. However, the new implementation can only find complexity bounds
for 305 instead of 317 programs without \glspl{mprf} and 323 instead of 329 when
using \glspl{mprf}.

This worsened behaviour needs further investigation.

\subsection{Probabilistic programs}

Due to the missing integration into the full analysis of probabilistic programs,
the implementation for probabilistic programs was evaluated on two cases.
Instead of using sub-\gls{scc} level control-flow refinement, we partially
evaluate the entire program and then feed the results to the probabilistic
analysis. 

Our first test was done with the example program $\Prog_3$ studied in
\Cref{ch:introduction} and \Cref{ch:theory}. The full partial evaluation yields
the same program as the theoretically conjectured partial evaluation of Example
\ref{ex:localized_pba}. For the original program, \gls{koat} failed at finding
finite expected runtime complexity bounds. After the partial evaluation,
\gls{koat} successfully finds a quadratic expected runtime complexity bound
$18y^2 + 524y+ 543 \in \mathcal{O}(n^2)$. 

Our second test, addresses multiphase-loops in probabilistic programs.
\gls{mprf} have a very similarly advantage as partial evaluation when
determining the runtime of loops where the loop implicitly has multiple phases.
However, probabilistic ranking functions are lacking this capabilities, hence
rendering partial evaluation on probabilistic programs even more powerful. We
demonstrate it with our second test program $P_5$ displayed in
\fref{fig:splitting-loop}. The loops at $\ell_1$ have two phases. In the first
phase $y$ is incremented up to $z$, and in the second phase $x$ in decremented
until it reaches a value less than zero. On non-probabilistic programs, this
behaviour could be bounded using a single \gls{mprf} at $\ell_1$. Probabilistic
ranking functions cannot bound the runtime complexity of this loop and
\gls{koat} cannot find a finite expected runtime complexity bound for this
program. After a full partial evaluation with our implementation the program is
transformed into an equivalent program, and \gls{koat} finds the linear
expected runtime complexity bound $12y + 12z + 24x + 47/2 \in \mathcal{O}(n)$.

\begin{figure}
    \centering
    \input{figures/ch4_p5_program}
    \caption{The \gls{pip} $\Prog_5$ with a multiphase-loop at $\ell_1$ and a
    hard to detect expected runtime complexity bound.
    \label{fig:splitting-loop}}
\end{figure}

We successfully demonstrated that partial evaluation of \gls{pip}s works and it
can be used to find tighter expected runtime complexity bounds with \gls{koat}.

% \begin{figure}
%     \centering
%     \input{figures/ch4_p5_pe}
%     \caption{The partial evaluation of the entire \gls{pip} $\Prog_5$. \label{fig:splitting-loop-pe}}
% \end{figure}