5MHYFQWH3WWCAARBKU5OEMKNW2F5EMA2FJ5HS7PDDGEFH2HWKEPQC
USNRNHANGXUQHFFVHLAVWWA7WGPX2RZOYFQAV44HU5AYEW5YC27QC
NEWG2Q2EYR3BOPWVN44EXIPVH2GH23BYFHJXUSNYWWT7WWDEAIKQC
JCK77YLTPL6VV5ZOTMGUJSFZ5DNMS3MZGP5UQLXYW6WSMEK6GAQAC
YUEGUKXBV6IKGZPTBXJXW2ANYGYSRJILQFLYCYQUYD54LJVZS4CQC
7JUQQYQCE7G3ESFJETNGEPBCYGR5NGMAWHFPWI24RSQJPFM5YCTQC
567NUQ633R5ESTE2LWXVUGM7XEZGJ65IRKPOCDZJS3JE5RAMR5KAC
MQJMCLGFSBSFJ3JPCIHQSBZS4XZPGJI7NGGRASX5UVGJO6HES25AC
TKJMPTB73W5TLLTNUOC4FGRQ7DHE5GRBTL7C5QBXRCNQPMJKGEQAC
KDY46C5VOI2QWEXPE6J6BW3U2KEMH3IYPUNYKH5FAISTRCWBWO3AC
DDNESFMD4LGNTDGTCW27FQPCOR5Y5MJWN2TWCSIRNBK6QGBITBRQC
Q3TJZVCBWD7FDMS4MWTT7VUNOI3R3MQCOU6JJCJ5UVIDJPXKMLWAC
QIVGMR6INKBVT5IFCO2UJJQEO4PVW3ZM7IDTCTCOKOX7VKG7UPFAC
KVSF5AFJEIMIYAC264OUAA56SGP5MFAUG3AECKOTUPIXPGULVFQQC
QCVDPRXH57UFLU4YV3P2I5RQZGCBUKROYUZHDWOCHL43KDYN7MPAC
R2A2OD7J5ATBRR7W4QPTZDVH6GGHQ7XBX2JE5R3GKVS5WIQ43O3AC
45HZPHVHYJ2BXATQ4MKCMPMMPZFBNJECY7JV3WX63BSA4CSQN6HAC
M3WK5HF3XU5MN2JKJMRJVDMC74F4MZAQ23FTGJRW4CGU5ZL72MHAC
A5NVU6DZYOG6GRXJU6DZLCUQTKA7LPUWOGQ4NCJSY6FJB6GSP3OAC
GNUONJQNZG2MNEE55RW3XOQP3UICEBOKLVLKV42H3K64IBGLUGIQC
6PTHERE26NXSC3ZXSI5UYKWTMPYZD4WMYFBJKZE3MRRTKC7MIU6QC
YS5O2UANXCWRI3W7KMRKKK7OT7R6V3QDL4JS4DRLXM2EZUIPFALAC
B5T3HFKSQF3T4T6Z2KZOA5ZDMI6TVON7BS7KKB3EAGGYAS7RJPBAC
EIPHTX56WTRFBMBNA4VYMKK4L7GWHA4Q2AQ3BCG2ZGZB3DBRENBQC
UVN5HOW3H2OVESJ62UFNQNS6VXHI5TRV22HQ44EOYU42453SDD2QC
KUEZAMEH2M3K6EDHWMYXQMHZ2QTTH4KBM5CO7A7EB64ABEWRE6WQC
5GR6GP7ONTMALU3XHWDTWWIDIW2CGYOUOU6M3H7WMV56KCBI25AAC
}
@Article{dijkstra1959,
author = {Edsger W. Dijkstra},
journal = {Numerische Mathematik},
title = {A note on two problems in connexion with graphs},
year = {1959},
pages = {269--271},
volume = {1},
bibsource = {dblp computer science bibliography, https://dblp.org},
biburl = {https://dblp.org/rec/journals/nm/Dijkstra59.bib},
doi = {10.1007/BF01386390},
timestamp = {Mon, 29 Jul 2019 15:59:06 +0200},
url = {https://doi.org/10.1007/BF01386390},
@InProceedings{kremer2016lncs,
author = {Gereon Kremer and Florian Corzilius and Erika {\'{A}}brah{\'{a}}m},
booktitle = {Computer Algebra in Scientific Computing - 18th International Workshop, {CASC} 2016, Bucharest, Romania, September 19-23, 2016, Proceedings},
title = {A Generalised Branch-and-Bound Approach and Its Application in {SAT} Modulo Nonlinear Integer Arithmetic},
year = {2016},
editor = {Vladimir P. Gerdt and Wolfram Koepf and Werner M. Seiler and Evgenii V. Vorozhtsov},
pages = {315--335},
publisher = {Springer},
series = {Lecture Notes in Computer Science},
volume = {9890},
bibsource = {dblp computer science bibliography, https://dblp.org},
biburl = {https://dblp.org/rec/conf/casc/KremerCA16.bib},
doi = {10.1007/978-3-319-45641-6\_21},
timestamp = {Thu, 23 Sep 2021 11:48:04 +0200},
url = {https://doi.org/10.1007/978-3-319-45641-6\_21},
}
(l2) edge[bend left=30] node {$t_{3}:\begin{aligned}
p&=0.5 \\ \eta_3(y) &= y - \textsc{Bern}(0.5)
\end{aligned}$} (l1)
% (l2) edge[bend left=30,draw=rwth-75] node[swap] {$t_{3a}: p=0.5$} (l1)
% (l1) edge node {$t_4:x \leq 0$} (l2)
% edge[loop above] node {$t_2:\begin{aligned} x &> 1 \\ x&:=
% x+1\end{aligned}$} ()
% edge[loop below] node {$t_3:\begin{aligned} x &> 0 \\ x&:=
% x-1\end{aligned}$} ()
(l2) edge[bend left=30] node {$t_{3}: \eta_3(y) = y -
\textsc{Bern}(0.5)$} (l1)
\section{Future Work}\label{sec:future_work}
In \Cref{ch:preliminaries} and \Cref{ch:theory} we have skipped over the concept
of costs, which are closely related to runtime complexity. It is common to
assign costs to a transition which can vary depending on the value of program
variables. During a run of the program, those costs are added up. The runtime
runtime complexity of a program is a special case of cost-analysis where every
transition has costs equal to 1. Because the costs would be equally copied from
one transition to the unfolded transition, we conjecture that the
partial evaluation preserves worst-case expected costs similarly to the
worst-case expected runtime complexity. However, answering this question needs
additional research.
\section{Conclusion}\label{sec:conclusion}
\section{Future Work}\label{sec:future_work}
In \Sref{sec:performance}, we have seen that the new implementation is not as
good as iRankFinder on a small number of programs in the test sets. Since we
tried to mirror the behaviour of iRankFinder as close as possible, there were
most probably mistakes made during implementation. For example, our
implementation of the heuristics deviate from iRankFinders implementation or the
selecting the feedback set for locations had an unexpected impact on the
analysis. In any case, the worsened analysis results warrant further
investigation. Until then, \gls{koat} can keep using iRankFinder for
non-probabilistic integer programs.
During implementation, the constraints of versions are converted back and forth
to Aprons representation of polyhedra. There is probably room for improvement,
by reducing the number of conversions to and from apron. Furthermore, the
satisfiability check for the constraint of a version is done twice. Once before
the unolding and once after the unfolding during abstraction even though the
satisfiability after the unfolding is already implied. We can probably remove
one of the satisfiability checks and in turn reduce the number of calls to the
\gls{smt}-solver.
The heuristics used for property generation were taken over from
\citeauthor{Domenech19}, however in our implementation we found that the
heuristics resulted in a large number of properties and thus a larger than
probably necessary evaluation graph. With regard to sub-\gls{scc} level
evaluation, one could search for better heuristics which take the context of the
previous analysis better into account.
A last idea, that came up during the writing of this thesis is some sort of
online-abstraction. The localised property-based abstraction requires locations
to be selected for abstraction before the evaluation. Instead, one can imagine
an evaluation algorithm that abstracts only when traversing a loop on-the-fly,
even ignoring the first $k$ number of iterations and thus only abstracting when
really necessary. Such an online-abstraction could possibly increase the
precision of the partial evaluation even further. It is not immediately obvious
if this would improve the sub-\gls{scc} level analysis of \gls{koat} but this
might be interesting for some other class of programs, we haven't considered in
this thesis.
\section{Conclusion}\label{sec:conclusion}
In \Cref{ch:theory} the method of control-flow refinement via partial evaluation
was expanded to \gls{pip}. The method is proven to be sound for \gls{pip} and
can be used to refine a \gls{pip} when classical analysis fails at finding tight
runtime bounds. In \Cref{ch:implementation} we have presented a new
implementation of partial evaluation on probabilistic programs and successfully
inferred tighter expected runtime complexity bounds for an example \gls{pip}. To
the best of our knowledge, we applied partial evaluation for the first time in a
complexity analysis frame work for probabilistic programs.
This chapter will present the implementation of partial-evaluation with
localised property-based abstraction as introduced in the previous
\Cref{ch:theory}. In \Sref{sec:implementation} we describe the implementation
using pseudo-code. We will mostly on the practical aspects of the partial
evaluation.
% \Sref{sec:abstract_domain} will briefly discuss the possible
% abstract domains and the reasons why we decided to use the abstract domain of
% Polyhedra. \Sref{sec:loop_heads} will focus on the different possibilities to
% mark locations for abstraction, which is part of the history-dependent aspect of
% the abstraction layer. \Sref{sec:technical_details} describes the technical
% details of the actual implementation and finally in \Sref{sec:performance} the
% correctness and performance of the implementation is evaluated practically.
This chapter will present the implementation of partial evaluation with
localised property-based abstraction as introduced in \Cref{ch:theory}. In
\Sref{sec:implementation} we describe the implementation using pseudo-code. We
will mostly focus on the practical aspects of the partial evaluation.
\Sref{sec:technical_details} will briefly describe the technical details of the
implementation. Finally, in \Sref{sec:performance} we will analyse the
performance of the new implementation.
In this section, we present a pseudo-code implementation for the $\unfold$
operation, and the partial evaluation of an entire program as described in
\Sref{sec:partialevaluation} using localised property-based abstraction. First,
we will present two variants for selecting locations on all the loops such that
the property from Lemma \ref{lem:localized_pba_finite} is satisfied. Second, we
will introduce the heuristic for property generation, that was put forward by
\citeauthor{Domenech19} \cite{Domenech19} and give an implementation as a
pseudo-code algorithm. Third, we will describe the technique for selecting
First, we present two variants for selecting locations on all the loops such
that the property from Lemma \ref{lem:localized_pba_finite} for the localised
property-based abstraction is satisfied. Second, we will introduce the heuristic
for property generation, that was put forward by \citeauthor{Domenech19}
\cite{Domenech19}. Third, we will describe the technique for selecting
number of loop-heads by computing the feedback-vertex set. Reducing the number
of loop-heads makes successive evaluation faster, and the partial-evaluation
graph more precise in the mostly linear parts, where abstraction is not
necessary.
number of loop-heads by computing the feedback-vertex set of the graph. Reducing
the number of loop-heads makes the successive partial evaluation faster, and the
partial-evaluation graph more precise in the mostly linear parts, where
abstraction is not necessary.
The value for the location variables are constrained to $0 \leq x_{\ell_j}\leq
1$ for all $\ell_j \in \Loc_\Prog$ by \eqref{eq:ilp-geq0} and
\eqref{eq:ilp-leq1}. For an assignment $\sigma : \Loc_\Prog \rightarrow \N$ a
value $\sigma(x_{\ell_j}) = 1$ indicates that the location $\ell_j$ is part of
the hitting set; a value $\sigma(x_{\ell_j}) = 0$ indicates that it is not. The
constraints \eqref{eq:lip1-hit} guarantees that at least one location on every
loop is selected, and the optimization goal \eqref{eq:ilp-vfs} minimizes the
number of selected locations.
The location variables are constrained to $0 \leq x_{\ell_j}\leq 1$ for all
$\ell_j \in \Loc_\Prog$ by \eqref{eq:ilp-geq0} and \eqref{eq:ilp-leq1}. For an
assignment $\sigma : \Loc_\Prog \rightarrow \N$, a value $\sigma(x_{\ell_j}) = 1$
indicates that the location $\ell_j$ is part of the hitting set; a value
$\sigma(x_{\ell_j}) = 0$ indicates that it is not. The constraints
\eqref{eq:lip1-hit} guarantee that at least one location on every loop is
selected, and the optimisation goal \eqref{eq:ilp-vfs} minimises the number of
selected locations.
computes bounds for all program variables that might not be captured otherwise,
because they they implicitly implied by the guard.
$\PRc$ and $PRcv$ do the same for all incoming transition. In addition, a
heuristic $\PRd$ is defined that propagates properties backwards through a loop,
capturing the conditions that are required to iterate through the loop without
leaving the loop. The loops are first transformed into transition loops
$T_{L_1},\dots, T_{L_n} \subseteq \TP$, as described in
\Sref{sec:graph_operations}. Then for every transition loop
For a transition loop $T = \{t_0, \dots, t_m\}$ and a location $\ell \in L_j$
on that loop, a backwards propagation function $\back : \T \times \C \rightarrow
\C$ and the heuristic $\PRd$ is defined as follows:
computes bounds for all the program variables that might not be captured
otherwise, because they are implicitly implied by the guard. $\PRc$ and $\PRcv$
do the same for all the incoming transitions. In addition, a heuristic $\PRd$ is
defined that propagates properties backwards through a loop, capturing the
conditions that are required to iterate through the loop without leaving the
loop. The loops are first transformed into transition loops $T_{L_1},\dots,
T_{L_n} \subseteq \TP$, as described in \Sref{sec:graph_operations}. Then for
every transition loop and loop-head on the loop, the transition loop is rotated
to have the loop-head first, and the guards are propagated backwards starting at
the end of the loop.
For a transition loop $T = \{t_0, \dots, t_m\}$ and a location $\ell \in L_j$ on
that loop, a backwards propagation function $\back : \T \times \C \rightarrow
\C$ and the heuristic $\PRd$ are defined as follows:
The localised property-based abstraction using those properties for the
locations marked for abstraction is used by our implementation.
The localised property-based abstraction using those properties $\Psi_\ell$ for
every locations $\ell \in L$. For all other $\ell \not\in L$, we set $\Psi_\ell
= \bot$.
every transition are computed. Whenever the analysis fails at finding linear
runtime-complexity bounds for a transition, this transition is added to a set of
transitions $\T_\text{nl}$ which need further refinement. The set of transitions
$T$ used for control-flow-refinement is generated by selecting the smallest loop
for all transitions in $\T_\text{nl}$ and expanding it to all parallel
transitions. The algorithm is describe in details by
every transition are computed. Whenever the analysis fails to find linear
runtime-complexity bounds for a transition, the transition is added to a set of
transitions $\T_\text{cfr}$ that will need further refinement. The set of
transitions $T$ used for control-flow-refinement is generated by selecting the
smallest loop for all transitions in $\T_\text{cfr}$ and expanding it to all
parallel transitions. The reasoning behind the selection was explained in
detail by \citeauthor{giesl2022lncs} \cite{giesl2022lncs} for classical
program. The algorithm is displayed in Algorithm \ref{alg:selectT}.
\begin{algorithm}
\begin{algorithm}[ht]
\caption{Selecting transitions $T$ for partial evaluation for a set of
non-trivial transitions $\T_\text{cfr}$
\cite{giesl2022lncs}.\label{alg:selectT}}
\KwData{ A PIP $\Prog = (\PV, \Loc_\Prog, \ell_0, \GTP)$, and a non-empty
subset of transitions $\T_\text{cfr} \subseteq \TP$.}
\KwResult{A subset of transitions $T \subseteq \TP$.}
\SetKwFunction{FSelect}{select}
\SetKwProg{Fn}{Function}{:}{end}
\Fn{\FSelect{$\T_\text{cfr}$}}{
$T = \emptyset$\;
\For{$t = (\ell, \_, \_, \_, \ell') \in \T_\text{cfr}$}{
$T_t \leftarrow \text{a shortest path from } \ell \text{ to }
\ell'$\label{alg:line:shortest_path}\;
$T_t \leftarrow T \union \{t\}$\;
$T_t \leftarrow T \union \set{(\ell, \_, \_, \_, \ell') \in
\T_\Prog}{(\ell, \_, \_, \_, \ell') \in T}$\;
$T \leftarrow T \union T_t$\;
}
\Return{$T$}
}
The shortest path on line \ref{alg:line:shortest_path} of Algorithm
\ref{alg:selectT} can be computed using one of many variants of Djikstra's
algorithm \cite{dijkstra1959}, which is readily available in most graph
libraries. The shortest path is closed into a transition loop and then all
parallel edges are selected.
\subsection{Partial evaluation}
The unfolding is implemented using the abstract domain of convex polyhedra. In
practice, any other abstract domain can be used, but since the set of transitions
is kept small in general and precision is favoured over speed, polyhedra allow
exact projections of linear constraints. Let $\texttt{proj}: 2^\V \times \Poly
\rightarrow \Poly$ be the projection, and $\texttt{update}: (\PV \rightarrow
\Z[\V]) \times \Poly \rightarrow \Poly$ be an update in the abstract domain. We
can convert to and from polyhedra with the $\texttt{polyh} : \C \rightarrow
\Poly$ and $\texttt{constr}: \Poly \rightarrow \C$ function. An implementation
of the unfold operation is shown in Algorithm \ref{alg:unfold}.
\uIf{$\varphi \Land \tau$ is satisfiable} {
$P \leftarrow \texttt{polyh}(\varphi \land \tau \Land \lfloor\eta\rceil)$\;
$P \leftarrow \texttt{update}(\tilde{\eta}, P)$\;
$P \leftarrow \texttt{proj}(\PV_\Prog, P)$\;
$\varphi' \leftarrow \texttt{constr}(P)$\;
} \Else{
$\varphi' \leftarrow \texttt{false}$\;
}
$P \leftarrow \texttt{polyh}(\varphi \land \tau \Land \lfloor\eta\rceil)$\;
$P \leftarrow \texttt{update}(\tilde{\eta}, P)$\;
$P \leftarrow \texttt{proj}(\PV_\Prog, P)$\;
$\varphi' \leftarrow \texttt{constr}(P)$\;
}
\end{algorithm}
Finally, we present the partial evaluation on subsets of transitions $T \subseteq
\TP$ that were selected as shown in the previous section using a localised
abstraction function $S$. The full algorithm is displayed in Algorithm
\ref{alg:pe}.
First, the program is converted into its graph $G_\Prog$, then the transitions
from $T$ are removed and the graph is lifted to a partial evaluation graph. An
empty set of versions $V$ is created that remembers the versions that have
already been evaluated in order to avoid unnecessary computations. Additionally
a new set of general transitions $\GTPn$ is initialised, where all newly found
general transitions are added. A single evaluation starts from the target
location $\ell \in \Loc_\Prog$ of an entry transition $t = (\_, \_, \_, \_,
\ell) \in \TPin(T)$ of the subset of transitions $T$. It evaluates recursively
all the newly encountered versions until the evaluation would leave $T$ or the
version has already been encountered. The evaluation is repeated for every entry
transition, expanding the partial evaluation every time. We just have to
consider the special case when the subset of transitions contains the starting
location, in which case an evaluation is also started from $\ell_0$. The
satisfiability check on line \ref{alg:line:check_sat} of Algorithm \ref{alg:pe}
cuts the evaluation short and avoids unnecessary computations for unreachable
versions.
\begin{algorithm}
\caption{Partial evaluation of a subset of transitions $T$ of a \gls{pip}
$\Prog$.\label{alg:pe}}
\KwData{ A \gls{pip} $\Prog = (\PV, \Loc_\Prog, \ell_0, \GTP)$, localised
abstraction function $S$ and a subset of transitions $T \subseteq \TP$.}
\KwResult{A PIP $\Prog'$ with equal runtime-complexity bounds.}
\SetKw{Continue}{continue}
\SetKwFunction{FEvaluate}{evaluate}
\SetKwProg{Fn}{Function}{:}{end}
$G \leftarrow (G_\Prog - T)\lift$\;
$V \leftarrow \emptyset$\;
% $\GTPn \leftarrow$ \set{\{\}}{\;
\Fn{\FEvaluate{$G$, $v$}}{
\lIf{$v \in V$}{\Return $G$}
$V \leftarrow V \uplus \{v\}$\;
$\langle \ell, \varphi\rangle \leftarrow v$\;
\For{$g \in \GTPout(\ell)$}{
\lIf{$\tau_g \land \varphi$ is satisfiable\label{alg:line:check_sat}}{\Continue}
$g' \leftarrow \emptyset$
\For{$t \in g$}{
\uIf{$t \in T$}{
$(\langle\ell,\varphi\rangle, p, \tau, \eta, \langle\ell',
\varphi'\rangle) \leftarrow \unfold(\varphi, t)$\;
$t' \leftarrow (\langle\ell,\varphi\rangle, p, \tau, \eta,
S(\langle\ell',\varphi'\rangle))$\;
$g' \leftarrow g \union \{t'\}$\;
$G \leftarrow \FEvaluate{$G$, $\langle\ell',\varphi\rangle$}$\;
}\uElse{
$t' \leftarrow (\langle\ell,\varphi\rangle, p, \tau, \eta,
\langle\ell',\texttt{true}\rangle)$\;
$g' \leftarrow g' \union {t'}$\;
}
}
$G \leftarrow G + g'$\;
$\GTPn \leftarrow \GTPn \union \{g'\}$\;
}
\Return {$G$}
\lIf{$(\ell_0, \_, \_, \_,\_) \in T$} {
$G \leftarrow \FEvaluate{$G$, $\langle\ell_0,\texttt{true}\rangle$}$
}
\For{$(\_, \_, \_, \_, \ell) \in \TPin(T)$} {
$G \leftarrow \FEvaluate{$G$, $\langle\ell,\texttt{true}\rangle$}$\;
}
$\GTPn \leftarrow \text{general transitions of }E(G)$\;
$\Prog'\leftarrow (\PV,V(G),\langle\ell,\texttt{true},\GTPn)$\;
\Return $\Prog'$\;
Consider the case where a transition a of general transition $g \in \GTP$ is
removed because it is part of the subset $T$, making $g$ an illegal general
transition. Still the general transition is lifted into the initial partial
evaluation graph. However, if there exists an entry of $T$ to the location
$\ell_g$, then the version $\langle \ell_g, \texttt{true}$ and all transitions
in $g$ are unfolded by the first evaluation step for this entry transition
making $g$ whole again in the partial evaluation graph. If the $\langle \ell_g,
\texttt{true}$ is reachable trough the partial evaluation the version is
evaluated then at its corresponding step. If it is not reachable the invalid
general transition is removed at the end as part of the clean-up.
\gls{ocaml}. For solving satisfiability of constraints and formulas in general
Z3\cite{z3} was used as it is already used in other parts of \gls{koat}. For
projections of constraints and handling polyhedra we recurred
\gls{apron}\cite{apron} and the \gls{ppl}\cite{bagnara2008socp}, which can be
used as an abstract domain in apron.
\gls{ocaml}. For solving satisfiability of constraints Z3\cite{z3} was used as
it is already used in other parts of \gls{koat}. For projections of constraints
and handling polyhedra we recurred \gls{apron}\cite{apron} and the
\gls{ppl}\cite{bagnara2008socp}, which can be used as an abstract domain in
apron.
implementation. The performance is analysed for classical and probabilistic
programs using benchmarks on test sets from the \gls{tpdb}\cite{tpdb}. The
results are compared to benchmarks on the same data sets using the latest
version of \gls{koat}. In addition to the performance comparison, the
correctness is asserted by comparing the upper bounds found by the new
implementation to lower bounds proven by the analysis tool \gls{loat}.
implementation. The performance is analysed for classical programs using
benchmarks on test sets from the \gls{tpdb}\cite{tpdb}. The results are compared
to benchmarks on the same data sets using the latest version of \gls{koat}. In
addition to the performance comparison, the correctness is asserted by comparing
the upper bounds found by the new implementation to lower bounds proven by the
analysis tool \gls{loat}.
All code was compiled using OCaml 4.14.0. Among the more important dependencies
is the SMT library \textit{Z3} in version 4.11.0.0, \textit{Apron} in version
0.9.13, and the \gls{ppl} in version 1.2. \gls{koat} and all dependencies are
build inside a docker container. All benchmarks are done on a AMD Ryzen 7 5800X
8-Core Processor with 32 GB RAM. 8 benchmarks are executed in parallel each in
their own Ubuntu 20.04 docker container. The measured computation time for each
test includes the upstart and take down time of the docker container, but the
timeout is measured internally in \gls{koat}.
All of the code was compiled using OCaml 4.14.0. Among the more important
dependencies is the SMT library Z3 in version 4.11.0.0, Apron in version 0.9.13,
and the \gls{ppl} in version 1.2. \gls{koat} and all dependencies were built
inside a docker container. All the benchmarks were done on an AMD Ryzen 7 5800X
8-Core Processor with 32 GB RAM. Eight benchmarks are executed in parallel, each
in their own Ubuntu 20.04 docker container. A single analysis timeouts after 240
seconds. The measured computation time for each test includes the upstart and
take down time of the docker container, but the timeout is measured internally
in \gls{koat}. For each benchmark configuration, we obtain a number of programs
for which the analysis found a constant, linear, quadratic, higher-order
polynomial and exponential runtime-complexity bounds, as well as the average
time $AVG$ of analysis across all programs. Since the timeout is not very
representative, we also include the average analysis time $AVG^+$ only those
programs for which \gls{koat} found finite complexity bounds. Besides we do not
include the number for linear bounds in the number for quadratic bounds, etc.
measure \gls{koat}s performance.\cite{giesl2022lncs} The at the time of writing
latest version on the master branch is used.
\gls{koat}\footnote{commit \texttt{a2eb156ac31db0f4daaa89822786a413632e637f}} is
evaluated with four different settings:
The new implementation is tested with four different settings:
measure \gls{koat}'s performance.\cite{giesl2022lncs} At the time of writing,
the latest version of the master branch is used.\footnote{commit
\texttt{a2eb156ac31db0f4daaa89822786a413632e637f}} The old version of \gls{koat}
is evaluated with four different settings:
\item \enquote{new-\acrshort{koat}} as a correctness check, this
configuration uses the same settings as \enquote{\acrshort{koat}} and
is expected to have similar result.
\item \enquote{new-\acrshort{koat} + \acrshort{mprf}5} as is this configuration that has the
same settings as \enquote{\acrshort{koat} + \acrshort{mprf}5}.
the \gls{tpdb}. The first test set it called \enquote{Complexity\_ITS} that is
comprised of 801 integer programs from various literature. The results of this
the \gls{tpdb}. The first test set it called \enquote{Complexity\_ITS} and it
comprises 801 integer programs from various studies. The results of this
\acrshort{koat} & 128 & 218 & 95 & 10 & 6 & 457 & 4.69 & 8.84 \\ \hline
\acrshort{koat} + \acrshort{mprf}5 & 128 & 235 & 98 & 14 & 6 & 481 & 5.58 &
12.36 \\ \hline
\acrshort{koat} + old-\acrshort{cfr} & 133 & 244 & 98 & 9 & 6 & 490 & 6.12 &
\acrshort{koat} & 128 & 218 & 95 & 10 & 6 & 457 & 3.17 & 8.84 \\ \hline
\acrshort{koat} + \acrshort{mprf}5 & 128 & 235 & 98 & 14 & 6 & 481 &
2.74 & 12.36 \\ \hline
\acrshort{koat} + iRF & 133 & 244 & 98 & 9 & 6 & 490 & 4.48 &
new-\acrshort{koat} & & & & & & & \\ \hline
new-\acrshort{koat} + \acrshort{mprf}5 & & & & & & & \\ \hline
new-\acrshort{koat} + nat.-\acrshort{cfr} & & & & & & &\\ \hline
new-\acrshort{koat} + \acrshort{mprf}5 + nat.-\acrshort{cfr} & & & & & & &\\ \hline
% new-\acrshort{koat} & & & & & & & \\ \hline
% new-\acrshort{koat} + \acrshort{mprf}5 & & & & & & & \\ \hline
new-\acrshort{koat} + nat.-\acrshort{cfr} & 134 & 239 & 97 & 8 & 6 & 484
& 3.22 & 21.67 \\ \hline
new-\acrshort{koat} + \acrshort{mprf}5 + nat.-\acrshort{cfr} & 134 &
250 & 98 & 12 & 6 & 500 & 3.02 & 26.28 \\ \hline
We observe a noticeable improvement in computational performance when using the
native implementation over iRankFinder. Without \gls{mprf} the average analysis
of programs where a finite complexity bound was found shortens by 1.26 seconds
from 4.48 seconds to 3.22 seconds; and with \gls{mprf} by 1.23 seconds from 4.25
seconds to 3.02 seconds. Unfortunately overall, fewer finite bounds can be found
for the given test set. Where \gls{koat} found finite runtime bounds for 490
programs with iRankFinder, it manages to find finite complexity bounds for only
484 programs with the new implementation. Similarly, the old implementation with
\gls{mprf} and iRankFinder found finite complexity bounds for only 500 program
in the test set.
programs, C programs are transformed integer programs using the tool
\textit{llvm2kittel}\cite{llvm2kittel}. The results of this benchmark are
programs, C programs are transformed into integer programs using the tool
\texttt{llvm2kittel}\cite{llvm2kittel}. The results of this benchmark are
\gls{koat} & 23 & 175 & 68 & 10 & 0 & 276 & 3.74 & 4.69 \\ \hline
\gls{koat} + \acrshort{mprf}5 & 23 & 203 & 96 & 10 & 0 & 305 & 4.43 &
\gls{koat} & 23 & 175 & 68 & 10 & 0 & 276 & 3.40 & 4.69 \\ \hline
\gls{koat} + \acrshort{mprf}5 & 23 & 203 & 96 & 10 & 0 & 305 & 3.40 &
new-\acrshort{koat} & & & & & & & \\ \hline
new-\acrshort{koat} + \acrshort{mprf}5 & & & & & & & \\ \hline
new-\acrshort{koat} + nat.-\acrshort{cfr} & & & & & & &\\ \hline
new-\acrshort{koat} + \acrshort{mprf}5 + nat.-\acrshort{cfr} & & & & & & &\\ \hline
new-\acrshort{koat} + nat.-\acrshort{cfr} & 25 & 202 & 72 & 6 & 0 & 305
& 4.14 & 10.06 \\ \hline
new-\acrshort{koat} + \acrshort{mprf}5 + nat.-\acrshort{cfr} & 24 & 222
& 71 & 6 & 0 & 323 & 3.81 & 13.66 \\ \hline
For the second test set, we observe a similar improvement in analysis time as
well. The average analysis time of programs where a finite runtime-bound is
found shortens by 0.96 seconds from 5.10 seconds to 4.14 seconds without
\glspl{mprf}; and by 1.93 seconds from 5.74 seconds to 3.81 seconds with
\glspl{mprf}. However, the new implementation can only find complexity bounds
for 305 instead of 317 programs without \glspl{mprf} and 323 instead of 329 when
using \glspl{mprf}.
This worsened behaviour needs further investigation. We suspect there to be
either a bug in the implementation or that the properties are not optimally
chosen. Besides, there is certainly room for performance optimisation in the new
implementation.
\subsection{Probabilistic programs}
The goal of \gls{koat}'s analysis is to find tight upper bounds for \glspl{pip}.
The analysis techniques used, such as \acrfullpl{mprf}, have their limits and
cannot always find tight bounds for every program. This motivates control-flow
refinement in order to find a program that is equivalent to the original one,
which opens the possibility for the analysis to find tighter runtime-complexity
bounds.
The goal of \gls{koat}s analysis is to find tight upper bounds for \glspl{pip}.
The analysis techniques used, such as \acrfullpl{mprf} and \cite{twn}, have
their limits and cannot always find tight bounds for every program. This
motivates control-flow refinement in order to find a program that is equivalent
to the original one, which opens the possibility for the analysis to find
tighter runtime complexity bounds.
and evaluation will be adapted multiple times in order to 1. find finite
partial evaluation graphs, and 2. evaluate only the relevant parts of the
program.
and evaluation will be adapted multiple times in order to find finite partial
evaluation graphs, and to evaluate only the relevant parts of the program.
\{x,y,u,w\}$ and has the program variables $\PV_{\Prog_3} = \{x,y\}$. $u,w$
are temporary variables whose values are reassigned by the scheduler on
every transition and whose values do not propagate throughout the program.
\{x,y,u,w\}$ and has the program variables $\PV_{\Prog_3} = \{x,y\}$. $u$,
and $w$ are temporary variables whose values are reassigned by the scheduler
on every transition and whose values do not propagate throughout the
program.
not change anything about the expected runtime complexity of the program. On
the general transition $g_1 = \{t_{1a}, t_{1b}\}$ at location $\ell_1$ the
program branches probabilistically, with a probability of $p=0.5$. On the
transition $t_3$, the value with which $y$ is decremented is sampled
probabilistically from a Bernoulli distribution, where both possible
outcomes have the same probability $p=0.5$.
not change the expected runtime complexity of the program. On the general
transition $g_1 = \{t_{1a}, t_{1b}\}$ at location $\ell_1$ the program
branches probabilistically, with a probability of $p=0.5$. On the transition
$t_3$, the value with which $y$ is decremented is sampled probabilistically
from a Bernoulli distribution, where both possible outcomes have the same
probability $p=0.5$.
When entering any of the two loops at $t_{1a}$ or $t_{1b}$ the value of $x$
is unbounded due to the non-deterministic sampling of $x$ in $t_0$. Hence
\gls{koat} cannot find \gls{mprf} bounding the value of $x$ throughout the
execution and compute runtime bounds for those loops.
complex, since the transitions taken do not only depend on the current state
but also on the answer of the scheduler and some random event. Instead of
simulating every single possible run, it would be more efficient to treat
\emph{similar} states as one. Instead of simulating a single assignment, the set
of all possible assignments after a step is described by a set of constraints.
complex, since the transitions taken do not only depend on the current state but
also on the answer of the scheduler and some random event. Instead of simulating
every single possible run, it would be more efficient to treat \emph{similar}
states as one. Instead of simulating a single assignment, the set of all
possible assignments after a step is over-approximated by a set of constraints.
formalised for \glspl{chc} by
\citeauthor{gallagher2019eptcs} \cite{gallagher2019eptcs}. In this thesis, the
definitions are adapted to better fit the algorithm and the formalism of
\Glspl{pip}.
formalised for \glspl{chc} by \citeauthor{gallagher2019eptcs}
\cite{gallagher2019eptcs}. In this thesis, the definitions are adapted to better
fit the algorithm and the formalism of \Glspl{pip}.
The versions are the edges/locations of the partial evaluation graph. The edges
of the partial evaluation graph are transitions $\VTP$ (\enquote{$\VT$} stands
for \enquote{version transitions}) analogously defined to Definition
The versions are the vertices/locations of the partial evaluation graph. The
edges of the partial evaluation graph are transitions $\VTP$ (\enquote{$\VT$}
stands for \enquote{version transitions}) analogously defined to Definition
An unfolding operation $\unfold : \Phi_\PV \times (\PV \rightarrow
\Z[\V\union\D)]) \rightarrow \Phi_\PV$ of a formula $\varphi \in \Phi_\PV$
with an update $\eta: \PV\rightarrow\Z[\V \union \D]$ is defined as
An unfolding operation $\unfold : 2^{\C_\PV} \times (\PV \rightarrow
\Z[\V\union\D)]) \rightarrow 2^{\C_\PV}$ of a set of constraints $\varphi
\in 2^{\C_\PV}$ with an update $\eta: \PV\rightarrow\Z[\V \union \D]$ is
defined as
&= (y > 0 \Land x' = x \Land y' = y - d \Land 0 \leq d \leq 1 \Land x'' =
x' \Land y'' = y')|_{\PV''}[x''/x]_{x\in\PV}\\
&\begin{aligned}
\;=(&y > 0 \Land x' = x \Land y' = y - d \Land 0 \leq d \leq 1 \\&\Land x'' =
x' \Land y'' = y')|_{\PV''}[x''/x]_{x\in\PV}\end{aligned}\\
For any $V(G_n) \subseteq V(G_{n+1})$ and $E(G_n) \subseteq E(G_{n+1})$ for any
$n \in \N$. Let $G^* = \evaluate_\Prog^*$ be the limit the evaluation approaches
with an increasing number of repetitions.
For any $V(G_n) \subseteq V(G_{n+1})$ and $E(G_n) \subseteq E(G_{n+1})$ where $n
\in \N$. Moreover, let $G^* = \evaluate_\Prog^*$ be the limit the evaluation
approaches with an increasing number of repetitions.
As one can see in example \ref{ex:infinite_evaluation}, the evaluation graph of
a program can be infinite, especially, when the program does not terminate on
As one can see in Example \ref{ex:infinite_evaluation}, the evaluation graph of
a program can be infinite, especially when the program does not terminate on
\textit{abstraction space}. A function $\alpha: \Phi \rightarrow \A$ is
called an \textit{abstraction function} if for every $\varphi\subset\Phi$,
$\llbracket \varphi \rrbracket \subseteq \llbracket
\alpha(\varphi)\rrbracket$
\textit{abstraction space}. A function $\alpha: 2^{\C_\PV} \rightarrow \A$
is called an \textit{abstraction function} if for every $\varphi\subset
2^{\C_\PV}$, $\llbracket \varphi \rrbracket \subseteq \llbracket
\alpha(\varphi)\rrbracket$.
above are satisfied. One common abstraction method is is the so called
property-based abstraction\cite{Domenech19} which selects a subset of properties
from a finite set of properties, that are all entailed by the original
constraint set.
above are satisfied. One common abstraction method is the so-called
property-based abstraction \cite{Domenech19} which selects a subset of
properties from a finite set of properties that are all entailed by the
original constraint set.
The property based abstraction is a valid abstraction function. For all
$\varphi \in \Phi$ and finite set of properties $\Psi \subset \C$, the
following holds.
A property based abstraction function is a valid abstraction function. For
all $\varphi \in 2^{\C_\PV}$ and finite set of properties $\Psi \subset \C$,
the following holds.
% \begin{lemma}
% Let $\alpha: \Phi \rightarrow \A$ be an abstraction function over a finite
% abstraction space $\A\subset\C$. The partial evaluation graph $G_\alpha^*$
% exists and is finite.
% \begin{proof}
% The abstraction $\alpha$ can map to only finitely many values in the
% finite abstraction space $\A$. Every evaluation step adds only versions
% from a reduced and finite set of version $\Loc_\Prog \times \A \subset
% \vers_\Prog$. Hence the set of vertices of partial evaluation graph
% equally finite. A transition is added after a version $\langle \ell,
% \_\rangle$ for every outgoing transitions $t \in \TPout(\ell)$ only.
% Since $\TPout(\ell) \subseteq \T_\Prog$ is finite, only finitely many
% transitions can exist between any two versions.
% The graph grows with every evaluation step, hence the limit exists and
% it must be reached in a finite number of evaluation steps.
% \end{proof}
% \end{lemma}
For better readability, we denote abstracted constraints with the set notation
using curly braces. We write $\langle \ell, \{\psi_1, \dots, \psi_m\}\rangle$
instead of $\langle \ell, \psi_1 \land \dots \land \psi_m\rangle$, in order to
highlight the condensing characteristic of the abstraction.
Consider program $\Prog_3$ from \fref{fig:ex_prob} again. We define
$\psi_1,\psi_2,\psi_3 \in \C$ and an abstraction space $\A_\Psi$ where
Consider program $\Prog_3$ from \fref{fig:ex_prob} on \pref{fig:ex_prob}
again. We define $\psi_1,\psi_2,\psi_3 \in \C$ and an abstraction space
$\A_\Psi$ where
On the second step the transitions $t_{1a}, t_{1b}$, and $t_2$ are unfolded
from $\langle \ell_1, \texttt{true}\rangle$ yielding three new transitions
On the second step the transitions, $t_{1a}, t_{1b}$, and $t_2$ are unfolded
from $\langle \ell_1, \texttt{true}\rangle$, yielding three new transitions,
Note that $t_{1a,2},t_{1b,2},t_{1a,3}$, and $t_{1b,3}$ transition to already
existing versions $\langle \ell_1, \{\psi_1,\psi_3,\psi_4\}\rangle$ and
$\langle \ell_1, \{\psi_1,\psi_2,\psi_4\}\rangle$.
Note that $t_{1a,2}, t_{1b,2}, t_{1a,3}$, and $t_{1b,3}$ transition to
versions $\langle \ell_1,~\{\psi_1,~\psi_3,~\psi_4\}\rangle$ and
$\langle \ell_1, \{\psi_1,\psi_2,\psi_4\}\rangle$ that already exist in the
previous graph $G_2$.
\begin{definition}[Localized abstraction]
A \emph{localized} abstraction is a function $S : \vers_\Prog
\rightarrow \vers_\Prog$ is a function that \emph{can} abstract the
constraint of a version, but is not required to do so. For any version
$\langle \ell,\varphi\rangle \in \vers_\Prog$ the abstraction is required to
be of the form $S (\langle \ell, \varphi' \rangle)$ with
$\llbracket\varphi\rrbracket \subseteq \llbracket \varphi'\rrbracket$
\begin{definition}[Localised abstraction]
A \emph{localised} abstraction function is a function $S : \vers_\Prog
\rightarrow \vers_\Prog$ that \emph{can} abstract the constraint of a
version, but is not required to do so. For any version $\langle
\ell,\varphi\rangle \in \vers_\Prog$ the abstraction is required to be of
the form $S (\langle \ell, \varphi' \rangle)$ with
$\llbracket\varphi\rrbracket \subseteq \llbracket \varphi'\rrbracket$.
We dropped the requirement for a finite abstraction space for now and will
reintroduce it in a slightly altered form later on, in order to still
We have dropped the requirement for a finite abstraction space for now and
will reintroduce it in a slightly altered form later on, in order to still
\fref{fig:ex_localized_pba} It looks very similar to before, but we saved
some entailment checks on location $\ell_2$, in return the versions at
$\ell_2$ are not collapsed anymore.
\fref{fig:ex_localized_pba}. It looks very similar to before, but we saved
some entailment checks on location $\ell_2$, in return for which the
versions at $\ell_2$ are no longer collapsed.
practice the abstraction is expensive and abstracting fewer locations, can
speed up the partial evaluation. The real strength of localized abstraction
will be seen when only evaluating a component of $\Prog_3$.
practice the abstraction is expensive and abstracting fewer locations can
speed up the partial evaluation. The real strength of localised abstraction
will be seen when only evaluating a subset of transitions of $\Prog_3$.
The property based abstraction crafted in Example \ref{ex:localized_pba} was
carefully crafted to guarantee a finite abstraction graph. In fact we will see
that the localized abstraction function must abstract on at least one location
per loop in the original program. This is formalized by the lemma
\ref{lem:localized_pba_finite}. But before, we will adapt the evaluation one
The property-based abstraction crafted in Example \ref{ex:localized_pba} was
carefully crafted to guarantee a finite abstraction graph. In fact, we will see
that the localised abstraction function must abstract on at least one location
per loop in the original program. This is formalized by the Lemma
\ref{lem:localized_pba_finite}. But before that, we will adapt the evaluation one
only transitions in a given component $T$. The component is not required to be
connected. How to select the component for best results ins discussed in
\Cref{ch:implementation}.
only a subset of transitions $T \subseteq \T_\Prog$. This way the evaluation can
be focused on only the parts of the program where the classical analysis failed.
The subset is not required to be connected. How to select the transitions for
the best evaluation results is discussed in \Cref{ch:implementation}.
\begin{definition}[Component-level Evaluation]
Let $\Prog$ be a \gls{pip} and $T \subseteq \T_\Prog$ be a component of
$\Prog$. For a localized abstraction $S : \vers_\Prog \rightarrow \C$ an
evaluation step $\evaluate_{\Prog,S,T} : \G_\Prog \rightarrow \G_\Prog$ is
defined as
\begin{definition}[Evaluation of a subset of transitions]
Let $\Prog$ be a \gls{pip} and $T \subseteq \T_\Prog$ be a subset of
transitions of $\Prog$. For a localised abstraction $S : \vers_\Prog
\rightarrow \C$ an evaluation step $\evaluate_{\Prog,S,T} : \G_\Prog
\rightarrow \G_\Prog$ is defined as
component $T = \{t_{1a}, t_{1b}\}$, because the two loops are the only
problematic transitions during analysis with \gls{mprf}s. Also we reduce the
number of properties. The evaluation uses the localized property-based
abstraction $S_{\bot,\Psi_{\ell_1},\bot}$ with $\Psi_1 = {\psi_1, \psi_2,
\psi_3, \psi_4}$ where
subset of transitions $T = \{t_{1a}, t_{1b}\}$, because the two loops are
the only problematic transitions during analysis with \gls{mprf}s. In
addition, we reduce the number of properties. The evaluation uses the
localised property-based abstraction $S_{\bot,\Psi_{\ell_1},\bot}$ with
$\Psi_1 = \{\psi_1, \psi_2, \psi_3, \psi_4\}$ where:
\psi_1 = 1 \leq x, \hspace{1cm} \neg\psi_1 = 1 > x\\
\psi_2 = 2 \leq x, \hspace{1cm} \neg\psi_2 = 2 > x\\
\psi_3 = x \leq 3, \hspace{1cm} \neg\psi_3 = x > 3\\
\psi_4 = x \leq 4, \hspace{1cm} \neg\psi_4 = x > 4\\
\psi_1 = 1 \leq x, \hspace{1cm} \neg\psi_1 = 1 > x,\\
\psi_2 = 2 \leq x, \hspace{1cm} \neg\psi_2 = 2 > x,\\
\psi_3 = x \leq 3, \hspace{1cm} \neg\psi_3 = x > 3,\\
\psi_4 = x \leq 4, \hspace{1cm} \neg\psi_4 = x > 4,\\
Note, the transition $t_{2,1}$ and $t_{2,2}$ are new transitions but they
transition to the trivial version instead beeing unfolded due to $t_2
Note that the transitions $t_{2,1}$ and $t_{2,2}$ are new transitions but
they transition to the trivial version instead of being unfolded due to $t_2
\caption{Component-level partial evaluation graph of $\Prog_3$ from
\fref{fig:ex_prob} on a component $T = \{t_{1a},t_{1b}\}$ and with a
localized property-based abstraction
\caption{Partial evaluation graph of $\Prog_3$ from \fref{fig:ex_prob}
on a subset of transitions $T = \{t_{1a},t_{1b}\}$ and with a localised
property-based abstraction
In practice versions that have no incoming transitions in $G_0$ are removed,
before the evaluation, to avoid unnecessary unfoldings. If the version is
reachable in the partial evaluation nonetheless, it is readded by the
evaluation and evaluated on in later step.
In practice, versions that have no incoming transitions in $G_0$ are
removed, before the evaluation to avoid unnecessary unfoldings. Nonetheless,
if the version is reachable in the partial evaluation, it is
re-added by the evaluation and evaluated in a later step.
The partial evaluation over the entire program is a special case of the
component level evaluation with a component $T = \T_\Prog$. All following lemmas
and theorems hold equally for a (global) abstraction and partial evaluations of
the entire program.
The partial evaluation over the entire program is a special case of the partial
evaluation with a subset $T = \T_\Prog$. All the following lemmas and theorems
hold equally for a (global) abstraction and partial evaluations of the entire
program.
\ref{ex:component_pe} the localized property-based abstraction was carefully
crafted so that the evaluation graph is finite. As was already foreshadowed the
partial evaluation graph is for a localized abstraction function is finite when at least one location
per loop in the original program is abstracted to a finite abstraction space.
\ref{ex:component_pe}, the localised property-based abstraction was carefully
crafted so that the evaluation graph is finite. As was already predicted, the
partial evaluation graph for a localised abstraction function is finite when
at least one location per loop in the original program is abstracted to a finite
abstraction space.
$\Prog$ with a localized abstraction $S$, and let
$\set{S(\langle\ell,\varphi\rangle)}{\varphi \in \Phi}$ for some $\ell
\in L$ on any loop $L$ of $Prog$.
$\Prog$ with a localised abstraction $S$, and let
$\set{S(\langle\ell,\varphi\rangle)}{\varphi \in 2^{\C_\PV}}$ for some
$\ell \in L$ on any loop $L$ of $\Prog$.
By assumption there must exists a location $\ell^* \in L$ for which
$\set{S(\langle\ell,\varphi\rangle)}{\varphi \in \Phi}$ is finite. At
the same time versions $\langle\ell^*,\_\rangle$ must appear an infinite
number of times on $p$, and there can be only abstracted to finitely
many possible versions, which is a contradiction to the versions on $p$
By assumption there must exist a location $\ell^* \in L$ for which
$\set{S(\langle\ell,\varphi\rangle)}{\varphi \in 2^{\C_\PV}}$ is finite. At
the same time, versions $\langle\ell^*,\_\rangle$ must appear an infinite
number of times on $p$, and they can only be abstracted to finitely
many possible versions, which is a contradiction of the versions on $p$
We will proceed with proving, that the partial evaluation graph of a component
level partial evaluation can be used for constructing an new program $\Prog'$
with equal expected runtime-complexity. All results equally hold for partial
evaluation of entire program and independently of the choice of the (global or
localized) abstraction function as long as the localized evaluation function
We will proceed with proving that the partial evaluation graph restricted to a
subset of transitions can be used for constructing an new program $\Prog'$ with
equal expected runtime complexity. All the results hold equally for partial
evaluation of the entire program and independently of the choice of the (global
or localised) abstraction function, as long as the localised evaluation function
\item there exist a unique general transition $\bar{g}_i \in \GTP$ and
some $\varphi \in \Phi$ for which the following property holds:
\item there exists a unique general transition $\bar{g}_i \in \GTP$ and
some $\varphi \in 2^{\C_\PV}$ for which the following property holds:
Theorem \ref{thm:correctness} which will shows that any runtime-complexity
bounds for the partial evaluation are runtime-complexity bounds of the original
program, which demonstrates the correctness of the construction. Lemma
\ref{lem:constrfgeq} and Theorem \ref{thm:thightness} will show that worst-case
expected runtime-complexity doesn't get worse with partial evaluation.
Theorem \ref{thm:correctness}, which shows that any runtime-complexity bounds for
the partial evaluation are runtime-complexity bounds of the original program,
which demonstrates the correctness of the construction. Lemma
\ref{lem:constrfgeq} and Theorem \ref{thm:thightness} show that the worst-case
expected runtime-complexity does not worsen with partial evaluation either.
admissible finite prefixes, since non-admissible ones, do not count in the
expected runtime-complexity as they have a probability zero.
admissible finite prefixes, since non-admissible ones do not count in the
expected runtime-complexity as they have a probability of zero.
For a \gls{pip} $\Prog$, and any initial state $s_0 \in \Sigma$, a Markovian
scheduler $\scheduler \in \MDS_\Prog$ and the partial evaluation $\Prog'$ as
described in Definition \ref{def:partial_evaluation} with localised
abstraction function $S$ and component $T$. There exists a Markovian
scheduler $\scheduler' \in \MDS_{\Prog'}$, such that for all admissible
finite prefixes $f\in\fpath_\Prog$ with $\prSs(\varf) > 0$ there exists a
finite prefix $\varf' = \simu_{\Prog,S,T}(\varf)
Let $\Prog$ be a \gls{pip}, $s_0 \in \Sigma$ an arbitrary initial state,
$\scheduler \in \MDS_\Prog$ a Markovian scheduler, and $\Prog'$ the partial
evaluation as described in Definition \ref{def:partial_evaluation} with a
localised abstraction function $S$ and the subset of transitions $T
\subseteq \TP$. There exists a Markovian scheduler $\scheduler' \in
\MDS_{\Prog'}$, such that for all admissible finite prefixes
$f\in\fpath_\Prog$ with $\prSs(\varf) > 0$ there exists a finite prefix
$\varf' =
\simu_{\Prog,S,T}(\varf)
\in g, \varphi' \in \Phi, \text{ but } t \not\in T}$. $g'$ is the general
transition that contains all the unfolded and copied transitions for $t \in
g$. For better readability we define a function $\texttt{eval}_{\Prog,S,T} : \GTP
\rightarrow \GTPn$ where
\in g, \varphi' \in 2^{\C_\PV} \text{ and } t \not\in T}$. $g'$ is the
general transition that contains all the unfolded and copied transitions for
$t \in g$. For better readability we define a function
$\texttt{eval}_{\Prog,S,T} : \GTP \rightarrow \GTPn$ where
\Phi_\PV$. $s' \models \varphi$ is not guaranteed. $s' \not\models\varphi
\Land \tau_g$ if and only if $s \not\models \varphi$. In that case the
2^{\C_\PV}$. $s' \models \varphi$ is not guaranteed. $s' \not\models\varphi
\Land \tau_g$ if and only if $s \not\models \varphi$. In that case, the
Case 1, $s' \models \varphi$. Case 1.1, $\scheduler(\ell, \bar{t'},s')$
selects $(g_\bot, s')$. In that case, $\scheduler(\ell', t', s') =
(g_\bot, s')$ as well. This is correct because if no $(g, \tilde{s}) \in
\GTPnout(\ell) \times \Sigma$ with $\tilde{s}|_\PV = s'$ and $\tilde{s}
\models \tau_g$ exist, then so can't any $(g', \tilde{s'}) \in \GTPout(\ell')
\times \Sigma$, because all general transitions in $\GTPout(\ell')$ have more
restrictive guards than their original counterparts.
$\scheduler'$ is indeed a valid scheduler: Case 1, let $s' \models \varphi$.
Case 1.1, let $\scheduler(\ell, \bar{t'},s')$ selects $(g_\bot, s')$. In
that case, $\scheduler(\ell', t', s') = (g_\bot, s')$ as well. This is
correct because if no $(g, \tilde{s}) \in \GTPnout(\ell) \times \Sigma$ with
$\tilde{s}|_\PV = s'$ and $\tilde{s} \models \tau_g$ exist, then does no
$(g', \tilde{s'}) \in \GTPout(\ell') \times \Sigma$, because all general
transitions in $\GTPout(\ell')$ have more restrictive guards than their
original counterparts.
\rem{In case unreachable transitions and versions are removed from the
program, the original scheduler might select a general transition which has
\rem{If unreachable transitions and versions are removed from the
program, the original scheduler might select a general transition that has
At this point is becomes clear, why filtering unreachable transitions
ans versions can be done in practice. The newly constructed prefix can
At this point it becomes clear why filtering unreachable transitions
and versions can be done in practice. The newly constructed prefix can
For a \gls{pip} $\Prog$, the partial evaluation $\Prog'$ as described in
Definition \ref{def:partial_evaluation}, and an initial state $s_0
\in\Sigma$ the following holds:
For a \gls{pip} $\Prog$, the partial evaluation $\Prog'$ of $\Prog$ as
described in Definition \ref{def:partial_evaluation}, and an initial state
$s_0 \in\Sigma$ the following holds:
Taking the supremum over all schedulers on both sides of
(\ref{eq:esleq}) doesn't change the relation, since we just found for
Taking the supremum over all schedulers on both sides of Equation
\eqref{eq:esleq} does not change the relation, since we just found for
The following theorem shows, that the partial evaluation hasn't gotten worse
either. Note, this doesn't imply that a perfectly tight bound for the worst-case
expected runtime can be found. Similarly to Lemma \ref{lem:constrfleq} we first
construct a new scheduler $\scheduler \in \HDS_\Prog$ and show that when using
this scheduler, a finite path of equal probability and runtime-complexity exists
in $\Prog$ for every finite path in $\Prog'$.
The following Theorem \ref{thm:thightness} shows, that the partial evaluation
does not worsen the worst-case expected runtime complexity either. Note, this
does not imply that a perfectly tight bound for the worst-case expected runtime
can be found. Similar to Lemma \ref{lem:constrfleq} we first construct a new
scheduler $\scheduler \in \HDS_\Prog$ and show that when using this scheduler, a
finite path of equal probability and runtime-complexity exists in $\Prog$ for
every finite path in
$\Prog'$.
By Lemma \ref{lem:hdsvsmds} the fact that we used a history dependent
scheduler doesn't matter in the supremum and we get the following
By Lemma \ref{lem:hdsvsmds} the fact that we used a history-dependent
scheduler does not matter in the supremum and we optain the following
For a given finite set of variables $\V$, the set of \textit{linear
non-strict inequality constraints} is defined as the set
For a given finite set of variables $A \subseteq \V$, the set of
\textit{linear non-strict inequality constraints} with variables $A$ is
defined as the set
% Analogously, the set of \textit{linear strict inequality} constraints is
% defined as the set
% \begin{align*}
% \Lin^{<}=\set{a_0 + \sum_{i=1}^{n} a_i v_i < 0}{a_i \in \R}.
% \end{align*}
% The union of strict and non-strict linear constraints is called
% \textit{mixed linear inequality} constraints denoted as $\Lin = \Lin^{\leq}
% \union \Lin^{<}$.
The set of \textit{non-strict polynomial inequality constraints} is defined
as the set
The set of \textit{non-strict polynomial inequality constraints} with
variables $A$ is defined as the set
% whereas the set of \textit{strict polynomial inequality} constraints is
% defined as the set
% \begin{equation*}
% \C^< = \set{e < 0}{e \in \Z[\V]},
% \end{equation*}
% and the set of \textit{mixed polynomial inequality} constraints is the set
% $\C = \C^{\leq} \union \C^{<}$.
Let $\C$ and $\PV$ denote the linear and polynomial constraints with
variables $A = \V$.
Non-strict linear constraints are a subset of non-strict polynomial
constraints. For the most part we will just refer to \enquote{constraints}
when the distinction between linear and polynomial constraints is
unimportant.
The set of non-strict linear constraints is a subset of the set of
non-strict polynomial constraints. For the most part we will just refer to
\enquote{constraints} and mean linear constraints. Whenever we encounter
polynomial constraints we will mention it explicitly.
The set of quantifier-free formulas of first-order logic over a variable set
$A\subset\V$ is denoted with $\Phi_A$, or simply $\Phi$, when $A=\V$. For
foundations on \gls{fo}-logic we refer the reader to \cite{barwise1977}.
For foundations on \gls{fo}-logic we refer the reader to \cite{barwise1977}.
\gls{fo} logic.\cite{barwise1977} In this thesis, we will only consider integer
arithmetic. Let $\Zs := (\Z, +, \cdot, 0, 1, \leq)$ be the structure of
standard integer arithmetic. For an assignment $\beta : \V \rightarrow \Z$ and a
formula $\varphi \in \Phi$ we write $\beta \models \varphi$ instead of $\Zs,
\beta \models \varphi$. Finally, we define two trivial formulas $\texttt{true} =
(0 = 0)$ and $\texttt{false} = (1 < 0)$
\gls{fo} logic \cite{barwise1977}. In this thesis, we will only consider integer
arithmetic. Let $\Zs := (\Z, +, \cdot, 0, 1, \leq)$ be the structure of standard
integer arithmetic. For an assignment $\beta : \V \rightarrow \Z$ and a formula
$\varphi \in \Phi$ we write $\beta \models \varphi$ instead of $\Zs, \beta
\models \varphi$. Finally, we define two trivial formulas $\texttt{true} = (0 =
0)$ and $\texttt{false} = (1 < 0)$.
Two formulas, constraints, or set of constraints $\varphi_1$ and $\varphi_2$
are considered \textit{equivalent}, denoted as $\varphi_1 \equiv \varphi_2$
when $\llbracket \varphi_1 \rrbracket = \llbracket \varphi_2 \rrbracket$.
Two constraints, or set of constraints $\varphi_1$ and $\varphi_2$ are
considered \textit{equivalent}, denoted as $\varphi_1 \equiv \varphi_2$, when
$\llbracket \varphi_1 \rrbracket = \llbracket \varphi_2 \rrbracket$.
The theory relevant for this thesis is the quantifier-free linear integer
arithmetic (usually called \texttt{QF\_LIA}).
The satisfiability of sets polynomial constraints is known to be
undecidable\cite{kremer2016lncs}.
Thus whenever we encounter polynomial constraints the need to be
over-approximated by linear constraints. The theory relevant for this thesis is
the quantifier-free linear integer arithmetic (usually called \texttt{QF\_LIA}).
Let $A \subseteq V$ and $\varphi \in \Phi$. The projection of $\varphi$ onto
$A$ denoted as $\varphi|_A$ is a quantifier-free FO-formula $\varphi|_A\in
\Phi_A$, with $\llbracket \varphi|_A \rrbracket = \set{\beta|_A}{\beta \in
Let $A \subseteq V$ and $\varphi \in 2^\C$. The projection of $\varphi$ onto
$A$ denoted as $\varphi|_A$ is a constraint $\varphi|_A\in
2^{\C_A}$, with $\llbracket \varphi|_A \rrbracket = \set{\beta|_A}{\beta \in
\Sigma_A$. It computes the assignment after an update $u$ is given an assignment
$s \in \Sigma$ before the update:
\Sigma_A$. It computes the assignment after an update $u$ with a given an
assignment $s \in \Sigma$ before the update:
Let $A, B \subseteq \V$. Given an assignment $s\in\Sigma$, a formula
$\varphi \in \Phi$ and an update $u: A \rightarrow \Z[B]$. Without loss of
generality $A' = \set{x'}{x \in A} \subset \V$ is a set of fresh variables
with $A' \cap (A \union B) = \emptyset$. We write under misuse of notation
Let $A, B \subseteq \V$. Given an assignment $s\in\Sigma$, a set of
constraints $\varphi \in 2^\C$ and an update $u: A \rightarrow \Z[B]$.
Without loss of generality $A' = \set{x'}{x \in A} \subset \V$ is a set of
fresh variables with $A' \cap (A \union B) = \emptyset$. We write under
misuse of notation
used abstract domains are Polyhedra \cite{bagnara2005convexpolyhedron}
Octagons \cite{mine2001wcre, mine2007arxiv, mine2006hosc,truchet2010synacs}, and
the Boxes \cite{gurfinkel2010}.
used abstract domains are Polyhedra \cite{bagnara2005convexpolyhedron},
Octagons \cite{mine2001wcre, mine2007arxiv, mine2006hosc,truchet2010synacs} and
Boxes \cite{gurfinkel2010}.
special case of polyhedra. Octagons are used in \gls{koat} during the
generation of invariants. They are faster to work with than polyhedra but less
precise.
special case of polyhedra. Octagons are used in \gls{koat} during the generation
of invariants. They are faster to work with than polyhedra but less precise,
since they can not represent all linear constraints and are forced to
over-approximated some of them.
probability two subsets of outcomes (e.g. $A = \set{ \omega \in \Omega} {\text{
coin shows head}}$ and $B = \set{\omega \in \Omega} {\text{coin shows
tail}}$) so you can measure the probability of the union of both subsets ($A
probability two subsets of outcomes (e.g. $A = \{\omega \in \Omega\mid\text{
coin shows head}\}$ and $B = \{\omega \in \Omega\mid\text{coin shows
tail}\}$) so you can measure the probability of the union of both subsets ($A
\min (\support(D)) \leq d \leq \max (\support(D)) & \text{if}
\min(\support(D)) \text{and} \max (\support(D)) \text{ is known}, \\
\min (\support(D)) \leq d \leq \max (\support(D)) & \begin{aligned}&\text{if}
\min(\support(D))\text{, and } \\&\max (\support(D)) \text{ are
known},\end{aligned} \\
\begin{comment}
The syntax and semantics of probabilistic programs has been extensively
studied:
- [Supermartingales,
2017](https://arxiv.org/pdf/1709.04037.pdf#cite.Billingsley%3Abook)
- [Probabilistic Koat2, 2021](https://arxiv.org/pdf/2010.06367.pdf)
\end{comment}
accumulated rewards over a whole process. In the
literature \cite{puterman1994markov, puterman1990markov}, the properties of a
given process with regard to the highest achievable rewards have been
accumulated rewards over a whole process. In the literature
\cite{puterman1994markov, puterman1990markov}, the properties of a given Markov
decision process with regard to the highest achievable rewards have been
\item The transition probability will be defined later in
\Sref{ssec:probabilityspace}.
\item The reward is 1 for every non-terminating transition.
\item The transition probability, which will be defined later.
\item The reward is 1 for every non-terminating transition and 0 for the
terminating transition $t_\bot$.
$\scheduler_\text{opt} \in \MDS_{F_i}$ exists that maximises the
expected reward when limited to the finite set of configurations. Let
$\nu_i$ be the reward of this optimal scheduler.
$\scheduler_\text{opt} \in \MDS_{F_i} \subset \MDS$ exists that
maximises the expected reward when limited to the finite set of
configurations. For all remaining configurations $\scheduler_\text{opt}$
may behave arbitrarily. Let $\nu_i$ be the reward of this optimal
scheduler.
A component $T \subset \T_\Prog$ is a subset of transitions of $\Prog$. All
locations in this component are denoted by a function $\Loc : 2^{\TP}
\rightarrow 2^{\Loc_\Prog}$ with
All locations in a subset of transitions are denoted by a function $\Loc :
2^{\TP} \rightarrow 2^{\Loc_\Prog}$ with
For a component $T \subset \T_\Prog$ the entry transitions are described by a
function $\TPin : 2^{\TP} \rightarrow 2^{\TP}$ and the exit transitions are
described by a function $\TPout : 2^{\TP} \rightarrow 2^{\TP}$ with
For a subset of transitions $T \subset \T_\Prog$ the entry transitions are
described by a function $\TPin : 2^{\TP} \rightarrow 2^{\TP}$ and the exit
transitions are described by a function $\TPout : 2^{\TP} \rightarrow 2^{\TP}$
with
of components $T_L$ which we call the \emph{transition loops} of $L$ with
\begin{equation*}
T_L = \set{\{t_0,\dots,t_n\}}{t_i = (\ell_i, \_, \_, \_, \ell_{i+1}) \in \T_\Prog \text{ for } 0 \leq i < n \text{ and } t_n =
(\ell_i, \_, \_, \_, \ell_0) \in \T_\Prog}.
of subsets of transitions $T_L$ which we call the \emph{transition loops} of $L$
with
\begin{equation*}
\begin{gathered}
T_L = \set{\{t_0,\dots,t_n\}}{t_i = (\ell_i, \_, \_, \_, \ell_{i+1}) \in
\T_\Prog, 0 \leq i < n, t_n = (\ell_i, \_, \_, \_, \ell_0) \in \T_\Prog}.
\end{gathered}
A component $T \subseteq \T_\Prog$ is \emph{strongly connected} when for every
two locations $\ell, \ell' \in V(T)$ there exists a path between the two using
only the transitions from $T$.
A subset of transitions $T \subseteq \T_\Prog$ is called a \emph{strongly
connected} component when for every two locations $\ell, \ell' \in V(T)$ there
exists a path between the two using only the transitions from $T$.
programming paradigms as tightly and quickly as possible. \cite{giesl2017aprove,
montoya2014aplas,alarcon2017muterm,irankfinder2018wst}
programming paradigms as tightly and quickly as possible \cite{giesl2017aprove,
montoya2014aplas,alarcon2017muterm,irankfinder2018wst}.
two program variables $x$, and $y$, and one temporary variable $u$, which is
used to non-deterministically sample $x$ in the first transition $t_0$. The
two program variables $x$ and $y$, and one temporary variable $u$, which is used
to non-deterministically sample $x$ on the first line of the program. The
\gls{cfr} on sub-\gls{scc} level, the program (see \fref{fig:classic_pe_c}) can
be transformed into an equivalent program where the analysis with \gls{mprf}
succeeds at finding a finite runtime-complexity bound.
\gls{cfr} on sub-\gls{scc} level, the program can be transformed into an
equivalent program (see \fref{fig:classic_pe_c}) where the analysis with
\gls{mprf} succeeds at finding a finite runtime-complexity bound, because now
the variable $x$ is constrained when entering the loop.
Consider the very similar probabilistic integer program $\Prog_2$ displayed in
\fref{fig:prob_c}. The variables $x$ and $y$ are now only in/-decremented with
50\% probability in every iteration. The program will probably always
terminate, since a longer runtime becomes increasingly unlikely. The current
version of \gls{koat} fails to find a finite runtime and expected complexity
bound for this program. It would be useful to transform $\Prog_2$ into an
equivalent program like the one shown in \fref{fig:prob_pe_c}, similar to the
\gls{cfr} technique presented by \citeauthor{giesl2022lncs} \cite{giesl2022lncs}
and \citeauthor{Domenech19} \cite{Domenech19}.
Consider the very similar probabilistic integer program $\Prog_2$ which is
displayed in \fref{fig:prob_c}. The variables $x$ and $y$ are now only
in/-decremented with 50\% probability in every iteration. The program will
probably always terminate, since a longer runtime becomes increasingly unlikely.
The current version of \gls{koat} fails to find a finite runtime and expected
complexity bound for this program. It would be useful to transform $\Prog_2$
into an equivalent program like the one shown in \fref{fig:prob_pe_c}, similar
to the \gls{cfr} technique presented by \citeauthor{giesl2022lncs}
\cite{giesl2022lncs} and \citeauthor{Domenech19} \cite{Domenech19}.
\gls{koat} by reimplementing the control flow refinement technique of
iRankFinder\cite{irankfinder2018wst} natively in KoAT instead of calling the
external library and adapting it to the paradigm of probabilistic programs.
\gls{koat} with regard to non-probabilistic integer programs by reimplementing
the control flow refinement technique of iRankFinder \cite{irankfinder2018wst}
natively in KoAT instead of calling the external library. By adapting the method
of partial evaluation to the paradigm of probabilistic programs and proving its
correctness, partial evaluation is made available for probabilistic programs for
the first time (to the best of our knowledge).
\Cref{ch:preliminaries}, focusing especially on constraints, probability
theory, and (probabilistic) integer programs.
In \Cref{ch:theory} the \gls{cfr} techniques from iRankFinder are adapted to
\Glspl{pip}. We prove the theoretical correctness of the construction.
Furthermore, the technique of sub-\gls{scc} level control-flow-refinement is
presented.
\Cref{ch:implementation} focuses on the changes made to \gls{koat}, and the
performance improvements made with the new implementation. Finally
in \Cref{ch:conclusion} we recapitulate the results of this thesis and give
an outlook on future research.
\Cref{ch:preliminaries}, focusing especially on constraints, probability theory,
and the formalism behind (probabilistic) integer programs.
In \Cref{ch:theory} the \gls{cfr} techniques from \citeauthor{Domenech19} are
adapted to \Glspl{pip}. We prove the theoretical correctness of the
construction. Furthermore, the technique is extended to refine only a part of
the original program.
\Cref{ch:implementation} focuses on the implementation and the integration into
the existing analysis. The performance of the new implementation is analysed.
Finally in \Cref{ch:conclusion}, we recapitulate the results of this thesis and
give an outlook on future research.
% \begin{vplace}
% \textit{
% Hanna and Thibaud, I would like to express my deepest gratitude
% for your continuous encouragement and assistance. Your unwavering belief
% in my abilities and your willingness to lend a helping hand during challenging
% times have been instrumental in the successful completion of this thesis. Thank
% you, from the bottom of my heart.
% }
% \end{vplace}
\begin{vplace}
\textit{
Hanna and Thibaud, I would like to express my deepest gratitude for your
continuous encouragement and assistance. Your unwavering belief in my
abilities and your willingness to lend a helping hand during challenging
times have been instrumental in the successful completion of this thesis.
Thank you, from the bottom of my heart.
}
\end{vplace}
\newcommand{\PRh}[0]{\texttt{PR}_\text{h}}
\newcommand{\PRhv}[0]{\texttt{PR}_\text{hv}}
\newcommand{\PRc}[0]{\texttt{PR}_\text{c}}
\newcommand{\PRcv}[0]{\texttt{PR}_\text{cv}}
\newcommand{\PRd}[0]{\texttt{PR}_\text{d}}
\newcommand{\PRh}[0]{\texttt{PR}_\mathrm{h}}
\newcommand{\PRhv}[0]{\texttt{PR}_\mathrm{hv}}
\newcommand{\PRc}[0]{\texttt{PR}_\mathrm{c}}
\newcommand{\PRcv}[0]{\texttt{PR}_\mathrm{cv}}
\newcommand{\PRd}[0]{\texttt{PR}_\mathrm{d}}