QIVGMR6INKBVT5IFCO2UJJQEO4PVW3ZM7IDTCTCOKOX7VKG7UPFAC 6PTHERE26NXSC3ZXSI5UYKWTMPYZD4WMYFBJKZE3MRRTKC7MIU6QC JCK77YLTPL6VV5ZOTMGUJSFZ5DNMS3MZGP5UQLXYW6WSMEK6GAQAC 567NUQ633R5ESTE2LWXVUGM7XEZGJ65IRKPOCDZJS3JE5RAMR5KAC X7SPUV4CXRX7D5KKL6UVGHC7JWRWCA4QAJVVQISE64UEU7GXJNRQC DDNESFMD4LGNTDGTCW27FQPCOR5Y5MJWN2TWCSIRNBK6QGBITBRQC QCVDPRXH57UFLU4YV3P2I5RQZGCBUKROYUZHDWOCHL43KDYN7MPAC RVA3DOCNBWFBWA2MLC2TVQOEY25AUCYGEJSWMD72QRQ26AFLVZUAC MQJMCLGFSBSFJ3JPCIHQSBZS4XZPGJI7NGGRASX5UVGJO6HES25AC B5T3HFKSQF3T4T6Z2KZOA5ZDMI6TVON7BS7KKB3EAGGYAS7RJPBAC KUEZAMEH2M3K6EDHWMYXQMHZ2QTTH4KBM5CO7A7EB64ABEWRE6WQC UVN5HOW3H2OVESJ62UFNQNS6VXHI5TRV22HQ44EOYU42453SDD2QC EIPHTX56WTRFBMBNA4VYMKK4L7GWHA4Q2AQ3BCG2ZGZB3DBRENBQC QMVJ2ZKX5ALAMMIL7E47ZQS67QFZJ7Z3HAY5G7X72KPEQEHWRGCQC }@InProceedings{gulwani2009,author = {Sumit Gulwani and Sagar Jain and Eric Koskinen},booktitle = {Proceedings of the 2009 {ACM} {SIGPLAN} Conference on Programming Language Design and Implementation, {PLDI} 2009, Dublin, Ireland, June 15-21, 2009},title = {Control-flow refinement and progress invariants for bound analysis},year = {2009},editor = {Michael Hind and Amer Diwan},pages = {375--385},publisher = {{ACM}},bibsource = {dblp computer science bibliography, https://dblp.org},biburl = {https://dblp.org/rec/conf/pldi/GulwaniJK09.bib},doi = {10.1145/1542476.1542518},file = {:/home/paradyx/MA/literatur/gulwani2009.pdf:PDF},groups = {Partial Evaluation},timestamp = {Fri, 25 Jun 2021 14:48:54 +0200},url = {https://doi.org/10.1145/1542476.1542518},
@InProceedings{gallagher2019eptcs,author = {John P. Gallagher},booktitle = {Proceedings Seventh International Workshop on Verification and Program Transformation, VPT@Programming 2019, Genova, Italy, 2nd April 2019},title = {Polyvariant Program Specialisation with Property-based Abstraction},year = {2019},editor = {Alexei Lisitsa and Andrei P. Nemytykh},pages = {34--48},series = {{EPTCS}},volume = {299},bibsource = {dblp computer science bibliography, https://dblp.org},biburl = {https://dblp.org/rec/journals/corr/abs-1908-07189.bib},doi = {10.4204/EPTCS.299.6},file = {:/home/paradyx/MA/literatur/gallagher2019eptcs.pdf:PDF},groups = {Partial Evaluation},timestamp = {Fri, 03 Jun 2022 08:18:12 +0200},url = {https://doi.org/10.4204/EPTCS.299.6},}
% TeX root = ../main.tex\begin{tikzpicture}[program,initial left]\node[state,initial] (l0) {$l_0$};\node[state] (l1) [right=3cm of l0] {$l_1$};\node[state] (l2) [right=3cm of l1] {$l_2$};\path[->] (l0) edge node {$t_1:\textit{true}$} (l1)(l1) edge node {$t_3:x \leq 0$} (l2)edge[loop above] node {$t_2: \begin{aligned}x &> 0 \\x &:= x - \textsc{Bern}(0.5)\end{aligned}$}();\end{tikzpicture}
% TeX root = ../main.tex\begin{tikzpicture}[program,initial left]\node[state,initial] (l0) {$l_0$};\node[state] (l1) [right=3cm of l0] {$l_1$};\node[state] (l2) [right=3cm of l1] {$l_2$};\path[->] (l0) edge node {$t_1:\textit{true}$} (l1)(l1) edge node {$t_5:x \leq 0$} (l2)edge[in=100,out=145,looseness=15,draw=rot-75] node[aboveleft] {$t_2:\begin{aligned} p&=0.5 \\ x &> 0 \\ x&:=x+1\end{aligned}$} (l1)edge[in=35,out=80,looseness=15,draw=rot-75] node[above right] {$t_3:\begin{aligned}p&=0.5 \\x &> 0 \\ x&:=x+2\end{aligned}$} (l1)edge[loop below] node {$t_4:\begin{aligned} x &> 0 \\ x&:=x-1\end{aligned}$} ();\end{tikzpicture}
% TeX root = ../main.tex\begin{tikzpicture}[program,initial left]\node[state,initial] (l0) {$l_0$};\node[state] (l1) [right=3cm of l0] {$l_1$};\node[state] (l2) [right=3cm of l1] {$l_2$};\path[->] (l0) edge node {$t_1:\textit{true}$} (l1)(l1) edge[loop above] node {$t_2: \\begin{aligned}x & > 0 \\1 &\leq t \leq 3 \\x &:= x - t\end{aligned}$} ()(l1) edge node {$t_3:x \leq 0$} (l2);\end{tikzpicture}
% TeX root = ../main.tex\begin{tikzpicture}[program,initial left]\node[state,initial] (l0) {$l_0$};\node[state] (l1) [right=3cm of l0] {$l_1$};\node[state] (l2) [right=3cm of l1] {$l_2$};\path[->] (l0) edge node {$t_1:\textit{true}$} (l1)(l1) edge node {$t_4:x \leq 0$} (l2)edge[loop above] node {$t_2:\begin{aligned} x &> 1 \\ x&:=x+1\end{aligned}$} ()edge[loop below] node {$t_3:\begin{aligned} x &> 0 \\ x&:=x-1\end{aligned}$} ();\end{tikzpicture}
% TeX root = ../main.tex\begin{tikzpicture}[scale=0.5]\begin{axis}[axis lines = center,axis on top,xmin=0,xmax=6.3,ymin=0,ymax=5.3,y=1cm,x=1cm,ytick={0,...,5},xtick={0,...,6},xlabel = x,ylabel = y,font=\tiny]% Octagon\addplot[mark=none,draw=black,fill=rwth-50] coordinates {(1,2) (1,3)(3,4) (4,4) (5,2) (4,1) (2,1)} -- cycle;% Constraints% -x + 2y = 5\addplot[mark=none,dashed] {(x +5)/2};% y = 4\addplot[mark=none,dashed] {4};% 2x + y = 12\addplot[mark=none,dashed] {-2*x + 12};% x - y = 3\addplot[mark=none,dashed] {x - 3};% y = 1\addplot[mark=none,dashed] {1};% x + y = 3\addplot[mark=none,dashed] {-x + 3};% x = 1\addplot[mark=none,dashed] coordinates {(1, 0) (1,5.3)};% Points\addplot[only marks,mark=o] coordinates {(3,4) (4,4)(1,3) (2,3) (3,3) (4,3)(1,2) (2,2) (3,2) (4,2) (5,2)(2,1) (3,1) (4,1)};\end{axis}\end{tikzpicture}
% TeX root = ../main.tex\begin{tikzpicture}[scale=0.5]\begin{axis}[axis lines = center,axis on top,xmin=0,xmax=6.3,ymin=0,ymax=5.3,y=1cm,x=1cm,ytick={0,...,5},xtick={0,...,6},xlabel = x,ylabel = y,font=\tiny]% Octagon\addplot[mark=none,draw=black,fill=rwth-50] coordinates {(1,2) (1,3)(2,4) (4,4) (5,3) (3,1) (2,1)} -- cycle;% Constraints% x = 1\addplot[mark=none,dashed] coordinates {(1, 0) (1,5.3)};% -x + y = 2\addplot[mark=none,dashed] {x + 2};% y = 4\addplot[mark=none,dashed] {4};% x + y = 8\addplot[mark=none,dashed] {-x + 8};% -x + y = -2\addplot[mark=none,dashed] {x - 2};% y = 1\addplot[mark=none,dashed] {1};% x + y = 3\addplot[mark=none,dashed] {-x + 3};% Points\addplot[only marks,mark=o] coordinates {(2,4) (3,4) (4,4)(1,3) (2,3) (3,3) (4,3) (5,3)(1,2) (2,2) (3,2) (4,2)(2,1) (3,1)};\end{axis}\end{tikzpicture}
% TeX root = ../main.tex\begin{tikzpicture}[scale=0.5]\begin{axis}[axis lines = center,axis on top,xmin=0,xmax=6.3,ymin=0,ymax=5.3,y=1cm,x=1cm,ytick={0,...,5},xtick={0,...,6},xlabel = x,ylabel = y,font=\tiny]% Box\addplot[mark=none,draw=black,fill=rwth-50] coordinates {(1,2) (1,4) (5,4) (5,2)} -- cycle;% Intervals\addplot[mark=none,dashed] coordinates {(1,0) (1,2)};\addplot[mark=none,dashed] coordinates {(5,0) (5,2)};\addplot[mark=none,dashed] coordinates {(0,2) (1,2)};\addplot[mark=none,dashed] coordinates {(0,4) (1,4)};% Points\addplot[only marks,mark=o] coordinates {(1,4) (2,4) (3,4) (4,4) (5,4)(1,3) (2,3) (3,3) (4,3) (5,3)(1,2) (2,2) (3,2) (4,2) (5,2)};\end{axis}\end{tikzpicture}
\end{comment}\section{Modules}The code specific to this thesis is split into a couple of modules. The code isfully commented. Nonetheless, this section gives a short overview over themodules to ease the understanding of the code.\subsection{Loops}\subsection{PartialEvaluation}\subsection{Abstraction}\section{Command-Line Interface}The techniques described above were implemented natively in Koat2. They areavailable as part of the normal analysis via the flags \texttt{--cfr-native}.The old option for control-flow refinement \texttt{--cfr} was renamed to\texttt{--cfr-ir}, and can probably be removed in the future.During normal analysis the control-flow refinement is done on every \gls{scc}and the bounds of the resulting scc is compared to the previous bounds.A full program can be refined using the native control-flow refinement with aseparate command \texttt{cfr}. For the new control-flow refinement the option isadded to switch between the various method to select location for abstractionand the abstraction method itself (see \Sref{sec:abstraction}). An summary ofthe compatible options is given in \tref{tab:options}.As discussed in \Sref{sec:updateinvariants} the polyhedra computed duringevaluation are tighter than the invariants computed during preprocessing. Inorder to update the invariants after the partial evaluation a new option\texttt{--update-invariants} is added and activated by default.% \begin{table}% \center% \begin{tabular}[ccc]% \hline% \begin{comment}% analysis --cfr=native,update-invariants, --o% --cfr=irankfinder,% --cfr=off% prob-analysis --cfr=native% cfr --native% --irankfinder% \end{comment}% \hline% \end{tabular}% \caption{}% \end{table}\begin{comment}cfr --
The project was compiled with OCaml 4.14.0.
\addConstraint{ x_i }{\leq 1,\protect\quad\label{eq:lip1-leq1}}{i = 1,\dots,n}\addConstraint{ x_i }{\geq 0,\protect\quad\label{eq:lip1-geq0}}{i = 1,\dots,n}
\addConstraint{ x_i }{\leq 1,\protect\quad\label{eq:ilp-leq1}}{i = 1,\dots,n}\addConstraint{ x_i }{\geq 0,\protect\quad\label{eq:ilp-geq0}}{i = 1,\dots,n}
\ref{eq:lip1-hit} guarantees that every loop is \enquote{hit}, while\ref{eq:lip1-leq1} and \ref{eq:lip1-geq0} enforce that every location is onlyselected at most once. The optimization goal \ref{eq:lip-vfs} being to select
\ref{eq:ilp-hit} guarantees that every loop is \enquote{hit}, while\ref{eq:ilp-leq1} and \ref{eq:ilp-geq0} enforce that every location is onlyselected at most once. The optimization goal \ref{eq:ilp-vfs} being to select
\begin{comment}# Textbautsteine---For random program variables in this thesis, $\Omega$ will be the set ofintegers and $\mathcal{F}$ the maximal $\sigma$-field in $\Omega$, thepower-set $2^\Omega$. However since the program variables are chosen atrandom, a specific path over a given program can also be considered a randomevent in the probability space and hence be considered as a ransom variable.In this case $\Omega$ is the set of all runs $\mathcal{Runs}$. The exactdefinition will be introduced later on in \Sref{}.---\begin{definition}[Linear Constraints]Let $V$ be a fixed set of integerVariables $\{x_1, x_2, \dots, n\} $. A linear constraint is defined as $\psi= a_0 + a_1 x_1 + \dots + a_n x_n \diamond 0$ where $a_i \in \mathbb{N}$ areinteger coefficients, $x_i \in V$ are variables and $\diamond \in \{>, <,\geq, \leq, =\}$. \end{definition}\begin{definition}[Assignment]An assignment $\sigma : V \mapsto \mathbb{Z}$ assigns an integer value toeach variable in $V$. $\sigma$ is a solution for a linear constraint $\psi$\end{definition} \Common operations can be defined over probability distributions as well, whichwill be very handy in probabilistic integer programs. Given two probabilitydistributions $p$, and $q$, the following operations are defined:\begin{equation*}p \diamond q (x) = \sum_{\substack{i,j \in \Z \\ i \diamond j = x}} p(i)\cdot q(i) \hspace{1cm} \with \diamond \in \{+, \cdot\}\end{equation*}\todo{more operators?}In the context of this thesis the discrete bounded distributions will be used todescribe the possible values of a variable. They can be used as if they werenon-probabilistic values, for exymple andIt can be used inside theA note non-probabilistic (fixed) value $k$ is equivalent to a distribution whichassignes the probability $1$ to this fixed value $k$ and $0$ to every othervalue.program itself during assignments when setting a new value to a variable orduring analysis to describe the possible values of the variables at a certainconfiguration.\[p_k(x) = \begin{cases}1, & x = k \\0, & \text{otherwise}\end{cases}\]\end{comment}
\begin{comment}- Term Rewrite Systems (TRS)- Transition Systems- A set of States S- A start location S_0 \in S- A set of transitions T \in S x C x S- Integer Transition Systems (ITS)Koat2 can work on integer programs with polynomial integerarithmetic.\cite{Brockschmidt16} To represent states and invariants in such aprogram, one needs polynomial constraints. As an extended concept Koat2 usesboxes and we will use polyhedra for the partial evaluation. Since those conceptsare closely related we will introduce them together.\end{comment}
\section{Preliminaries}
\newcommand{\confs}[0]{\textbf{\textsf{Conf}}}\newcommand{\runs}[0]{\textbf{\textsf{Runs}}}\newcommand{\fpath}[0]{\textbf{\textsf{FPath}}}\section{Preliminaries}\label{sec:preliminaries}\subsection{Constraints}\label{ssec:constraints}
\section{Constraints}In the following $\V$ denotes a countable finite set of variables $\V = \{v_1,\dots, v_n\}$.\begin{definition}[Constraints]
\begin{definition}[Constraints]\label{def:constraints}
\begin{definition}[Satisfiability]For a domain $D$ and a structure $\mathfrak{I} = (D, +, \cdot, 0, 1, \leq,<, =)$, a constraint $\psi \in \L$ is \textit{satisfiable} in when the formula$\psi$ is $\mathfrak{I} \models \psi$. We say a variable assignment $\beta: \V\rightarrow D$ \textit{satisfies} the constraint $\psi$ when $(\mathfrak{I},
\begin{definition}[Satisfiability]\label{def:satisfiability}For a domain $A$ and a structure $\mathfrak{A} = (A, +, \cdot, 0, 1, \leq,<, =)$, a constraint $\psi \in \L$ is \textit{satisfiable} in when$\mathfrak{A} \models \psi$. We say a (total) variable assignment $\beta: \V\rightarrow A$ \textit{satisfies} the constraint $\psi$ when $(\mathfrak{A},
In practice constraints are often provided in a more general form. Restrictingthe constraints to a specific form does not limit it's practicality, since they
Note that the constraints are restricted to (strict) inequality against zero. Inpractice constraints are often provided in a more general form. Restricting theconstraints to the specific form does not limit it's practicality, since they
When talking about integer arithmetic we will thus only refer to linearconstraints.\section{Abstract Domains}\label{sec:abstract_domains}Another way of reasoning about constraints is to see them in an $n$-dimensionalspace. Constraints partition the space excluding one half. In case of integerarithmetic the solutions are discrete points in this space. Linear constraintsdescribe half-spaces and a set of linear constraints describes a (possiblyopen) convex polyhedron. More about polyhedron in \Sref{sec:polyhedron}.\begin{example}\label{ex:1}\begin{figure}[h]\label{fig:ex1}\centering\begin{subcaptiongroup}\subcaptionlistentry{Formula $\varphi_1$}\input{figures/ch1_example1_phi1}\subcaptionlistentry{Formula $\varphi_2$}\input{figures/ch1_example1_phi2}\subcaptionlistentry{Formula $\varphi_3 = \varphi_1 \land \varphi_2$}\input{figures/ch1_example1_phi3}\end{subcaptiongroup}% \subbottom[Formula $\varphi_1$]{% \centering% \input{figures/ch1_example1_phi1}% }% \subbottom[Formula $\varphi_2$]{% \centering% \input{figures/ch1_example1_phi2}% }% \subbottom[Formula $\varphi_3 = \varphi_1 \land \varphi_2$]{% \centering% \input{figures/ch1_example1_phi3}% }\caption{Graphical representation of the set of constraints $\varphi_1$,$\varphi_2$ and $\varphi_3$ from Example \ref{ex:1}. Every constraint isrepresented by a line and the excluded side is marked by hatchings. }\end{figure}Consider the following constraints over the variables $\V = \{x, y\}$:\begin{align}\varphi_1 &= 2y \leq x \\\varphi_2 &= 2y > x^2 - 4\end{align}They are graphically represented in \fref{fig:ex1} using a 2D-coordinatesystem. Since the space is two-dimensional the linear constraint $\varphi_1$can be represented as a straight line and $\varphi_2$ as a parabola.The solutions are marked as circles. For example, the assignment $\sigma: x\mapsto 0, y \mapsto -2$ (colored red) satisfies $\varphi_1$ but not$\varphi_2$. The conjunction $\varphi_3 = \varphi_1 \land \varphi_2$ isclearly equal to the intersection of both sets of solutions. While$\varphi_1$ and $\varphi_2$ have infinitely many solutions, the conjunction$\varphi_3$ has not.\end{example}The problem of determining, if a set of constraints is satisfiable or findingvalid variable assignments for a given formula is called \gls{smt}-solving andis well researched. The theories relevant for this thesis the arequantifier-free linear integer arithmetic (usually called \texttt{QF\_LIA}) andquantifier-free polynomial integer arithmetic (\texttt{QF\_PIA}).
When talking about integer arithmetic on only refers to linear constraints.
Sometimes one want's to remove a variable from a formula or a set ofconstraints. It is especially useful, when having a formula with temporaryvariables that are only relevant to a certain transition but should notinfluence the program after wards.
The problem of determining, if a set of constraints is satisfiable or findingvalid variable assignments for a given formula is part of \gls{smt} and is wellresearched. The theories relevant for this thesis the are quantifier-free linearinteger arithmetic (usually called \texttt{QF\_LIA}) and quantifier-freepolynomial integer arithmetic (\texttt{QF\_PIA}).
Formally, removing a variable is equivalent to existentially quantifying it.However, existential quantification is not available in quantifier free integerarithmetic. Instead one searches for a formula that describes all originalsolutions but restricted to the smaller domain.
Sometimes one is not interested in the result to all variables but only to asubset of variables. It will later become relevant when handling temporaryvariables that are only relevant to the scope of a transition. In FO-logic onewould just existentially quantify the variable. However the new FO-formulawouldn't be quantifier-free anymore. Instead a new formula is searched that hasjust the same solutions when ignoring the values for the irrelevant variables.
Let $X \subseteq V$ and $\varphi$ be an FO-formula over the variables $V$and let $\sigma|_X$ denote the restriction of $\sigma$ to the smaller todomain $X$.
Let $X \subseteq V$ and $\varphi$ be a quantifier-free FO-formula over thevariables $V$ and let $\beta|_X$ denote the restriction of the assignment$\beta$ to the smaller to domain $X$.
a formula $\varphi'$, with $\llbracket \varphi' \rrbracket =\set{\sigma|_X}{\sigma \in \llbracket \varphi \rrbracket}$.\todo{use $\sigma$ and $\beta$ consistently.}
a quantifier-free FO-formula $\varphi'$ over the variables $X$, with$\llbracket \varphi' \rrbracket = \set{\beta|_X}{\sigma \in \llbracket\varphi \rrbracket}$.
Geometrically, the projection can be seen as set of solutions projected(casting a shadow) onto the remaining dimensions.
\section{Abstract Domains}\label{sec:abstract_domains}Another way of reasoning about solutions is to see them as points in an$m$-dimensional space for variables $X \subset \V$ and $|X| = m$. Everyconstraint partitions the space into one included and one excluded part. In caseof integer arithmetic the solutions are discrete points in this space.An abstract domains is a representation of those sets of points. Usually theyover-approximate the sets in order to facilitate operations. Commonly usedabstract domains are Polyhedra\cite{bagnara2005convexpolyhedron}Octagons\cite{mine2001wcre, mine2007arxiv, mine2006hosc,truchet2010synacs}, andthe Interval domain\cite{}\subsection{Operations}\label{ssec:ad_operations}All abstract domains support the following operations with varying computationaland memory requirements: emptiness, intersection, inclusion test, orthogonalprojection, upper, and lower bounds. There are many more but we will focus onlyon the ones important for this thesis. They allThe test for emptiness is equivalent to asking if a formula is satisfiable asevery point represents a solution to the formula. Intersecting two or moreabstract values is equivalent to the conjunction of one or more formulas.However, the intersection of two abstract values might include more values thanthe solutions of the formula conjunction due to the approximative nature ofabstract domains.\begin{example}\label{ex:intersection}\begin{figure}[h]\centering\begin{subcaptionblock}[t]{0.3\textwidth}\input{figures/ch1_example1_phi1}\caption{Formula $\varphi_1$}\end{subcaptionblock}\begin{subcaptionblock}[t]{0.3\textwidth}\input{figures/ch1_example1_phi2}\caption{Formula $\varphi_2$}\end{subcaptionblock}\begin{subcaptionblock}[t]{0.3\textwidth}\input{figures/ch1_example1_phi3}\caption{Formula $\varphi_3 = \varphi_1 \land \varphi_2$}\end{subcaptionblock}\caption{Graphical representation of the set of constraints $\varphi_1$,$\varphi_2$ and $\varphi_3$ from Example \ref{ex:intersection}. Every constraint isrepresented by a line and the excluded side is marked byhatchings.\label{fig:ex_intersection} }\end{figure}Consider the following constraints over the variables $\V = \{x, y\}$:\begin{align}\varphi_1 &= 2y \leq x \\\varphi_2 &= 2y > x^2 - 4\end{align}They are graphically represented in \fref{fig:ex_intersection} using a2D-coordinate system. Since the space is two-dimensional the linearconstraint $\varphi_1$ can be represented as a straight line and $\varphi_2$as a parabola.The solutions are marked as circles. For example, the assignment $\sigma: x\mapsto 0, y \mapsto -2$ (colored red) satisfies $\varphi_1$ but not$\varphi_2$. The conjunction $\varphi_3 = \varphi_1 \land \varphi_2$ isclearly equal to the intersection of both sets of solutions. While$\varphi_1$ and $\varphi_2$ have infinitely many solutions, the conjunction$\varphi_3$ has not.\end{example}The orthogonal projection computes the projection of an FO-formula, possiblyloosing some precision in the process. Geometrically, the projection can be seenas set of solutions projected (casting a shadow) onto the remaining dimensions.
Among others the following operations are important to our use-case.1. Emptiness2. Intersection3. Inclusion test3. Orthogonal projection4. Upper and lower bounds\subsection{Not necessarily closed convex polyhedra}\label{ssec:nncpolyhedra}
In the following we will give a short introduction to the abstract domains. Avisual example for the three abstract domains is given in\fref{fig:abstract_domains}.\begin{figure}\centering\begin{subcaptionblock}[t]{0.3\textwidth}\input{figures/ch2_abstract_domains_box}\caption{Interval domain (Box)\label{fig:abstract_domains_box}}\end{subcaptionblock}\begin{subcaptionblock}[t]{0.3\textwidth}\input{figures/ch2_abstract_domains_oct}\caption{Octagon domain\label{fig:abstract_domains_oct}}\end{subcaptionblock}\begin{subcaptionblock}[t]{0.3\textwidth}\input{figures/ch2_abstract_domains_poly}\caption{Polyhedra domain\label{fig:abstract_domains_poly}}\end{subcaptionblock}\caption{Visual comparison of the Interval domain, Octagons andPolyhedra\label{fig:abstract_domains}}\end{figure}\subsection{Not necessarily closed convex polyhedra}\label{ssec:polyhedra}
As discussed in \Sref{sec:constraints} strictness of linear constraints and inturn openness of half-spaces are not relevant to the integer domain. Hence whentalking about integer arithmetic one refers only to polyhedra.\todo{more about tightness of operations}\todo{say something aboutoverapproximation of polynomial constraints}
An octagon can be described by a set of constraints of the form $\pm v_i \pm v_j\leq b$\cite{mine2001wcre}. Obviously, such constraints are linear constraintsand hence octagons are a specialization of polyhedra.
\begin{definition}\label{def:octagons}$\mathcal{P} \subseteq \R^n$ is an octagon if and only if it can beexpressed by an a set of constraints of the form $\pm v_i \pmv_j \leq b$\cite{mine2001wcre}.\end{definition}
Octagons are used in \gls{koat} during the computation of invariants. They arefaster to work with than polyhedra but less precise. \todo{cite runtime forvarious operations} As shown in \Sref{sec:invariant}, too wide constraints arenot a problem for invariants.
Obviously, such constraints are linear constraints and hence octagons are aspecialization of polyhedra. Octagons are used in \gls{koat} during thegeneration of invariants. They are faster to work with than polyhedra but lessprecise.
By construction every Box is an octagon and in turn a polyhedron. For everypolyhedron $P$ one can find a box $B$ that includes the polyhedron. Wheneverit's acceptable to arbitrarily widen the polyhedron, one can work with boxesinstead. Boxes are computationally very efficient, although rather imprecise,for most applications.\todo{beautify}\todo{example picture}
By construction every box is an octagon and in turn a polyhedron. For everypolyhedron $P$ one can find a box $P'$ that contains the polyhedron$P$.\todo{citation or proof}Whenever it's acceptable to arbitrarily widen thepolyhedron, one can work with boxes instead. Boxes are computationally veryefficient, although rather imprecise, for most applications.
\begin{definition}[Probability space]
Imagine probabilistic event with some possible outcomes $\Omega$. For example acoin-toss can have arbitrary many outcomes when taking the position on the tableinto account, the fact that the coin might get lost in the process etc.A probability measure describes the probabilities of sets of random outcomes.For example the coin-toss on the table has a sub-set of outcomes where thecoin's head is above. The probability measure would assign a probability closeto 50\% for a somewhat normal coin-toss.Formally a probability measure is defined on a $\sigma$-algebra which assertsthe intuitive property of random event: if you can measure two subsets ofoutcomes (e.g. $A = \set{ \omega \in \Omega} {\text{ coin shows head}}$, and $B= \set{\omega \in \Omega} {\text{coin shows tail}}$) so you can for both subsetstogether ($A \union B = \set{\omega \in \Omega} {\text{coin shows head ortails}}$) and the probability of the combined outcomes is equal to the sum ofthe individual outcomes.\begin{definition}[Probability Measure]\label{def:measure}A set function $\mu$ on a $\sigma$-field $\F$ is a probabilitymeasure if it satisfies the following conditions:\begin{enumerate}\item $\mu(A)\in [0,1]$ for $A \in \F$\item $\mu(\emptyset) = 0$\item if $A_1, A_2, ...$ is a disjoint sequence of $\F$-sets then\begin{equation*}\mu\left(\Union_{k=1}^\infty A_k \right) = \sum_{k=1}^\infty\mu(A_k)\end{equation*}\item $\mu(\Omega) = 1$\end{enumerate}\end{definition}A probability space describes the set of all possible outcomes $\Omega$, themeasurable sub-sets of outcomes $\F$ and the probability $P$ of those measurableoutcomes.\begin{definition}[Probability space]\label{def:probability_space}
\end{enumerate}\item $P$ assigns a probability to every event and satisfies thefollowing conditions:\begin{enumerate}\item $0 \leq P(A) \leq 1$ for $A \in \F$,\item $P(\emptyset) = 0$, and $P(\Omega) = 1$,\item $ $\todo{invariance of permutation.}
\end{enumerate}\todo{Refer to a definition of a $\sigma$-algebra.}\end{definition}\begin{definition}[Measure]A set function $\mu$ on a $\sigma$-field $\F$ in $\Omega$ is a probabilitymeasure if it satisfies the following conditions:\begin{enumerate}\item $\mu(A)\in [0,1]$ for $A \in \F$\item $\mu(\emptyset) = 0$\item if $A_1, A_2, ...$ is a disjoint sequence of $\F$-sets then\begin{equation*}\mu\left(\Union_{k=1}^\infty A_k \right) = \sum_{k=1}^\infty\mu(A_k)\end{equation*}\item $\mu(\Omega) = 1$
\item $P$ assigns a probability to every set in $\F$ and is aprobability measure.
In the following, we will only consider discrete random variables $X : \Omega\rightarrow \Z$ with countably (but possibly infinitely) many possible results,since only discrete random variables are relevant for this thesis.
In the following, we will encounter discrete random variables $X : \Omega\rightarrow \Z$ with countably (but possibly infinitely) many possible results.
between those locations. A run of the program is a sequence of configurations --that is a location and a state -- where every configuration is reachable via atransition from the previous configuration.Non-deterministic programs have mutually non-exclusive transitions leaving thesame location. Hence it is unclear which transition is taken at runtime. Thedecision is made by a scheduler. Also guards and assignments of the transitionmight depend on temporary variables that are not set by the state.
between those locations. A run of a program is a sequence of configurations,where every configuration contains the location at which the program is and astate that holds the current values of the program variables. A transitioncontains conditions, called \textit{guard}, that must be true for the currentstate to be taken. Whenever a transition is taken it can modifies the state withsome updates. An update assigns a new value to a program variable depending onthe previous state. Ever inter program starts at a unique location usuallycalled $l_0$. Whenever no transition guard is satisfied the program terminates.
Probabilistic programs are similar to non-determistic ones, but the transitiontaken is decied on a probabilistic basis. Imagine an oracle that throws somedice to determin the taken transition.In order to allow both concepts in a program, we use the concept of\textit{general transitions}: they are sets of (conventional) transitions withthe probabilities adding up to 1, acting as a single transition for theresolution of non-determinism. Non-determinism can only occur between generaltransitions and it is resolved first before probabilities are resolved.
\begin{example}\label{ex:nd_branching}\begin{figure}[h]\centering\input{figures/ch2_ex_pip_nd.tex}\caption{A non-deterministic integer program over the variables $\V =\{x\}$.\label{fig:ex_pip_nd}}\end{figure}
Deterministic and non-probabilistic programs are a special case of probabilisticprograms, where every general transition consist of a single transition and allgeneral transitions are mutually exclusive.
Take the program displayed in \fref{fig:ex_pip_nd} for example. At location$l_1$ with the state $\sigma(x) = 3$ both transitions $t_2$ and $t_3$ arepossible, since both of their guards $x > 0$ and $x > 1$ are satisfied. Atruntime the decision which transition is taken is arbitrary(non-determistic).\end{example}
This thesis will closely follow the definitions of F. Meyer \cite{https://link.springer.com/chapter/10.1007/978-3-319-66167-4_8}.For the following the set of all variables is split into the program variables$\PV$ and temporary variables $TV$. Program variables are assigned at the startof a program and can get values assigned during the run of a program, wherastemporary variables are only existentially quantified for every transition.Hence, they do not \enquote{carry over} between states.
A similar concept to non-determinism (where the decision is fully arbitrary), isto give a probability to each transition. Instead of picking the transitionfully arbitrarily, the transitions are picked with the respective probability.Although similar, non-deterministic and probabilistic branching treatedseparately. In order to allow both concepts in a program, one introduces theconcept of \textit{general transitions}: they are sets of (conventional)transitions with the probabilities adding up to 1, acting as a single transitionfor the resolution of non-determinism. Non-determinism can only occur betweengeneral transitions and it is resolved first before probabilities are resolved,and every transition in a general transition must have the same guard.
\begin{definition}[Probabilistic Integer Program]
\begin{example}\label{ex:prob_branching}\begin{figure}[h]\centering\input{figures/ch2_ex_pip_prob.tex}\caption{A probabilistic (and non-deterministic) integer program overthe variables $\V= \{x\}$. \label{fig:ex_pip_prob}}\end{figure}Take the program displayed in \fref{fig:ex_pip_prob} for example. Thetransitions $t_2$ and $t_3$ are part of a general transition (marked inred). For simplicity, all other transitions are unique members of their owngeneral transition and the trivial probability of 1 is omitted.At location $l_1$ with a variable assignment $\sigma(x) = 3$ threetransitions $t_2$, $t_3$ and $t_4$ are possible. The non-determinism isresolved first\end{example}Besides probabilistic branching, variable assignments can be non-deterministicor probabilistic, which is also called non-deterministic or probabilisticsampling. Non-deterministic sampling occurs when on a transition a variable isupdated to an expression that contains unbound variables (variables that are notassigned a value in the previous state). The unbound variables are sampledarbitrarily from all the integers, but can be constraint by the guard, in whichcase the transition is only taken if the non-deterministic sampling fulfills theguard. If the sampled value fulfills no transition's guard it means the programterminates. Non-deterministic sampling is resolved before the non-deterministicbranching.\begin{example}\label{ex:nd_sampling}\begin{figure}[h]\centering\input{figures/ch2_ex_pip_nds.tex}\caption{A integer program with non-deterministic sampling overthe variables $\V= \{x, t\}$ with program variables $\PV=\{x\}$.\label{fig:ex_pip_nds}}\end{figure}Take the program displayed in \fref{fig:ex_pip_nds} for example. Thetransition $t_2$ contains an update with a temporary $t$ variable.Furthermore, the temporary variable is constraint to $1 \leq t \leq 3$.Since the temporary variable $t$ is not bound in the state $\sigma: \PV\rightarrow \Z$, the variable can take any arbitrary value.Whenever temporary variable is sampled to a value not satisfying theguard of $t_2$ and x is larger then zero, the program would terminate.\end{example}Probabilistic sampling is done using probability distributions. Coincidentally,only the probability distributions from \Sref{ssec:distributions} are supported.Updates are restricted to integer polynomials with variables and additionallydistributions when probabilistic sampling is allowed. The value of theindeterminate in the polynomial is equal to the (possibly temporary) variable orsampled from the distribution with the corresponding probability.\begin{example}\label{ex:prob_sampling}\begin{figure}[h]\centering\input{figures/ch2_ex_pip_probs.tex}\caption{A integer program with probabilistic sampling overthe variables $\V= \{x\}$ with program variables $\PV=\{x\}$.\label{fig:ex_pip_probs}}\end{figure}Take the program depicted in \fref{fig:ex_pip_probs} for example. Intransition $t_2$, the variable x is assigned to the polynomial $x -\textsc{Bern}(0.5)$. The value of the Bernoulli distribution can take thevalue of 1 or 2 with a probability of 0.5 in both cases. In case that thedistribution is sampled to 1, the variable x is decremented, and stays thesame otherwise.\end{example}Deterministic non-probabilistic programs are a special case of probabilisticprograms, where every general transition consist of a single transition, allgeneral transitions are mutually exclusive, and no non-deterministic orprobabilistic sampling is done in the transitions. Similarly non-deterministicnon-probabilistic programs are a special case of probabilistic programs whereevery general transition contains only a single transition and no probabilisticsampling is allowed.This thesis will closely follow the formal definitions of E. Meyer\cite{meyer2021tacas}. First the formal definition for a probabilistic programis given, then the semantics are introduced.\begin{definition}[Probabilistic Integer Program]\label{def:pip}
\item a probability $p \geq 0$, that the transition is taken whenthe correspondig general transition $g$ is executed.\item a guard $τ \in C$\item an update function $η : \PV \mapsto \Z[\PV] \cup D$,mapping every program variable to a polynomial or a boundeddistribution function. All $t \in g$ must share the samestart location $l$, guard $τ$.
\item a probability $p \geq 0$, that the transition is takenwhen the corresponding general transition $g$ is executed.\item a guard $\tau \in C$\item an update function $\eta : \PV \mapsto \Z[\V \cup D]$,mapping every program variable to an integer polynomialcontaining variables or a distribution functions. All $t \ing$ must share the same start location $l$, guard $τ$.
\end{definition}\todo{examples}\begin{figure}\label{fig:pip1}\caption{A simple probabilistic integer program.}\end{figure}Let's recap the semantics of a probabilistic integer program.\fref{fig:pip1} gives an example for a very simple probabilistic integerprogram.\todo{give an example for temporary variables}Unsurprisingly, a PIP is executed starting at the start location with a givenassignment $\sigma_0 : \PV \rightarrow \N$. A pair of location and assignment$(l, \sigma) \in \L \times \Sigma$ is called a configuration and representsspecific state of a program during an execution.Given a configurationassignment for the program variables (the input to the program), the programis evaluated starting from the start location $l_0$.An execution path of a program is a sequence of configurations, where the firstconfiguration is the starting configuration.The curent configuration is represented as a tuple $(l_0, c_0)$.An execution path is a sequence of configurations.\subsection{Non-deterministic and probabilistic branching and assignments}\begin{comment}- Probabilistic branching- Non-determinstic branching- probabilistic assignments (via branching)- non-deterministic assignments (via temporary variables)\end{comment}\subsection{Runtime}\begin{comment}for a transition:- constant cost- most simple case cost = 1 for every transition- combining transitions and adding costs- costs depending on variablesfor a prefix:- finite sequence of configurations- runtime = sum of all costs over all steps in the prefix- probability = product over all transition probabilities in the prefix- legal prefix = non-zero probabilityfor a path:- infinite sequence of configurations- runtime = runtime of the largest non-zero probability prefix- prolegal paths = paths with a non-zerofor a program:- supremum over runtime of all paths\end{comment}\begin{definition}[Runtime of a run]
\begin{definition}[Worst-case runtime of a Program]\end{definition}\begin{definition}[Expected runtime of a Program]\end{definition}Various languages exist to represent integer programs in a computer readableform; namely Koat2\cite{}, IntC, Some weird XML Onthology% \cite{https://raw.githubusercontent.com/TermCOMP/TPDB/master/xml/xtc.xsd}% - CINT% - Koat Format% For probabilistic: PIPDuring this thesis we will transform programs. The goal is to find equivalentprograms that are easier to analyize by Koat2. But what does equivalence mean?On non probabilistic programs one would expect, that while beeing structurallydifferent, the outcome of the new program is the same as the old one.In addition the new program should take the same time as the old one. Sincecosts are represented as a variable, this is part of the outcome. But theruntime of a program is calculated by ranking-functions and they count thenumber of visits to a location. In order for the runtime to be the same, boundsshall remain the same. \todo{doesn't realy make sence!}In probabilistic programs, a the run is non-deterministic and hence calculatingthe runtime is impossible. Instead we have expected runtime-bounds. So the newprogram should have the same expected runtime-bounds and the same expectedcosts, as the old program. Let's give a more formal definition.
\begin{definition}[Program equivalence]Given two PIP $P = (\V, L, \GT, l_0)$ and $P' = (\V, L', \GT', l_0)$are equivalent if and only if the following properties hold:\begin{enumerate}\item \todo{continue}\end{enumerate}\end{definition}
\subsection{Strongly connected components}A given \gls{pip} induces two relations. The first one being the relation if atransition exists from one location to another and the other one describes if apath exists from one location to another.
\subsection{Semantics}\todo{open question regarding temporary variables}% the original definition alled states defining variables, but this let's% temporary variables carry-over to the next transition. Is this what we% want?Before defining the semantics of a \gls{pip} a new special location $l_\bot$ isintroduced that will be used to indicate the termination of a program as well asa virtual transition $t_\bot$ and general transition $gt_\bot = \{t_\bot\}$ withthe same purpose. Instead of just vaguely \enquote{terminating} a program takesthe transition $gt_\bot$ to the location $l_\bot$ if and only if no othertransition is walkable. Since no normal transition leaves the location $l_\bot$the run loops indefinitely on the location $l_\bot$ after termination. Inaddition a new transition $t_\text{in}$ will be used for the start of theprogram.A state $s$ is a variable assignment for the variables of a program $s: \V\rightarrow \Z$. \Sigma is the set of all states. The set of configuration isdefined as $\confs = L \union \{l_\bot\} \times \T \union \{t_{\text{in}},t_\bot\} \times \Sigma$.Note that additionally to the location and state, the last transition taken tothe current location is captured as well. This won't be relevant for the partialevaluation but is important to define the probabilistic properties of a\gls{pip}.A run of the program is an \textit{infinite} sequence of configurations, and theset of all runs is defined as $\runs = \confs^\omega$. In contrast a finiteprefix is a finite sequence of configurations. The set of all finite prefixes isdefined as $\fpath = \confs^*$.\begin{rem}The set of configurations does not exclude invalid combinations oflocations and transitions, that is a configuration with a location that isnot the target of the incoming transition, or a state that is impossible toreach via the incoming transition. Conversely, the set of runs and finiteprefixes does not exclude invalid paths, e.g. two consecutive configurationsthat cannot be reached via the given transition, etc. The validity of a pathwill be described by a probability measure where invalid paths have a zeroprobability.\end{rem}For a \gls{pip} one can define a $(\runs, \F, \textit{pr})$. The outcomes ofthe probabilistic event in this space are the runs on the program. Everydistinct run is measurable hence the $\sigma$-algebra $\F$ is defined to containevery set ${r} \in \F$ for $r \in \runs$. The probability measure $pr$ describesthe probability of the given runs. Formally the probability space is defined bya cylindrical construction \todo{elaborate}. Since the construction is ratherinvolved we will only sum up and the properties relevant to this thesis.First the probability of a run is separate from the non-determinism. So for allthe considerations the non determinism is assumed to be decided by a schedulerand the probability measure depends on the scheduler.The probability of specific run in the probability space is equal to theprobability that every transition in the run is taken. Naturally, invalidtransitions (either because they do not exist in the program, or because theguard is not satisfied) have a zero probability.The probability space will be used to define the semantics of a \gls{pip}\subsection{Notes about the probability measure}Probability that a variable $x$ takes the value $s(x)$ when taking thetransition $t$ from an (non-deterministically resolved) assignment $\tilde{s}$:\todo{redefine with polynomials over indeterminates}\begin{align}q_{\tilde{s},t,s(x)}(x)\end{align}(Trivial) probability that the temporary variables got assigned there expectedvalues, given a (non-deterministically resolved) assignment $\tilde{s}$ and theresulting assignment $s$\begin{align}\delta_{\tilde{s}(u),s(u)})\end{align}Probability of a transition after a finite prefix: For $F = \dots (l', t', s') \in \fpath$, $(l, t,x) \in \confs$, $t = (l', p, \tau, \eta, l)$.\begin{align}pr_{\mathfrak{S},s_0}((l', t',s') \rightarrow (l, t, s)) :=pr_{\mathfrak{S},s_0}(f \rightarrow (l,t,s))= p \cdot \prod_{x\in\PV}q_{\tilde{s},t,s(x)}(x) \cdot \prod_{u \in \TV} \delta_{\tilde{s}(u),s(u)})\end{align}\begin{rem}The probability depends on the scheduler $\mathfrak{S}$ and the startingstate $s_0$.\end{rem}Probability of a finite prefix. For $f = c_0, \dots, c_n \in \fpath$:\begin{align}pr_{\mathfrak{S},s_0}(f) = pr_{\mathfrak{S},s_0}(c_0) \cdot \prod_{i=1}^n pr_{\mathfrak{A},s_0}(c_{i-1} \rightarrow c_i)\end{align}Cylindrical construction yields the unique probability measure$\mathbf{P}_{\mathfrak{S},s_0} : \F \rightarrow [0,1]$ for infinite runs\subsection{Runtime}\begin{comment}for a transition:- constant cost- most simple case cost = 1 for every transition- combining transitions and adding costs- costs depending on variablesfor a prefix:- finite sequence of configurations- runtime = sum of all costs over all steps in the prefix- probability = product over all transition probabilities in the prefix- legal prefix = non-zero probabilityfor a path:- infinite sequence of configurations- runtime = runtime of the largest non-zero probability prefix- prolegal paths = paths with a non-zerofor a program:- supremum over runtime of all paths\end{comment}\begin{definition}[Runtime of a run]\end{definition}\begin{definition}[Worst-case runtime of a Program]\end{definition}\begin{definition}[Expected runtime of a Program]\end{definition}% Various languages exist to represent integer programs in a computer readable% form; namely Koat2\cite{}, IntC, Some weird XML Onthology% \cite{https://raw.githubusercontent.com/TermCOMP/TPDB/master/xml/xtc.xsd}% - CINT% - Koat Format% For probabilistic: PIP% During this thesis we will transform programs. The goal is to find equivalent% programs that are easier to analyze by Koat2. But what does equivalence mean?% On non probabilistic programs one would expect, that while beeing structurally% different, the outcome of the new program is the same as the old one.% In addition the new program should take the same time as the old one. Since% costs are represented as a variable, this is part of the outcome. But the% runtime of a program is calculated by ranking-functions and they count the% number of visits to a location. In order for the runtime to be the same, bounds% shall remain the same. \todo{doesn't realy make sence!}% In probabilistic programs, a the run is non-deterministic and hence calculating% the runtime is impossible. Instead we have expected runtime-bounds. So the new% program should have the same expected runtime-bounds and the same expected% costs, as the old program. Let's give a more formal definition.
Ranking functions are a widely use tool to prove termination and runtime-bounds of integer transition systems. The simplest variant of ranking functions are \gls{lrf}
% Ranking functions are a widely use tool to prove termination and runtime-bounds of integer transition systems. The simplest variant of ranking functions are \gls{lrf}
We seek a function $f(x_1,\dots,x_n) = a_1x_1 + \cdots + a_nx_n + a_0$ with the rationals as a co-domain, such that\begin{enumerate}\item $f(\bar{x}) \leq 0$ for any valuation $\bar{x}$ that satisfies the loop constraints;\item $f(\bar{x}) - f(\bar{x}') \geq 1$ for any transition leadtion from $\bar{x}$ to $\bar{x}'$\end{enumerate}\cite{benamram2017}\todo{plagiat}
% We seek a function $f(x_1,\dots,x_n) = a_1x_1 + \cdots + a_nx_n + a_0$ with the rationals as a co-domain, such that% \begin{enumerate}% \item $f(\bar{x}) \leq 0$ for any valuation $\bar{x}$ that satisfies the loop constraints;% \item $f(\bar{x}) - f(\bar{x}') \geq 1$ for any transition leadtion from $\bar{x}$ to $\bar{x}'$% \end{enumerate}% \cite{benamram2017}\todo{plagiat}% Intuition: The ranking function decreases for every walkable transition. If such a ranking function exists, the loop terminates. \todo{not sure about this, guess}% \section{Lexicographic ranking functions}% We expand the \gls{lrf} to a tuple of \gls{lrf}. Where whenever lexicograpically higher function decreases lower ones can increase again.
\section{Goal}Transform a program into an equivalent program. \todo{define equivalence ofprograms}
\begin{comment}- Term Rewrite Systems (TRS)- Transition Systems- A set of States S- A start location S_0 \in S- A set of transitions T \in S x C x S
Koat2 can work on integer programs with polynomial integerarithmetic.\cite{Brockschmidt16} To represent states and invariants in such aprogram, one needs polynomial constraints. As an extended concept Koat2 usesboxes and we will use polyhedra for the partial evaluation. Since those conceptsare closely related we will introduce them together.\end{comment}
is strongly related to the question upper and lower runtime bounds, or theruntime complexity of a program, is equally non-computable. Nevertheless, theanswers to those questions are very important, since proven termination or smallruntime-complexities are generally desirable in practice. For example a compilermight want to warn the programmer about faulty code sections that were notmarked explicitly to run indefinitely, or the developer could be forced to provethe efficiency of his program in critical scenarios. In an ever more complexworld, with ever more complex algorithms, the need for automatic tools arised.Even though the question for runtime-complexity can not be answered in a generalcase, many tools were developed to automatically analyze the complexity ofvarious programming paradigms as tight and fast as possible.
is strongly related to the search for upper and lower runtime bounds, or theruntime complexity of a program, is equally undecidable. Nevertheless, theanswers to those questions are very important in practice. For example acompiler might want to warn the programmer about faulty code sections that werenot marked explicitly to run indefinitely, or the developer could be forced toprove the efficiency of his program in critical scenarios. In an ever morecomplex world, with ever more complex algorithms, the need for automatic toolsarised. Even though the question for runtime-complexity can not be answered in ageneral case, many tools were developed to automatically analyze the complexityof various programming paradigms as tight and fast as possible.
runtime-complexity and size bounds on integer transition systems. At its core,KoAT iteratively computes ranking \todo{be more precise}functions to prove upperbounds, alternating between size and runtime-bounds and thus leveraging therelation between both to get tight results. Optionally to improve the analysis,KoAT resorts to refinement techniques, like \gls{mprf} and\gls{cfr}\cite{giesl2022lncs}. Recently, KoAT was expanded to work onprobabilistic integer programs as well\cite{meyer2021tacas}.
runtime-complexity and size bounds on transition systems. At its core,KoAT computes bounds for subprograms using \gls{mprf} or \gls{twn}, and thenputting the subprograms together generate upper runtime-complexity and sizebounds for the whole program.Optionally to improve the analysis, Koat can resort to partial evaluationas a control-flow refinement techniques\cite{giesl2022lncs}.At first \gls{koat} was aimed at linear integer programs, but it was recentlyextended to probabilistic integer programs\cite{meyer2021tacas} andnon-linear arithmetic\todo{citation needed}.
The tool iRankFinder is another tool taking part in \gls{termcomp}. It converts\gls{its} into \gls{chc} and uses% Hier weiter machen
The tool iRankFinder\cite{irankfinder2018wst} is another tool taking part in\gls{termcomp}. It uses linear and lexicographic ranking functions for theanalysis and the program is internally represented as \gls{chc} program, whichis very similar to integer transition systems in the regard that integertransition systems can be transformed into equivalent \gls{chc} systems.Control-flow refinement is a technique where a program is transformed into asemantically equivalent program, which is easier to analyse with the existinganalysis techniques (e.g. linear, lexicographic, and multiphase rankingfunctions). The technique was put forward by \citeauthor{gulwani2009}\cite{gulwani2009}. A typical example of such a program can be seenin example \ref{ex:splitterloop}, where a single loop is replaced by twoindependent consecutive loops. iRankFinder successfully implemented thecontrol-flow refinement technique by partial evaluation proposed by\cite{gallagher2019eptcs,domenech2019arxiv}.
This thesis improves on the latest version of \gls{koat} by reimplementing thecontrol flow refinement technique of iRankFinder\cite{irankfinder2018wst}natively in KoAT instead of calling an external library and adapting it to theparadigm of probabilistic programs. The well-known basics required to understandthis thesis are recapitulated in \Sref{sec:preliminaries}, focusing especiallyon constraints, probability, and (probabilistic) integer programs. In\Sref{sec:theory} the \gls{cfr} techniques from iRankFinder are adapted to\Gls{pip}, showing the proving the theoretical correctness of the construction.\Sref{sec:implementation} will focus on the changes made to \gls{koat} and thechallenges met during implementation of the technique in Ocaml. Finally in\Cref{sec:conclusion} the new version of \gls{koat} will be compared to itspredecessor with regard to the found runtime-complexity and size bounds as wellas the overall computational performance of the tool using the problems from the\gls{tpdb}.
\gls{koat} borrowed the implementation of iRankFinder as a black-box which camewith a number of limitations\cite{giesl2022wst}: Besides code-quality andperformance issues, the external library wouldn't be able to profit from futureimprovements to the algorithms. This thesis improves on the latest version of\gls{koat} by reimplementing the control flow refinement technique ofiRankFinder\cite{irankfinder2018wst} natively in KoAT instead of calling theexternal library and adapting it to the paradigm of probabilistic programs.The well-known basics required to understand this thesis are recapitulated in\Sref{sec:preliminaries}, focusing especially on constraints, probability, and(probabilistic) integer programs. In \Sref{sec:theory} the \gls{cfr} techniquesfrom iRankFinder are adapted to \Gls{pip}, showing the proving the theoreticalcorrectness of the construction. \Sref{sec:implementation} will focus on thechanges made to \gls{koat} and the challenges met during implementation of thetechnique in OCaml. Finally in \Cref{sec:conclusion} the new version of\gls{koat} will be compared to its predecessor with regard to the foundruntime-complexity and size bounds as well as the overall computationalperformance of the tool using the problems from the \gls{tpdb}.
\newacronym{twn}{TWN}{triangular weakly non-linear loop}\newacronym{scc}{SCC}{strongly connected component}