\section{Property-based abstraction}\label{sec:abstraction}
In \Sref{sec:partialevaluation} the abstraction the details of the abstraction
were omitted. In this section we will present the property-based abstraction
proposed by \citeauthor{Domenech19}\cite{Domenech19} for non-probabilistic
programs but which can be applied equally to probabilistic programs. Recall
Definition \ref{def:abstraction}, an abstraction space must be finite and the
abstraction function must map any probability to elements of this abstraction
space.
The property-based abstraction achieves a finite abstraction space by generating
a finite number of so-called \emph{properties} which are single constraints. An
abstracted value is a subset of the predetermined properties. Selecting
appropriate properties is crucial to a good partial evaluation.
In practice, the abstraction function $\alpha$ can take the current location or
even the whole evaluation history into account. Besides, the abstraction can be
further generalized to only require a finite abstraction space on if an abstraction is
only necessary on one location per loop
and one can select different properties when abstracting constraints on
different locations. As long as the set of properties is finite for every
location, the partial evaluation graph is guaranteed to be finite.
\begin{comment}
pro: easy to implement
contra:
- abstractions are expensive
- abstractions loose precision
open question:
- how to select properties
\end{comment}
The naive approach would be to abstract at every location, as it also stated in
\cite{Domenech19}. The advantage is that the implementation is easy. The
evaluated program is small and the analysis thereof fast. The disadvantage is
that a lot of evaluated control-flow is lost to the abstraction.
The property based abstraction abstraction presented in
\Sref{sec:propertybasedabstraction} requires $\Theta(n)$ calls to the
\gls{smt}-solver which is computationally expensive.
The primary goal of the presented \gls{cfr} via partial evaluation is to
evaluate overlapping loops in order to find easier to analyse smaller loops.
In practice, the control-flow of a loop is mostly linear. Abstracting the linear
control-flow could lead to two linear sections merging together which would
otherwise-stay separated and thus make finding tight bounds harder in the
process.
\subsection{Abstract Loop heads}\label{ssec:loopheads}
The second method proposed by \cite{Domenech19} is to abstract only on loop
heads. Termination is guaranteed since every infinite path in the evaluation
tree of a finite program must visit some loop (and in turn the head) an infinite
number of times. With only finitely many possible abstraction results, it is
guaranteed that at some point some version repeats and the evaluation
backtracks.
The exact definition of a loop head is unclear. For sake of termination it is
sufficient that every loop contains at least one (but possibly arbitrary)
location marked for abstraction. The intuitive candidate is the head of the loop
since that's the location where most branching will probably occur in practice.
For this implementation a loop head is defined as follows.
\begin{definition}
A \textit{loop head} of a loop is the first location of the loop encountered
during a depth-first search starting at the start location $l_0$.
\end{definition}
When selecting all loop heads for abstraction, multiple loop heads can end up on
a single loop, for when the loop is overlapping with another loop.
\begin{figure}
\input{figures/ch3_loopheads}
\end{figure}
(see examples \todo{} \todo{}). Detecting loops and loop-heads is further
discussed in \Sref{sec:findingloops}.
\todo{prove upper bound on program size}
% Since
% In addition control-flow refinement was introduced to unroll loops in such a way
% that hidden branching is resolved. The heuristics presented by \cite{Domenech19}
% are focused on loops and the properties are selected for the loop heads.
% Constructing a heuristic, that is well suited for every location in the program
% is not obvious is left open for future work.
\subsection{Property based abstraction}\label{sec:propertybasedabstraction}
\section{Sub-\gls{scc} level evaluation}
For the sub-SCC level evaluation Algorithm \ref{alg:abstr_eval} is adapted to
stop the evaluation when leaving the component and instead of starting only from
the initial location of the program, the evaluation starts from every
entrytransition to the program. The exact algorithm is displayed in Algorithm
\ref{alg:eval_component}.
\begin{algorithm}
\caption{Evaluate the component $S \subseteq \T_\Prog$ of a Program $\Prog$
with abstraction $\alpha$.\label{alg:eval_component}}
\KwData{ A \gls{pip} $\Prog$, component $S$, abstraction function $\alpha$,
and abstraction oracle $S_\alpha$}
\KwResult{A Graph $G$, and a set of pairwise disjunct general transitions
$\GTG \subset \GT_\text{PE}$}
$\ET \leftarrow \Tout(S)$\;
\SetKwFunction{FEvaluate}{evaluate}
\SetKwProg{Fn}{Function}{:}{end}
$\GTG \leftarrow \emptyset$\;
\Fn{\FEvaluate{$G$, $v$}}{
$\langle l, \varphi\rangle \leftarrow v$\;
\For{$g \in \GTout(l)$} {
$g' \leftarrow \unfold^*(v, g)$\label{alg:abstr_line_unfold_g}\;
\If {$g^* \neq \emptyset$ \label{alg:abstr_line_filteremptyg}}{
$g^* \leftarrow \emptyset$\;
\For{$t = (v, p, \tau, \eta, \langle l', \varphi\rangle) \in g$}{
\uIf {$g \in \ET$} {
$t \leftarrow (v, p, \tau, \eta, \langle l',
\texttt{true}\rangle)$\label{alg:abstr_line_exit}\;
}
\ElseIf {$S_\alpha$} {
$t \leftarrow (v, p, \tau, \eta, \langle l',
\alpha(\varphi)\rangle)$\label{alg:abstr_line_abstraction}\;
}
$g^* \leftarrow g^* + t$\label{alg:abstr_line_addtog}\;
\uIf {$v' \not\in V(G)$}{
$G \leftarrow G + t$\;
$G \leftarrow \FEvaluate{G, v'}$\;
}
\Else {
$G \leftarrow G + t$\label{alg:abstr_line_addt}\;
}
}
$\GTG \leftarrow \GTG + g^*$\;
}
}
\Return {$G$}
}
$G \leftarrow \text{ graph of }\Prog \text{ lifted to a partial evaluation
graph}$\;
$G \leftarrow G - S$\;
\For{$t = (\ell, p, \tau, \eta, \ell')\in \Tin(S)$} {
$G \leftarrow \FEvaluate{$G, \langle\ell',\texttt{true}\rangle $}$ \;
}
\Return $G, \GTG$
\end{algorithm}
The partial evaluation starts at every entry transition $\T_in(S)$ to the
component. When unfolding an exit transition $t = (\ell, \_, \_, \_, \ell') \in
\Tout(S)$ from a version $\langle \ell, \varphi\rangle$, to $\unfold^*(\langle
\ell, \varphi\rangle, t) = \{(\langle\ell,\varphi\rangle, \_, \_, \_,
\langle\ell', \varphi'\rangle)\}$ the target version
$\langle\ell',\varphi'\rangle$ is replaced by the trivial overapproximation
$\langle\ell',\texttt{true}\rangle$. This target is always already contained in
the program, by construction and the algorithm always backtracks.
Lets cosider an admissible finite prefix $f \in \fpath_\Prog$ with
\begin{align*}
f &= c_0c_1\dots{}c_k\dots{}c_m\dots{}c_n \\
&=
(\ell_0,t_\in,s_0)(\ell_1,t_1,s_1)\dots\underbrace{(\ell_k,t_k,s_k)\dots(\ell_m,t_m,s_m)}_{t_k,\dots,t_m
\in S}\dots(\ell_n,t_n,s_n)
\end{align*}
with an infix $c_m\dots{}c_k$ going through the component.
By construction, up to the component the partial evaluation $\Prog'$ contains a
similar prefix
$(\langle\ell_0,\texttt{true}\rangle,t_\in,s_0)(\langle\ell_1,\texttt{true}\rangle,t_1,s_1)\dots(\langle\ell_{k-1},\texttt{true}\rangle,t_{k-1},s_{k-1})
\in \fpath_{\Prog'}$. The component is entered via the transition
$t_k\in\Tin(S)$ which is was used as a starting point for the evaluation with
the trivial version $\langle\ell_{k-1}, \texttt{true}\rangle$ in Algorithm
\ref{alg:eval_component}. By the same argument as in Theorem
\ref{thm:correctness}, a similar prefix
$(\langle\ell_0,\texttt{true}\rangle,t_\in,s_0)(\langle\ell_1,\texttt{true}\rangle,t_1,s_1)\dots(\langle\ell_{k-1},\texttt{true}\rangle,t_{k-1},s_{k-1})(\langle\ell_k,\varphi_k\rangle,t_k,s_k)\dots(\langle\ell_{m-1},\varphi_{m-1}\rangle,t_{m-1},s_{m-1})
\in \fpath_{\Prog'}$ through the component exists in $\Prog'$. The exit
transition $t_m$ must have been unfolded before backtracking and afterwards the
similar prefix trivially follows the lifted versions again, yielding a similar
finite prefix
$(\langle\ell_0,\texttt{true}\rangle,t_\in,s_0)(\langle\ell_1,\texttt{true}\rangle,t_1,s_1)\dots(\langle\ell_{k-1},\texttt{true}\rangle,t_{k-1},s_{k-1})(\langle\ell_k,\varphi_k\rangle,t_k,s_k)\dots(\langle\ell_{m-1},\varphi_{m-1}\rangle,t_{m-1},s_{m-1})(\langle\ell_m,\texttt{true}\rangle,t_m,s_m)\dots(\langle\ell_n,\texttt{true}\rangle,t_n,s_n)
\in \fpath_{\Prog'}$.
For any finite prefix visiting the component more than once, the similar finite
prefix of $\Prog'$ is found by making the argument above multiple times.
\section{Foo}
\citeauthor{Domenech19} described the property based abstraction that
will be used for this algorithm. The idea is the following: the solution space
is cut into pieces by a finite number of constraints, the so called properties.
Then the abstraction computes which of those properties, are entailed by the
version. Those are chosen as abstract representation for the version. Since the
number of properties is finite, the powerset of properties is finite as well and
hence the number of all possible abstractions is too.
Also the abstraction is entailed by the version. \todo{rewrite}
\begin{definition}[Property based abstraction].
\todo{definition}
\end{definition}
\begin{example}
\todo{picture with properties}
\end{example}
Every abstraction is expensive, because it requires numerous entailement checks
(depending on the number of properties), wich require each a call to a
SAT-checker. The more properties are chose the more precise the abstraction
get's but the more expensive the computation is. Choosing the relevant
properties for the loops is done heuristically and will be described in the
following subsection.
\begin{comment}
motivate abstraction
\end{comment}
Selecting the locations to abstract is crucial for good implementation. From a
theoretical point of view the selection of locations to abstract guarantees
termination of the control-flow refinement. Yet, with every abstraction the
refined program looses precision compared to the evaluated control-flow. In the
following some different possibilities to select the locations marked for
abstractions are discussed.
Finding loops in a program can be done with the algorithm described by
\citeauthor{johnson1975}\cite{johnson1975}.
The algorithm is linear in the number of edges in the graph, specifically
$\mathcal{O}(n+e (c+1))$
\subsection{Feedback Vertex Set}
As stated in the previous section \Sref{ssec:loopheads}, it is sufficient to
find only a set of locations that cover every loop in the program. The problem
is one of Karp's 21 NP-complete problems and
commonly called \textit{Feedback Vertex Set} or \textit{Feedback Node Set}\cite{karp2010reducibility}.
\begin{definition}[Feedback Vertex Set Problem]
Given a directed graph $G = (V, E)$. Find the minimal subset $R \subseteq
V$ such that every (directed) cycle of $G$ contains a node in $R$.
\end{definition}
Detecting all loops(cycles) in a graph can be done easily done with an
algorithm describe\gls{dfs} on
the graph. And the problem can instead be formulated as a hitting set problem.
\begin{definition}[Hitting Set Problem]
Given a finite collection of finite sets $S = \{S_1, \dots, S_n\}$, where
each $S_i \subseteq U$ for some set $U$, find a minimal set $R$ such that
$R\intersect S_i \neq \emptyset$ for all $i = 1,\dots,n$.
\end{definition}
A hitting set problem can be formulated as a \gls{ilp} and solved by readily
available \gls{smt} solvers. Recall definition \ref{def:loop}. In the following
the \gls{ilp} used for finding the feedback vertex set of an integer program
with locations $L$ and a total of $N$ loops $a_1, \dots, a_N$ is constructed. It
is assumed that all loops have been found by \gls{dfs} as described in
\Sref{findingloops}.
\begin{mini!}
{}{\sum_{i=1}^{n} x_i\label{eq:lip1-goal}}{\label{eq:ilp-vfs}}{}
\addConstraint{\sum_{i : l_i \in \text{Loc}(a_j)} x_i}{\geq
0,\protect\quad\label{eq:lip1-hit}}{j = 1,\dots,N}
\addConstraint{ x_i }{\leq 1,\protect\quad\label{eq:ilp-leq1}}{i = 1,\dots,n}
\addConstraint{ x_i }{\geq 0,\protect\quad\label{eq:ilp-geq0}}{i = 1,\dots,n}
\end{mini!}
Every variable $x_i$ of the \gls{ilp} \ref{eq:ilp-vfs} indicates if a location
is part of the hitting set in which case the variable is assigned to 1, or not
in which case the variable is assigned the value 0. The constraints
\ref{eq:ilp-hit} guarantees that every loop is \enquote{hit}, while
\ref{eq:ilp-leq1} and \ref{eq:ilp-geq0} enforce that every location is only
selected at most once. The optimization goal \ref{eq:ilp-vfs} being to select
the fewest locations possible.
Computing the feedback vertex set of a program is independent of the partial
evaluation and can be done a-priori. It is thus considered to be an offline
method. It shall be noted that the locations inside the feedback vertex set do
not need to be the closed location to the start location of a loop. For
determination of properties it might be necessary to rotate the loop so that the
marked locations are the first location in the loop.
\todo{prove upper bound on program size}
\subsection{$k$-encounters}
We have seen two offline strategies to select locations for abstraction. In the
following we will introduce a technique that decides on abstraction during
partial evaluation using the path evaluated. The technique counts the number of
occurences on the path for every location in the program. Instead of abstracting
on first re-occurrence, one can decide on a parameter $k \in \N$ resulting in
the first $k$ re-occurrences to be ignored before abstracting.
The idea is to unroll loops up to a constant number of times so that loops with
constant bounds $b \leq k$ are perfectly unrolled without any abstraction.
\begin{example}
\todo{example with constantly bounded loop}
\end{example}
\todo{formalize, prove upper bound on program size}
\begin{comment}
technique:
- online
- daikstra during traversal
pro:
- takes unreachable loops into account
- allows for some constantly bounded loops to be perfectly unrolled.
contra:
- hard to implement
- hard to reason on
\end{comment}
\section{Choosing properties}
\section{Correctness}
\begin{definition}[Equivalent \glspl{pip}]
Let $\Prog_1,\Prog_2$ be \glspl{pip} with $\Prog_1 = (\PV, \Loc_1, \GT_1,
l_0)$, $\Prog_2 = (\PV, \Loc_2, \GT_2, l_0)$ and their respective
probability spaces $(\runs_1, \F_1, \PrSs^1)$ and $(\runs_2, \F_2,
\PrSs^2)$. Besides $\T_1$ and $\T_2$ denote the set of transitions of
$\Prog_1$ and $\Prog_2$; $\fpath_1$ and $\fpath_2$ the set of finite paths,
etc.
The two programs are equivalent if the two random variables $\Rt_{\Prog_1}$
and $\Rt_{\Prog_2}$ have the same probability distributions $\mu_1$ and
$\mu_2$ with $\mu_1(k) = \mu_2(k)$ for all $k\in\overline{\N}$.
\end{definition}
\begin{theorem}
If $\Prog_1$ and $\Prog_2$ are equivalent then:
\begin{enumerate}
\item A total runtime-complexity bound for the $\Prog_1$ is a total
runtime complexity bound for $\Prog_2$ and vice-versa.
\item $\Prog_1$ and $\Prog_2$ have the same expected runtime-complexity.
\end{enumerate}
\proof{
Let $\RB_P$
}
\end{theorem}
\begin{rem}
The probability distribution doesn't care about the underlying probability
space making it the perfect tool to compare the semantics of two different
\glspl{pip}.
\end{rem}
We will show the partial evaluation $\Prog'$ is equivalent to the original
\gls{pip} $\Prog$. In a first step we will show that for every admissible finite
prefix in $\Prog$ there exists an admissible finite prefix in $\Prog'$ with the
same probability and intermediate states and vice versa. This in turn will be
expanded for all admissible runs and finally we will show that this proves an
equal probability distribution of the total runtime complexity.
\subsection{Considering probability}
During this section the fact that programs are probabilistic has been completely
ignored. One could wonder if loops with zero probability should be removed in
order to further shrink the problem. We will assume that all transitions with
zero probability have been removed before the analysis. All other transitions
with a non-zero probability are walkable in theory and will be unfolded by the
partial evaluation. Hence they must be considered for the set of abstracted
locations in order to guarantee termination of the algorithm.
If a path has should result in a zero-probability, this could be filtered during
partial evaluation.
\section{Partial evaluation of \gls{scc}s}
During \gls{koat}s analysis the program is split into \glspl{scc}. Bounds are
found for every \gls{scc} separately and then composed into an overall bound at
the end. Control-flow refinement is only needed for those \glspl{scc} where the
analysis fails to find tight bounds. In practice, linear bounds are considered
tight enough, whereas polynomial and exponential complexity bounds warrant
refinement in hope for tighter bounds.
The previously presented control-flow refinement starts at the start locations
and evaluates the whole program. The size of the refined program is limited by
the number of locations and transitions as well as the number of properties on
each abstracted location. The smaller the initial program, the faster the
control-flow, the refined program and subsequent analysis. Instead of evaluating
the program, one can adapt the partial evaluate to only evaluate a single
\gls{scc} of the program. This section presents the partial evaluation of an
\gls{scc}.
The result is again an entire program, where only transitions the \gls{scc}, and
incoming and outgoing transitions to this \gls{scc} are changed.
In a first step the program is copied as a whole, lifted to a trivial evaluation
graph and the \gls{scc}, and all incoming and outgoing transitions are removed
from the copy.
Then the evaluation algorithm is executed from every incoming transition, but
with the additional condition to backtrack after unfolding an exiting
transition. The new transitions are added into the copied evaluation graph and
therefore progress made by earlier evaluations of entry transitions can be
reused, shortening the evaluation of succeeding incoming transitions.
Finally, the evaluation graph is converted back to a \gls{pip} and returned as
result of the partial evaluation of the \gls{scc}.
\begin{figure}[h]
\centering
\begin{subcaptionblock}[t]{0.45\textwidth}
\centering
\input{figures/ch4_scc_prog}
\caption{The original program}
\end{subcaptionblock}
\begin{subcaptionblock}[t]{0.45\textwidth}
\centering
\input{figures/ch4_scc_lift}
\caption{Copy and lift}
\end{subcaptionblock}
\begin{subcaptionblock}[t]{0.45\textwidth}
\centering
\input{figures/ch4_scc_remove}
\caption{Remove SCC}
\end{subcaptionblock}
\begin{subcaptionblock}[t]{0.45\textwidth}
\centering
\input{figures/ch4_scc_evaluate}
\caption{Evaluate entry transitions}
\end{subcaptionblock}
\caption{Visualization of partial evaluation for a single
SCC\label{fig:evaluate_scc}}
\end{figure}
\begin{algorithm}
\caption{$evaluate_{SCC}(P, A)$}
\KwData{ A PIP $P$, a \gls{scc} $A\subset\T$ }
\KwResult{A Graph $G$}
$\Tin \leftarrow \Tin(A)$\;
$\Tout \leftarrow \Tout(A)$\;
\SetKwFunction{FEvaluate}{evaluate}
\SetKwProg{Fn}{Function}{:}{end}
\Fn{\FEvaluate{$G$, $v$, $t$}}{
$N = \unfold^*(v)$\;
\For{$t = (v, \tau, \eta, p, v') \in N $}{
\uIf {$(l_v, \tau, \eta,p, l_{v'}) \in \Tout$}{
$G \leftarrow G + (v, \tau, \eta, p, \langle v_{l'},
\textit{true}\rangle)$\;
}
\uElseIf {$v' \not\in G$}{
$G \leftarrow G + t$\;
$G \leftarrow evaluate(P, G, v')$\;
}
\Else {
$G \leftarrow G + t$\;
}
}
\Return {$G$}
}
$G \leftarrow \text{lift}(P - A - \Tin - \Tout)$\;
\For{$t_i = (l_\text{i}, \_, \_ ,\_, l_\text{i}') \in \Tin $}{
$ G \leftarrow \text{evaluate}(P, G, \langle l_i, \textit{true}\rangle,
t_i)$\;
}
\Return {$G$}\;
\end{algorithm}
\begin{comment}
termination:
- assume the abstraction method cuts every cycle
- only finitely many abstractions exist
-> the partial evaluation graph is finite
-> the algorithm terminates
correctness:
- the abstraction is entailed by the state
- every walkable path in the program/scc exists in the partial evaluation
graph
\end{comment}
% -----
% In probabilistic programs not only the worst case is of interest, but also the
% probable case. For example a program might run indefinitely in the worst case
% but the worst-case has a probability approaching zero. The expected bounds are
% defined using the expected value of random variables in the probability space
% $(\runs, \F, \PrSs)$ defined in \Sref{ssec:probabilityspace}. Moreover, recall
% definition \ref{def:expected_value}.
% \begin{definition}[Expected Runtime Complexity, \acrshort{past}
% \cite{meyer2021tacas}\label{def:past}]
% For $g \in \GT$ its runtime is the random variable $\Rt(g)$ where $R: \GT
% \rightarrow \runs \rightarrow \overline{\N}$ with
% \begin{equation}
% \Rt(g)((\_,t_0,\_)(\_,t_i,\_)\dots) = |\set{i}{t_i \in g}|.
% \end{equation}
% For a scheduler $\scheduler$ and starting state $s_0\in\Sigma$, the expected
% runtime complexity of $g\in\GT$ is $\E_{\scheduler,s_0}(\Rt(g))$ and the
% expected runtime complexity of $\Prog$ is $\sum_{g\in\GT}\E_{\scheduler,
% s_0}(\Rt(g)))$.
% If $\Prog$'s expected runtime complexity is finite for every scheduler
% $\scheduler$ and every initial state $s_0$, then $\Prog$ is called
% \acrfull{past}
% \end{definition}
% The expected size bound is defined similarly
% \begin{definition}[Expected Size
% Complexity\cite{meyer2021tacas}\label{def:expectedcomplexity}]
% The set of general result variables is $\GRV = \set{(g,l,x)}{g\in\GT,
% x\in\PV, (\_,\_,\_,\_,l) \in g}$. The size of $\alpha = (g,l,x) \in \GRV$ is
% the random variable $\S(\alpha)$ where $\S : \GRV \rightarrow (\runs
% \rightarrow \overline{N})$ with
% \begin{equation}
% \S(g,l,x)((l_0,t_0,s_0)(l_1,t_1,s_1)\dots) = \sup \set{|s_i(x)|}{l_i =
% l, t_i \in g}.
% \end{equation}
% For a scheduler $\scheduler$ and starting state $s_0$, the expected size
% complexity of $\alpha \in \GRV$ is $\E_{\scheduler,s_0}(\S(\alpha))$.
% \end{definition}
% \begin{definition}[Expected Runtime and Size
% Bounds\cite{meyer2021tacas}\label{def:expectedbounds}]\phantom{ }
% \begin{itemize}
% \item $\RBe : \GT \rightarrow \B$ is an expected runtime bound if for
% all $g\in\GT$, all schedulers $\scheduler$, and all $s_0 \Sigma$, we
% have $|s_0|(\RBe(g)) \geq \ESs(\Rt(g))$.
% \item $\SBe : \GRV \rightarrow \B$ is an expected size bound if for
% all $\alpha\in\GT$, all schedulers $\scheduler$, and all $s_0 \Sigma$, we
% have $|s_0|(\SBe(g)) \geq \ESs(\S(g))$.
% \end{itemize}
% \end{definition}
% \begin{example}
% \end{example}
% \begin{definition}[Total worst-case runtime complexity \label{def:wc_rt_complexity}]
% For a \gls{pip} $\Prog$, the overall worst-case runtime complexity $\Rt_\Prog :
% \runs \rightarrow \overline{\N}$ is defined as the \textit{random variable}
% \begin{equation}
% \Rt_\Prog(\vartheta)= \sum_{g\in\GT}\Rt(g)(\vartheta)
% \end{equation}
% \end{definition}
% The following holds for any $\vartheta = (\_,t_0,\_)(\_,t_1,\_)\dots \in \runs$:
% \begin{align}
% \Rt_P(\vartheta) &= \sum_{g\in\GT}\Rt(g)(\vartheta) & \text{by Definition
% \ref{def:wc_rt_complexity}} \\
% &= \sum_{g\in\GT} |\set{i}{t_i \in g}| & \text{by Definition \ref{def:past}}
% \\
% &= | \set{i}{t_i \in g, g\in\GT} & \text{for all } g \neq g', g\intersect{}g' = \emptyset
% \end{align}
% \begin{rem}
% $t_\bot \in g_\bot \not\in \GT$, hence terminating runs have a finite
% overall runtime complexity.
% \end{rem}
% The worst-case total runtime complexity, for an initial state $s_0\in\Sigma$ and
% all scheduler $\scheduler$, is the supremum over all admissible runs:
% \begin{align}
% \sup \set{\Rt_\Prog (\vartheta)}{\vartheta \in \runs, \PrSs(\vartheta) > 0}
% \end{align}
% The expected total runtime complexity is the expected value of the random
% variable $\ESs(\Rt_\Prog)$
% \subsection{Costs}
% It is useful to add the notion of costs to a transition to reflect variable
% complexity of transitions. The accumulative costs of a run are the sum of costs
% for each transition taken. Accumulative costs can simulated by an additional
% program variable $\mathbf{c} \in \PV$ that is updated only by polynomials of the
% form $\eta(\mathbf{c}) = \mathbf{c} + c_t$, where $c_t \in
% \Z$ is the cost associated with the transition $t\in\T$.
% For a given size bound $\SB$ for the \gls{pip} $\Prog$ and a starting assignment
% $s_0$, the cost bound is defined as $|s_0|(\SB(t_\bot,\mathbf{c}))$, and given
% an expected size bound $\SBe$, the expected cost bound is defined as
% $|s_0|\SBe(t_\bot, mathbf{c})$.
\begin{comment}
\subsection{Graph operations}
A \gls{pip} contains a directed graph with vertices $\L$ and edges $\T$. We will
define some graph operations that will become useful during the thesis. The set
of directed graphs over locations $\L$ and transitions $\T$ is denoted by $\G$.
% The graph $G(P)$ denotes the graph contained within a \gls{pip} $P$ and is
% defined as $G(P) = (L, \T)$. The following graph operations will be defined on
% graphs only, but work equally well for \glspl{pip} by using the contained graph
% operation first. However, the result might not be a valid integer program, per-se.
A set of transitions $T \subseteq \T$ induces a sub-graph $G(T)$ in a program $P$,
with $\G: 2^\T \rightarrow \G$. Similarly a set of locations $L \subseteq \Loc$
induces a sub-graph $G(L)$ in $P$, with $G: 2^\Loc \rightarrow \G$.
\begin{rem}
The resulting graphs cannot allways be lifted to a valid \gls{pip}, since
cumulative probability of general transitions are not necessarily preserved.
\end{rem}
\begin{definition}[Transition relation]
A relation $\rightarrow_P \,\subseteq L \times L$ is defined for the
\gls{pip} $P$, with $l \rightarrow_P l'$ if and only if there exists a
general transition $g \in GT$ and a transition $t \in g$ with $t = (l, p,
\tau, \kappa, l)$.
\end{definition}
\begin{definition}[Path relation]
For $n \in \N$ and the \gls{pip} $P$, a relation $\npathto{n}_P\,\subseteq L
\times L$ is defined with $l \npathto{n}_P l'$ if and only if there exist a
(finite) sequence of locations $(l_{k_0}, \dots l_{k_n})\in L^n$, with
$l_{k_i}\rightarrow_P l_{k_{i+1}}$ for $i \in [0, n-1]$, $l_{k_0} = l$, and
$l_{k_n} = l'$. We say there exists a \textit{path} of length $n$ from $l$
to $l'$.
In addition, the relation $l \pathto_P l'$ denotes that there exists a path
of arbitrary length, $l \npathto{>n}_P l'$ that there exists a path of
length
at least $n$.
\end{definition}
\begin{rem}
\begin{itemize}
\item A path of length 0 and the relation $\npathto{0}$ are perfectly
legal. With a zero-length path being a sequence of length one
containing only a single location;
\item the relation $\rightarrow_P$ is neither \textit{reflexive},
\textit{transitive} nor \textit{symmetric} in the general case, but
might be depending on the program.
\item the relation $\npathto{0}_P$ is always \textit{reflexive},
\textit{transitive}, and \textit{symmetric};
\item the relation $\npathto{<n}_P$ is always \textit{reflexive};
\item the relation $\npathto{>n}_P$ is always \textit{transitive};
whereas
\item the relation $\npathto{>0}_P \, = \,\pathto_P$ is always
\textit{reflexive} and \textit{transitive}.
\end{itemize}
\todo{proof needed?}
\end{rem}
Notation-wise, whenever the program is obvious from the context, the subscript
$P$ can be dropped from the relation.
\begin{rem}
The transition and path relation describe the mere existens of transitions
and not if the transition is at all walkable.
\end{rem}
\begin{definition}[SCC]
A set of locations $A \subseteq L$ is \textit{strongly connected} in P when
for every two locations $l, l' \in A$ there exists a path between the two
locations $l \pathto_P l'$. And it is a \textit{strongly connected
component} when the set is maximal. That is, there exists no location $l'' \in
L\backslash A$ such that $A \union \{l''\}$ is also strongly connected.
\end{definition}
\begin{lemma}\label{lem:scc_expand}
Let $A \subseteq L$ be strongly connected set of locations in $P$. Given a
location $l \in L$, and a location $l_A \in A$, if $l \pathto l_A$ and $l_A
\pathto l$ then $A \union \{l\}$ is also strongly connected in $P$.
\begin{proof}
\todo{proof needed?}
\end{proof}
\end{lemma}
\begin{lemma}
\label{lem:scc_disjunct}
Two strongly connected components $A, B \subseteq L$ of $P$ are either equal
or disjunct.
\begin{proof}
Let two strongly connected components $A, B \subseteq L$ be not
disjunct, and $l_{AB} \in A\intersect B$.
Assume there is a location $l_A \in A \backslash B$. Then $l_A \pathto
l_{AB}$ and for all $l_B \in B$, $l_{AB}\pathto l_B$ due to $A$ and $B$
being strongly connected. By transitivity, $l_A \pathto l_B$.
Symmetrically, for any $l_B \in B$, $l_B \pathto l_A$. But then $B
\union \{l_A\}$ would be strongly connected, contradicting $B$
maximality. Hence such a location $l_AB$ cannot exist and $A \intersect
B = \emptyset$.
\end{proof}
\end{lemma}
\begin{lemma}
\label{lem:scc_exist}
For every location $l\in L$, there exists a strongly connected component $A$
such that $l\in A$.
\begin{proof}
Let $l_{k_0}\in L$ be an arbitrary location of a program $P$. The proof
will be given in an inductive fashin. Starting with a set $A_0 =
\{l_{k_0}\}$, by reflexivity $l \pathto l$. Hence, $A_0$ is strongly
connected.
For $i \in \N$, if $A_i$ is maximal and $A_i$ is the SCC containing $l$.
If not there exists another $l_{k_{i+1}} \notin A_i$, such that $A_{i+1}
:= A_i \union \{l_{k_{i+1}}\}$ is strongly connected.
Repeat the argument above until a maximal set is found. The induction
ends, because the set of locations is finite.
\end{proof}
\end{lemma}
\begin{definition}[Loop]
A loop in a \gls{pip} $P$ is a sequence of pairwise different locations
$(l_{k_0} \dots l_{k_n})$ with $l_{k_i} \rightarrow l_{k_{i+1}}$ for $i \in
[0, n-1]$ and $l_{k_n} \rightarrow l_{k_0}$.
\end{definition}
\todo{examples: 1. loops, 2. two loopheads}
\begin{lemma}\label{lem:loop_path}
Let $(l_{k_0} \dots l_{k_n})$ be a loop in the program $P$. For any two
locations in the loop $l_{k_i}, l_{k_j}$, $l_{k_i} \pathto l_{k_j}$ and
$l_{k_j} \pathto l_{k_i}$.
\todo{proof needed?}
\end{lemma}
\begin{lemma}
Let $(l_{k_0} \dots l_{k_n})$ be a loop in the program $P$. Then all
locations $l_{k_0} \dots l_{k_n}$ are contained in the same SCC.
\begin{proof}
Follows immediately from Lemma \ref{lem:scc_expand} and Lemma
\ref{lem:loop_path} \todo{fix ref}
\end{proof}
\end{lemma}
% Various languages exist to represent integer programs in a computer readable
% form; namely Koat2\cite{}, IntC, Some weird XML Onthology
% \cite{https://raw.githubusercontent.com/TermCOMP/TPDB/master/xml/xtc.xsd}
% - CINT
% - Koat Format
% For probabilistic: PIP
% During this thesis we will transform programs. The goal is to find equivalent
% programs that are easier to analyze by Koat2. But what does equivalence mean?
% On non probabilistic programs one would expect, that while beeing structurally
% different, the outcome of the new program is the same as the old one.
% In addition the new program should take the same time as the old one. Since
% costs are represented as a variable, this is part of the outcome. But the
% runtime of a program is calculated by ranking-functions and they count the
% number of visits to a location. In order for the runtime to be the same, bounds
% shall remain the same. \todo{doesn't realy make sence!}
% In probabilistic programs, a the run is non-deterministic and hence calculating
% the runtime is impossible. Instead we have expected runtime-bounds. So the new
% program should have the same expected runtime-bounds and the same expected
% costs, as the old program. Let's give a more formal definition.
% \section{Linear Ranking Funktions}
% Ranking functions are a widely use tool to prove termination and runtime-bounds of integer transition systems. The simplest variant of ranking functions are \gls{lrf}
% We seek a function $f(x_1,\dots,x_n) = a_1x_1 + \cdots + a_nx_n + a_0$ with the rationals as a co-domain, such that
% \begin{enumerate}
% \item $f(\bar{x}) \leq 0$ for any valuation $\bar{x}$ that satisfies the loop constraints;
% \item $f(\bar{x}) - f(\bar{x}') \geq 1$ for any transition leadtion from $\bar{x}$ to $\bar{x}'$
% \end{enumerate}
% \cite{benamram2017}\todo{plagiat}
% Intuition: The ranking function decreases for every walkable transition. If such a ranking function exists, the loop terminates. \todo{not sure about this, guess}
% \section{Lexicographic ranking functions}
% We expand the \gls{lrf} to a tuple of \gls{lrf}. Where whenever lexicograpically higher function decreases lower ones can increase again.
% Example for a tuple.
% $$
% \langle f_1, f_2\rangle \text{s. th.}
% $$
% \section{Multiphase-ranking Functions}
% \section{Goal}
% Transform a program into an equivalent program. \todo{define equivalence of
% programs}
\begin{comment}
- Term Rewrite Systems (TRS)
- Transition Systems
- A set of States S
- A start location S_0 \in S
- A set of transitions T \in S x C x S
- Integer Transition Systems (ITS)
Koat2 can work on integer programs with polynomial integer
arithmetic.\cite{Brockschmidt16} To represent states and invariants in such a
program, one needs polynomial constraints. As an extended concept Koat2 uses
boxes and we will use polyhedra for the partial evaluation. Since those concepts
are closely related we will introduce them together.
\end{comment}