WBLPXVTDO5DLSFRCAO7Q5AVZRUQUUOQKG4LLEQCDDGKJNL4TDVSAC
- Formally proving useful things about games may just turn out to be infeasible.
This would be disappointing, but is very possible.
- We fail to prove any useful lower bound.
This would be disappointing, but is very possible. It could just be too hard to prove anything.
In mathematics, you need to define your terms. If you want to talk about real numbers, you first need to define them.
(They're a ordered collection of things with addition, negation, multiplication, and reciprocation, saisfying thirteen axioms.)
If you want to talk about symmetry, you need to define the concept of symmetry group. And if you want to prove a lower bound on Dragster times, you need to precisely define how an Atari 2600 works.
So how does one formally verify code? When I tried directly searching for answers to this, I had no luck.
This basically amounts to writing an emulator.
We define the state of the processor, and how each instruction affects it.
We define the memory map; the behavior of reading and writing to different areas of memory.
I would suggest using the Coq theorem prover. When I tried learning it, it seemed like half the tutorials mentioned program verification!
Also, I found Coq easier to install than Agda, and less laggy than Isabelle.
The behavior does not need to be completely defined. We don't need to describe how illegal opcodes behave; we just need to prove they won't be executed. We probably don't need to precisely define the behavior of the peripherals; I don't think we care what value is read from them.
It's worth noting that proof will assume the Atari 2600 definition is correct. If it isn't, the proof won't quite prove what we think it does.
Next, we need to precisely describe what we're proving. To do this, we need to explain to Coq precisely how an Atari 2600 works.
This basically amounts to writing an emulator.
## Making the model usable
Note that the proof will have to trust that we get this part right.
Our proof will actually be talking about the behavior of Dragster *when run on this emulator*.
So if our emulator is buggy, the proof won't prove what we think it does.
### My own thoughts
Here's my thinking:
We want to specify, for each possible value of the program counter, a property of the Atari's state.
We want to prove that the property holds on startup, and that it will continue to hold as the program runs.
Finally, we want the property at the victory-handling code to ensure that the timer's value is greater than our lower bound.
There has been a lot of research into formally verifying code.
In particular, researchers have invented the concept of a *separation logic*,
which is good for reasoning about code involving mutable state.
And we need all this to not take an unreasonable amount of work. So we simplify.
1. For program counter values outside of ROM, we choose the property `False`.
This will require us to prove that the program counter never leaves ROM.
But in return, we get that each possible program counter value corresponds to a unique instruction,
irrespective of the state of the processor or memory.
2. For most possible values of the program counter, the property can be automatically generated.
For example, consider this code:
```
LABEL1: LDA #$00
LABEL2: ...
```
If the property at `LABEL2` is `P`, the property at `LABEL1` should default to
`state => P(state[accumulator = 0x00, sign_flag = False, zero_flag = True])`.
Property-preservation is automatic.
Separation logic, in its usual form, doesn't work well for assembly code.
But this can be fixed. see [this paper](https://jbj.github.io/research/hlsl.pdf) for details.
Suddenly, instead of having to define thousands of properties, and thousands of property-preservation theorems,
we only need a few. But I worry about two things:
That paper itself isn't perfect for our use-case, either.
It talks about proving that code won't fault, whereas we want a stronger property.
But this should be fixable, too.
- We still need to specify a property somewhere in each loop, to keep the automatic property generation from endlessly looping.
Even in the rendering code, which we really don't care about.
- The property-preservation theorems might be too complicated.
"If this property holds, and then we apply these ten instructions, then that property will hold."
In Coq, there is a library called [Iris](https://iris-project.org/),
which provides general-purpose tools for working with separation logic inside Coq.
## Using the model
Once we've recast the problem in Iris' specification logic, we try to prove it.
To do this, we try to understand why people think the game can't be beaten in less than 5.57 seconds.
Then we try to formalize that reasoning.
Once using the model is practical, we get to the main part of the project: proving a lower bound on the time.
This is where we analyze the actual mechanics of the game.
Either the proof will go through, or it won't.
In the latter case, we should try to understand *why* it doesn't go through.
Is our proof just not good enough, or is a 5.54 actually possible?