finegeometer/Ideas - Change WBLPXVTDO5DLSFRCAO7Q5AVZRUQUUOQKG4LLEQCDDGKJNL4TDVSAC

Update speedrunning_lower_bound.md

Created by finegeometer on January 19, 2022

WBLPXVTDO5DLSFRCAO7Q5AVZRUQUUOQKG4LLEQCDDGKJNL4TDVSAC

Dependencies

In channels

main

Change contents

Replacement in src/speedrunning_lower_bound.md at line 20 [3.52]

B:BD[3.1180] → [3.1180:1313]

- Formally proving useful things about games may just turn out to be infeasible. 
This would be disappointing, but is very possible.

[3.1180]

[3.1313]

- We fail to prove any useful lower bound.
This would be disappointing, but is very possible. It could just be too hard to prove anything.

Insertion in src/speedrunning_lower_bound.md at line 26 [3.52]
[3.1477]
[3.1477]

Replacement in src/speedrunning_lower_bound.md at line 29 [3.52]

B:BD[3.1496] → [3.1496:1940]

In mathematics, you need to define your terms. If you want to talk about real numbers, you first need to define them.
(They're a ordered collection of things with addition, negation, multiplication, and reciprocation, saisfying thirteen axioms.)
If you want to talk about symmetry, you need to define the concept of symmetry group. And if you want to prove a lower bound on Dragster times, you need to precisely define how an Atari 2600 works.

[3.1496]

[3.1940]

So how does one formally verify code? When I tried directly searching for answers to this, I had no luck.

Replacement in src/speedrunning_lower_bound.md at line 31 [3.52]

B:BD[3.1941] → [3.1941:2156]

This basically amounts to writing an emulator. 
We define the state of the processor, and how each instruction affects it.
We define the memory map; the behavior of reading and writing to different areas of memory.

[3.1941]

[3.2156]

I would suggest using the Coq theorem prover. When I tried learning it, it seemed like half the tutorials mentioned program verification!
Also, I found Coq easier to install than Agda, and less laggy than Isabelle.

Deletion in src/speedrunning_lower_bound.md at line 34 [3.52]

B:BD[3.2157] → [3.2157:2438]

The behavior does not need to be completely defined. We don't need to describe how illegal opcodes behave; we just need to prove they won't be executed. We probably don't need to precisely define the behavior of the peripherals; I don't think we care what value is read from them.

Replacement in src/speedrunning_lower_bound.md at line 35 [3.52]

B:BD[3.2439] → [3.2439:2582]

It's worth noting that proof will assume the Atari 2600 definition is correct. If it isn't, the proof won't quite prove what we think it does.

[3.2439]

[3.2582]

Next, we need to precisely describe what we're proving. To do this, we need to explain to Coq precisely how an Atari 2600 works.
This basically amounts to writing an emulator.

Replacement in src/speedrunning_lower_bound.md at line 38 [3.52]

B:BD[3.2583] → [3.2583:2610]

## Making the model usable

[3.2583]

[3.2610]

Note that the proof will have to trust that we get this part right.
Our proof will actually be talking about the behavior of Dragster *when run on this emulator*.
So if our emulator is buggy, the proof won't prove what we think it does.

Replacement in src/speedrunning_lower_bound.md at line 42 [3.52]

B:BD[3.2611] → [3.2611:2690]

Once we have our model of the Atari 2600, we can prove things about it. Right?

[3.2611]

[3.2690]

Despite this limitation, I think the proof will still be worthwhile.

Deletion in src/speedrunning_lower_bound.md at line 44 [3.52]

B:BD[3.2691] → [3.2691:2855]

Not so fast. In theory, we can. But it won't be practical.
It'll take dozens of lines even to prove the behavior of `LABEL: JMP LABEL`, let alone any real program.

Replacement in src/speedrunning_lower_bound.md at line 45 [3.52]
B:BD[3.2856] → [3.2856:2946]
```
So we need to simplify. We need most of the low-level detail to be handled automatically.
```
[3.2856]
[3.2946]
```
## Using the model
```

Replacement in src/speedrunning_lower_bound.md at line 47 [3.52]

B:BD[3.2947] → [3.2947:3009]

I'm not entirely sure how to do this. But I have a few ideas.

[3.2947]

[3.3009]

Once we've implemented the Atari 2600, we want to begin the proof. But how?

Replacement in src/speedrunning_lower_bound.md at line 49 [3.52]

B:BD[3.3010] → [2.0:20]

∅:D[2.20] → [3.3014:3372]

B:BD[3.3014] → [3.3014:3372]

### My own thoughts
Here's my thinking:
We want to specify, for each possible value of the program counter, a property of the Atari's state. 
We want to prove that the property holds on startup, and that it will continue to hold as the program runs.
Finally, we want the property at the victory-handling code to ensure that the timer's value is greater than our lower bound.

[3.3010]

[3.3372]

There has been a lot of research into formally verifying code.
In particular, researchers have invented the concept of a *separation logic*,
which is good for reasoning about code involving mutable state.

Replacement in src/speedrunning_lower_bound.md at line 53 [3.52]

B:BD[3.3373] → [3.3373:4163]

And we need all this to not take an unreasonable amount of work. So we simplify.
1. For program counter values outside of ROM, we choose the property `False`.
    This will require us to prove that the program counter never leaves ROM.
    But in return, we get that each possible program counter value corresponds to a unique instruction,
    irrespective of the state of the processor or memory.
2. For most possible values of the program counter, the property can be automatically generated.
    For example, consider this code:
    ```
    LABEL1: LDA #$00
    LABEL2: ...
    ```
    If the property at `LABEL2` is `P`, the property at `LABEL1` should default to
    `state => P(state[accumulator = 0x00, sign_flag = False, zero_flag = True])`.
    Property-preservation is automatic.

[3.3373]

[3.4163]

Separation logic, in its usual form, doesn't work well for assembly code.
But this can be fixed. see [this paper](https://jbj.github.io/research/hlsl.pdf) for details.

Replacement in src/speedrunning_lower_bound.md at line 56 [3.52]

B:BD[3.4164] → [3.4164:4326]

Suddenly, instead of having to define thousands of properties, and thousands of property-preservation theorems,
we only need a few. But I worry about two things:

[3.4164]

[3.4326]

That paper itself isn't perfect for our use-case, either.
It talks about proving that code won't fault, whereas we want a stronger property.
But this should be fixable, too.

Replacement in src/speedrunning_lower_bound.md at line 60 [3.52]

B:BD[3.4327] → [3.4327:4678]

- We still need to specify a property somewhere in each loop, to keep the automatic property generation from endlessly looping.
Even in the rendering code, which we really don't care about.
- The property-preservation theorems might be too complicated.
"If this property holds, and then we apply these ten instructions, then that property will hold."

[3.4327]

[2.21]

In Coq, there is a library called [Iris](https://iris-project.org/),
which provides general-purpose tools for working with separation logic inside Coq.

Replacement in src/speedrunning_lower_bound.md at line 63 [3.52]

B:BD[2.22] → [2.22:56]

### Potentially useful prior work

[2.22]

[2.56]

Crucially, Iris is flexible enough that it should work with assembly code.
The parts of separation logic that don't apply are derived concepts in Iris,
rather than part of its core logic.

Deletion in src/speedrunning_lower_bound.md at line 67 [3.52]

B:BD[2.57] → [2.57:167]

I found this paper:
[High-Level Separation Logic for Low-Level Code](https://jbj.github.io/research/hlsl.pdf)

Replacement in src/speedrunning_lower_bound.md at line 68 [3.52]

B:BD[3.4679] → [2.168:364]

This seems useful. But it's not perfect. In particular, their `safe` primitive is not quite what we need,
because we want to prove something stronger than the fact that the program doesn't crash.

[3.4679]

[2.364]

# The proof

Replacement in src/speedrunning_lower_bound.md at line 70 [3.52]

∅:D[2.365] → [3.4679:4698]

B:BD[3.4679] → [3.4679:4698]

## Using the model

[2.365]

[3.4698]

Once we've recast the problem in Iris' specification logic, we try to prove it.
To do this, we try to understand why people think the game can't be beaten in less than 5.57 seconds.
Then we try to formalize that reasoning.

Replacement in src/speedrunning_lower_bound.md at line 74 [3.52]

B:BD[3.4699] → [3.4699:4868]

Once using the model is practical, we get to the main part of the project: proving a lower bound on the time.
This is where we analyze the actual mechanics of the game.

[3.4699]

[3.4868]

Either the proof will go through, or it won't.
In the latter case, we should try to understand *why* it doesn't go through.
Is our proof just not good enough, or is a 5.54 actually possible?

Deletion in src/speedrunning_lower_bound.md at line 78 [3.52]

B:BD[3.4869] → [3.4869:5091]

If all goes well, we can formalize players' reasoning about why one can't beat 5.57. Alternatively, we might find a subtle error in that reasoning, and discover that 5.57 can be beaten! Either result would be interesting.