README.md

AoC 2021 solutions in zig

After solving AoC in golang first, now I try to rewrite it in zig.

While doing it I try to improve the speed of my solutions. There will be no asm tweaks here, but I try to do my best.

Here are a couple similar repository:

Benchmark times below measured on a Raspberry Pi 4 with 8GB RAM. Zig version 0.9.1 with -Drelease-fast build option.

Optimizations (and fails)

Day 1: Sonar Sweep

We compare to array window (a and b) when a < b we increment our counter.

a: idx-3 + idx-2 + idx-1
b: idx-2 + idx-1 + idx

As idx-2 and idx-1 are in a and b too this can be simplified to idx-3 < idx. This way it fits in u13, using @intCast(u15) is not necessary.

Gain: 271 μs -> 270 μs :-)

Day 2: Dive!

Nothing smart here. I could check only the first character of each command, but it does not really matter.

Day 3: Binary Diagnostic

First version: day03_string.zig:

Using std.heap.FixedBufferAllocator instead of std.heap.GeneralPurposeAllocator helps. Gain: 4780 μs -> 2650 μs

Final version:

Using std.ArrayList instead of std.BufSet and bitwise operators instead of strings makes things faster. ~2200 μs with std.testing.allocator and ~950 μs with std.heap.FixedBufferAllocator.

Day 4: Giant Squid

First version: day04_map.zig:

I have tried to be smart whith the data structure here: Using an AutoHashMap to store the position of each number in the grid, so I do not have to loop through the whole grid when searching for a number.

This turns out quite slow: ~2x15 ms. Takeaway: When the grid is small (in this case 5x5) looping through it is quite fast anyway, it will compensate the HashMap overhead.

Final version:

I am using ArrayList for holding the numbers and boards just like the first version, but the Board representation is vastly simplified. Items are stored in a 5x5 array and I keep track of marked numbers in row and col which are [5]u5 so it is easier to check if we reached a bingo, do not have to loop through the board every time.

This version runs around 1200-1400 μs, so 10x faster than the first version.

Day 5: Hydrothermal Venture

First version

The first version used AutoHashMap to store vents data. With my input on a 1000x1000 grid there are 98435 points, so it does not make sense after all.

Runtime was 50 ms for the first part, so I just gave up that point.

Final version

Using an [1000][1000]u2 array for stroring vents. Nothing fancy. Runtime is around 5ms each.

Day 6: Lanternfish

The main idea here is that we do not try to keep track of the fishes, just following how many fish is in a given age. This way the solution is simple and fast.

Day 7: The Treachery of Whales

In first part we need the median of the numbers, in second part we need the mean of them. https://www.geeksforgeeks.org/program-for-mean-and-median-of-an-unsorted-array/

I get the median by sorting the array (which is O(n*log n) worst case), but there are methods to do it O(n) time.

The mean is computed while parsing the items. For fuel stepping I just use Gauss' trick: 1+2+3+...+n = n(n+1)/2. This saves me a loop.

Day 8: Seven Segment Search

Instead of brute forcing the result deduct the unknown numbers from the already known ones. I am also using a u7 for representing the seven segments. This way filling them up also takes care of sorting.

Day 9: Smoke Basin

This day does not need any fancy idea. I just walk through the items to get the lowest in first part.

In second part I start from every low point and collect every neighbour (and neighbour of neighbours) which is lowert than 9 recursively. Already visited points are stored in a std.AutoHashMap(point, void). I collect the basins to a std.PriorityQueue, so I do not have to sort them at the end. It is probably useless, but was fun to use this data type too.

TODO: Using a [100][100]bool array for stroring the seen items might speed things up.

Day 10: Syntax Scoring

In the first part I just store the opening pairs in a std.ArrayList(u8). I quit parsing as soon as a pair mismatch occures and count the correspoinding fail value.

In second part there is an other std.ArryList() for storing the completion scores. The score is computed reading the first array backward.

Day 11: Dumbo Octopus

I just used a [10][10]u4 array for stroring the grid items. Zig has a neat +|= operator, so the grid items will not overflow because of the continous flashing. I just flash all the neighbours in 8 direction recursively.

The second part is very similar. I just check if every item in the grid is 0 with fullFlash(). I break out from the infinite loop when it returns true.

Day 12: Passage Pathing

First version: (*_without_memo.zig)

I used a std.StringHashMap(std.ArrayList([]const u8)) for storing the routes from every cave to the adjacent ones and a std.StringHashMap(void) for storing the seen ones.

In first part I recursively check all the caves starting from start cave, trying all the paths increasing the counter on every end reach. I only collect small caves in seen HashMap.

In second part the seen is a std.StringHashMap(u2), so I can store how many times a small cave is seen. (Using +|= to avoid overflow).

These solutions are slow: 7ms for the first part and 272 ms for the second which is way too much.

Final version

Instead of a std.StringHashMap(void) I have used a bitmap for representing seen caves (this could be further simplified by only tracking lowercase caves. In the second part I use an other bit for marking if double lowercase cave is used or not.

The main optimization here is using memoization for storing the alredy generated states.

First part is 1.2 ms, second part is 2.5ms (more than 100x faster).

Day 13: Transparent Origami

I use an std.AutoArrayHashMap for storing the points on the map, it is a bit faster to iterate through than std.AutoHashMap. Nothing special.

Day 14: Extended Polymerization

I store how often a given character pair appears in a std.StringHashMap. In every step I just create a new map based on the previous one. The second part takes ~15ms, this could be improved.

Day 15: Chiton

This one is solved by a Dijksra algorithm. I use a std.PriorityQueue for storing grid points to visit. I can not use queue.update() because of some limitations in the std lib implementation, so I just keep adding the grid points. It turns out to be faster than updating with std.meta.eql(). This solution is ~240ms for the second part.

Day 16: Packet Decoder

I convert from hex to binary as string with switch. Not sure if this is good or bad. The rest if slicing and parsing. Nothing fancy.

Day 17: Trick Shot

This first part is a simple calculation. In second part I just brute force the solutions with the following optimizations:

  • I keep vx{min, max} and vy{min, max} limited to the interesting part.
  • I also keep the time (steps) within useful range.
  • I break the loop when it has no chance to come up with a hit.

Day 18: Snailfish

First solution: day18a_string.zig

My go solution uses strings to solve this with regexes and so. I have tried to replicate that in zig, but it needs a regex lib (which is not in the std) and turns out to be quite slow. The first part takes ~500ms on my computer, so I have to find a better way.

Final version

Instead of strings I use the following node structure for storing items:

const SnailItem = union(enum) {
    num: u6,
    pair: *SnailNumber,
}

const SnailNumber = struct {
    left: ?SnailItem,
    right: ?SnailItem,
    parent: ?SnailNumber,
}

The first step is parsing the string to this, every other manipulation is done on the SnailNumber. First part drops down to 10ms, second part is ~200ms.

Day 19: Beacon Scanner

Using a brute force solution here, looping through all the scanners and beacons. Comparing them to already known absolute beacon positions. At start any scanner can be the (0,0,0) and there beacons are the absolute ones.

We also have to rotate the coordinates (3*8 rotation) before subtracting the absolute ones. When 12 matching item found we add those beacons to the absolute ones.

We store the absolute beacon coordinates in an std.AutoArrayHashMap, so it is easier to loop through them.

This solution is quite slow, first and second part takes more than 5 seconds each.

TODO: Figure out how to avoid checking all scanner, beacon and rotation against the absolute ones. Can we rule out certain scanners based on distance or something else? (Sort of done, but I am not satisfied with the result, especially the hard coded magic numbers.)

Day 20: Trench Map

The image enhancement algorithm string converted to a std.StaticBitSet. I use a u9 for generating the index with bitmasking. This works fine.

Day 20b first solution: day20b_hashmap.zig ~1.6 seconds

I keep the lit items in a std.AutoHashMap, but generally the picture grid is just an array of coordinates.

Final version

I have replaced std.AutoHashMap with std.DynamicBitSet as each pixes is either lit or not. This makes the code much faster. First part is ~3792 us, second part is 190 ms.

Day 21: Dirac Dice

First part is very easy, not much to talk about. The second part is much harder. It takes forever without memoization. Using an std.AutoHashMap for the cache. There are 16k+ items in the cache and more than 255k hits for chache items with my input. Runs ~70ms on the rpi4.

Day 22: Reactor Reboot

The first part is fast, as we only have to check the items in -50..50 range.

Day 22b first solution: day22b_coord_comp.zig ~9 seconds

The second part is much harder: We use coordinate compression: Each cube switches on and off a range of coordinates, but we do not check them one-by-one, just keeep track of the start and end of each range. This makes this solution much faster, but it is still not fast enough.

Final version

We use create subcube only when two cubes are intersecting and switch the cubes on and off based on the new input. This way we keep the number of cubes low (~7000 cubes vs. the 840840840 items in the previous version). We keep the subcubes in std.AutoArrayHashMap so iterating is fast. Running time is ~95ms.

Day 23: Amphipod

I use Dijksra to solve this one. The various state representation is the following:

const Amphipod = enum { A, B, C, D, };

const hallway_length = 7;
const siderooms = 4;
const sideroom_depth = 2; // 4 int the second part

const State = struct {
    hallway: [hallway_length]?Amphipod,
    rooms: [siderooms][sideroom_depth]?Amphipod,
};

I do not think that I am doing any particularly smart with the moves. But I move the Amphipod to it's room whenever that is possible to lower state count.

I store state and it's energy in a std.PriorityQueue. I do not update queue energy values as adding new ones is faster and I do not drain the queue, just return when reached final state.

TODO:

  • According to profiling hashing the states for the std.AutoHashMap which strores the already visited states is the slow part.
  • Maybe try using BTreeMap
  • Use A* heuristics for choosing better states to check, do not wast time on useless states.

Day 24: Arithmetic Logic Unit

The hard part of this task figuring out which input lines matter (which I did not, so...) The solution is fast as it is.

Day 25: Sea Cucumber

I just use a [137][139]u8 for storing grid values and change them in place. To change them in place I had to take special care for capturing all the the row/col state before changing them, but it is bearable. Running time is ~250ms.

TODO:

  • I could try storing '>' and 'v' coordinates in separate arrays, that way I can check only those items, not the whole grid. This way the grid can be represented as a std.StaticBitSet(grid_rows*grid_cols) to store occupied or free state of a given grid point.
const Ocean = struct {
    easts: []usize,
    souths: []usize,
    grid: std.StaticBitSet(grid_rows*grid_cols)
}

This way every coordinate is comes as Coord{ .x = value / grid_cols, .y = value % grid_cols}. Might worth a try.

Benchmark results

Raspberry Pi 4 with 8GB RAM

⋊> ~/c/aoc2021-zig on master > zig build bench -Drelease-fast
Day 1a result: 1400             time: 297us
Day 1b result: 1429             time: 297us
Day 2a result: 1893605          time: 212us
Day 2b result: 2120734350       time: 216us
Day 3a result: 3847100          time: 174us
Day 3b result: 4105235          time: 912us
Day 4a result: 11536            time: 1054us
Day 4b result: 1284             time: 1519us
Day 5a result: 8060             time: 6568us
Day 5b result: 21577            time: 7945us
Day 6a result: 350149           time: 37us
Day 6b result: 1590327954513    time: 35us
Day 7a result: 335271           time: 420us
Day 7b result: 95851339         time: 156us
Day 8a result: 381              time: 248us
Day 8b result: 1023686          time: 645us
Day 9a result: 550              time: 514us
Day 9b result: 1100682          time: 4798us
Day 10a result: 411471          time: 557us
Day 10b result: 3122628974      time: 923us
Day 11a result: 1700            time: 913us
Day 11b result: 273             time: 2214us
Day 12a result: 5212            time: 1065us
Day 12b result: 134862          time: 2265us
Day 13a result: 802             time: 2173us
Day 13b result: 103             time: 5834us
Day 14a result: 3411            time: 3225us
Day 14b result: 7477815755570   time: 13915us
Day 15a result: 366             time: 7948us
Day 15b result: 2829            time: 241226us
Day 16a result: 967             time: 1373us
Day 16b result: 12883091136209  time: 1429us
Day 17a result: 5778            time: 1us
Day 17b result: 2576            time: 28285us
Day 18a result: 3675            time: 10034us
Day 18b result: 4650            time: 190114us
Day 19a result: 342             time: 1063723us
Day 19b result: 9668            time: 1066491us
Day 20a result: 5503            time: 3756us
Day 20b result: 19156           time: 188746us
Day 21a result: 679329          time: 4us
Day 21b result: 433315766324816 time: 75514us
Day 22a result: 607573          time: 14265us
Day 22b result: 1267133912086024time: 95714us
Day 23a result: 16244           time: 636858us
Day 23b result: 43226           time: 1388284us
Day 24a result: 99299513899971  time: 122us
Day 24b result: 93185111127911  time: 118us
Day 25a result: 432             time: 233175us
Total running time: 5357ms

Intel(R) Core(TM) i5-8365U CPU 8GB RAM

C:\Users\voroskoia\zig\aoc2021-zig>zig build bench -Drelease-fast
Day 1a result: 1400             time: 41us
Day 1b result: 1429             time: 43us
Day 2a result: 1893605          time: 44us
Day 2b result: 2120734350       time: 25us
Day 3a result: 3847100          time: 107us
Day 3b result: 4105235          time: 221us
Day 4a result: 11536            time: 136us
Day 4b result: 1284             time: 196us
Day 5a result: 8060             time: 846us
Day 5b result: 21577            time: 690us
Day 6a result: 350149           time: 1us
Day 6b result: 1590327954513    time: 1us
Day 7a result: 335271           time: 68us
Day 7b result: 95851339         time: 16us
Day 8a result: 381              time: 52us
Day 8b result: 1023686          time: 104us
Day 9a result: 550              time: 108us
Day 9b result: 1100682          time: 932us
Day 10a result: 411471          time: 96us
Day 10b result: 3122628974      time: 269us
Day 11a result: 1700            time: 109us
Day 11b result: 273             time: 244us
Day 12a result: 5212            time: 222us
Day 12b result: 134862          time: 499us
Day 13a result: 802             time: 235us
Day 13b result: 103             time: 958us
Day 14a result: 3411            time: 433us
Day 14b result: 7477815755570   time: 1514us
Day 15a result: 366             time: 1156us
Day 15b result: 2829            time: 28947us
Day 16a result: 967             time: 248us
Day 16b result: 12883091136209  time: 207us
Day 17a result: 5778            time: 0us
Day 17b result: 2576            time: 5552us
Day 18a result: 3675            time: 1661us
Day 18b result: 4650            time: 22754us
Day 19a result: 342             time: 133007us
Day 19b result: 9668            time: 146975us
Day 20a result: 5503            time: 1768us
Day 20b result: 19156           time: 64990us
Day 21a result: 679329          time: 0us
Day 21b result: 433315766324816 time: 9780us
Day 22a result: 607573          time: 2271us
Day 22b result: 1267133912086024time: 12200us
Day 23a result: 16244           time: 99033us
Day 23b result: 43226           time: 197525us
Day 24a result: 99299513899971  time: 46us
Day 24b result: 93185111127911  time: 42us
Day 25a result: 432             time: 44443us
Total running time: 731ms