Table of Contents
AOC 2016  Day 5
Problem statement on Advent of Code website.
Code for this post can be found on GitHub.
Background
 On Day 1, we located Easter Bunny headquarters.
 On Day 2, we broke the code to use the bathroom in Easter Bunny HQ.
 On Day 3, we helped the EBs' design department by removing invalid triangles from the list.
 On Day 4, we figured out which rooms in the information kiosk are valid.
On Day 5, we will figure out how to get past security doors by calculating the code.
Problem  Part 1
For Part 1, we are given two pieces of information.
 The door ID
 An increasing integer index, starting at
0
.
For each MD5 hash that starts with 00000
(five 0
's), the 6th digit indicates the next character of the password. If we find eight hashes that start with five 0
's, that provides the password.
Solution  Part 1
My strategy to solve this problem was twofold.
 Write an MD5 algorithm in F#.
 I outlined my solution for a functional MD5 algorithm in F# in a previous blog post.
 Write a function to collect the required hashes and construct the final answer.
The vast majority of the functionality for this solution is in the MD5 algorithm, which I won't describe here again.
Collection function
Once I had an MD5 algorithm, I used the Seq
module and its library functions to collect the required hashes and calculate the door password.
The strategy for the function is simple.
 Start by producing indices starting at
0
.  Map all the strings to hashes.
 Remove all hashes that don't match the criteria (i.e. hashes that don't start with
00000
).  Once we have eight hashes, stop producing indices and get the final answer.
let day5part1input = "cxdnnyjw"
let day5part1criteria = "00000"
let day5part1 input crit (md5sum:string > string) =
Seq.initInfinite (fun i > input + string i)
> Seq.map md5sum
> Seq.filter (fun m > m.StartsWith crit)
> Seq.take 8
> Seq.map (Seq.item 5 >> string)
> Seq.reduce ( + )
When I ran the function, I got F77A0E6E
, which is the correct answer.
Problem  Part 2
The problem statement for Part 2 still uses the MD5 function, but adds a twist to the password computation. We still have the same information from the problem.
 The door ID
 An increasing integer index, starting at
0
.
We are still only interested in hashes that start with five 0
's, i.e. 00000
. The sixth character of the hash now indicates the position in the solution and the seventh character is what goes into that position.
If we get multiple hashes that meet the criteria and indicate the same position, all but the first match are ignored. Hashes that meet the criteria but indicate an invalid position, i.e. anything other than 0
7
, are also ignored.
Solution  Part 2
To solve Part 2, I wanted to use a similar approach to Part 1. However, I needed to way to determine when the algorithm had collected enough values to form a complete answer.
My overall strategy was as follows.
 Start by producing indices starting at
0
.  Map all the strings to hashes.
 Remove all hashes that don't match the criteria.
 Once I have a character for each position of the answer, stop producing indices and get the final answer.
Storing the pieces of the answer
I chose to use a Dictionary<int, string>
collection to hold the values that would form the answer. The keys represent the position in the answer and the values are the singlecharacter strings that, when concatenated, would form the answer.
First, I created a value that would hold the relevant pieces.
let answer = Dictionary<int, string> 8
Second, I created an add
function that would only add values to the Dictionary
if a value for the key was not already present.
let add (d:Dictionary<_,_>) kv =
if not < d.ContainsKey(fst kv) then
d.Add(fst kv, snd kv)
else ()
Third, I created a notComplete
function to check whether I had collected enough values for an answer.
let notComplete (d:Dictionary<_,_>) = d.Count < 8
Collecting the pieces of the answer
Once I had decided on a storage strategy, I wrote a function to add values to the Dictionary
until I had enough to get an answer.
The overall strategy for this part of the logic depends on the Seq.takeWhile
function. This function will terminate the infinite sequence generator once the Dictionary
is "full".
Seq.initInfinite (fun i > input + string i)
> Seq.map md5sum
> Seq.filter
(fun m >
(m.StartsWith crit)
&& (m.[5] = '0'  m.[5] = '1'  m.[5] = '2'  m.[5] = '3'
 m.[5] = '4'  m.[5] = '5'  m.[5] = '6'  m.[5] = '7'))
> Seq.map (fun m > m.[5] > string > System.Int32.Parse, m.[6] > string)
> Seq.takeWhile (fun _ > notComplete answer)
> Seq.iter (fun m > add answer m)
Producing the answer
As the last step, I pulled the values from the Dictionary
and concatenated the strings to get the final answer.
I chose to explicitly pull the values based on key lookups because, while there are library functions to get a collection of the values, I did not see any guarantees about the order that the items would be in.
[
for i in 0 .. 7 do
yield answer.[i]
]
> List.reduce ( + )
When I ran this function, I got back 999828EC
which is, in fact, the correct answer.
Some side notes
Mutable vs. Immutable data structures
As you may have noticed, I am using a mutable data structure to store the needed values for the Part 2 solution. At the time, I struggled to find a "pure" functional way of solving the problem but couldn't come up with something off hand (and, to tell you the truth, I didn't try too hard since I had just finished the MD5 algorithm and wanted to be done with this problem).
However, the Part 2 algorithm can easily be rewritten using a tailrecursive function and immutable data structures. The tailrecursive part is important because, while I don't know how many hashes the algorithm has to go through, I do know that even the OOTB MD5 algorithm takes over 2 minutes to calculate the answer (and my functional MD5 algorithm takes over 7 minutes). From that, I assume that the stack would get very deep if the function was not tailrecursive.
Seq
vs. ParallelSeq
In order to speed up the calculation, I attempted to use the ParallelSeq
library to take advantage of my CPU's multiple cores. Using the ParallelSeq
library did speed up my calculations considerably.
However, one thing I learned is that the ParallelSeq
library does not take care of putting the calculated values back in their original order. Instead, it is the client's responsibility to track the order of values (if that is important  and in this case it was).
As an example, when I used the ParallelSeq library, the system produced C9E29889
as the answer for Part 2. However, the correct answer for Part 2 is 999828EC
, which are the same characters but in the correct order.
In the end, I removed ParallelSeq
from this solution but it is a library that I intend to investigate in the future.
Lessons learned
Here are the lessons I learned from this exercise.
 Take the time to implement a solution the way you want, so that you do not later have regrets about it. In this instance, I regret choosing a mutable data structure and imperative algorithm to solve Part 2 and it's something I hope to rectify in the near future.
 Just as when I implemented the MD5 algorithm in F#, it is critical to be extra vigilant when reading code. Much more so than any other technique, reading and understanding code is the fastest way to find and fix most bugs.
See you next time!
PS: Solution  Part 2 (Pure functional implementation)
After I wrote this blog post, but before posting it, I decided to rewrite the Part 2 solution using immutable data structures.
Rewriting the function was quite simple, and relied on the following strategy.
 Define a completion check to terminate the recursive function.
 Define a tailrecursive function to calculate the 8 characters of the answer.
 Combine the 8 characters to form the final answer.
Here is the code for the pure functional implementation of Part 2, broken into 3 chunks.
let day5part2alt input crit (md5sum : string > string) =
let complete l = l > List.filter ((=) "") > List.isEmpty
let rec collector i acc =
if complete acc then acc
else
let m = md5sum (input + string i)
if (m.StartsWith crit) then
match m.[5] with
 '0'  '1'  '2'  '3'  '4'  '5'  '6'  '7' >
List.mapi (fun idx elt >
if elt = "" && idx = (m.[5] > string > System.Int32.Parse) then
string m.[6]
else elt) acc
> collector (i + 1)
 _ > collector (i + 1) acc
else collector (i + 1) acc
[ ""; ""; ""; ""; ""; ""; ""; "" ]
> collector 0
> List.reduce (+)
Running this function provides the same answer as above, i.e. 999828EC
.
Performance comparison with day5part2
I compared day5part2
and day5part2alt
to see whether my new implementation matches up to the previous one.
On average, I received the following results.
Function  Real  CPU  Gen0  Gen1  Gen2 


174.758 
174.689 
21623 
18 
2 

166.528 
166.437 
20590 
17 
1 
And here are the same results, converted to percentages.
Function  Real  CPU  Gen0  Gen1  Gen2 


105% 
105% 
105% 
106% 
200% 

100% 
100% 
100% 
100% 
100% 
Ignoring the Gen2 result, the mutable version is actually 5% worse across the board, in both processing time and increased garbage collection. It pays to be immutable.
See you next time!
No comments:
Post a Comment