### Daily Coding Problem 3

Daily Coding Problem 3

# Daily Coding Problem 3¶

## Problem Statement¶

Given the root to a binary tree, implement serialize(root), which serializes the tree into a string, and deserialize(s), which deserializes the string back into the tree.

For example, given the following Node class

class Node:
def __init__(self, val, left=None, right=None):
self.val = val
self.left = left
self.right = right


The following test should pass:

node = Node('root', Node('left', Node('left.left')), Node('right'))
assert deserialize(serialize(node)).left.left.val == 'left.left'


## Solution¶

Each solution requires two algorithms, serialize and deserialize.

First, let's pull in the Nuget packages for the solution.

In [1]:
// Easy way to work with any Nuget packages. Standard packages to load, regardless of solution.
Paket.Version [
"FSharp.Core", "4.5.4"
"FParsec", "1.0.4-RC3"
]
Paket.Package [
"Expecto"
"Xplot.Plotly"
"Newtonsoft.Json"
"Chiron"
]

#r "packages/Expecto/lib/net461/Expecto.dll"
#r "packages/FParsec/lib/portable-net45+win8+wp8+wpa81/FParsec.dll"
#r "packages/FParsec/lib/portable-net45+win8+wp8+wpa81/FParsecCS.dll"
#r "packages/Newtonsoft.Json/lib/net45/Newtonsoft.Json.dll"
#r "packages/Chiron/lib/net452/Chiron.dll"

/// Quick debugging
let tee f x = f x |> ignore; x


### Strategy Part 1 - Pick the correct data model¶

The strategy is to provide behavior in F# that is similar to Python (at least, as much as possible given the only example). Since val is a keyword in F#, using it as a field name required wrapping it in 2 back-ticks, aka val. Or renaming it to something like value.

The key here, in my mind, is to pick the appropriate data type to hold the tree. Some choices include:

1. Discriminated union
2. Record type
3. Class
In [2]:
open System

module Model =
// Need to define static members to allow Chiron to work
open Chiron
open Chiron.Operators

/// Tree as discriminated union or choice type
type DUNode =
| Node of string * DUNode option * DUNode option
with
static member create(v, l, r) = Node(v, l, r)
static member create(v, l) = DUNode.create(v, l, None)
static member createOpt(v, l, r) = DUNode.create(v, l, r) |> Some
static member createOpt(v, l) = DUNode.create(v, l, None) |> Some
static member createOpt(v) = DUNode.create(v, None, None) |> Some
member x.val = match x with Node(v, l, r) -> v
member x.left = match x with Node(_, l, _) -> l
member x.right = match x with Node(_, _, r) -> r
// Chiron functions
static member ToJson (Node(v, l, r)) = Json.write "dunode" (v, l, r)
static member FromJson (_:DUNode) = Json.read "dunode" |> Json.map Node

/// Tree as record type
type RNode = {
val: string
left: RNode option
right: RNode option
} with
// Convenience methods to construct the record
static member create(v, l, r) = {val=v; left=l; right=r}
static member createOpt(v, l, r) = RNode.create(v, l, r) |> Some
static member createOpt(v, l) = RNode.create(v, l, None) |> Some
static member createOpt(v) = RNode.create(v, None, None) |> Some
// Chiron functions
static member ToJson (node:RNode) =
json {
do! Json.write "rval" node.val

match node.left with
| Some(l) -> do! Json.write "rleft" l
| None -> do! Json.writeNone "rleft"

match node.right with
| Some(r) -> do! Json.write "rright" r
| None -> do! Json.writeNone "rright"
}
static member FromJson (_:RNode) =
json {
return RNode.create(v, l, r)
}

/// Tree as class type
type CNode(val:string, left:CNode option, right:CNode option) =
// Save the values from the constructor
member this.val = val
member this.left = left
member this.right = right
// Short-hand constructors, but as static members
static member create(v, l, r) = CNode(v, l, r)
static member createOpt(v, l, r) = CNode.create(v, l, r) |> Some
static member createOpt(v, l) = CNode.create(v, l, None) |> Some
static member createOpt(v) = CNode.create(v, None, None) |> Some
// Chiron functions
static member ToJson (node:CNode) =
json {
do! Json.write "cval" node.val

match node.left with
| Some(l) -> do! Json.write "cleft" l
| None -> do! Json.writeNone "cleft"

match node.right with
| Some(r) -> do! Json.write "cright" r
| None -> do! Json.writeNone "cright"
}
static member FromJson (_:CNode) =
json {
return CNode(v, l, r)
}
// Expecto equals comparison requires overrides for classes
// Using trick from https://stackoverflow.com/a/38932144
// More on tuple equality is at https://fsharpforfunandprofit.com/posts/tuples/
override x.GetHashCode() = hash(x.val, x.left, x.right)
override x.Equals(b) =
match b with
| :? CNode as c -> x.GetHashCode() = c.GetHashCode()
| _ -> false


### Strategy Part 2 - Figure out how to Serialize and Deserialize¶

The next question is this - how to serialize and deserialize the object into a string. The problem doesn't specify the output string's format - so it could be XML, JSON, etc. or something custom.

Here are three ideas:

1. Use an auto-JSON serializer and deserializer, like Json.NET.
2. Use a manual JSON serializer and deserializer, like Chiron.
3. Use a custom serializer and deserializer, using something like FParsec.

#### Automatic Serialization using Json.NET¶

First, we write the functions for Automatic Serialization using Json.NET.

In [3]:
/// Uses Json.NET
module AutoSerializer =
open Newtonsoft.Json
// Convenience methods
let deserialize<'a> str = JsonConvert.DeserializeObject<'a> str

// DUNode
let serializeDUNode (n:Model.DUNode) = JsonConvert.SerializeObject n
let deserializeDUNode = deserialize<Model.DUNode>

// RNode
let serializeRNode (n:Model.RNode) = JsonConvert.SerializeObject n
let deserializeRNode = deserialize<Model.RNode>

// CNode
let serializeCNode (n:Model.CNode) = JsonConvert.SerializeObject n
let deserializeCNode = deserialize<Model.CNode>


#### Manual Serialization using Chiron¶

Second, we have the functions for Manual Serialization using Chiron. Please note that to use Chiron, the static functions ToJson and FromJson were added to each of the types (see above).

In [4]:
/// Uses Chiron
module ManualSerializer =
open Chiron

// DUNode
let serializeDUNode (n:Model.DUNode) =
Json.serialize n
|> Json.formatWith Chiron.Formatting.JsonFormattingOptions.Compact
let deserializeDUNode str : Model.DUNode = str |> Json.parse |> Json.deserialize

// RNode
let serializeRNode (n:Model.RNode) =
Json.serialize n
|> Json.formatWith Chiron.Formatting.JsonFormattingOptions.Compact
let deserializeRNode str : Model.RNode = str |> Json.parse |> Json.deserialize

// CNode
let serializeCNode (n:Model.CNode) =
Json.serialize n
|> Json.formatWith Chiron.Formatting.JsonFormattingOptions.Compact
let deserializeCNode str : Model.CNode = str |> Json.parse |> Json.deserialize


#### Custom Serialization using FParsec¶

Finally, we have the functions for Custom Serialization using FParsec. The strategy here is to serialize nodes with values as Some("val") Child1 Child2 or as None. We will also be using a pre-order traversal of the tree, i.e. process the parent node, then the left child, and finally the right child.

In [5]:
module CustomSerializer =
/// Pre-order traversal for serialization, not tail-recursive.
/// I am using SRTPs here to avoid having to write identical code for the 3 data types.
/// Advantage of SRTP over higher order function here is that the record's field accessor
/// matches the member constraint - not sure how to do that with functions without
/// having to declare a custom member function for each of the record's fields.
let inline serializeNode (node:^a) =
let rec helper (node:^a) =
let lans =
match (^a:(member left : ^a option) (node)) with
| Some l -> helper l
| None -> "None"
let rans =
match (^a:(member right : ^a option) (node)) with
| Some r -> helper r
| None -> "None"
sprintf "Some(\"%s\") %s %s" (^a:(member val : string) (node)) lans rans
helper node

// Why not go all out?! This is my first time trying FParsec.  FParsec will be used
// to deserialize the string back into the appropriate type.
open FParsec

/// Pre-order traversal for deserialization
/// Here, using a higher-order function to avoid SRTP.
let inline deserializeNode (ctor:string * 'a option * 'a option -> 'a) (str:string) : 'a =
// Basic parser based on our serialization strategy
// We don't allow double-quotes inside strings - KISS
let properString =
pipe3 (pstring "\"") (manySatisfy (fun x -> x <> '\"')) (pstring "\"")
(fun x y z -> x + y + z)
let someValue = spaces >>. pstring "Some(" >>. properString .>> (pstring ")") .>> spaces
let noneValue = spaces >>. pstring "None" .>> spaces
let fullParser = many (someValue <|> noneValue)

/// Parser will tokenize the string and extract the values we care about
/// If it is a real value, it will retain the quotes as part of the string: "value"
/// If it is not a value, it will look like: None
/// This is important because the value itself could be the string "None", and we
/// must distinguish this from an actual non-value.
let tokens =
run fullParser str
|> function
| Success(res,_,_) -> res
| Failure(err,_,_) -> failwith err

/// Recursive deserializer, not tail recursive. Returns the deserialized value
/// and the remaining tokens.
let rec deser (remTokens:string list) : 'a option * string list =
match remTokens with
| hd::tl when hd.StartsWith("N") ->
None, tl
| hd::tl when hd.StartsWith("\"") ->
// Remove the double-quotes from the value
let v = hd.[1 .. String.length hd - 2]
let ln, remAfterLeft = deser tl
let rn, remAfterRight = deser remAfterLeft
Some (ctor (v, ln, rn)), remAfterRight
| _ -> failwith "Tree structure is corrupted"

// Call the deserializer, then return the data structure
// NOTE: Option.get should never fail because, if you start from a tree object
// instead of some contrived string representation, it is impossible to
// create a tree without a root.
deser tokens
|> fst
|> Option.get

// DUNode
let serializeDUNode (n:Model.DUNode) = serializeNode n
let deserializeDUNode str = deserializeNode (Model.DUNode.create) str

// RNode
let serializeRNode (n:Model.RNode) = serializeNode n
let deserializeRNode str = deserializeNode (Model.RNode.create) str

// CNode
let serializeCNode (n:Model.CNode) = serializeNode n
let deserializeCNode str = deserializeNode (Model.CNode.create) str

Model.DUNode.create("root", Model.DUNode.createOpt("1 left"))
|> tee (printfn "I start as a DUNode:%s%O" Environment.NewLine)
|> CustomSerializer.serializeDUNode
|> tee (printfn "serialized: %O")
|> CustomSerializer.deserializeDUNode
|> tee (printfn "Back to a DUNode:%s%O" Environment.NewLine)
|> CustomSerializer.serializeDUNode
|> tee (printfn "serialized: %O")
|> CustomSerializer.deserializeRNode
|> tee (printfn "Now I'm a RNode:%s%O" Environment.NewLine)
|> CustomSerializer.serializeRNode
|> tee (printfn "serialized: %O")
|> CustomSerializer.deserializeCNode
|> tee (printfn "And now I'm a CNode:%s%O" Environment.NewLine)

Out[5]:
FSI_0009+Model+CNode
I start as a DUNode:
Node ("root",Some (Node ("1 left",None,None)),None)
serialized: Some("root") Some("1 left") None None None
Back to a DUNode:
Node ("root",Some (Node ("1 left",None,None)),None)
serialized: Some("root") Some("1 left") None None None
Now I'm a RNode:
{val = "root";
left = Some {val = "1 left";
left = None;
right = None;};
right = None;}
serialized: Some("root") Some("1 left") None None None
And now I'm a CNode:
FSI_0009+Model+CNode


### Strategy Part 3 - Make sure the De-/Serialization strategies work as expected¶

Now that we have the 3 sets of de-/serializers written for each of the 3 data types, let's write some basic unit tests to ensure that the functions are behaving as expected.

NOTE: I cannot seem to get FsCheck to run in IfSharp / Jupyter. If anyone has ideas on how to make it work, please let me know!

First, let's setup some common parameters.

In [6]:
// For random tree generation
let random = Random((int) DateTime.Now.Ticks &&& 0x0000FFFF)
let mutable randLock = Object()

/// Convenience method to generate random numbers safely.
let getNextRandLim (min, max) = lock randLock (fun () -> random.Next (min, max))
let getNextRand () = lock randLock (fun () -> random.Next ())

module Test =
// Common parameters for the tests
let numTests = 150 // How many test to run to check results
let minDepth = 5 // Some complexity
let maxDepth = 10 // To keep runtimes reasonable
let percentNones = 10 // 10% of the time (1 out of 10), we can get a None instead of a Some
// Allows for unbalanced trees
let minValueLength = 10 // Minimum length of val
let maxValueLength = 20 // Maximum length of val
let rootValue = "root" // val for root node.
let nameGen() : string = // Generates a val for the node
let len = getNextRandLim(minValueLength, maxValueLength)
seq {
for i in 1 .. len do
let isCapital = getNextRand() % 2
if isCapital = 0 then yield (getNextRandLim(65, 90) |> char |> string)
else yield (getNextRandLim(97, 122) |> char |> string)
}
|> Seq.fold (fun acc e -> acc + e) ""

// Generate test cases, limiting depth and complexity
let CustomTreeGenerator (ctor:string * 'a option * 'a option -> 'a option) : 'a =
let rec helper depth =
match depth with
| 0 -> None
| n when n>0 ->
let choice = getNextRand() % percentNones // None or Some
match choice with
| 0 -> None
| _ -> ctor(nameGen(), helper <| depth - 1, helper <| depth - 1)
| _ -> invalidArg "depth" (sprintf "Depth is %d but must be >= 0" depth)
ctor(rootValue,
helper <| getNextRandLim(minDepth, maxDepth),
helper <| getNextRandLim(minDepth, maxDepth))
|> Option.get


#### Equality Tests¶

Let's ensure that our equality tests are working correctly, especially for CNodes because we had to override Equals and GetHashCode.

In [7]:
module EqualityTest =
open Expecto

type DU = Model.DUNode
type R = Model.RNode
type C = Model.CNode

let tests =
testList "Equality tests" <|
[ for i in 1 .. Test.numTests do
yield test (sprintf "Confirm a DUNode tree is equal to itself %i" i) {
let tree = Test.CustomTreeGenerator DU.createOpt
Expect.equal tree tree (sprintf "%O" tree)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "Confirm a RNode tree is equal to itself %i" i) {
let tree = Test.CustomTreeGenerator R.createOpt
Expect.equal tree tree (sprintf "%O" tree)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "Confirm a CNode tree is equal to itself %i" i) {
let tree = Test.CustomTreeGenerator C.createOpt
Expect.equal tree tree (sprintf "%O" tree)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "Confirm 2 different DUNode trees are not equal %i" i) {
let tree =
Test.CustomTreeGenerator DU.createOpt
|> fun (DU.Node(v, l, r)) -> DU.create("1", l, r)
let tree2 =
Test.CustomTreeGenerator DU.createOpt
|> fun (DU.Node(v, l, r)) -> DU.create("2", l, r)
Expect.notEqual tree tree2 (sprintf "%O %O" tree tree2)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "Confirm 2 different RNode trees are not equal %i" i) {
let tree =
Test.CustomTreeGenerator R.createOpt
|> fun t -> { t with val = "1" }
let tree2 =
Test.CustomTreeGenerator R.createOpt
|> fun t -> { t with val = "2" }
Expect.notEqual tree tree2 (sprintf "%O %O" tree tree2)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "Confirm 2 different CNode trees are not equal %i" i) {
let tree =
Test.CustomTreeGenerator C.createOpt
|> fun t -> C.create("1", t.left, t.right)
let tree2 =
Test.CustomTreeGenerator C.createOpt
|> fun t -> C.create("2", t.left, t.right)
Expect.notEqual tree tree2 (sprintf "%O %O" tree tree2)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "Confirm 2 similar DUNode trees are not equal %i" i) {
let tr = Test.CustomTreeGenerator DU.createOpt
let tree = tr |> fun (DU.Node(v, l, r)) -> DU.create("1", l, r)
let tree2 = tr |> fun (DU.Node(v, l, r)) -> DU.create("2", l, r)
Expect.notEqual tree tree2 (sprintf "%O %O" tree tree2)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "Confirm 2 similar RNode trees are not equal %i" i) {
let tr = Test.CustomTreeGenerator R.createOpt
let tree = tr |> fun t -> { t with val = "1" }
let tree2 = tr |> fun t -> { t with val = "2" }
Expect.notEqual tree tree2 (sprintf "%O %O" tree tree2)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "Confirm 2 similar CNode trees are not equal %i" i) {
let tr = Test.CustomTreeGenerator C.createOpt
let tree = tr |> fun t -> C.create("1", t.left, t.right)
let tree2 = tr |> fun t -> C.create("2", t.left, t.right)
Expect.notEqual tree tree2 (sprintf "%O %O" tree tree2)
}
]

runTests { defaultConfig with parallel = true } tests

[03:47:23 INF] EXPECTO? Running tests... <Expecto>
[03:48:46 INF] EXPECTO! 1,350 tests run in 00:01:23.7584329 for Equality tests – 1,350 passed, 0 ignored, 0 failed, 0 errored. Success! <Expecto>


#### Automatic Serialization using Json.NET¶

For our tests, we need to ensure that the 1 original test case works, and that random test cases also work correctly.

In [8]:
module AutoSerializerTest =
open Expecto
open AutoSerializer

type DU = Model.DUNode
type R = Model.RNode
type C = Model.CNode

/// Basic brute-force, randomized tests to ensure the logic works correctly
let tests =
testList "Auto Serializer tests" <|
test "DU provided tests" {
let tree =
DU.create("root", DU.createOpt("left", DU.createOpt("left.left")), DU.createOpt("right"))
Expect.equal (tree |> serializeDUNode |> deserializeDUNode) tree (sprintf "%O" tree)
Expect.equal "left.left" tree.left.Value.left.Value.val "tree's left.left = left.left"
} ::
test "R provided tests" {
let tree =
R.create("root", R.createOpt("left", R.createOpt("left.left")), R.createOpt("right"))
Expect.equal (tree |> serializeRNode |> deserializeRNode) tree (sprintf "%O" tree)
Expect.equal "left.left" tree.left.Value.left.Value.val "tree's left.left = left.left"
} ::
test "C provided tests" {
let tree =
C.create("root", C.createOpt("left", C.createOpt("left.left")), C.createOpt("right"))
Expect.equal (tree |> serializeCNode |> deserializeCNode) tree (sprintf "%O" tree)
Expect.equal "left.left" tree.left.Value.left.Value.val "tree's left.left = left.left"
} ::
[ for i in 1 .. Test.numTests do
yield test (sprintf "DUNode generated tests %i" i) {
let tree = Test.CustomTreeGenerator DU.createOpt
Expect.equal (tree |> serializeDUNode |> deserializeDUNode) tree (sprintf "%O" tree)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "RNode generated tests %i" i) {
let tree = Test.CustomTreeGenerator R.createOpt
Expect.equal (tree |> serializeRNode |> deserializeRNode) tree (sprintf "%O" tree)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "CNode generated tests %i" i) {
let tree = Test.CustomTreeGenerator C.createOpt
Expect.equal (tree |> serializeCNode |> deserializeCNode) tree (sprintf "%O" tree)
}
]

runTests { defaultConfig with parallel = true } tests

[03:48:47 INF] EXPECTO? Running tests... <Expecto>
[03:49:10 INF] EXPECTO! 453 tests run in 00:00:23.9313535 for Auto Serializer tests – 453 passed, 0 ignored, 0 failed, 0 errored. Success! <Expecto>


#### Manual Serialization using Chiron¶

In [9]:
module ManualSerializerTest =
open Expecto
open ManualSerializer

type DU = Model.DUNode
type R = Model.RNode
type C = Model.CNode

/// Basic brute-force, randomized tests to ensure the logic works correctly
let tests =
testList "Auto Serializer tests" <|
test "DU provided tests" {
let tree =
DU.create("root", DU.createOpt("left", DU.createOpt("left.left")), DU.createOpt("right"))
Expect.equal (tree |> serializeDUNode |> deserializeDUNode) tree (sprintf "%O" tree)
Expect.equal "left.left" tree.left.Value.left.Value.val "tree's left.left = left.left"
} ::
test "R provided tests" {
let tree =
R.create("root", R.createOpt("left", R.createOpt("left.left")), R.createOpt("right"))
Expect.equal (tree |> serializeRNode |> deserializeRNode) tree (sprintf "%O" tree)
Expect.equal "left.left" tree.left.Value.left.Value.val "tree's left.left = left.left"
} ::
test "C provided tests" {
let tree =
C.create("root", C.createOpt("left", C.createOpt("left.left")), C.createOpt("right"))
Expect.equal (tree |> serializeCNode |> deserializeCNode) tree (sprintf "%O" tree)
Expect.equal "left.left" tree.left.Value.left.Value.val "tree's left.left = left.left"
} ::
[ for i in 1 .. Test.numTests do
yield test (sprintf "DUNode generated tests %i" i) {
let tree = Test.CustomTreeGenerator DU.createOpt
Expect.equal (tree |> serializeDUNode |> deserializeDUNode) tree (sprintf "%O" tree)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "RNode generated tests %i" i) {
let tree = Test.CustomTreeGenerator R.createOpt
Expect.equal (tree |> serializeRNode |> deserializeRNode) tree (sprintf "%O" tree)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "CNode generated tests %i" i) {
let tree = Test.CustomTreeGenerator C.createOpt
Expect.equal (tree |> serializeCNode |> deserializeCNode) tree (sprintf "%O" tree)
}
]

runTests { defaultConfig with parallel = true } tests

[03:49:11 INF] EXPECTO? Running tests... <Expecto>
[03:49:29 INF] EXPECTO! 453 tests run in 00:00:18.1177119 for Auto Serializer tests – 453 passed, 0 ignored, 0 failed, 0 errored. Success! <Expecto>


#### Custom Serialization using FParsec¶

In [10]:
module CustomSerializerTest =
open Expecto
open CustomSerializer

type DU = Model.DUNode
type R = Model.RNode
type C = Model.CNode

/// Basic brute-force, randomized tests to ensure the logic works correctly
let tests =
testList "Custom Serializer tests" <|
test "DU provided tests" {
let tree =
DU.create("root", DU.createOpt("left", DU.createOpt("left.left")), DU.createOpt("right"))
Expect.equal (tree |> serializeDUNode |> deserializeDUNode) tree (sprintf "%O" tree)
Expect.equal "left.left" tree.left.Value.left.Value.val "tree's left.left = left.left"
} ::
test "R provided tests" {
let tree =
R.create("root", R.createOpt("left", R.createOpt("left.left")), R.createOpt("right"))
Expect.equal (tree |> serializeRNode |> deserializeRNode) tree (sprintf "%O" tree)
Expect.equal "left.left" tree.left.Value.left.Value.val "tree's left.left = left.left"
} ::
test "C provided tests" {
let tree =
C.create("root", C.createOpt("left", C.createOpt("left.left")), C.createOpt("right"))
Expect.equal (tree |> serializeCNode |> deserializeCNode) tree (sprintf "%O" tree)
Expect.equal "left.left" tree.left.Value.left.Value.val "tree's left.left = left.left"
} ::
[ for i in 1 .. Test.numTests do
yield test (sprintf "DUNode generated tests %i" i) {
let tree = Test.CustomTreeGenerator DU.createOpt
Expect.equal (tree |> serializeDUNode |> deserializeDUNode) tree (sprintf "%O" tree)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "RNode generated tests %i" i) {
let tree = Test.CustomTreeGenerator R.createOpt
Expect.equal (tree |> serializeRNode |> deserializeRNode) tree (sprintf "%O" tree)
}
] @
[ for i in 1 .. Test.numTests do
yield test (sprintf "CNode generated tests %i" i) {
let tree = Test.CustomTreeGenerator C.createOpt
Expect.equal (tree |> serializeCNode |> deserializeCNode) tree (sprintf "%O" tree)
}
]

runTests { defaultConfig with parallel = true } tests

[03:49:29 INF] EXPECTO? Running tests... <Expecto>
[03:49:47 INF] EXPECTO! 453 tests run in 00:00:17.7770251 for Custom Serializer tests – 453 passed, 0 ignored, 0 failed, 0 errored. Success! <Expecto>


Excellent, our logic works!

Now that we have it working, let's try some basic performance tests.

## Performance Testing¶

Here is the strategy for performance testing.

Using random, pre-generated trees, test the following:

1. Serialize trees (e.g. 1,000 of each model) using each of the 3 serialization methods.
2. De-serialize those trees using each of the 3 deserialization methods.

Hopefully, this will give us an indication of which combination of de-/serialization method and data structure is the "fastest".

### Setup the test data¶

The first step is to create the trees that will be used for serialization and de-serialization.

In [11]:
module PerformanceTestData =
type DU = Model.DUNode
type R = Model.RNode
type C = Model.CNode

// How many trees to generate
let dataStructureCount = 2000

// Generate the trees
let DUTrees = List.init dataStructureCount (fun _ -> Test.CustomTreeGenerator DU.createOpt)
let RTrees = List.init dataStructureCount (fun _ -> Test.CustomTreeGenerator R.createOpt)
let CTrees = List.init dataStructureCount (fun _ -> Test.CustomTreeGenerator C.createOpt)

// To measure each execution
let stopWatch = System.Diagnostics.Stopwatch()

// The rows and columns of the result table.
let rows = [
"Automatic"
"Manual"
"Custom"
]
let columns = ["Discriminated Union"; "Record"; "Class"]
let cornerText = [@"Algorithm vs. Data Structure"];

// The algorithms, for "somewhat easy" testing
let duSerializationAlgorithms =
[ AutoSerializer.serializeDUNode
ManualSerializer.serializeDUNode
CustomSerializer.serializeDUNode ]
let rSerializationAlgorithms =
[ AutoSerializer.serializeRNode
ManualSerializer.serializeRNode
CustomSerializer.serializeRNode ]
let cSerializationAlgorithms =
[ AutoSerializer.serializeCNode
ManualSerializer.serializeCNode
CustomSerializer.serializeCNode ]

let duDeserializationAlgorithms =
[ AutoSerializer.deserializeDUNode
ManualSerializer.deserializeDUNode
CustomSerializer.deserializeDUNode ]
let rDeserializationAlgorithms =
[ AutoSerializer.deserializeRNode
ManualSerializer.deserializeRNode
CustomSerializer.deserializeRNode ]
let cDeserializationAlgorithms =
[ AutoSerializer.deserializeCNode
ManualSerializer.deserializeCNode
CustomSerializer.deserializeCNode ]


### Run the Performance Tests¶

In [12]:
open PerformanceTestData

/// Returns an average measurement of 3 consecutive runs.
let measurement numRuns alg lst =
stopWatch.Reset()
stopWatch.Start()
List.init numRuns (fun i -> List.map alg lst)
|> ignore
stopWatch.Stop()

TimeSpan.FromTicks(stopWatch.ElapsedTicks / (int64 numRuns)).Ticks

In [13]:
/// Collect the serialization results.
let serializationResults =
[ yield
duSerializationAlgorithms
|> List.map (fun a -> measurement 3 a DUTrees)
|> List.zip rows ] @
[ yield
rSerializationAlgorithms
|> List.map (fun a -> measurement 3 a RTrees)
|> List.zip rows ] @
[ yield
cSerializationAlgorithms
|> List.map (fun a -> measurement 3 a CTrees)
|> List.zip rows ]

In [14]:
/// Collect the deserialization results.
let deserializationResults =
[ let ser = List.map (fun a -> List.map a DUTrees) duSerializationAlgorithms
yield
duDeserializationAlgorithms
|> List.mapi (fun i a -> measurement 3 a (List.item i ser))
|> List.zip rows ] @
[ let ser = List.map (fun a -> List.map a RTrees) rSerializationAlgorithms
yield
rDeserializationAlgorithms
|> List.mapi (fun i a -> measurement 3 a (List.item i ser))
|> List.zip rows ] @
[ let ser = List.map (fun a -> List.map a CTrees) cSerializationAlgorithms
yield
cDeserializationAlgorithms
|> List.mapi (fun i a -> measurement 3 a (List.item i ser))
|> List.zip rows ]


### Plot the Performance Tests¶

And finally, we can plot the results in a simple table.

#### Serialization Results¶

As a data table, showing as hh:mm:ss.ms.

In [15]:
open XPlot.GoogleCharts

serializationResults
|> List.map (fun l -> List.map (fun (x, y) -> (x, TimeSpan.FromTicks(y).ToString("c"))) l)
|> Chart.Table
|> Chart.WithOptions (Options(minColor="green", midColor="yellow", maxColor="red", showRowNumber=true))
|> Chart.WithLabels (cornerText @ columns)

Out[15]:

And as a chart, showing ticks.

In [16]:
serializationResults
|> Chart.Bar
|> Chart.WithOptions (Options())
|> Chart.WithLabels (cornerText @ columns)

Out[16]:

#### Deserialization Results¶

Again as a table, with the hh:mm:ss.ms format.

In [17]:
deserializationResults
|> List.map (fun l -> List.map (fun (x, y) -> (x, TimeSpan.FromTicks(y).ToString("c"))) l)
|> Chart.Table
|> Chart.WithOptions (Options(minColor="green", midColor="yellow", maxColor="red", showRowNumber=true))
|> Chart.WithLabels (cornerText @ columns)

Out[17]:

And as a chart, showing ticks.

In [18]:
deserializationResults
|> Chart.Bar
|> Chart.WithOptions (Options())
|> Chart.WithLabels (cornerText @ columns)

Out[18]:

## Conclusion¶

The results are very interesting. From a CPU perspective, FParsec wins hands down regardless of the chosen data structure. There is an order of magnitude difference between the three serialization techniques.

From an SLOC perspective, Json.NET is the clear winner. Serialization and deserialization are both 1 line, each. FParsec is the most complex, requiring the development of a custom de-/serialization format, then ensuring that all 3 data structures can be transformed to that format.

Chiron stays right in the middle, both in terms of performance and SLOC.

The one advantage that both Chiron and FParsec can provide is that, when created the right way, the serialized string can be deserialized into any of the 3 data structures (discriminated union, record, or class). The FParsec implementation above works that way.

Unfortunately, I do not know how to measure memory usage from within IfSharp / Jupyter. So, I can't really speak to that.

From a data structure perspective, records seems to provide very good performance. I was surprised that records performed even better than classes, in general. However, it appears that Json.NET had issues with both discriminated unions and classes, during deserialization and serialization, respectively.

Personally, I am not sure that creating a custom de-/serialization format (aka, going down the FParsec route) is worth the effort, especially given all the edge cases that must be handled. For example, my FParsec implementation cannot handle double-quotes within strings and, I believe (though I didn't test this), cannot handle strings with newline characters in them.

Json.NET and Chiron both use JSON, which is an extremely popular format and enjoys widespread support in development tooling. I believe this is a better route, especially because using such a common, human-readable format would make debugging easier. Maintainability is extremely important and something that is often overlooked.

After testing the system with JSON, it can be shifted to a binary format such as MessagePack, BSON, or Protocol Buffers, among others, if size becomes a concern. Most of these formats have well-tested libraries available in multiple languages.