Skip to main content

Identifying and Processing Flexible Structured Data

In writing a card game history parser, I came across an interesting problem. At first it seemed like it might be mostly straightforward, as these types of problems often do. When you consider a game like Texas Hold’em, there are several rounds of betting, referred to as streets. These streets as you expect are sequential. Preflop always leads to the flop, the flop always leads to the turn, the turn always leads to the river and if more than one player got that far, the player with the strongest hand takes the money. The problem I ran into is in the ‘if’ part of that last sentence. When a sufficient number of players during any part of the hand folds such that there is only 1 remaining, the hand is then short-circuited to the summary, which states some brief information about what happened. When you consider the possible combinations here:

Preflop -> Summary
Preflop -> Flop -> Summary
Preflop -> Flop -> Turn -> Summary
Preflop -> Flop -> Turn -> River -> Summary
Preflop -> Flop -> Turn -> River -> Showdown -> Summary

Each of these stages has an identifying string, so it’s easy to check where they begin. The first approach I took to this was not particularly efficient, and excessively complicated. In this implementation the following happened (warning, much code ahead):

 let GetLinesBetweenMatches (input: string list, startMatch: string, endMatch: string) =  
  let indexOfStart = (input.TakeWhile (fun x -> Regex.Matches(x, startMatch).Count = 0)).Count()  
  let indexOfEnd = (input.TakeWhile (fun x -> Regex.Matches(x, endMatch).Count = 0)).Count()  
  let numberOfLines = indexOfEnd - indexOfStart  
  if startMatch.Length = 0 then  
  input.Skip(indexOfStart + 1).Take(numberOfLines - 1)  
 let flopExists = (List.forall (fun (x: string) -> Regex.Matches(x, "\*\*\*\sFLOP\s\*\*\*").Count = 0) input) = false  
 let turnExists = (List.forall (fun (x: string) -> Regex.Matches(x, "\*\*\*\sTURN\s\*\*\*").Count = 0) input) = false  
 let riverExists = (List.forall (fun (x: string) -> Regex.Matches(x, "\*\*\*\sRIVER\s\*\*\*").Count = 0) input) = false  
 let gotToShowdown = (List.forall (fun (x: string) -> Regex.Matches(x, "\*\*\*\sSHOW\sDOWN\s\*\*\*").Count = 0) input) = false  
 // Process the preflop action  
 let preflopStrings = if flopExists then GetLinesBetweenMatches (input, "Dealt to", "\*\*\*\sFLOP\s\*\*\*")  
    else if gotToShowdown then GetLinesBetweenMatches (input, "Dealt to", "\*\*\*\sSHOW\sDOWN\s\*\*\*")  
    else GetLinesBetweenMatches (input, "Dealt to", "\*\*\*\sSUMMARY\s\*\*\*")  
 let pfact = [for x in (ProcessStreet (Seq.toList preflopStrings, players)) -> x] |> (fun x -> x)  
 let preflop = { actions = pfact.ToList(); streetType = Streets.Preflop }  
 // Process the flop action  
 let flopStrings = if turnExists then GetLinesBetweenMatches (input, "\*\*\*\sFLOP\s\*\*\*", "\*\*\*\sTURN\s\*\*\*")  
    else if gotToShowdown then GetLinesBetweenMatches (input, "\*\*\*\sFLOP\s\*\*\*", "\*\*\*\sSHOW\sDOWN\s\*\*\*")  
    else GetLinesBetweenMatches (input, "\*\*\*\sFLOP\s\*\*\*", "\*\*\*\sSUMMARY\s\*\*\*")  
 let flact = [for x in (ProcessStreet (Seq.toList flopStrings, players)) -> x] |> (fun x -> x)  
 let flop = if flopExists then Some({ actions = flact.ToList(); streetType = Streets.Flop })  
    else None  
 if flop = None then  
  if gotToShowdown then  
  { metaData = metaData; players = players.ToList(); streets = ([preflop; showdown.Value]).ToList() }  
  { metaData = metaData; players = players.ToList(); streets = ([preflop]).ToList() }  
 else if turn = None then  
  if gotToShowdown then   
   { metaData = metaData; players = players.ToList(); streets = ([preflop; flop.Value; showdown.Value]).ToList() }  
   { metaData = metaData; players = players.ToList(); streets = ([preflop; flop.Value]).ToList() }  
 /// ...skipping the turn and river parsing and processing code, you get the idea that it's a branching nightmare and it would be at least double this length.  

As you can see, it’s complicated, inelegant and difficult to modify. I was aware of this at the time, but wanted to sketch something out so that I understood the problem much better. I considered what the better approach would be for a little while, as it was evident that I was trying to solve the problem in an object oriented manner, when it wasn’t an object oriented problem. I mostly wrote this up to demonstrate the difference a small ‘a-ha!’ moment can make, and probably the first time I’d come to a much better answer through the use of this new paradigm.
The improved code is shown below:

 /// Maps a street identifier to each action.  
 let MapHand (input: List<string>) =  
  let street = ref Streets.MetaData  
  seq { for line in input do  
  if Regex.Matches(line, "Seat \d: (?<name>.+) \(\$(?<stack>\d+(\.\d{2})?) in chips\)").Count > 0 then street := Streets.Players  
  else if Regex.Matches(line, "posts small blind").Count > 0 then street := Streets.Blinds  
  else if !street = Streets.Hero then street := Streets.Preflop  
  else if Regex.Matches(line, "Dealt to (?<hero>.+)\s\[(?<leftcard>.{2})\s(?<rightcard>.{2})\]").Count > 0 then street := Streets.Hero  
  else if Regex.Matches(line, "\*\*\*\sFLOP\s\*\*\*").Count > 0 then street := Streets.Flop  
  else if Regex.Matches(line, "\*\*\*\sTURN\s\*\*\*").Count > 0 then street := Streets.Turn  
  else if Regex.Matches(line, "\*\*\*\sRIVER\s\*\*\*").Count > 0 then street := Streets.River  
  else if Regex.Matches(line, "\*\*\*\sSHOW\sDOWN\s\*\*\*").Count > 0 then street := Streets.ShowDown  
  else if Regex.Matches(line, "\*\*\*\sSUMMARY\s\*\*\*").Count > 0 then street := Streets.Summary  
  yield (!street, line) } |> Seq.toList  
 let ProcessHand (input: (Streets * string) list) =  
  let FilterByStreet lines filterStreet =  
  lines |> List.filter (fun (street, _) -> street = filterStreet)  
  |> (fun (_, action) -> action)  
  let result = ProcessResult (FilterByStreet input Streets.Summary)  
  let filtered = input |> List.filter (fun (_, action) -> FilterAction action)  
  let metaData = ProcessMetaData (FilterByStreet filtered Streets.MetaData)  
  let players = GetPlayers (FilterByStreet filtered Streets.Players)  
  let hero = GetHero ((FilterByStreet filtered Streets.Hero), players)  
  let blinds = { actions = ProcessStreet ((FilterByStreet filtered Streets.Blinds), players); streetType = Streets.Blinds }  
  let preflop = { actions = ProcessStreet ((FilterByStreet filtered Streets.Preflop), players); streetType = Streets.Preflop }  
  let flop = { actions = ProcessStreet ((FilterByStreet filtered Streets.Flop), players); streetType = Streets.Flop }  
  let turn = { actions = ProcessStreet ((FilterByStreet filtered Streets.Turn), players); streetType = Streets.Turn }  
  let river = { actions = ProcessStreet ((FilterByStreet filtered Streets.River), players); streetType = Streets.River }  
  let showdown = { actions = ProcessStreet ((FilterByStreet filtered Streets.ShowDown), players); streetType = Streets.ShowDown }  
  let summary = { actions = ProcessSummary ((FilterByStreet filtered Streets.Summary), players); streetType = Streets.Summary }  
  let streets = [blinds; preflop; flop; turn; river; showdown; summary] |> List.filter (fun x -> x.actions.Length <> 0)  
  { metaData = metaData; players = players; streets = streets; result = result }  
 /// ...back into the main processing function.  
 seq { for x in lines do yield MapHand x} |> Seq.toList  
       |> (fun hand -> try Some (ProcessHand hand) with | InvalidHand(_) -> None)  
       |> List.filter (fun processedHand -> processedHand.IsSome)  
       |> (fun hand -> hand.Value)  

In this solution, there’s a value that states the street, each street is paired to a regular expression string. It iterates through the lines of the history and yields a tuple of the line and the street. This sequence of tuples is then given to the process hand function which the code for is in the snippet above which creates values for all streets. If those streets don’t exist, the ProcessStreet function will give us an empty list, and all of the streets are added into a single list, then filtered for empty lists. It’s not purely functional, as there is a mutable variable used in processing the file, but with a recursive function the mutable value wouldn't be necessary anyway, as we’d just give the next use of the function the new value we want to use instead.
In this solution, the readability has improved and so has the extensibility, if a new street is invented in some sort of variant, it’s very simple to include without any branching.
If I’d written a solution to the above problem in C#, I’d imagine I’d have come to a similar solution eventually, but it was the awareness of the functional concepts of map and filter that were important here, and steered me back to describing what I wanted instead of writing the steps to get it myself.

Edit: I'm guessing I'm going to need to write an F# add-on to that syntax highlighter.


Popular posts from this blog

Constructing a Trie in F#

This post might get a bit more context next week, but essentially, in continuation of looking at data structures in F# and C# I picked prefix trees (tries) as the next step. A trie is similar to a BST except the search is cumulative. A simple example would be building a spellchecker. The spell checker can use the structure to see if a word exists quickly and efficiently.

It works by creating a new node for each part of the word, using existing nodes if they're already in place. You can see in this diagram an example structure after an insertion of several words.

In the above diagram the words ANT, AND, BATS have been added into the data structure. You'll notice that the word BATS can also be the word BAT and that becomes a part of the structure, setting a value at each node to state if this is a place that a word terminates.

1 2 3 4// Node type for storing a trie. typeTrieNode=|Nodeofchar*TrieNodelist*bool|RootofTrieNodelist
I defined the type to have 2 options. The root node,…

A Docker Experiment

Containers are a topic that have been rising in occurrence for me, for the last year or so. From using them as part of the architecture at work or for various pet projects friends have been working on. I figured it was time to experiment myself and get to grips with what people have been talking about. It seemed like a good idea to find some introductory material that would give me an overview of its uses without delving too far into the details so I could see what it actually was, so I found a PluralSight course about Docker, specifically “Docker and Containers:The Big Picture”. This gave a nice overview of what Docker is, and more importantly what it was trying to achieve and how to use it. With a bit more of an understanding, I wanted to use it for something, preferably something familiar. I decided to try setting up ElasticSearch and Kibana containers, where Kibana would visualize the ElasticSearch data. I used bits of this article along the way as a guide, if you'd prefer a more…

Self-Producing an EP Pt.5: Mixing

After some discussion we opted to mix it ourselves for a few reasons.

It's our first EP, it's not going to be heard by a huge amount of people and while it's important that it sounds good, it's expensive to get done properly and we might be able to manage 'good enough' on our own with a lot of work.Mixing as an online service seems to be the main way to go for small/starting out bands and that seems too creatively detached. There are many stories floating around of people who've sent things off for mixing and the mixer has a much different idea of how it sounds.
Maybe it's not the best choice, but it seemed an acceptable risk that we would at least attempt to have it 'look' how we want and it be a bit wonky than potentially end up with something we're not happy with in a different way. The benefit would have been being able to just ship the stems off and make it someone else's problem instead of weeks of being unsure while swimming in the…