Monday, 22 April 2013

C# Wish List: yield for Collections

I remembered a new C# feature I want today. If you're familiar with 'yield', it's a great keyword in C# used in iterator blocks. eg:

 // Ignoring that there would be much simpler ways to do this anyway.  
 public static IEnumerable<int> GetEvens(IEnumerable<int> numbers)  
   foreach (var number in numbers)  
     if (number % 2 == 0)  
       yield return number;  

Using yield will also result in lazy evaluation as demonstrated here, when using the method above:

 static void Main(string[] args)  
   var list = Enumerable.Range(1, 10).ToList();  
   var output = GetEvens(list);  
   foreach (var item in output)  

Which prints:


Which is an awesome feature. However, I often find cases where I'd like to yield the rest of a collection. For instance, in the Combine method of a MergeSort I was implementing, I had this block from where I wanted to get the rest of the right collection (because the left collection was now empty):

 if (leftIndex == left.Count)  
   foreach (var x in right.Skip(rightIndex))  
     yield return x;  

But it would be much nicer if I could yield the collection. In F# you can yield a collection with yield! eg:

 let FooOneToTen =   
   seq { yield! [1..10] }  

Which I suppose in C# would be more like this:

 if (leftIndex == left.Count)  
   yield! return right.Skip(rightIndex);  

That would improve readability, but would understandably be a very low priority feature.

Sunday, 21 April 2013

Different Problem, Same Effect

** Warning: Guitar post **
A couple of years ago I decided I wanted the pickups changed in my Ibanez RG350DX guitar, to save time and money I figured I'd do it myself. Swapping pickups, in general, should be pretty easy. It just requires un-soldering the existing connections, unscrewing the pickup rings, and doing the reverse with the new ones. The first time I tried this, the soldering iron didn't get hot enough, and as I was replacing 2 pickups from a 3 pickup configuration I got stuck. I relented and took that one into a local guitar shop and let them sort it.

Fast-forward to now, I bought a new electric a few months ago, and while really happy with it, thought it could benefit from the same pickup change the Ibanez got because of some subtleties in the high-end. I remembered having to take it into the shop and thought "not this time". When they eventually arrived (and that's another story entirely), I got to work on replacing them. I took some pictures of the existing wiring:

And went out and found a better soldering iron. I got the cover off the back and un-soldered the bridge pickup successfully. Time to get the pickup out from the front with the... oh... that screw on the pickup guard looks different... I don't think I have that size. Foiled again, but at least only temporarily.

Edit: I found a metal nail file that worked excellently as a screwdriver. Mission accomplished:

Wednesday, 17 April 2013

Irregular Expressions

To parse the input of my hand history parser, I thought about a couple of possible options for reading the text. Given the dynamic nature of the file format, from my limited use of regular expressions up to that point, I decided they would be the best tool for the job and would also give me an opportunity to learn about them a bit. You can start as simply as just checking a string will or won’t match:

 bool match = Regex.IsMatch("exampletext", ".+");  

But when you have lines of text that contain multiple sections of information you will have to stray into grouped matching. Let’s say a line contains some summary information, that a player has won some amount of money at the end of a hand and as such their cards are displayed. Eg.

 "Player1 won $1.90 with Ah Jh"  

Grouped matching allows you to take all the useful information at once:


Using that string we can now get the useful information out:

 string input = "Player1 won $1.90 with Ah Jh";  
 string matchString = @"(?<player>[\w\s\.]+)\swon\s\$(?<amount>\d+\.\d{2})\swith\s(?<leftcard>.{2})\s(?<rightcard>.{2})";  
 var matches = Regex.Matches(input, matchString);  
 string playerName = matches[0].Groups["player"].Value;  
 float amount = Convert.ToSingle(matches[0].Groups["amount"].Value);  
 string leftCard = matches[0].Groups["leftcard"].Value;  
 string rightCard = matches[0].Groups["rightcard"].Value;  

Let’s suppose the string gets a little more troublesome, and the log marks the players role at the table (dealer button, small blind, big blind), and nothing for the rest. Eg.

 "Player1 (dealer) won $1.90 with Ah Jh"  
 "Player1 (small blind) won $1.90 with Ah Jh"  
 "Player1 (big blind) won $1.90 with Ah Jh"  
 "Player1 won $1.90 with Ah Jh"  

That match string will fail for ¾ of the cases supplied above, so now we need to use optional matching. Let’s say we start with just the first and last possibilities, dealer and normal player:


Where we’ve added (dealer\s|), the | allowing matching of the text either side (‘dealer ‘, or nothing extra). Or if it was just the small and big blind, it could be modified to have (small|big)\sblind\s and if we want to catch all of these possibilities, (((big|small) blind|button)\s|) giving us the full string:

 "(?<player>[\w\s\.]+)\s(((big|small) blind|button)\s|)won\s\$(?<amount>\d+\.\d{2})\swith\s(?<leftcard>.{2})\s(?<rightcard>.{2})"  

Which will happily match all of my potential options.

Monday, 15 April 2013

Reading Grouped Data

Having multiple records of information stored as plain text in files is certainly neither uncommon or difficult to read back in. Grouped lines however, was a problem I came across a while ago, with data being split across multiple lines and being separated by a line of empty text. My first shot at this worked fine:

 let ProcessFile (allLines: string list) =   
   let list = new List<List<string>>()  
   let rec SplitFile (input: string list) =  
     if input.Length <> 0 then  
       list.Add(new List<string>(input.TakeWhile(fun x -> x <> "")))  
       let nextGroup = input.SkipWhile(fun x -> x <> "").SkipWhile(fun x -> x = "")  
       SplitFile (Seq.toList nextGroup)  
   SplitFile allLines |> ignore  

But I was unhappy with this solution. It doesn’t seem idiomatic to new List<List<string>> and store the data in there. So I took to StackOverflow in search of a better answer, which was soon provided:
Which directed me to what I wanted, a solution that retained the use of immutable data.

Wednesday, 10 April 2013

Identifying and Processing Flexible Structured Data

In writing a card game history parser, I came across an interesting problem. At first it seemed like it might be mostly straightforward, as these types of problems often do. When you consider a game like Texas Hold’em, there are several rounds of betting, referred to as streets. These streets as you expect are sequential. Preflop always leads to the flop, the flop always leads to the turn, the turn always leads to the river and if more than one player got that far, the player with the strongest hand takes the money. The problem I ran into is in the ‘if’ part of that last sentence. When a sufficient number of players during any part of the hand folds such that there is only 1 remaining, the hand is then short-circuited to the summary, which states some brief information about what happened. When you consider the possible combinations here:

Preflop -> Summary
Preflop -> Flop -> Summary
Preflop -> Flop -> Turn -> Summary
Preflop -> Flop -> Turn -> River -> Summary
Preflop -> Flop -> Turn -> River -> Showdown -> Summary

Each of these stages has an identifying string, so it’s easy to check where they begin. The first approach I took to this was not particularly efficient, and excessively complicated. In this implementation the following happened (warning, much code ahead):

 let GetLinesBetweenMatches (input: string list, startMatch: string, endMatch: string) =  
  let indexOfStart = (input.TakeWhile (fun x -> Regex.Matches(x, startMatch).Count = 0)).Count()  
  let indexOfEnd = (input.TakeWhile (fun x -> Regex.Matches(x, endMatch).Count = 0)).Count()  
  let numberOfLines = indexOfEnd - indexOfStart  
  if startMatch.Length = 0 then  
  input.Skip(indexOfStart + 1).Take(numberOfLines - 1)  
 let flopExists = (List.forall (fun (x: string) -> Regex.Matches(x, "\*\*\*\sFLOP\s\*\*\*").Count = 0) input) = false  
 let turnExists = (List.forall (fun (x: string) -> Regex.Matches(x, "\*\*\*\sTURN\s\*\*\*").Count = 0) input) = false  
 let riverExists = (List.forall (fun (x: string) -> Regex.Matches(x, "\*\*\*\sRIVER\s\*\*\*").Count = 0) input) = false  
 let gotToShowdown = (List.forall (fun (x: string) -> Regex.Matches(x, "\*\*\*\sSHOW\sDOWN\s\*\*\*").Count = 0) input) = false  
 // Process the preflop action  
 let preflopStrings = if flopExists then GetLinesBetweenMatches (input, "Dealt to", "\*\*\*\sFLOP\s\*\*\*")  
    else if gotToShowdown then GetLinesBetweenMatches (input, "Dealt to", "\*\*\*\sSHOW\sDOWN\s\*\*\*")  
    else GetLinesBetweenMatches (input, "Dealt to", "\*\*\*\sSUMMARY\s\*\*\*")  
 let pfact = [for x in (ProcessStreet (Seq.toList preflopStrings, players)) -> x] |> (fun x -> x)  
 let preflop = { actions = pfact.ToList(); streetType = Streets.Preflop }  
 // Process the flop action  
 let flopStrings = if turnExists then GetLinesBetweenMatches (input, "\*\*\*\sFLOP\s\*\*\*", "\*\*\*\sTURN\s\*\*\*")  
    else if gotToShowdown then GetLinesBetweenMatches (input, "\*\*\*\sFLOP\s\*\*\*", "\*\*\*\sSHOW\sDOWN\s\*\*\*")  
    else GetLinesBetweenMatches (input, "\*\*\*\sFLOP\s\*\*\*", "\*\*\*\sSUMMARY\s\*\*\*")  
 let flact = [for x in (ProcessStreet (Seq.toList flopStrings, players)) -> x] |> (fun x -> x)  
 let flop = if flopExists then Some({ actions = flact.ToList(); streetType = Streets.Flop })  
    else None  
 if flop = None then  
  if gotToShowdown then  
  { metaData = metaData; players = players.ToList(); streets = ([preflop; showdown.Value]).ToList() }  
  { metaData = metaData; players = players.ToList(); streets = ([preflop]).ToList() }  
 else if turn = None then  
  if gotToShowdown then   
   { metaData = metaData; players = players.ToList(); streets = ([preflop; flop.Value; showdown.Value]).ToList() }  
   { metaData = metaData; players = players.ToList(); streets = ([preflop; flop.Value]).ToList() }  
 /// ...skipping the turn and river parsing and processing code, you get the idea that it's a branching nightmare and it would be at least double this length.  

As you can see, it’s complicated, inelegant and difficult to modify. I was aware of this at the time, but wanted to sketch something out so that I understood the problem much better. I considered what the better approach would be for a little while, as it was evident that I was trying to solve the problem in an object oriented manner, when it wasn’t an object oriented problem. I mostly wrote this up to demonstrate the difference a small ‘a-ha!’ moment can make, and probably the first time I’d come to a much better answer through the use of this new paradigm.
The improved code is shown below:

 /// Maps a street identifier to each action.  
 let MapHand (input: List<string>) =  
  let street = ref Streets.MetaData  
  seq { for line in input do  
  if Regex.Matches(line, "Seat \d: (?<name>.+) \(\$(?<stack>\d+(\.\d{2})?) in chips\)").Count > 0 then street := Streets.Players  
  else if Regex.Matches(line, "posts small blind").Count > 0 then street := Streets.Blinds  
  else if !street = Streets.Hero then street := Streets.Preflop  
  else if Regex.Matches(line, "Dealt to (?<hero>.+)\s\[(?<leftcard>.{2})\s(?<rightcard>.{2})\]").Count > 0 then street := Streets.Hero  
  else if Regex.Matches(line, "\*\*\*\sFLOP\s\*\*\*").Count > 0 then street := Streets.Flop  
  else if Regex.Matches(line, "\*\*\*\sTURN\s\*\*\*").Count > 0 then street := Streets.Turn  
  else if Regex.Matches(line, "\*\*\*\sRIVER\s\*\*\*").Count > 0 then street := Streets.River  
  else if Regex.Matches(line, "\*\*\*\sSHOW\sDOWN\s\*\*\*").Count > 0 then street := Streets.ShowDown  
  else if Regex.Matches(line, "\*\*\*\sSUMMARY\s\*\*\*").Count > 0 then street := Streets.Summary  
  yield (!street, line) } |> Seq.toList  
 let ProcessHand (input: (Streets * string) list) =  
  let FilterByStreet lines filterStreet =  
  lines |> List.filter (fun (street, _) -> street = filterStreet)  
  |> (fun (_, action) -> action)  
  let result = ProcessResult (FilterByStreet input Streets.Summary)  
  let filtered = input |> List.filter (fun (_, action) -> FilterAction action)  
  let metaData = ProcessMetaData (FilterByStreet filtered Streets.MetaData)  
  let players = GetPlayers (FilterByStreet filtered Streets.Players)  
  let hero = GetHero ((FilterByStreet filtered Streets.Hero), players)  
  let blinds = { actions = ProcessStreet ((FilterByStreet filtered Streets.Blinds), players); streetType = Streets.Blinds }  
  let preflop = { actions = ProcessStreet ((FilterByStreet filtered Streets.Preflop), players); streetType = Streets.Preflop }  
  let flop = { actions = ProcessStreet ((FilterByStreet filtered Streets.Flop), players); streetType = Streets.Flop }  
  let turn = { actions = ProcessStreet ((FilterByStreet filtered Streets.Turn), players); streetType = Streets.Turn }  
  let river = { actions = ProcessStreet ((FilterByStreet filtered Streets.River), players); streetType = Streets.River }  
  let showdown = { actions = ProcessStreet ((FilterByStreet filtered Streets.ShowDown), players); streetType = Streets.ShowDown }  
  let summary = { actions = ProcessSummary ((FilterByStreet filtered Streets.Summary), players); streetType = Streets.Summary }  
  let streets = [blinds; preflop; flop; turn; river; showdown; summary] |> List.filter (fun x -> x.actions.Length <> 0)  
  { metaData = metaData; players = players; streets = streets; result = result }  
 /// ...back into the main processing function.  
 seq { for x in lines do yield MapHand x} |> Seq.toList  
       |> (fun hand -> try Some (ProcessHand hand) with | InvalidHand(_) -> None)  
       |> List.filter (fun processedHand -> processedHand.IsSome)  
       |> (fun hand -> hand.Value)  

In this solution, there’s a value that states the street, each street is paired to a regular expression string. It iterates through the lines of the history and yields a tuple of the line and the street. This sequence of tuples is then given to the process hand function which the code for is in the snippet above which creates values for all streets. If those streets don’t exist, the ProcessStreet function will give us an empty list, and all of the streets are added into a single list, then filtered for empty lists. It’s not purely functional, as there is a mutable variable used in processing the file, but with a recursive function the mutable value wouldn't be necessary anyway, as we’d just give the next use of the function the new value we want to use instead.
In this solution, the readability has improved and so has the extensibility, if a new street is invented in some sort of variant, it’s very simple to include without any branching.
If I’d written a solution to the above problem in C#, I’d imagine I’d have come to a similar solution eventually, but it was the awareness of the functional concepts of map and filter that were important here, and steered me back to describing what I wanted instead of writing the steps to get it myself.

Edit: I'm guessing I'm going to need to write an F# add-on to that syntax highlighter.

Tuesday, 9 April 2013

Classes and Records

I recently started writing a text parser, a foundation stage in a larger project I’ve been planning to process large numbers of card game hands for a bit of big data experimentation in F#. I began in C#, I figured I’d just get the annoying bit out of the way here quickly in the language I was most familiar with, load the library in with the F# stuff and move on from there. After about 30 minutes of sketching out the classes for data storage, it started to feel rather clunky and weighed down. I’d already got a couple of hundred lines of code and I hadn’t even started processing anything yet, for example:

 using System.Collections.Generic;  
 namespace HHParser  
   /// <summary>  
   /// A class that represents a 'street' and all of its actions.  
   /// </summary>  
   public class StreetActions  
     /// <summary>  
     /// Initialises a new instance of the StreetActions class.  
     /// </summary>  
     public StreetActions()  
       OrderedActions = new LinkedList<playeraction>();  
     /// <summary>  
     /// All the actions taken on this particular street.  
     /// </summary>  
     public LinkedList<playeraction> OrderedActions { get; private set; }  
   /// <summary>  
   /// A single action a player has taken in the hand.  
   /// </summary>  
   public struct PlayerAction  
     /// <summary>  
     /// The players name.  
     /// </summary>  
     public string Player { get; set; }  
     /// <summary>  
     /// The action the player took.  
     /// </summary>  
     public Actions Action { get; set; }  
     /// <summary>  
     /// The cost to the player.  
     /// </summary>  
     public float? Amount { get; set; }  
   /// <summary>  
   /// The actions a player can through the process of a hand.  
   /// </summary>  
   public enum Actions  
I had figured it would be a little more concise overall, but I hadn’t realised I’d get this benefit out of the gate, when porting the above:
 namespace Parser  
 type Action =  
   | Fold of unit  
   | Check of unit  
   | Call of float  
   | Bet of float  
   | Raise of float * float  
 type Streets =  
   | Preflop  
   | Flop  
   | Turn  
   | River  
 type PlayerAction = { player: string; action: Action }  
 type Street = { actions: PlayerAction list; streetType: Streets }  
A couple of discriminated unions and records and I’m presented with a far clearer version of what I’d written in the C# version. It’s also nice to not have to worry about the accessibility in getting and assigning variables, since its immutable those properties are irrelevant.
Another point of note is more subtle but bugged me when it first came up, the nullable float in the C# PlayerAction class, provided for those actions that require it, and to be kept null for those that don’t. I felt I’d lost some neatness as soon I’d written that, and while I could have just had it as a base class and specialised for actions that required the float, that too adds its own complexity, so writing the Action type in F# was refreshingly simple, especially considering how simple it is to pattern match out the correct instance when you’re trying to do some work on it. For example:
 match action with  
   | Fold -> printfn "Player folded."  
   | Check -> printfn "Player checked."  
   | Call(size) -> printfn "Player called %f." size  
   | Bet(size) -> printfn "Player bet %f." size  
   | Raise(call, raise) -> printfn "Player raised from %f to %f." call raise  
Which is already neater than getting or setting a null value based on an enum, and no chance of them falling out of sync.