hubFS: THE place for F#

. . . are you on The Hub?
Welcome to hubFS: THE place for F# Sign in | Join | Help
in Search

Comprehension performance

Last post 05-13-2007, 5:20 by jdh30. 7 replies.
Sort Posts: Previous Next
  •  05-07-2007, 15:06 3085

    Comprehension performance

    I hope I'm doing something wrong, but I'm seeing some poor performance in comprehensions that's a little unexpected. I'm a long time Scheme and C# programmer, so I feel like I should be happy as a pig in slop in the F# world! My get-started example is the implementation of Norvig's toy spelling corrector. I figure it's a nice simple example perfect for seqs and comprehensions. Here is two of the basic routines: edits is given a word, it determines all the words which are one spelling error away from that word. edits2 determines all the words two spelling errors away

    let (+..) = Seq.append
    let edits (word: string) =
      { for i in {0 .. word.Length-1} -> word.Substring(0,i) + word.Substring(i+1) } +.. (* delete *)
      { for i in {0 .. word.Length-2} -> word.Substring(0,i) + word.Substring(i+1,1) + word.Substring(i,1) + word.Substring(i+2) } +.. (* transpose *)
      { for i in {0 .. word.Length-1} for c in {'a' .. 'z'} -> word.Substring(0,i) + (String.make 1 c) + word.Substring(i+1) } +..     (* alter *)
      { for i in {0 .. word.Length} for c in {'a' .. 'z'} -> word.Substring(0,i) + (String.make 1 c) + word.Substring(i) }             (* insert *)
      |> Set.of_seq
      |> Set.to_seq
    let edits2 word = Set.of_seq (Seq.fold Seq.append Seq.empty (edits word |..> edits))

    Works like a charm, reads gracefully, fun to program ,etc. It takes about 60 seconds to produce all the edit2 results for "something"... about 140k words. It thought it seemed a bit pokey, though. So I coded up what appears to be a nearly equivalent implementation in C# and it produces the answer in somewhere on the order of half a second. I am not clear on what the big performance drain is.

    private static IEnumerable EditsRaw(string word)
    {
    for(int i=0; i yield return word.Substring(0, i) + word.Substring(i + 1);
    for(int i=0; i yield return word.Substring(0, i) + word.Substring(i + 1, 1) + word.Substring(i, 1) + word.Substring(i + 2);
    for(int i=0; i for(char c='a'; c yield return word.Substring(0,i) + c.ToString() + word.Substring(i + 1);
    for(int i=0; i for(char c='a'; c yield return word.Substring(0, i) + c.ToString() + word.Substring(i);
    }
    private static IEnumerable Edits(string word)
    {
    return F.Distinct(EditsRaw(word));
    }
    private static IEnumerable Edits2Raw(string word)
    {
    foreach(string edit in Edits(word))
    foreach(string edit2 in Edits(edit))
    yield return edit2;
    }
    private static IEnumerable Edits2(string word)
    {
    return F.Distinct(Edits2Raw(word));
    }
    static void Main(string[] args)
    {
    Console.Out.WriteLine(F.Count(Edits2("something")));
    }

    (The F class contains some generic IEnumerable based idioms I've had in my back pocket the last couple of years. Their implementation looks pretty much exactly like the stuff in the FSharp.Collections library.) I know that earlier versions of F# discussed fast comprehensions on integer ranges, but I don't see much evidence of that in the generated IL. I wonder if there is some subtlety in the implementation that I am missing, because I keep hearing stories about excellent F# performance stats. Thanks for any advice you can give me!

  •  05-07-2007, 16:24 3086 in reply to 3085

    Re: Comprehension performance

    Hi Sebastian,

    It's a beautiful program!

    I ran the program and got 2.8s for "something". My laptop is fairly quick though. But I'm certain we could do a lot better. Could you send us the full C# and F# code to fsbugs AT microsoft.com?

    Sequence expressions are odd with regard to performance. The aim of the first release of the feature (in December 06) was absolutely to "get the feature in there in time for the books", with a focus on getting some (but not all) of the performance cases right.  Since then we've mainly been fixing correctness problems (e.g. with regard to Dispose of database connections).  But we'll certainly look into this particular issue.

    "yield" is quite magical in C#: it's imperative programming on steroids. It's not the right model for F#, but it is an amazing construct. While the two look similar, the compilation strategies are indeed wildly different, and the C# team has a 6 year lead over us in terms of implementation effort on this one. So this is one area where I expect the "natural equivalent" C# program to currently be faster.

    You can use Seq.map and Seq.filter over enormous sequences and things work out pretty well.  However the sequence expression model is harder to compile than C# iterators, so I think that in this case the C# iterator code will always be at least as fast as the F# code. Also, I suspect most F# programmers intuitively switch to working over concrete data structures (arrays and lists) when performance really matters. I'm not sure if that's possible here.

    Regards,

    Don

     

  •  05-07-2007, 16:43 3087 in reply to 3086

    Re: Comprehension performance

    Sorry - have no reproduced your perf problem - I was using

    let (|..>) x y = Seq.map_concat y x

    instead of

    let (|..>) x y = Seq.map y x

    We'll look into where the perf problems are.

    regards,

    don

     

  •  05-07-2007, 17:01 3088 in reply to 3087

    Re: Comprehension performance

    We'll look into this further over the next few days. For starters, Seq.map_concat is considerably faster than your fold/append/concat combination.  A very long chain of Seq.append's causes problems.  This is exactly the sort of situation where C# iterators do better, since they compile to a state machine.

    Also removing the "Set.of_seq" in "edits" makes things faster. But that's more of an algorithmic issue.

    Am I right in thinking this is logically equivalent code?

    let edits (word: string) = 
        { for i in {0 .. word.Length-1} -> word.Substring(0,i) + word.Substring(i+1) } +..                                               (* delete *) 
        { for i in {0 .. word.Length-2} -> word.Substring(0,i) + word.Substring(i+1,1) + word.Substring(i,1) + word.Substring(i+2) } +.. (* transpose *) 
        { for i in {0 .. word.Length-1} for c in {'a' .. 'z'} -> word.Substring(0,i) + (String.make 1 c) + word.Substring(i+1) } +..     (* alter *) 
        { for i in {0 .. word.Length} for c in {'a' .. 'z'} -> word.Substring(0,i) + (String.make 1 c) + word.Substring(i) }             (* insert *)  
        |> Set.of_seq 
    let edits2 word = Set.of_seq (Seq.map_concat edits (edits word))

    Regards,

    don

  •  05-07-2007, 17:34 3089 in reply to 3088

    Re: Comprehension performance

    Your changes to the edit function look right to me. It's always a fun tradeoff with these things knowing where to "stop the seq train". having edit produce a set rather than a stream of unique words makes as much sense as anything. I realized later I had not posted the |..> operator I toyed with. In my case it was
    let (|..>) a b = a |> Seq.map b

    I will email you the entire code separately, but I've pasted it in below for those playing along at home. I agree with the statement that yield is imperative programming on steroids. It's a wonderful construct and very intuitive for most imperative programmers, once they get over the shock of having the compiler write non-trivial code for them.

    #light
    let (|..>) a b = a |> Seq.map b
    let (+..) = Seq.append
    let edits (word: string) =
    { for i in {0 .. word.Length-1} -> word.Substring(0,i) + word.Substring(i+1) } +.. (* delete *)
    { for i in {0 .. word.Length-2} -> word.Substring(0,i) + word.Substring(i+1,1) + word.Substring(i,1) + word.Substring(i+2) } +.. (* transpose *)
    { for i in {0 .. word.Length-1} for c in {'a' .. 'z'} -> word.Substring(0,i) + (String.make 1 c) + word.Substring(i+1) } +.. (* alter *)
    { for i in {0 .. word.Length} for c in {'a' .. 'z'} -> word.Substring(0,i) + (String.make 1 c) + word.Substring(i) } (* insert *)
    |> Set.of_seq
    |> Set.to_seq
    let edits2 word = Set.of_seq (Seq.fold Seq.append Seq.empty (edits word |..> edits))
    printf "%d" (Seq.length (edits2 "something"))

    and the C# below, including the "F" class referenced within the code. With LINQ, much of the "F" class withers away in favor of the native implementations of the same concepts, but I haven't quite gotten there yet.

    class Program
    {
    // what a shame you cannot yield from an anonymous function
    // or this would be inside Edits
    private static IEnumerable<string> EditsRaw(string word)
    {
    for(int i=0; i < word.Length; ++i)
    yield return word.Substring(0, i) + word.Substring(i + 1);
    for(int i=0; i < word.Length-1; ++i)
    yield return word.Substring(0, i) + word.Substring(i + 1, 1) + word.Substring(i, 1) + word.Substring(i + 2);
    for(int i=0; i < word.Length; ++i)
    for(char c='a'; c <= 'z'; ++c)
    yield return word.Substring(0,i) + c.ToString() + word.Substring(i + 1);
    for(int i=0; i <= word.Length; ++i)
    for(char c='a'; c <= 'z'; ++c)
    yield return word.Substring(0, i) + c.ToString() + word.Substring(i);
    }
    private static IEnumerable<string> Edits(string word)
    {
    return F.Distinct(EditsRaw(word));
    }
    private static IEnumerable<string> Edits2Raw(string word)
    {
    foreach(string edit in Edits(word))
    foreach(string edit2 in Edits(edit))
    yield return edit2;
    }
    private static IEnumerable<string> Edits2(string word)
    {
    return F.Distinct(Edits2Raw(word));
    }

    static void Main(string[] args)
    {
    Console.Out.WriteLine(F.Count(Edits2("something")));
    }
    }
    public static class F
    {
    public static IEnumerable<T> Distinct<T>(IEnumerable<T> from)
    {
    if (from == null) yield break;
    Dictionary<T, byte> values = new Dictionary<T, byte>();
    foreach (T t in from)
    {
    if (!values.ContainsKey(t))
    {
    values.Add(t, 0);
    yield return t;
    }
    }
    }
    public static int Count<T>(IEnumerable<T> from)
    {
    int i = 0;
    foreach(T _ in from) ++i;
    return i;
    }
    }

  •  05-07-2007, 17:50 3090 in reply to 3088

    Re: Comprehension performance

    "A very long chain of Seq.append's causes problems"

    This is definitely true, as you must have found when you removed the thousands of appends implied by edit2. When I replace the guts of edit2 with a Seq.map_concat, the generation of edits2("something") takes a little under 5 seconds... something like 10 or 15 times faster. Certainly the state machine implementation of Seq.append is trivial in C# with the magical yield operator.

    public static IEnumerable<T> Sequence<T>(params IEnumerable<T>[] es)
    {
        int i = 0;
        foreach (IEnumerable<T> e in es)
        {
            foreach (T t in e) yield return t;
            es[i++] = null; // oh why not be nice to the garbage collector
        }
    }


    I see that the underlying implementation of most of the seq functions is inherently stateful, and perhaps I am glossing over too many details, but the F# doesn't look that different. The performance drop is surprising. Perhaps because it is creating so many more enumerator objects instead of surfing through the one it started with? Anyway, fascinating.

    let append (ie1: #IEnumerable<'a>) (ie2: #IEnumerable<'a>) =
      generate
        // the state holds two enumerators: one for the outer list and one for the current inner list
        (fun () -> (ref (ie1.GetEnumerator()), ref false))
        (fun (curr,right) ->
            let rec take () =
              // check the inner list
              match get !curr with
              | None ->
                // check the outer list
                if !right then None
                else
                    // don't forget to dispose of the enumerator for the inner list now we're done with it
                    (!curr).Dispose();
                    curr := ie2.GetEnumerator();
                    right := true;
                    take ()
              | res-> res
            take ())
         (fun (curr,_) -> (!curr).Dispose())

  •  05-11-2007, 17:44 3137 in reply to 3090

    Re: Comprehension performance

    Hi Sebastian,

    I took a bit more of a look at this. I came up with a way to optimize Seq.append under the hood - it reduces runtimes dramatically. The overhead then becomes the use of Sets to implement "distinct" - I think a HashSet would be better here. When the use of sets is removed I get a runtime of 0.8s, but I'm certain further optimizations are possible.

    The optimization is as follows. The problem with a naive implementation of a binary "append" operator on IEnumerables is that a long chain of appends creates a long chain of cascading IEnumerators - if you have a chain of length N and yur getting the last element then your still asking the IEnumerator for the top most "append", which asks teh IEnumerator for the next-top-most "append" etc. all the way down the chain. Thus iterating becomes quadratic.

    The trick is to represent the results of Seq.append symbolically, rather than by creating an object that represents a fixed strategy for how the resulting IEnumerable will be implemented. If a zillion Seq.appends are then concatenated then you can optimize the symbolic representation (by concatenating all the sub-elements to append) and when you actually come to iterate you just use an integer state index to record which node you're yielding from. Effectively we're working out the best way to record the state "behind the scenes". I'd say this can be generalized to further behind-the-scenes deforesting, for example "map f (map g E)" can become "map (f o g) E".  All very interesting. Here are some code snippets (note the use of active patterns to factor out some bits :-) ):



        type ScriptedEnumerator<'a>(script: Script<'a>) =
            class
                interface IEnumerable<'a> with
                    override x.GetEnumerator() = script.GetEnumeratorForScript()
                interface IEnumerable with
                    override x.GetEnumerator() = (script.GetEnumeratorForScript() :> IEnumerator)
                member x.Script = script
            end
                   
       
        let (|HasScript|) (x : #IEnumerable<'a>) =
            match (x :> IEnumerable<'a>) with
            | :? ScriptedEnumerator<'a> as s -> s.Script
            | ie -> Run(ie)

        let append (HasScript(s1)) (HasScript(s2)) =
            (new ScriptedEnumerator<'a>(Script.MkAppend(s1,s2)) :> seq<'a>)
           

    And the bit that interprets scripts (note: uses of "generate" are always a bit ugly - the downside of not having "yield", but then this goes in the library, right!)



        type Script<'a> =
            | Append of Script<'a> []
            | Run of IEnumerable<'a>
            with
                static member MkAppend(s1,s2) =
                    let (|Flat|) = function Append(arr) -> arr | x -> [| x |]
                    match s1,s2 with
                    | Flat(arr1),Flat(arr2) -> Append(Array.append arr1 arr2)

                // based on Seq.concat

                member script.GetEnumeratorForScript() =
                    match script with
                    | Append(arr) ->
                        IEnumerator.generate
                          // the state holds two enumerators: one for the outer list and one for the current inner list
                          (fun () -> (ref 0, ref (IEnumerator.empty())))
                          (fun (n, curr) ->
                              let rec take () =
                                // check the inner list
                                match get !curr with
                                | None ->
                                  // check the outer list
                                  if !n >= arr.Length then None
                                  else
                                      let subScript = arr.[!n]
                                      incr n;
                                      // don't forget to dispose of the enumerator for the inner list now we're done with it
                                      (!curr).Dispose();
                                      curr := subScript.GetEnumeratorForScript();
                                      take ()
                                | res-> res
                              take ())
                           (fun (iese,curr) -> (!curr).Dispose())
                     | Run(ie) -> ie.GetEnumerator()
            end

  •  05-13-2007, 5:20 3142 in reply to 3085

    Re: Comprehension performance

    sebastiang:
    My get-started example is the implementation of Norvig's toy spelling corrector.

    What a wonderful example program! This is so enticing that I can't help but have a go myself.

    My first impressions are that Peter's Python implementations are not a good starting point for several reasons. The most important of which is that it is very idiomatic Python, to the extent that the program appears to be heavily optimized for an interpreted language like Python. I think F# can do vastly better but you'll have to start from scratch with a completely different approach.

    My second concern is that I believe the algorithm could be better. Rather than searching for single or double errors, it should search for all errors but bias the search towards the highest probability errors. For example, the error from "ace" to "oace" is probably much less likely than the error "tomorrow" to "tommorow" despite the fact that the latter involves two separate errors.

    I'll let you know how I get on.

    Cheers,
    Jon.

     

View as RSS news feed in XML
Powered by Community Server, by Telligent Systems