hubFS: THE place for F#

. . . are you on The Hub?
Welcome to hubFS: THE place for F# Sign in | Join | Help
in Search

Sort Algorithms

Last post 10-26-2007, 0:50 by jhugard. 0 replies.
Sort Posts: Previous Next
  •  10-26-2007, 0:50 3876

    Sort Algorithms

    Dusted off my copy of Sedgwick's Algorithms and decided to implement some of the sorts described therein to dig in to F#.  These are heavy on imperitive features, but the intention here is to arrive at efficient library-quility code.

    On the other hand, I had a bit of functional fun with the test driver code!

    I hope these are worthy of posting here, since I finally managed to beat the standard .NET libraries with both quicksort and shellsort.  Man is quicksort hard to get right!

    Shellsort tends to perform very well with small arrays, running in as little as 10% of the time as the stock sort for 100 sorted elements and 86% of the time for 10,000 elements.  It tends to outperform even quicksort when the input is mostly sorted or reverse sorted.

    Quicksort is almost 2x faster than the .NET library sort above 1,000 elements in the random and forward ordered cases, but only a bit more than 10% faster in the reverse ordered case.  It uses a stack of about log2 N keys.  Note that it is *slower* than the .NET library for 100 random elements or less, but is still faster for sorted and reverse input at that level.

    See the attached project files for copyright, etc.

    Feedback most welcome!


    sort.fsi

    /// This stable in-place sort has the important characteristic that each element
    /// is only moved once. Therefore, it is useful for arrays with very large
    /// elements. Runtime tends to be stable regardless of the original order.
    ///
    /// Uses about N^2/2 comparisons and N exchanges. Complexity is O(N log N)
    /// on average, O(N^2) in the worst case, and linear for arrays with small keys
    /// and large elements.
     
    val selectionsort : ('a -> 'a -> int) -> 'a array -> unit
     
    /// Another in-place stable sort, this routine is ideal for arrays that are
    /// already mostly ordered. In fact, this sort is often used as a final step in
    /// quicksort, rather than having the latter fully order the array.
    /// Performs poorly when elements are in reverse order.
    ///
    /// Uses around N^2/4 comparisons and about N^2/8 exchanges, with N^2/4 in the
    /// worst case. Complexity is linear for "almost" sorted files, but quadratic
    /// for files in reverse order.
     
    val insertionsort : ('a -> 'a -> int) -> 'a array -> unit
     
    /// Added for comparison, this calls the F# default sort, which internally
    /// calls the .NET BCL Array.Sort method. Performs slightly better than
    /// this library's shellsort on random data with 10,000 or more elements,
    /// but worse on mostly ordered data, even if in reverse order.
    ///
    /// Complexity is unknown, but based on performance the implementation is
    /// suspected to be a form of shell sort.
     
    val stocksort : ('a -> 'a -> int) -> 'a array -> unit
     
    /// An unstable in-place sort with excellent performance characteristics. Based
    /// on the insertion sort, but modified to rapidly move elements over large
    /// distances. Tends to outperfom Array.Sort and Quicksort for small arrays
    /// (up to 1,000s of records), and outperforms Array.Sort most of the time.
    /// Not particularaly sensitive to initial ordering.
    ///
    /// This implementation never does more than N^3/2 comparisons. The exact
    /// complexity is not known, but suspected to be either N(log2 N)^2 or N^1.25.
     
    val shellsort : ('a -> 'a -> int) -> 'a array -> unit
     
    /// This is the fastest sort in the library, but is both unstable and requires a
    /// small supplementary stack of about log2 N indicies. Due to overhead, not
    /// recommended for 100s of records or less (use shellsort instead).
    /// Tends not to perform as well as shellsort when records are in reverse
    /// order.
    ///
    /// Uses 1.38N(log2 N) comparisons on average, with a very tight inner loop.
    /// Tends to avoid the worst case of O(N^2) and gains about 5% performance by
    /// using a "median-of-three partitioning" algorithm. For an additional savings
    /// of about 20%, only orders partitions with more than 21 elements; an insertion
    /// sort is used as a last step on the "almost" sorted result.
     
    val quicksort : ('a -> 'a -> int) -> 'a array -> unit


    sort.fs

    #light
     
    open System
     
    let selectionsort f a =
        let N = (Array.length a) - 1
        for i = 0 to N do
            let mutable minIx = i
            for j = i+1 to N do
                if (f a.[j] a.[minIx]) < 0 then minIx <- j
            if minIx <> i then
                let t = a.[ i ]
                a.[ i ] <- a.[minIx]
                a.[minIx] <- t
     
    let insertionsort f a =
        let N = (Array.length a) - 1
        for i = 1 to N do
            let v = a.[ i ]
            let mutable j = i
            while (j>0) && (f a.[j-1] v) > 0 do
                a.[j] <- a.[j-1]
                j <- j - 1
            a.[j] <- v
     
    let shellsort f a =
        let N = Array.length a - 1
     
        let mutable width = 1
        while width <= N do width <- 3*width+1
        width <- width / 3
     
        while width >= 1 do
            for i = width to N do
                let v = a.[ i ]
                let mutable j = i
                while j>=width && (f a.[j-width] v) > 0 do
                    a.[j] <- a.[j-width]
                    j <- j-width
                a.[j] <- v
            width <- width / 3
     
    let stocksort f a = Array.sort f a
     
    let quicksortM M fn a =
        let N = Array.length a - 1
        let mutable stack = [0,0]
        let mutable l = 0
        let mutable r = N
     
        let swap x y =
        let t = a.[ x ]
        a.[ x ] <- a.[ y ]
            a.[ y ] <- t
     
        let sort3 x y z =
            if a.[ x ] > a.[ y ] then swap x y
            if a.[ x ] > a.[ z ] then swap x z
            if a.[ y ] > a.[ z ] then swap y z
     
        while stack <> [] do
     
            let count = (r-l)+1
     
    #if DEBUG_QUICKSORT
            // Special case this for debugging to handle l=0 & r=1
            if count = 2 then
                if a.[l] > a.[r] then swap l r
     
            else if count = 3 then
                // Median-of-three
                let m = l + 1
                sort3 l m r
     
            else
    #endif
     
            if count > M then
     
                // Median-of-three
                let m = l + (r-l)/2
                sort3 l m r
                let rm1 = r-1
                swap m rm1
     
                let v = a.[rm1]
                let mutable i = l
                let mutable j = rm1
     
                while i <= j do
                    i <- i + 1
                    while (fn a.[ i ] v) < 0 do
                        i <- i + 1
                    j <- j - 1
                    while (fn a.[j] v) > 0 do
                        j <- j - 1
                    swap i j
     
                swap j i
                a.[rm1] <- a.[ i ]
                a.[ i ] <- v
     
                if j-l > r-i then
                    stack <- (l,j) :: stack
                    l <- i + 1
                else
                    stack <- (i+1,r) :: stack
                    r <- j
     
            else
                let tl,tr = List.hd stack
                stack <- List.tl stack
                l <- tl
                r <- tr
     
    #if DEBUG_QUICKSORT
    #else
         insertionsort fn a
    #endif
     
    // Emperical testing shows that 21 is a good all-around value for the quicksort
    // partition size on my laptop.
     
    let quicksort fn a = quicksortM 21 fn a


    sorttests.fs

    #light

    open Sort
    open System
    open System.Diagnostics
     
    let assertIsSorted a =
        let N = Array.length a - 1
        let mutable min = a.[0]
        for i = 1 to N do
            if a.[ i ] >= min then
                min <- a.[ i ]
            else
                failwithf "Array is not sorted at index %i (%i > %i)" i min a.[ i ]
        done
     
    let assertAreEqual a b =
        if Array.length a <> Array.length b then
            failwithf "Array lengths are different: %i <> %i" (Array.length a) (Array.length b)
        a |> Array.iteri (fun ix v -> if v <> b.[ix] then failwithf "Array not equal at index %i (%i <> %i)" ix v b.[ix])
     
     
    let rnd = new Random();
    let stopw = new Stopwatch();
     
    //--------------
     
    let sorts = [
                    ("Stock F# Sort", stocksort);
                    ("Quick Sort", quicksort);
                    ("Shell Sort", shellsort);
    #if I_HAVE_LOTS_OF_TIME_FOR_ON2
                    ("Insertion Sort", insertionsort);
                    ("Selection Sort", selectionsort);
    #endif
                ]
     
    let asc a b = (a-b)
    let dec a b = (b-a)
     
     
    //--------------
     
     
    let sortDriver a sortfn =
        let b = Array.copy a
     
        System.GC.Collect();
        stopw.Start()
        a |> sortfn asc
        stopw.Stop()
     
        b |> stocksort asc
        assertAreEqual a b
     
    let tests = [ yield! [for mag in 2 .. 7 ->>
                              [ (sprintf "Random x 10^%i" mag, (fun () -> let N = (int (10.0**(float mag))) in [| for i in 0..N -> rnd.Next() |]) )
                                (sprintf "Sorted x 10^%i" mag, (fun () -> let N = (int (10.0**(float mag))) in [| for i in 0..N -> i |]) )
                                (sprintf "Reverse x 10^%i" mag, (fun () -> let N = (int (10.0**(float mag))) in [| for i in 0..N -> N-i |]) )
                                // (sprintf "Alternating x 10^%i" mag, (fun () -> let N = (int (10.0**(float mag))) in [| for i in 0..N -> i % 2 |]) )
                                // (sprintf "Alternating2 x 10^%i" mag, (fun () -> let N = (int (10.0**(float mag))) in [| for i in 0..N -> (i+1) % 2 |]) )
                              ]
                         ]
                ]
     
    //--------------
     
    for test in tests do
        printfn "%s:" (fst test)
        let mutable defaultMS = -1.0
        for sort in sorts do
            printf "\t%s..." (fst sort)
            let N = 5
            stopw.Reset()
            for iteration = 1 to N do
                System.Console.Write(".");
                System.Console.Out.Flush()
                let ar = (snd test)()
                let fn = snd sort
                sortDriver ar fn
            let avg = (stopw.Elapsed.TotalMilliseconds / (float N))
            if defaultMS = -1.0 then
                defaultMS <- avg
            printfn "\tAvg. %fms (%i%%)" avg (int (avg*100.0 / defaultMS + 0.5 ))
        done
        printfn ""
    done
     
    Console.WriteLine( "Press <Enter> to continue..." );
    Console.ReadKey() |> ignore;


    "It seems that perfection is reached, not when there is nothing left to add, but rather when nothing more can be taken away." -- Antoine de St. Exupéry
View as RSS news feed in XML
Powered by Community Server, by Telligent Systems