hubFS: THE place for F#

. . . are you on The Hub?
Welcome to hubFS: THE place for F# Sign in | Join | Help
in Search

Hell Is Other Languages

Software Transactional Memory for F#

I have written a library for using software transactional memory in F#. The library exposes memory transactions as a monad using F#’s new computation expression feature. It can be downloaded here.

Software transactional memory (STM) is an approach to concurrent programming that avoids the use of traditional, error-prone locks to control access to shared memory. Instead of using locks a programmer specifies sections of code that will make atomic changes to shared memory.

The design and implementation of this F# library is blatantly copied from the paper Composable Memory Transactions. I am enormously grateful to the authors of this paper whose clear and precise explanation made the implementation of this complicated piece of software almost trivial.

The low-level STM machinery is implemented in the file stm.cs. It is written in C# mainly because I was working off existing code I’d written some time ago, but there’s no reason why it could not be written in F#. It is all managed code and internally uses monitors for synchronization. It currently uses a very coarse grained lock meaning that only one transaction can commit at once and every waiting thread is resumed when a transaction completes, but this can be improved with more work. The implementation follows the description in the paper.

The F# monadic interface is in the file stm.fs. It imports definitions from the C# assembly and defines the monad type Stm<'a> with corresponding builder class to enable computation expression syntax. An issue with the implementation of STM systems is how to access the transaction log to make reads and writes during a transaction. Two common approachs are to use thread-local storage which is slow, or implicitly pass the transaction log as an extra parameter to every function, which requires code generation. The advantage of using computation expressions (or monadic do notation) is that the transaction log is explicitly passed between functions by the programmer’s code, thereby avoiding compiler modification, but also avoiding the need for the programmer to manually manage the transaction log parameter passing scheme as well.

With these two layers in place one can write STM code in F#. Transacted memory locations have the type TVar<'a>. This is a reference cell like Ref<'a> in F# but its value can only be read and written from inside a transaction. A TVar is created using the newTVar function.

let x = newTVar 0

val x : TVar<int>

This code is similar to let x = ref 0, but being a TVar instead of a Ref, the location x can be safely shared between multiple threads.

A transaction is defined using a computation expression. A transaction is a value of type Stm<'a> which represents a computation that when executed will make an atomic change to the values of zero or more TVars and then return a value of type 'a. The type system ensures at compile-time that all reads and writes to a TVar only take place inside a transaction. The functions readTVar and writeTVar read and write a TVar respectively. They are analogous to let v = !x and x := v for regular reference cells.

val readTVar : TVar<'a> -> Stm<'a>
val writeTVar : TVar<'a> -> 'a -> Stm<unit>

Let’s define a function that increments the value stored in a TVar.

let incr x =
  stm {
    let! v = readTVar x
    do! writeTVar x (v+1)
    return v }

val incr : TVar<int> -> Stm<int>

The syntax stm { … } is an F# computation expression using the 'stm' monad, which means that this is a memory transaction (as opposed to an async work unit or something). The value of the location is read and stored in v. The incremented value is then written back to the location and v is returned as the result of the computation. When executed the read and write happen as an atomic block which means logically that an interleaving thread won’t access the location between the read and write.

A transaction is executed by the atomically function.

val : atomically : Stm<'a> -> 'a

incr x |> atomically

val it : int = 0

incr x |> atomically

val it : int = 1

The composability of STM can be seen by the ability to compose incr into a larger transaction that increments a location twice.

let incr2 x =
  stm {
    let! _ = incr x
    let! v = incr x
    return v }

val incr2 : TVar<int> -> Stm<int>

incr2 x |> atomically

val it : int = 3

Imagine that incr was part of the public interface of a module. If incr used a lock to perform the read and write atomically then it would be impossible for users of the library to write incr2 unless the module also exposed its locks through a public interface, which would make programming intricate for the user.

If you see any Haskell STM code, it is straightforward to translate this into F#. For example the paper gives the implementation of an MVar as a code example. The paper describes an MVar as

An MVar is a mutable location like a TVar, except that it may be either empty, or full with a value. The takeMVar function leaves a full MVar empty, and blocks on an empty MVar. A putMVar on an empty MVar leaves it full, and blocks on a full MVar. So MVars are, in effect, a one-place channel.

MVars can be implemented in F# STM like so.

type MVar<'a> = TVar<option<'a> >

let newEmptyMVar () = newTVar None
let newFullMVar v = newTVar (Some v)

let takeMVar mv =
  stm { let! v = readTVar mv
        return! match v with
                | Some a ->
                    stm { do! writeTVar mv None
                          return a }
                | None -> retry () }

let putMVar mv a =
  stm { let! v = readTVar mv
        return! match v with
                | None -> writeTVar mv (Some a)
                | Some _ -> retry () }

val newEmptyMVar : unit -> MVar<'a>
val newFullMVar : 'a -> MVar<'a>
val takeMVar : MVar<'a> -> Stm<'a>
val putMVar : MVar<'a> -> 'a -> Stm<unit>

The retry operation is used to re-execute a transaction from the beginning. There is no point re-executing the transaction immediately because no memory locations will have changed. So the retry operation blocks until some memory changes that might allow the transaction to make further progress. This is how blocking is implemented with STM.

Another feature of STM that makes it more composable than locks is how it handles blocking. The get/put MVar operations above are blocking operations. If an MVar library was written using locks and non-blocking versions of these operations were required then they would have to be implemented by the library developer and exposed as part of the library’s public interface. Not so with STM: if the MVar interface only exposes blocking get/put operations then it is possible for the user of the library to write the non-blocking versions, and vice-versa.

let tryTakeMVar mv =
  orElse
    (stm { let! v = takeMVar mv
           return Some v })
    (stm { return None })

val tryTakeMVar mv : MVar<'a> -> Stm<'a option>

The function tryTakeMVar is the non-blocking version of takeMVar; it returns an option value depending on whether a value was available. The key here is the orElse operation. This function takes two transactions and tries to execute the first. If the first fails with a retry/block then it executes the second. So in this example if takeMVar retries/blocks then it will return None.

Also included in the code download is the Santa program, a singly-linked list queue, and a fixed-length array queue.

Of course all these great benefits of STM come at a price, and that price is performance. An STM program is probably always going to be slower than a lock-based program. This STM implementation was engineered for correctness and design of the monadic interface and so probably has really bad performance.

Published Wednesday, January 16, 2008 12:11 AM by gneverov
Attachment(s): Stm.zip

Comments

 

Jason Haley said:

January 15, 2008 7:22 AM
 

d2bg said:

What happens if you define a new tvar as

let f = newRef (let x = ref 0 in fun () -> incr x; !x)

and run several copy of

stm { let! ff = readRef f
          let v = ff ()
          do! writeRef f ff
          return v }

will they return continuous int sequence?
January 15, 2008 2:02 PM
 

gneverov said:

No, because you are using a Ref to store the counter not a TVar -- only TVars are synchronized under concurrent access. The TVar, f, that you use does not in fact do anything because although you write to f the value of f never changes: it is always a closure that increments x.
January 15, 2008 9:57 PM
 

d2bg said:

Well, my point is: with the optimistic strategy used by STM, if a value has mutable state _inside_ (like the closure here: let x = ref 0 in fun () -> incr x; !x), even those failed commits (or even purly access) can still change its content.
January 16, 2008 2:55 AM
 

gneverov said:

The current implementation does not rollback side effects (I/O and non-transacted memory) when a transaction re-executes.
January 16, 2008 6:22 AM
 

Macker said:

Great post! Does your prior comment mean that unlike the Haskell STM, this F# one does not statically prevent a programmer from including IO or a side effect inside a transaction?
January 16, 2008 1:22 PM
 

gneverov said:

Yes, that's right. Haskell is able to do this because it is a pure functional language and all side effects are encapsulated by the IO monad which gives side effecting computations a different static type. Since F# is an impure language and by nature allows side effects anywhere, there is no easy way for the type system to prevent side effects in a transaction statically.
January 16, 2008 3:10 PM
 

Macker said:

I wonder if F# could be enhanced to have an unsafe mode, a bit like C#, with annotations for "unsafe" side-effect-allowing code. Code not thus annotated would be statically checked just as in Haskell.
January 17, 2008 4:52 AM
 

Don Syme's WebLog on F# and Other Research Projects said:

Greg Neverov (of active patterns fame) has placed an implementation of Software Transactional Memory
January 17, 2008 5:25 PM
 

Noticias externas said:

Greg Neverov (of active patterns fame) has placed an implementation of Software Transactional Memory
January 17, 2008 6:22 PM
 

MSDN Blog Postings » Greg Neverov: Software Transactional Memory for F# said:

January 17, 2008 6:50 PM
 

Development in a Blink » Blog Archive » Implementation of Software Transactional Memory for F# said:

January 17, 2008 7:33 PM
 

Matthew Podwysocki's Blog said:

Since I have a rather long commute to and from work, I have the opportunity to get caught up on all sorts
February 7, 2008 4:59 PM
 

Rick's Development Wonderland said:

Last night the wmassdevs Group hosted a Clojure presentation by Rich Hickey . Clojure was born of Rich's
March 21, 2008 7:50 AM
 

http://cs.hubfs.net/blogs/hell_is_other_languages/archive/2008/01/16/4565.aspx said:

March 26, 2008 1:55 AM
 

Rick Minerich's Development Wonderland said:

Synopsis I gave an hour long talk today, here at Atalasoft, on Concurrency in F# . It featured some slides
April 25, 2008 12:51 PM
 

Hell Is Other Languages said:

Introduction
F# has the async computation expression for writing parallel programs. Async achieves concurrency...
August 6, 2008 6:11 AM
Anonymous comments are disabled

This Blog

Post Calendar

<January 2008>
SuMoTuWeThFrSa
303112345
6789101112
13141516171819
20212223242526
272829303112
3456789

Syndication

Powered by Community Server, by Telligent Systems