I have written a library for using software transactional memory in F#. The library exposes memory transactions as a monad using F#’s new computation expression feature. It can be downloaded here.
Software transactional memory (STM) is an approach to concurrent programming that avoids the use of traditional, error-prone locks to control access to shared memory. Instead of using locks a programmer specifies sections of code that will make atomic changes to shared memory.
The design and implementation of this F# library is blatantly copied from the paper Composable Memory Transactions. I am enormously grateful to the authors of this paper whose clear and precise explanation made the implementation of this complicated piece of software almost trivial.
The low-level STM machinery is implemented in the file stm.cs. It is written in C# mainly because I was working off existing code I’d written some time ago, but there’s no reason why it could not be written in F#. It is all managed code and internally uses monitors for synchronization. It currently uses a very coarse grained lock meaning that only one transaction can commit at once and every waiting thread is resumed when a transaction completes, but this can be improved with more work. The implementation follows the description in the paper.
The F# monadic interface is in the file stm.fs. It imports definitions from the C# assembly and defines the monad type Stm<'a> with corresponding builder class to enable computation expression syntax. An issue with the implementation of STM systems is how to access the transaction log to make reads and writes during a transaction. Two common approachs are to use thread-local storage which is slow, or implicitly pass the transaction log as an extra parameter to every function, which requires code generation. The advantage of using computation expressions (or monadic do notation) is that the transaction log is explicitly passed between functions by the programmer’s code, thereby avoiding compiler modification, but also avoiding the need for the programmer to manually manage the transaction log parameter passing scheme as well.
With these two layers in place one can write STM code in F#. Transacted memory locations have the type TVar<'a>. This is a reference cell like Ref<'a> in F# but its value can only be read and written from inside a transaction. A TVar is created using the newTVar function.
let x = newTVar 0
val x : TVar<int>
This code is similar to let x = ref 0, but being a TVar instead of a Ref, the location x can be safely shared between multiple threads.
A transaction is defined using a computation expression. A transaction is a value of type Stm<'a> which represents a computation that when executed will make an atomic change to the values of zero or more TVars and then return a value of type 'a. The type system ensures at compile-time that all reads and writes to a TVar only take place inside a transaction. The functions readTVar and writeTVar read and write a TVar respectively. They are analogous to let v = !x and x := v for regular reference cells.
val readTVar : TVar<'a> -> Stm<'a>
val writeTVar : TVar<'a> -> 'a -> Stm<unit>
Let’s define a function that increments the value stored in a TVar.
let incr x =
stm {
let! v = readTVar x
do! writeTVar x (v+1)
return v }
val incr : TVar<int> -> Stm<int>
The syntax stm { … } is an F# computation expression using the 'stm' monad, which means that this is a memory transaction (as opposed to an async work unit or something). The value of the location is read and stored in v. The incremented value is then written back to the location and v is returned as the result of the computation. When executed the read and write happen as an atomic block which means logically that an interleaving thread won’t access the location between the read and write.
A transaction is executed by the atomically function.
val : atomically : Stm<'a> -> 'a
incr x |> atomically
val it : int = 0
incr x |> atomically
val it : int = 1
The composability of STM can be seen by the ability to compose incr into a larger transaction that increments a location twice.
let incr2 x =
stm {
let! _ = incr x
let! v = incr x
return v }
val incr2 : TVar<int> -> Stm<int>
incr2 x |> atomically
val it : int = 3
Imagine that incr was part of the public interface of a module. If incr used a lock to perform the read and write atomically then it would be impossible for users of the library to write incr2 unless the module also exposed its locks through a public interface, which would make programming intricate for the user.
If you see any Haskell STM code, it is straightforward to translate this into F#. For example the paper gives the implementation of an MVar as a code example. The paper describes an MVar as
An MVar is a mutable location like a TVar, except that it may be either empty, or full with a value. The takeMVar function leaves a full MVar empty, and blocks on an empty MVar. A putMVar on an empty MVar leaves it full, and blocks on a full MVar. So MVars are, in effect, a one-place channel.
MVars can be implemented in F# STM like so.
type MVar<'a> = TVar<option<'a> >
let newEmptyMVar () = newTVar None
let newFullMVar v = newTVar (Some v)
let takeMVar mv =
stm { let! v = readTVar mv
return! match v with
| Some a ->
stm { do! writeTVar mv None
return a }
| None -> retry () }
let putMVar mv a =
stm { let! v = readTVar mv
return! match v with
| None -> writeTVar mv (Some a)
| Some _ -> retry () }
val newEmptyMVar : unit -> MVar<'a>
val newFullMVar : 'a -> MVar<'a>
val takeMVar : MVar<'a> -> Stm<'a>
val putMVar : MVar<'a> -> 'a -> Stm<unit>
The retry operation is used to re-execute a transaction from the beginning. There is no point re-executing the transaction immediately because no memory locations will have changed. So the retry operation blocks until some memory changes that might allow the transaction to make further progress. This is how blocking is implemented with STM.
Another feature of STM that makes it more composable than locks is how it handles blocking. The get/put MVar operations above are blocking operations. If an MVar library was written using locks and non-blocking versions of these operations were required then they would have to be implemented by the library developer and exposed as part of the library’s public interface. Not so with STM: if the MVar interface only exposes blocking get/put operations then it is possible for the user of the library to write the non-blocking versions, and vice-versa.
let tryTakeMVar mv =
orElse
(stm { let! v = takeMVar mv
return Some v })
(stm { return None })
val tryTakeMVar mv : MVar<'a> -> Stm<'a option>
The function tryTakeMVar is the non-blocking version of takeMVar; it returns an option value depending on whether a value was available. The key here is the orElse operation. This function takes two transactions and tries to execute the first. If the first fails with a retry/block then it executes the second. So in this example if takeMVar retries/blocks then it will return None.
Also included in the code download is the Santa program, a singly-linked list queue, and a fixed-length array queue.
Of course all these great benefits of STM come at a price, and that price is performance. An STM program is probably always going to be slower than a lock-based program. This STM implementation was engineered for correctness and design of the monadic interface and so probably has really bad performance.