Using immutable data structures enables equational reasoning and assures that update operations are atomic. However, purely immutable interfaces are not always feasible. For instance a RESTful service typically needs to propagate the effects of update operations to other clients. In this post I describe a strategy for constructing mutable service interfaces on top of purely immutable data structures. I’m using F# to exemplify.
Building a Player Service
Consider the task of designing a service for managing player information for some online game. The core API needs to support methods for retrieving and updating a set of players given some search criterion; Here is simple interface capturing the requirements:
type PlayerService = {
FindPlayers : SearchConfiguration -> list<Player>
UpdatePlayers : SearchConfiguration -> (Player -> Player) -> unit
}
More specific operations such as adding or removing player credits can be implemented in terms of UpdatePlayers
and potentially exposed via a web-service API.
Also assume the following set of constraints:
- All invocations of
UpdatePlayers
are atomic and executed in sequence. - Invocations of
UpdatePlayers
do not blockFindPlayers
. FindPlayers
is non-blocking.FindPlayers
always returns data from a consistent state.
(1) is crucial in order to make sure that effects of update operations are predictable. If one update operation adds 100 credits to a player and another one removes 40 from the same player, the net result should be plus 60 credits once both operations have been completed. (2) assures that consumers will still be able to read data from the service without waiting for potentially slow update operations or any other FindPlayer
invocations to complete. (4) means that data from an incomplete update should never be accessible. For example, transferring credits from one player to another within one update operation must not allow a service consumer to observe an intermediate state where the total number of credits is any different from before and after the request was handled. Note that it is not required that update operations have an immediate effect on subsequent FindPlayers
requests, this is sometimes referred to as eventual consistency.
The above interface is apparently non-pure since UpdatePlayers
impacts future behavior of FindPlayers
as a side effect.
What is a good data-structure for this problem? It may seem intuitive to pick a traditional mutable dictionary (such a a HashTable
) for maintaining the set of players when realizing the interface. However, there are several challenges with such approach; Assuring atomicity of update operations would require a locking strategy, but blocking FindPlayers
while waiting for the lock to be released is not feasible because of constraint (2). Ignoring the lock (dirty reading) is also not an option since it may violate (4) by leaking data from an inconsistent state.
An immutable version
Rather than addressing these problems in a mutable setting, let’s start by designing a pure interface with the hope of later being able to convert it to a mutable version:
type PlayerServicePure =
{
FindPlayers : SearchConfiguration -> list<Player>
UpdatePlayers : SearchConfiguration -> (Player -> Player) -> PlayerServicPure
}
The only difference from the previous API is that UpdatePlayers
now returns the next state of the service.
The following example uses a standard FSharp Map for storing players. Simplistic definitions of SearchConfiguration
and Player
are also provided:
// Represent a player.
type Player = {Name : string ; Credit : int }
// Search configuration is simply a list of names.
type SearchConfiguration = { Names : list<string> }
// This is the pure interface for a player-service.
type PlayerServicePure = {
FindPlayers : SearchConfiguration -> list<Player>
UpdatePlayers : SearchConfiguration -> (Player -> Player) ->PlayerServicePure
}
// Builds a service object given a set of players.
let buildPlayerServicePure players =
let rec build playerMap =
{
FindPlayers = fun sc ->
sc.Names
|> List.choose (fun name -> Map.tryFind name playerMap)
UpdatePlayers = fun sc f ->
(playerMap, sc.Names)
||> Seq.fold (fun pMap name ->
match Map.tryFind name pMap with
| Some p -> Map.add name (f p) pMap
| None -> pMap
)
|> build
}
players
|> List.map (fun name -> (name, {Name = name; Credit = 0}))
|> Map.ofSeq
|> build
Given a sequence of players, buildPlayersService
constructs a service object. Here are some examples of how to program with the immutable service:
// Build a new service
let service = buildPlayerServicePure ["John"; "Jane"; "James"]
// Find all John players
let johns = service.FindPlayers ({Names = ["John"]})
// New service with richer Johns
let service = service.UpdatePlayers {Names = ["John"]} <| fun p ->
{p with Credit = p.Credit + 100}
Update operations are atomic by definition since the only way of noticing the effect of an update is via the next service state returned by UpdatePlayers
. Any other problems requiring locking or synchronization do not apply.
Designing and reasoning about the pure implementation is straight forward but we are not able to directly build a web-service interface on top of it. The fundamental problem is that any update operation requested by one user would not be visible to others users.
Deriving a service wrapper
To implement the original interface we wish to define a transformation from values of PlayerServicePure
to PlayerService
values also satisfying the constraints given above. The strategy is to capture the latest state of the service by a mutable reference and replace it with newer versions as update operations are processed. Here is an initial attempt:
let buildService players =
let service = ref buildPlayerServiceIM players
{
FindPlayers = fun sc -> service.Value.findPlayers sc
UpdatePlayers = fun sc ->
service := service.Value.UpdatePlayers sc
}
FindPlayers
always returns the values of the current service object. The problem with this implementation is the definition of UpdatePlayers
. If a second invocation of the function is performed before the first update operation terminates, the updates will be applied to the same service object and the first one to terminate will be discarded. This violates constraint (1).
To accommodate for this we must synchronize the updates so that subsequent operations will be placed in a queue in case the service is currently updating.
There is already built in support for this in F# via the MailBoxProccing
library providing message passing capabilities between concurrent computations.Here is the extended version based on a MailBox
process:
let buildPlayerService players =
let service = ref <| buildPlayerServicePure players
let updateProc = MailboxProcessor.Start <| fun inbox ->
let rec proc () =
async {
let! (sc, f) = inbox.Receive ()
service := service.Value.UpdatePlayers sc f
return! proc ()
}
proc ()
{
FindPlayers = fun sc ->
service.Value.FindPlayers sc
UpdatePlayers = fun sc f ->
updateProc.Post (sc,f)
}
Incoming update requests are now processed one at the time since the recursive call to loop is performed first after an update operation is completed. This addresses constraint number (1). All updates are processed within an asynchronous computation which ensures that invocations of FindPlayers
are not blocked, satisfying constraint (2). FindPlayers is non-blocking in accordance with (3). Thanks to the immutable underlying structure we are also guaranteed to always fetch player data from a consistent state as required by constraint (4).
Following is an example of programming with the interface:
// Build the service
let service =
buildPlayerService [ "John" ; "Jane" ; "James" ]
// Extracts credit from player with the given name.
let checkCredit name =
match service.FindPlayers {Names = [name]} with
| [p] -> p.Credit
| _ -> failwith "Player not found"
// Slowly adds some credit to a player.
let addCredit name amount =
service.UpdatePlayers {Names = [name]} <| fun player ->
System.Threading.Thread.Sleep 200
{player with Credit = player.Credit + amount}
// Runs 10 async computations each adding 1 credit to John 100 times.
// Total number of credits once completed should be 1000.
List.init 10 (fun _ ->
async {
List.replicate 100 1
|> List.iter (addCredit "John")
return ()
}
)
|> Async.Parallel
|> Async.Ignore
|> Async.Start
// Repeatedly prints the current credit of John
let rec trackJohn () =
async {
printfn "Current credit for John %A" <| checkCredit "John"
do! Async.Sleep 1000
return! trackJohn ()
}
trackJohn ()
|> Async.Start
The addCredit
function is deliberately made slower in order to test that the implementation can handle queued update operations. Ten async computations each invoking addCredit
100 times, increasing the amount of credit for the player John. In the meanwhile another thread repeatedly reports the current credit status. Here is some output of running the program:
Current credit for John is 48
Current credit for John is 95
...
Current credit for John is 918
Current credit for John is 967
Current credit for John is 1000
Current credit for John is 1000
Current credit for John is 1000
As seen, all update operations were accounted for. That’s not a guarantee but at least an indication that the implementation is sound.
By introducing a thin service layer on top of a pure immutable interface, much of the headache typically involved in designing thread-safe mutable data structures can be avoided. Using this strategy, business logic may be expressed solely in terms of pure functions. Although this example is may seem specific, my experience is that the pattern is general and applicable on many types or services providing CRUD interfaces.