Introduction

Over the last two and a half years, FunFair has been developing a complete end-to-end blockchain-based product with State Channels at its core, deploying the first prototype implementation in June 2017, and culminating in the release of one of, if not the first, commercial applications based on this technology in August 2018.

This article is the start of us revealing in detail how our core technology works, with a view to making more parts of it public, and eventually open-source.

The very core of the project is the Ethereum Solidity contracts that comprise our particular implementation of State Channels, called Fate Channels – due to the embedding of a deterministic-but-not-predictable RNG.  The code we wrote, and which is currently deployed, is complex, nebulous and hard to understand; we have spent some time rewriting this from scratch, refactoring with ease of understanding as a primary factor, but with the added benefit of making it much clearer how the platform could be extended – and it’s this code that we present in this article.

Up front, we should make it clear that this is very much a v1.0 implementation of the overall State Channel concept.  It’s bi-directional and two-party; It is feature complete and fully disputable, but we’ve not yet begun to address other active areas of research such as virtual channels and hub models.  It was very much our focus to get something practical and self-contained released into the wild; however, the refactor has been written as generically as possible to make it clearer how these things – and other parallel research such as Guardians/Watchtowers (eg PISA) could be integrated.

We are also fully aware the start of other State Channel developers collaborating to try and standardise some aspects of this technology. This document serves two additional purposes – to cross reference against any standards as they emerge to see if we should adopt them, but also to see if any of the concepts here should be included in the standards themselves.

Finally, we stress that the code presented is a Reference Implementation.  It has not been thoroughly tested in the wild – this we will do in due course, and move our existing product over to use this code.  Do not use this code in production as is at this point!  We will update and refine the code and publish it when it is ready.

State Channels Recap

This article does assume some familiarity with the concept of State Channels.  There are many good papers and discussions about this which are worth reading first.

We define a State Machine as code which implements the function:

 advanceState(state, action) => newState

Which takes a state, an action on that state and returns a new state.  It has no side-effects.

A State Channel, then, is a protocol which allows two or more parties to advance a shared state using a State Machine; and the protocol ensures that this can happen cooperatively off-chain, but backed by the same security properties as if it were advanced entirely on-chain.

In a traditional setup, some funds (hereafter used generically to mean “something of value”) are committed on-chain to the State Channel by multiple parties.  The state is then advanced interactively off-chain, and at some future point a final state is mutually agreed, and this is used on-chain to release the funds back to the participants.

Much of the effort in developing a pragmatic implementation of State Channels goes into dealing with what happens when things go wrong.  State Channels require the participants to be co-operative and live – what happens when they are not?  We talk about Disputes and Dispute Resolution in depth in this article.  It should be noted that the use of the word “Dispute” itself is currently in dispute in the research community – a better terminology may evolve in due course!

There is various ongoing research into things like virtual channels – channels within channels – which mean that commitments can be made on-chain once in a more general, long-term way, and then individual channels can be opened and closed using these commitments off-chain.  This approach is not discussed here, but is an extension of this basic concept.

Other research includes not limiting a State Channel to a specific State Machine – again, this is not discussed here, although I’ll touch on where our system could be extended to include this.

Code

The code referred to in this article can be found here. There are some notes on the code – along with some additional notes about the rest of the article at the end.


Opening a Channel (Part 1)

We have two participants that want to open a channel.  What happens next? Two things need to happen for the channel to be considered open:

  • The relevant funds have been locked, and
  • Both parties have signed a commitment to open the channel

Can you combine these two things?  Yes!  Can you do the whole lot in one transaction?  Also, yes!  However, for the sake of code simplicity our openChannel() function deals with the second part, and assumes that the first has already been done, to allow different implementations of locking funds – which is why the function is internal.  We’ll cover an implementation of funding the channel later.

Our openChannel() function in StateChannel.sollooks like this:

 function openStateChannel(bytes memory packedOpenChannelData) internal returns (bool)

which isn’t particularly revealing.  The (packed) OpenChannelData contains all the information necessary to define and open a State Channel, and the primary function of the Open Channel call is to validate that this information makes sense, and then to store a permanent record of the opening of the channel on the Chain.  Note that we are also assuming that it has been verified that both parties have signed this data in advance.

So, what information do we need?

    struct StateChannelParticipant {
        address participantAddress;
        address delegateAddress;
        address signingAddress;

        uint256 amount;
    }

    struct FunFairStateChannelOpenData {
        bytes32 channelID;
        address channelAddress;

        StateChannelParticipant[2] participants;
        address stateMachineAddress;
        uint256 timeStamp;
        bytes32 initialStateHash;
        Signature[2] initialStateSignatures;
        bytes packedStateMachineInitialisationData;
    }

The channelID uniquely identifies this State Channel.  The State Channel contract supports many concurrent channels – an alternative approach is to deploy a new contract each time, but that is very expensive in terms of gas for contract deployment.  To keep things general, we allow the two parties to just pick a random, unused ID here, rather than trying to come up with something unique in the contract itself.

We need to explicitly identify the address of the State Channel contract.  See the note at the end of this document about Preventing Replay Attacks for more detail.

We need to identify the two participants of the channel; here you’ll see something interesting – we actually keep three addresses per participant.  One is the actual main Ethereum account – one we assume the participant keeps the private key for safe.  Secondly, we have an ephemeral signing key – this is generated on the fly at the time of channel opening by the clients, and means that, for example, we don’t need Metamask to pop up boxes to sign state transitions.  Finally, we have an address where funds should be sent at channel close time.  Note that throughout the code we allow things to be signed by either the ephemeral signing key or the participant’s main account, as a backstop in case ephemeral data is lost.

We need to know the value of the funds that each participant has committed to the channel.

We also include a timestamp here.  This may not seem very generic, but it’s very practical!  State Channels require participants to be live (ie online and responding) throughout the existence of the channel.  If you can get to a point where one participant signs everything necessary to open a channel, and the other party gets access to this, they could wait for a long time – weeks or months – and then use that data to open a channel, hoping that the other participant doesn’t notice, which would be bad.  So, we limit the time that these requests are valid.

Next, we have the address of the State Machine to be used in the channel.  We have consciously gone down the road of having the code for these deployed to the chain ahead of time.  This has two primary advantages – the code only needs to be deployed once, and the code can be independently verified by the participants (or someone they trust) before a channel is opened.  It also makes the process of advancing the state very straightforward.

Finally, we need the signed hash of the initial state of the State Channel.  Eh?  Let’s have a look at state.

State (Part 1)

At the top of this article, I mentioned that a State Channel is something that uses a State Machine to advance the state of the channel.  It is highly desirable that the State Channel neither depends on, nor even needs to know the detailed contents of this state, so that different State Machines can be added to the system.  In addition, the State Machines themselves shouldn’t really need to know anything about the State Channel, or even that it’s a State Channel that’s calling them.

So, we now start to unpeel the nested layers of our state “Onion”.  The State Channel keeps the following running state:

    struct StateContents {
        bytes32 channelID;
        address channelAddress;
        uint256 nonce;
        uint256[2] balances;
        bytes packedStateMachineState;
    }

    struct State {
        StateContents contents;
        Signature[2] signatures;
    }

We have the Channel ID and Address, for identification and replay attack prevention.  We then have a nonce, balances and a packed State Machine State.

In Solidity 0.6, the ABIEncoderV2 is finally no longer experimental – this is a huge help for us.  What this basically means is that we can pass around data that we don’t understand for it to be decoded by something that does.

The State Machine might need to keep track of things like where the pieces are on a Chess board, or which cards have been dealt in Poker, and this is all part of the overall signed state.  Encoding it like this means that the outer layer – the State Channel – can validate the signatures (which the State Machine can’t), nonce progression etc, and pass on the packed state to the State Machine when it needs to.

Progressing the nonce is also something that needs to be done outside the State Machine, as it is core to the Dispute Resolution – see later.

Finally, we store the participants’ balances here.  We have spent some time debating whether or not this could be held in the State Machine and returned on demand, but I think it’s just cleaner that it lives here.

Opening a Channel (Part 2)

To complete the information needed to open a channel, we need the signatures of both participants of the initial state.  To close or to dispute a channel, you need to start from a point of a co-signed state, and requiring this here means that we can guarantee that it exists and has been signed.  How do we generate the initial state?

Generating the Initial State

It turns out this is quite straightforward.  We provide a function which takes the OpenChannelData and returns the initial state.  It:

  • Sets the channel ID, and the initial balances to those provided in the OpenChannelData
  • Sets the channel address to address(this)
  • Sets the nonce to zero, and
  • Calls the State Machine to get its initial state, packed by the ABI Encoder

You’ll notice that the OpenChannelData also contains information here to be passed into the State Machine at this point – This is used as initialisation data for the State Machine to generate its initial state.  We’ll revisit this later when we look at State Machines in more detail.

This function is public and has no side-effects.  The clients call this code off-chain to generate the initial state, which they sign.

Opening a Channel (Part 3)

Finally, we can verify the signatures – this is done on-chain by generating the initial state, hashing it, comparing this hash to the one provided in the OpenChannelData, and validating the signatures on the hash.

Note that it isn’t strictly necessary for the hash to be passed in, as it can be inferred, but this is a reference implementation and it makes the code look cleaner and more readable!

Finally – we’ve verified all the data that we needed to, so we can open the channel.  This is as simple as writing a flag to contract storage to say that the channel with ID #x is open.

We do, however, do a few other things.  In this reference implementation we store the total balance of the channel on chain.  This is entirely for safety – it can always be inferred – but this code is intended to be used for many simultaneous channels, and I get paranoid about funds leaking from one channel to another; storing this effectively prevents it from happening.

We emit an event so that the clients know what’s going on.  We emit events for every on-chain transaction.

Finally, we store the hash of the OpenChannelData.  This is an interesting optimisation.  Early in development we stored this on chain in its entirety at this point – we do need pretty much all of this data (except for the timestamp, and the State Machine initialisation data) down the line.  But SSTOREs are expensive and this is quite a large structure.  It turns out that it’s much cheaper and not much more complicated to store the hash of it instead.  Then, for any subsequent contract call that needs the data (which is pretty much all of them), to pass the OpenChannelData as a parameter to the call in its entirely.  All we then need to do is to check that the hash of this data matches the stored one, and we’re good to go.  You will see code to do this at the top of every externally callable function.

At last, our channel is open!

Advancing the State (Part 1)

We have an open channel in the initial state.  For anything interesting to happen, we need to be able to advance the state, ideally without making any on-chain transactions, and to do this we need an Action.

Actions

In this model, we assume that an action is taken by only one participant.  We don’t enforce that they take turns at this point (that comes later), but it does mean that an action only has one signature on it.  Our Action structure looks like this:

    uint256 constant ACTION_TYPE_ADVANCE_STATE = 0x01;
    uint256 constant ACTION_TYPE_CLOSE_CHANNEL = 0xff;

    struct ActionContents {
        bytes32 channelID;
        address channelAddress;
        uint256 stateNonce;
        uint256 participant;
        uint256 actionType;
        bytes packedActionData;
    }

    struct Action {
        ActionContents contents;
        Signature signature;
    }

It has the channel ID and contract address again for identification and replay protection.  It has the nonce of the state to which it applies (this is very important), and It contains which participant made the action (simply represented as 0 or 1).

Then we have an action type – of which there are currently only two – Advance State and Close Channel.  We went down this route because I can imagine other types of action that apply to the channel but not the State Machine even though we’ve not implemented them at this point.

Finally, we have the data that needs to be passed to the State Machine, which might be make a chess move, draw a card, place a bet, but which the State Channel doesn’t know anything about.

Advancing the State (Part 2)

Now we have everything we need.  The State Channel exposes a public, read-only function which takes a State, an Action and the address of a State Machine, and which returns a new State.  This is called off-chain by participants to advance the state in a deterministic way.  Let’s have a look at the code:

    function advanceState(StateContents memory stateContents, ActionContents memory actionContents,
                          IStateMachine stateMachine) public view returns
                          (FFR memory isValid, StateContents memory newStateContents) {
        // advance the state using the state machine
        int256 balanceChange;
        bytes memory packedNewCustomState;

        (isValid, packedNewCustomState, balanceChange) =
                 stateMachine.advanceState(stateContents.packedStateMachineState,
                                           actionContents.packedActionData,
                                           actionContents.participant,
                                           stateContents.balances);

        // was the action valid?
        if (!isValid.b) {
            return (FFR(false, "Invalid Action"), newStateContents);
        }

        // check that the balance change is acceptable
        // this must *never* trigger

        // Does Participant #0 have enough funds?
        assert((balanceChange >= 0) || (int256(stateContents.balances[0]) >= (-balanceChange)));
        // Does Participant #1 have enough funds?
        assert((balanceChange <= 0) || (int256(stateContents.balances[1]) >= ( balanceChange))); 

        newStateContents.channelID = stateContents.channelID;
        newStateContents.channelAddress = stateContents.channelAddress;
        newStateContents.nonce = stateContents.nonce + 1;
        newStateContents.balances[0] = uint256(int256(stateContents.balances[0]) + balanceChange);
        newStateContents.balances[1] = uint256(int256(stateContents.balances[1]) - balanceChange);
        newStateContents.packedStateMachineState = packedNewCustomState;
    }

It’s pretty straightforward – it calls the State Machine to update its state, and then creates a new top-level state, with a new nonce.  You’ll notice that the actual call is a little more complex than my original definition of a State Machine.  In practice, in our implementation, it looks like this:

 advanceState(state, action, participant, balances) => (isValid, newState, balanceChange)

The State Machine needs to check that the action is actually valid for the given state (in Chess, I can’t move the King into a Check position, in Roulette I can’t bet on 37 under any circumstances).  The participant also needs to be known by the State Channel to validate signatures and other things when the channel is in dispute, so it’s also explicitly passed in here rather than being encoded in the action passed to the State Machine.  This means that (for example) enforcing a requirement for players to act in turns is deferred to the State Machine.  Finally, as the responsibility for maintaining participants’ balances is in the State Channel, the State Machine is sent the current balances, and returns the change in balance as a result of the Action.

We require that the actions are valid, and that neither balance has gone negative, and return the new state.

Closing a Channel

In order for a State Channel to be closed, both participants must agree – there are a number of ways this can be approached, but we’ve gone for a semantically simple one.  Both players sign an action of type Close Channel against the nonce of the desired final state.  These are then passed to the chain, along with the co-signed final state.

The contract then needs to perform a number of checks:

  • Is the channel open?
  • Is the proposed final state valid?
    • Does it refer to this contract?
    • Does it refer to this State Channel?
    • Does the sum of the participant balances add up to the amount that the channel was opened with?
  • Are the actions valid?
    • Do they refer to this contract?
    • Do they refer to this State Channel?
    • Do they refer to the nonce of the proposed state?
  • Are the signatures on the State and Actions valid?
    • And is each action signatures from the correct participant?

And finally – Is the State Finalisable?  This concept is added to prevent the closure of a channel at a point that doesn’t make any sense – halfway through a game etc.  It’s determined by the State Machine, which has an additional method to return whether or not a given state is finalisable (see note).

If all these checks are passed then the channel can be closed, and the funds redistributed.


Using a State Channel

Now that we’ve gone through the core concepts, how would this work in practice?

We need two participants to the channel – they can be human beings, bots, automated software – whatever, really, as long as they are able to communicate with the blockchain, and with each other.  So they’ll need access to a node, and web3 (or equivalent), and some sort of off-chain comms channel, be it HTTP, Websockets, or a bespoke TCP/IP protocol (you could use email if you really wanted – the protocol doesn’t care at all).

These participants agree to open a channel; this is basically agreeing on the contents of the OpenChannelData structure – which includes their addresses, funding, the State Machine to use (with its initialisation data), a timestamp, and finally the signed hash of the initial state (which is retrieved by calling the State Channel code off-chain).

Once both participants have signed this data, the channel can be funded (see below), and opened.  Both participants wait until they see the transaction on the blockchain, and then the channel can be considered open.

To progress the state of the channel, one party proposes an action.  This might be constructed from a User pressing a button, or an automated algorithm; either way, the specific action message is constructed (and we’ll talk about this in more detail later).  Then the participant calls the advanceState() function off-chain, passing in the previous state and the action.  If this is all valid, the code will return a new state.

The participant then signs both the action and the new state and sends them to the counterparty.

On receipt of this, the counterparty needs to verify the data.  They must check:

  • That the proposed action does indeed advance the state from their own last state to the proposed new one, by calling advanceState() themselves
  • That the signatures of the new state and the action are valid

At that point, they should sign the new state themselves.  They could send it to the first participant at this point, or they could wait and send it along with their next action – it doesn’t really matter.  However, they can’t accept an action on a state that they don’t have the counterparty’s signature on.

This sequence of events then repeats until the participants agree that the channel should be closed.  Again, this could be one participant pressing a button, or a pre-determined point being reached.  At this point, both parties sign a Close Channel Action, and these are sent to the chain to close the channel.  Anyone can actually submit this message to the chain, as long as everything is signed correctly, though I would imagine it would usually be one of the participants, who has received the Action from the other party, generated and signed their own, and then submits them both.

It is very important to note here that, if one participant has signed a close channel action, and sent it to the counterparty, they should, under no circumstances carry on advancing state past this point.  The counterparty could use this message at any subsequent point to close the channel here – reverting any future state.

All sounds straightforward, right? What could possibly go wrong?  The answer is, predictably, almost anything, and something like 80% of the effort in State Channel development goes into dealing with this.  As a first step, it’s important to understand that the core concept of a State Channel is that it’s a Protocol.

Protocol

According to a random definition I found on the internet, a Protocol is “The accepted or established code of procedure or behaviour in any group, organisation, or situation”.

State Channels work because, between opening and closing a channel, the participants follow a Protocol – this is basically a set of rules that describe what they need to do, and is essentially the sequence of events described in the “Using a State Channel” above.  The core rules of the protocol are:

  • If it’s my turn to act, I must do so in a reasonable time frame
  • If I receive a new state, I must:
    • Check that the on-chain code agrees with the proposed state transition
    • Check that the new state and the action are correctly signed
    • Send my signature on the new state to the counterparty

Many, many other things can happen during the course of a channel.  I could lose my internet connection and not be able to talk to the counterparty.  My computer could break in a way that means I lose access to my signed states/and or ephemeral keys.  The counterparty could start sending me incorrect data, maliciously or otherwise.

We could write special cases to deal with individual types of this error, but it’s far cleaner to simply treat everything that can go wrong with the statement “One participant is not following the protocol”.  Thanks to Jeff Coleman for stopping me going down a rabbit hole here in the early stages of our development.

To deal with this, we add a process called “Dispute Resolution”


Dispute Resolution

When one participant is unable to progress the state of the Channel because the other participant is no longer following the Protocol, they can enter a Dispute.  As mentioned above, this is perhaps not the best name for this process, however it’s the one we’ve got for now.

From a code point of view, this sits in DisputableStateChannel.sol – this class is derived from StateChannel.sol and in a separate file to make reading the code clearer, and to cleanly differentiate disputes from the normal operation of the channel.

Essentially what we’re trying to do here is to use the blockchain to force the counterparty to behave and follow the protocol again. Tom Close – with whom I discussed this approach at length – calls his implementation “Force Move”, which is another way of thinking about it.

Initiating a Dispute

The most general situation we’ll find ourselves in, as a participant, is that we’ve got a co-signed state S, and we want to advance the state to S’ by using an action A.  We generate the state transition off-chain, sign S’ and A, and send this to the counterparty.  At this point, for whatever reason, the counterparty fails to send us their signature on S’ and so we’re stuck.

Despite the fact that we feel confident that S’ is valid, we can’t do anything with it off-chain, as the counterparty has not signed it.  However, we can validate this state transition on-chain to force the state to advance.

function disputeWithAction(bytes memory packedOpenChannelData, State memory state,
                           Action memory action, State memory proposedNewState)

This function takes a co-signed state, a signed action, and a partially signed proposed new state.  Almost all of the code for this function (and the others in this section) consists of validation checks on the incoming data.  In this case we check:

  • Is the channel open and not in dispute?
  • Are the State and Action valid in the context of this channel (as per the End Channel call)?
  • Is the State correctly co-signed?
  • Is the Action correctly signed by the participant raising the dispute?
  • Is the Action of type Advance State?

We then generate a new state on-chain using the State Machine and check:

  • If the new state is the same as the proposed new state submitted by the disputer
  • If it’s correctly signed by them

We also check that the nonce of the state is strictly greater than that of any previous dispute – this is needed to prevent an edge-case where someone can repeatedly dispute the same state to grief the counterparty.

We finally check that the message was actually sent to the chain by the person making the dispute.  Almost all of the other calls in this code can be made by a third-party – if they are correctly signed off-chain.  However, I’ve not found a way to convince myself that it’s the correct approach to allow a third-party to initiate a dispute on my behalf – it’s too easy for them to maliciously grief me if I do.  I’m still pondering this – any comments are welcome!

If any of these checks fail, we throw (which I use as a generic term for terminating the execution of a contract and reverting any changes), and the dispute is essentially rejected wholesale.

If the checks pass, we place the channel into dispute.  We store some data on-chain relating to the dispute (who initiated it, the hash of the new state and the action, the nonce of the new state, and the timestamp and the block number of the transaction), and we wait for the counterparty to respond.

Resolving a Dispute

In an ideal world, the counterparty realises what’s going on (eg their internet connection has just sprung into life) and attempts to put things right.

This introduces another part of the State Channel Protocol:

  • I must always watch the blockchain

Failure to notice that the other participant has initiated a dispute is going to get you into trouble (see below).

However, if you do notice, you are in a position to do something – you can pull the disputer’s Action, and partially signed new State from the blockchain itself, either from the event, or from the transaction record (note – this is why we send the full proposed state into the transaction rather than the just hash, as we have found tracking events to be notoriously unreliable.  It’s strictly not necessary, but it is very practical!).

With this information, you are able to advance the state yourself, and resolve the dispute.  In order to do this, you need to generate the next state transition, to demonstrate to the State Channel contract that you are back and following the protocol.

This may involve UI for the user, and will involve someone – you or a delegated authority – to post a transaction to the chain.  The User Interface flow for this needs considerable thought and is far from straightforward.

However you get there, you need another Action – A’, your signature on S’ (which the disputer has already signed), the next proposed new state S’’ (which, remember, you generate off chain), and your signature on that.  You then call:

function resolveDispute_WithAction(bytes memory packedOpenChannelData, State memory state, 
                                   Action memory action, State memory proposedNewState)

This code does lots of checks, many the same as before:

  • Is the channel in dispute?
  • Is the hash of S’ the same as stored on the chain?
  • Is the Action A’ valid in the context of this channel?
    • (Note that we don’t need to validate S’ here, as it is inherently correct due to the way it’s been validated during the initial Dispute call)
  • Is A’ of type “Advance State”?
  • Is A’ being made by the counterparty to the dispute initiator?
  • Is A’ correctly signed?

Then we generate S’’ on-chain and check

  • That it’s the same as the one submitted by the counterparty
  • That their signature on S’’ is valid

If any of the checks fail, the resolution is rejected, and nothing happens.

If they pass, the resolution is successful.  And here the magic happens! (and thanks again to Tom Close for pointing me in the right direction here.)  Because S’’ has been signed by the counterparty, and both that and A’ are available on-chain for the dispute initiator to pick up, the channel can now continue off-chain.  This is really cool!  The original initiator is now in exactly the same position they would have been in had those two state transitions happened off-chain according to the protocol, and they have all the information they need to make their next state transition and send it to the counterparty using their off-chain comms channel.

Other Resolutions

That is, however, not the only way a dispute can be resolved.  The next three can result in penalties for participants of the channel (see note at the end)

Timeout

If the counterparty to the dispute does not respond in a reasonable amount of time, the initiator (or indeed anyone) can claim a Timeout on the channel.  It is a requirement of being in a channel that you are able to respond – we call this “being live”.  The call itself it pretty straightforward:

function resolveDispute_Timeout(bytes memory packedOpenChannelData, State memory state)

All we need to do here is to check that the appropriate amount of time has passed (see note at the end), and we use the balances of the State that was used to open the dispute as the basis for closing the channel, then penalise the non-acting counterparty and close the channel.  We do need to actually pass in the State itself, though, as only the hash of this is stored on chain, and this hash needs to be verified.

Challenging with a Later State

State Channels have a concept of Immediate Finality – this means that when a State has been signed by both participants it is immediately irrevocable – there’s no waiting for block confirmations etc which you need on a traditional blockchain.

This means that it’s a violation of the protocol to attempt to revert a State or go backwards.  If you are, say, at State #10, and then open a dispute from State #5 (where, maybe, you were in a better position), this is a) not allowed, and b) instantly verifiable.

All the counterparty to the dispute initiator needs to do is call

function resolveDispute_ChallengeWithLaterState(bytes memory packedOpenChannelData, State memory state)

Passing in *any* later co-signed state, and we can demonstrate that the dispute initiator has violated the protocol.  We just need to validate the signatures, check that the State is valid in the context of the channel, and the nonce of this State is strictly greater than the one used to open the dispute.

If this is true, then we can close the channel and penalise the dispute initiator.  Note that this can be particularly bad for them, as the counterparty can choose *any* subsequent co-signed state to challenge with – they can look for the one that puts them in the best position even before the penalties are applied.  So – don’t dispute with an old state!!

Challenging with a Different Action

This final one is a bit more subtle.  If I’ve signed a state transition and sent it to you with an Action A, but subsequently opened a dispute on chain with the same state, but a different action B, this is a problem.

The reason is that you may well have made your next move, but because I haven’t sent you my signature on it yet, you can’t use it as the basis of a subsequent dispute.  This means that I could see what your action was going to be, and then change my move using an on-chain dispute.

This is particularly relevant for the way our RNG works (see below), and would effectively allow participants to know the result of the RNG ahead of time.

Consequently, we consider it a protocol violation to sign more than one action against a given state.  If the counterparty sees a dispute, and realises they have a different signed action, they can call

function resolveDispute_ChallengeWithDifferentAction(bytes memory packedOpenChannelData, State memory state,
                                                     Action memory action)

Here, other than the usual checks, we check that hash of the action is different to the one that was used to open the dispute.  We also need to check that the Action is actually valid (eg, it’s their turn to act), and would cause a valid state transition (eg not running out of funds) – so we run the State Machine here to advance the state and see if it returns an error.  In the past we have had a separate Validate Action function, but this is the only use case left for it, so it’s cleaner just to test advancing the state.

If these tests pass, we end the channel and penalise the dispute initiator.

Edge Cases

Finally, there are a couple of edge cases that we need to deal with

Dispute with No Action

This is more of an optimisation than an edge case.  If I’ve made an action and you’ve sent me your signature on the new state, but then not made a subsequent action, I can use your signature on this state to raise a dispute, rather than having to do the whole state transition on-chain.

Do note, however, that if you raise a dispute this way, and the only participant who can act next is you, then you’re in trouble, as you can’t resolve your own dispute!

Agree and Close Channel

This last one is necessary, however.  Given our definition of the finalisability, we say that you can’t force a participant to play on through a finalisable state.  This would be equivalent to forcing them to start a new game of Chess, say, when they wanted to stop and close the channel.

Consequently, we have a variation on resolveDisputeWithAction() – if the dispute took the state to a point where the state was finalisable, the counterparty can simply agree with the state channel and request that the channel is closed at that point.  Given the initiator has already signed the state at this point, they don’t need to counter-sign the channel close request.  They call:

function resolveDispute_AgreeAndCloseChannel(bytes memory packedOpenChannelData, State memory state,
                                             Action memory action)

Passing in a signed Action which is of type “End Channel” (it seemed clean to use the existing Action paradigm here), and the State.  These are checked as usual, and, if the state is indeed finalisable, then the channel is closed normally with no penalties.


Handling Funds

Funding a Channel

As mentioned above, the openChannel() method is internal and assumes that funds have already been locked in the contract.   There are many ways this can be achieved.  Our existing implementation takes advantage of the fact that our use case is with an ERC20 token.  This means that we can extend the token contract to provide in-built multi-sig capability – which is what we’ve done.  In FUNTokenController.sol you can see a function multiSigTokenTransferAndContractCall() which allows two participants to co-sign a message that transfers tokens from both of their accounts to a third account, and then calls a method on the contract at that address, afterMultiSigTransfer().

We then derive a new class from DisputableStateChannel, the catchily-named TokenMultiSigStateChannel, which implements the afterMultiSigTransfer() method.  This code very much assumes that the token transfers have been made at the point of the method call, so it is strictly locked down to require whitelisted addresses to be able to call it.

Finally the most messy bit of the code.  The multi-sig transfer method in the token contract takes its own set of data, along with the packed data required to open the channel.  We need to manually check that the data is consistent here – we can’t open a channel with different amounts of participant balances than those that were transferred by the token contract.  It’s a bit ugly, but splitting the code out like this means that only this contract needs to understand the data structures required to perform the multi-sig transfer.

Once this has been verified, this function can simply call the openChannel() method on its parent class, and the channel is now funded and open.

This is certainly not the cleanest or most generic way to fund a channel; however it has the distinct advantage that a channel can be opened in one transaction, and we continue to believe that keeping things cheap and fast for users is a good thing.

Releasing funds from a State Channel

After closeChannel() has been successfully called, or after a dispute has been resolved which results in the forced closing of a channel, we need to return funds to the participants.

The base State Channel class does know the current running balances of each participant, but it doesn’t know what these funds actually are (tokens, raw Ether etc).  It also doesn’t necessarily know that the running balances are the right way to actually distribute the funds.

The derived Disputable State Channel class can tell if one participant needs to be penalised, but doesn’t necessarily know what those penalties should be.

So, we push this logic down to the Token Multi-sig State Channel contract.  It knows that in this case the funds are Tokens, and the address of the token contract.   Then we have a decision to make.  We (FunFair) have specific rules about how penalties are applied.  In addition, we have rules about taking some of the funds and paying them to contributing third-parties (game developers, affiliates etc).  It seems appropriate that this logic should be in the State Machine, as it’s the code that is the most specific to us.

Consequently, we define a method distributeFunds() which has this prototype:

function DistributeFunds(bytes memory packedOpenChannelData, State memory state, uint256 penalties)
             internal returns (uint256[] memory finalBalances, address[] memory finalAddresses)

…in the base State Channel contract but leave it unimplemented.  We implement it in the Token Multi-Sig State Channel contract, which

  • Calls getPayouts() on the State Machine, passing in the channel balances, the final State Machine state, and if anyone needs to be penalised
  • Receives an array of payment amounts and addresses
  • Verifies that the sum of these payment addresses totals the channel balance
  • Makes the payments

The implementation here still needs some thought.  The State machine doesn’t inherently know the payout addresses for the participants, and there’s other data that it might need – in our case we use the initial opening balance to calculate the penalty value, so extra data needs to be passed to handle both of these things.  I’m pretty sure this can be cleaned up a lot in the future.


FunFair’s State Machine – the Fate Machine

Having spent the last two years referring to this code as a whole by the name Fate Channels, this refactor has pushed all of the RNG implementation into the State Machine itself, leaving the State Channel cleaner and much more agnostic as to what the State Machine actually gets up to.

FateMachine.sol shows a complete implementation of a State Machine in our system – we also show a method for this code to support many games by providing a clean interface for game rules.  The Fate Machine implements a specific type of RNG-based game, where one player is playing against the house who follows a pre-determined rule set.  Casino games such Roulette, Blackjack, Baccarat are in this category, as are most slot machines.

Randomness is notoriously tricky to achieve on a traditional blockchain – each transaction is inherently deterministic, as it needs to be computed and verified by thousands of nodes.  Our approach uses a commit-reveal mechanism:

Say we wanted to roll a dice – we need a number between 1 and 6.  One player can’t just choose a random number as they could pick whatever they wanted.  If both people chose a random number and then combined them (eg by each picking a number between 1 and 6, adding them, and then taking the modulus of 6 of this number), we would be in a better situation.  However, both people need to swap numbers to agree on the result, and in general, this means one person revealing their number first – at which point the second can change their mind to affect the result.

Commit-reveal provides a mechanism to prevent this.  Both players chose a large random number.  Then they both compute the hash of this.  Then the hashes are swapped (doesn’t really matter in which order).  At this point the random number is committed but unknown, assuming the hashing algorithm is good enough and the source numbers are large enough (we use the standard Ethereum keccak256 hash function, and the random numbers are 256 bits long).

Then both players reveal the number that they hashed.  Both players can verify that the revealed numbers hash to the committed hashes, and can generate the final output of the RNG.  We do this by creating a 512 bit number by concatenating the two provided numbers, taking the hash of this, and then modulo-ing to our specific needs.

This process is governed by the rules of the State Machine, and the State Channel and as such becomes part of the protocol – players are required to reveal their numbers in turn, and it is required that the numbers hash correctly.  Failure to do this (either reveal, or reveal the wrong number) is a breach of protocol, and the dispute mechanism above kicks in.

Also, it is important to ensure that the logic of the games means that nothing significant can happen after the RNG has been revealed.  If a player is making an action (such as choosing the number of a dice to bet on) this has to have happened irrevocably beforehand.

As written above this system works but is fairly inefficient, and chatty.  The dice game might look like:

  • (Player) – I’d like to bet 100 tokens on the number 6
  • (House) – OK! (the House is never obliged to take a bet)
  • (Player) – my commit is “0x…….”
  • (House) – my commit is “0x…….”
  • (Player) – my reveal is “0x…….”
  • (House) – my reveal is “0x…….”

So that’s 6 messages – and six state transitions – to roll the dice.  We can do better.

The first part is by using a chain of hashes.  Each participant generates one random number and hashes it many times – we use 10,000 by default – keeping each of the intermediate hashes in a list.  Working on the assumption that the keccak256 hash of a 256-bit number is sufficiently random, we can commit-reveal lots of random numbers by making the commit of the next number the reveal of the previous one.

In practice, this means that when the channel is opened, each participant supplies the last hash that they generated as the first commit in the channel.  Then each time they need a new number they just reveal the previous one in the list.  The counterparty can verify that this hashes to the commit, and we can do this any number of times until we run out of random numbers.  The sequence now looks like this:

  • Player and House each pre-commit the last hash in their list
  • (Player) – I’d like to bet 100 tokens on the number 6
  • (House) – Ok
  • (Player) – my reveal is “0x……”
  • (House) – my reveal is “0x……”

…which is a one-time setup and four steps – this is better.   If we didn’t have the requirement for the House to be able to refuse a bet, this could be two steps, as the player could simply say “I’d like to bet 100 tokens on the number 6 and my reveal is “0x…..” “.  At this point the House, knowing its own random number – would be able to determine the outcome of the bet, so the protocol would need to enforce that they did take the bet.

Some games have multiple stages, like blackjack.  In this case, whilst the house can refuse to start a hand of blackjack, once they have begun it, they can’t back out – so for each additional card that is dealt we do indeed do this with two steps, as above – eg “I’d like to Hit and my reveal is “0x…..” “.  We call each complete game a Round.

The final optimisation is to take the whole requesting and approval of a bet out of the State Machine completely.  What we do here is that the Player makes a request (eg to bet 100 tokens on 6).  If the House wants to take that bet, they sign the request and send the signature back to the player.  The state transition logic then looks like:

  • (Player) I’d like to bet 100 tokens on the number 6, here’s the House signature proving they’re ok with this, and my reveal is “0x……”
  • (House) my reveal is “0x……”

We finally we have our full RNG and gameplay down to two steps, which, for a commit-reveal scheme, is as optimal as we think it can get.

State Transition and Game Rules

In order to implement games on this mechanism, we need a system to define game rules.  For a dice game, where a player picks a number between 1 and 6, wins 6x their bet if they guess correctly, and loses their bet if they don’t – we could write a simple function like:

function resolveBet(uint256 guess, uint256 RNG, uint256 betAmount) public pure returns (int256 winLoss) {
    if ((RNG % 6) == guess) { // guess is actually between 0-5
        return int256(betAmount * 6);
    } else {
        return -int256(betAmount);
    }
}

This looks like it would do the trick – we’re clearly not checking for overflows here, nor that the guess is actually a number between 0 and 5.  In the latter case, betting on 7 would mean that you always lose, but that’s not always the case, and we need some additional work to get round this which I’ll talk about below.

Our State Machine has, as you would expect, a State, which looks like this:

    struct FateMachineState {
        address gameContract;
        address P1SigningAddress;

        bytes32[2] previousSeeds;
        bool isFinalisable;
        bytes packedGameState;

        bool hasPendingP0Action;
        bytes packedP0Action;

        AdditionalData additionalData;
    }

In order to play a game, we need to advance the State twice.  The first transition is the player making their bet and revealing their half of the RNG.  The second transition is the house revealing their half of the RNG, the game rules being called, and the State (and participant balances) being updated as a result.

The State Machine needs to keep track of which phase of play it’s in, and needs to cache the player’s action and RNG seed so that they can be combined with the house RNG and the game rules called after the house reveal.  This code is in advanceState().

However, we also need to validate the Actions of each participant, and this is a little more complicated.  For the Player action, we need to validate:

  • That it’s their turn to act
  • That their revealed RNG seed correctly hashes to the previous one
  • If it’s the first action in a Round, that they have a valid signature from the House accepting the bet
  • That the game rules think this is a valid bet
  • That the player has enough funds to make the bet
  • That the house has enough funds to take the bet

These last three are very important.  Looking at each of them:

Every action the player makes needs to be validated.  This would seem to be self-evident – there are only certain hands you can double or split in Blackjack, for example.  But it’s too late to check this after the house has revealed its RNG, as the state has already progressed past the player making their action.  In order to avoid writing lots of code to void invalid bets, we prefer to use the protocol here.  So, the bet is validated when the player makes their action by calling a second function in the game rules, and the State Transition fails at this point if the bet is invalid.

An important aside is that neither the game rules nor the state transition logic can ever throw with a valid action, otherwise the State Channel will grind to a halt, and an innocent party will end up being penalised through the dispute process.

We need to check that the player has enough funds to make the bet.  This is a bit more subtle, though, as in a multi-stage game such as Blackjack, they might need more funds midway through the game round.  Given our State Channels implementation doesn’t allow funds to be added to an existing channel yet, we check that the player has enough funds to cover the maximum additional amount that a round of a game might ask them to commit.  For optimisation, this amount is returned by the game rules as part of the process of validating the bet.

We also need to check that the House has enough funds to pay out if the player wins.  Here we need the maximum that the player might win, including taking into account all the possible actions (eg doubling, splitting in Blackjack) they might do during the course of a game round.

It is important that the game rules calculate these two numbers accurately!

The House action validations are much simpler.  They are only revealing their half of the RNG, so all we need to do is check that it hashes correctly.

Once all both players have acted, the State Machine calls the game rules contract to determine the outcome of the bet, and updates its state appropriately, passing the change in the participant balances back to the State Channel.

Determining the Final Outcome

As discussed above, we’re still working on the best way to handle this.  For now, though, the game rules contract returns an isEndOfRound flag, which we interpret as isFinalisable for the purposes of being able to determine whether or not a channel can be closed cleanly.

The implementation of getPayouts() is an example of how you could approach this.  We show two asymmetric penalty techniques and demonstrate paying out a commission to third party.  In reality, this code could be a lot more complicated.  I’ve not written the full implementation here to keep things simple.  You will notice an AdditionalData structure embedded in the Fate Machine state which I’ve not referred to.  In our live code, this is populated with data accumulated over the life of the channel, to allow more complex calculations to be done at the end of the channel (for example, being able to use the total amount bet or the number of bets).  This code is very bespoke to us, so I’ve not implemented it here.

Game Rules Contracts

The last step in the process is the definition of the Game Rules contracts.  We have designed these to be as straightforward as possible to write – as the intention is that there could be thousands of them, each for a different game.

The Fate Machine is passed in a game contract address in its initialisation data – this address is stored in the State.  As things currently stand, we don’t support changing game within a channel, but it would be relatively straightforward to write some new actions where the Player and the House agree to change to use a new game contract, and this would simply be updated in the state.

You will notice that the game rules contract has its own state – yet another packed set of binary data embedded in the state “onion”.  This is where it can store its own bespoke state data – eg the cards dealt in blackjack mid-round etc.  We also use this to store the results of the last bet in a more verbose way.  It might seem ok to know that a bet on roulette won or lost, but for multiple bets it would be very handy to know which won or lost, and pretty much essential to know where the ball landed, so it can be displayed to the player!

The implementation of the game rules contract here also has a provision for initialisation data to be processed.  I haven’t actually found a real-world use for this yet, but I imagine there is one out there waiting to be found.

Example Game Rules Contracts

I’ve included a few example game rules contracts to show how straightforward they can be, and how they are implemented. There’s a simple coin-flip which allows the player to bet on heads or tails. There’s a hi-lo game, to show a game with multiple actions per round, and finally a full implementation of European Roulette – which also demonstrates the technique of storing data in code rather than storage.


Jeremy Longley and the FunFair Team, August 2019


Notes – Protocol

On preventing replay attacks
A replay attack is, essentially, when a signed transaction (or data) is re-used after its initial intended use.  A simple example would be if I signed some data which said “please send Bob 10 Tokens from my account” to a contract that could interpret this, validate the signature, and transfer the tokens.  Without replay protection, Bob could then pull this signed data and the signature from the blockchain, and then call the transfer function repeatedly until he had all my tokens.  Native Ethereum transactions have replay attack prevention build in natively, but in State Channels, its signed data that’s sent, and can often be sent by anyone – so we need to build our own replay attack prevention.
 
This is why you’ll see lots of checks that the message is for the correct State Channel, and that internal nonces, participants, and other data are all valid etc.  But beyond that, were we to upgrade these contracts at a later date, we need to prevent these messages being used against any old implementations.  Additionally, transactions made on test networks need to not be valid on MainNet.
 
The simplest, if perhaps most heavy handed, approach to this, we think, is to ensure that the address of the contract that is validating the signature is included in the signed data itself.  There may be other ways of doing this, but this seems like a straightforward solution.
Finalisable vs Final
There is a mildly interesting discussion here about how to determine when a State Channel can be cleanly ended.  There are times when some thought is needed.  For example, in Chess, can a channel be closed halfway through a game when there are funds at stake on the outcome?  Can it be closed at the end of a game before another has started?
 
Our approach works by each State containing a flag isFinalisable.  This means that it’s valid for both parties to co-sign a Close Channel message on this state.  They can’t if this flag is not set.
 
Another popular approach is that the State contains a flag “isFinal”.  When this is set there can be no more state transitions, and any party able to act can unilaterally close the channel by sending a “Close Channel” action on this co-signed state without involving the counterparty.
 
These two approaches are, ultimately, versions of the same thing.  In both cases, to end a game of Chess in the middle, you’d need to implement a “Resign” function, to move the game to a finalisable, or final state.  However, I marginally prefer our approach, as it means that you can flag a state as clean to exit from without committing to it.  It’s a bit marginal – our approach takes one less state transition in general but requires one more signature.
State Transitions – view vs pure
You may have noticed that all the state transition function here – that are required to have no side-effects – are marked as view as opposed to pure.  There is an interesting side-discussion here.  It is fundamental to this approach to State Channels that state transitions are deterministic – by which I mean that the same transition will always return the same result.  If these functions are marked view, then they can access the local storage of the State Machine (and any additional supplementary contracts), which might seem necessary if you have complex game rules that refer to look-up tables etc. Which means that this storage could be altered if the contract has additional functions that permitted the owner (or someone else) to modify these tables after the contract had been deployed.
 
This, fairly obviously, could be very bad – if you sign a state transition, then someone modifies the state machine contract, the counterparty can challenge and your state transition would be considered invalid.  As this implementation doesn’t penalise someone for signing an invalid transition (earlier versions of our code did) – it just ignores them – the worst that can happen from a channel point of view is griefing, but the whole point of the exercise is that the rules are set in stone.
 
For this implementation, I have decided to leave them as view.  Each contract can be reviewed to see if any data in storage can be modified – if it is, don’t open a channel against that contract!  But I’m very much 50/50 on this point.  We don’t use storage for static data in our game contracts for a number of reasons, not least that it’s actually a lot cheaper in terms of gas usage to store data in code – SREADs are expensive.
Co-signatures vs single-signing
This implementation requres each state to be signed by both parties. In the specific case where the channel rules define a specific order of whose turn it is to act next, and that only one specific participant can act next, then it is possible to construct an alternative approach where only their signature needs to be validated against a specific state transition in the the case of a dispute. Again, this doesn’t make a huge amount of difference; the only place it would is in a channel with multiple participants, where the dispute initiation code would need to check back multiple state transitions until it got to one where the disputer had signed it. This could end up being quite expensive. I’m not aware of any particular research (although I imagine there is some) into >2 participant channels (most of the work around this is focusing on virtual channels or similar).
 
This also has similar implications for the routine closing of a channel.
 
We’ve found implementing co-signing to be pretty trivial from an application point-of-view, and I’m very happy to keep it for simplicity.
On Penalties
The sizing of penalties probably warrants an article of its own. At a minimum, a misbehaving party should be penalised at least as much as they would have gained if the counterparty hadn’t successfully challenged. However they can be asymmetric depending on use case. At FunFair we require more of the House – due to the design of casino games, in the long term they should win overall. Consequently the price they pay for this is being available for players. We currently penalise the House the entire contents of the channel if a challenge is successful, and we penalise the players much less.
On Timeout Times
These can also be asymmetric; again we place stricter requirements on the House (ie a shorter timeout window). In determining what’s correct for you, you really need to understand how people will be interacting with channels. I personally think they they need to be either short (a few hours), or long (several days) – there’s a spot in the middle – something like 24 hours – where it’s too long for short-lived channels like hours, and too short for long lived channels. Ask yourself the implications of someone’s phone battery running out (fixable in a few hours) vs someone going on a weekend break and forgetting their laptop (or that they’re even in an open channel).
Can you embed Actions in the State Machine State?
This was suggested to me recently. Yes you can. If S and A produce a new state S’, then if S’ actually contains A (ie how it got here from S), then your state validation only needs to check that S transitions to S’ as it can pull A from S’. This has the advantage of removing one signature from the process, and in fact this is how the House RNG reveal worked in our initial implementation. This is semantically identical to having the action separate, though, and in this reference implementation I’ve left it this way as I think its easier to understand.
Information Leakage
One property of blockchains that people seem to like is its semi-anonymity. Maybe State Channels don’t need to reveal any additional information than the net flow of funds from one party to another. This implementation leaks a lot of information; however my current thinking is that, as every dispute process that goes to chain will reveal something about the deeper state of the channel, it’s really an all-or-nothing situation. We can’t do this leaking nothing, so I’m not particularly bothered about leaking more.
 
It may be zero-knowledge proofs could provide this in due course – and this is certainly a thing to watch.

Notes – Code

Class Map

Why is everything 256-bits wide?
To keep the code clean for readability.  Bit-packing is an essential part of optimisation on Ethereum, as SSTOREs are expensive – but this is typically something you’d do at the end of development, once the functionality is relatively finalised, so this is left as an exercise for the reader!
Your Interfaces aren’t really Interfaces are they?
No… At least you’re reading the code, though. There are almost certainly more “correct” ways to do this, but the code structure isn’t that complicated at the moment, and maybe we’ll look at this later
What’s this FFR thing?
It’s not that interesting.  I quite like all my require statements to be explicit at the first place they make sense, rather than guessing whether or not code I’m calling has all the correct requires implemented.
 
So I found myself writing code like:
 
require(complexFunction(), “Complex function failed”);
 
which isn’t that helpful from a debugging point of view.  This technique enables the complex function to return an error and a string as to why it failed (which could be for many reasons), but keeps that actual require statement at the top level.
 
If traditional exceptions are ever added to Solidity than this could change.