Principles of designing Go APIs with channels

by Alan Shreve

Channels are concurrent-safe queues that are used to safely pass messages between Go’s lightweight processes (goroutines). Together, these primitives are some of the most popularly touted features of the Go programming language. The message-passing style they encourage permits the programmer to safely coordinate multiple concurrent tasks with easy-to-reason-about semantics and control flow that often trumps the use of callbacks or shared memory.

Despite their power, channels are rare to find in public APIs. I combed through the Go standard library for examples. As of Go 1.3, there are more than 6,000 public APIs across 145 packages. Among those thousands, there are only 5 unique uses of channels.

There is little guidance on the tradeoffs and decisions to make when using channels in a public API. By “public API” I mean “any programmatic interface whose implementer and user are two different humans”. This article will go in depth to provide a set of principles and rationale on how to use channels appropriately in public APIs. Some cases that break the rules are discussed at the end.

Principle #1

An API should declare the directionality of its channels.

Examples

#### time.After

func After(d Duration) <-chan Time

signal.Notify

func Notify(c chan<- os.Signal, sig ...os.Signal)

Although not commonly used, Go allows you to specify the directionality of a channel. The language spec says:

The optional <- operator specifies the channel direction, send or receive. If no direction is given, the channel is bidirectional.

The important part is that a directional operator in your API signature will be enforced by the compiler.

t := time.After(time.Second)
t <- time.Now()  // won't compile (send to receive-only type <-chan Time)

In addition to the safety granted by compiler enforcement, these directional operators help consumers of your API understand the direction of data flow just by looking at the type signature.

Principle #2

An API that sends an unbounded stream of values into a channel must document how it behaves for slow consumers.

Examples

#### time.NewTicker

// NewTicker returns a new Ticker containing a channel that will send the
// time with a period specified by the duration argument.
// It adjusts the intervals or drops ticks to make up for slow receivers.
// ...
func NewTicker(d Duration) *Ticker {
    ...
}

signal.Notify

// Notify causes package signal to relay incoming signals to c.
// ...
// Package signal will not block sending to c
// ...
func Notify(c chan<- os.Signal, sig ...os.Signal) {

ssh.Conn.OpenChannel

// OpenChannel tries to open an channel. 
// ...
// On success it returns the SSH Channel and a Go channel for
// incoming, out-of-band requests. The Go channel must be serviced, or
// the connection will hang.
OpenChannel(name string, data []byte) (Channel, <-chan *Request, error) 

Whenever an API sends an unbounded stream of values into a channel, the implementation will be faced with the decision about what to do if sending a value into the channel would block. This can occur either because the channel is full, or because it is unbuffered and no goroutine is ready to receive a new value. Choosing the appropriate behavior depends on the API, but an implementation must make a decision. For example, the ssh package chooses to block, and documents that if you don’t receive values that your connection will hang. signal.Notify and time.Tick choose not to block and instead drop values silently.

Unfortunately, the language does not provide a way to specify the intended behavior as part of a type or function signature. As a designer of an API, you must specify the behavior in your documentation, otherwise it is undefined. Since we are more often consumers of APIs than designers of them, it can be helpful to remember the converse rule, which is a warning:

You can never determine the behavior of an API that sends an unbounded stream of values over a channel to slow consumers without reading the documentation or the implementation.

Principle #3

An API that sends a bounded set of values into a channel it accepted as an argument must document how it behaves for slow consumers. ### BAD Example #### rpc.Client.Go

func (client *Client) Go(serviceMethod string,
                         args interface{}, 
                         reply interface{}, 
                         done chan *Call
                         ) *Call

This is similar to the second principle, except it’s for APIs sending a bounded series of values. Unfortunately, there’s not a good example of this in the standard library. The only API in the standard lib which does this is rpc.Client.Go and it violates this principle. The documentation says:

Go invokes the function asynchronously. It returns the Call structure representing the invocation. The done channel will signal when the call is complete by returning the same Call object. If done is nil, Go will allocate a new channel. If non-nil, done must be buffered or Go will deliberately crash.

Go sends a bounded number of values (just 1, when the remote call completes). But notice that because the channel is passed into the function, that it still suffers from the slow consumer problem. Even though you must pass a buffered channel to this API, sending on that channel could still block if the channel is full. The documentation does not define the behavior under this circumstance. Time to read the source code:

src/pkg/net/rpc/client.go

171 func (call *Call) done() {
172      select {
173      case call.Done <- call:
174          // ok
175      default:
176          // We don't want to block here.  It is the caller's responsibility to make
177          // sure the channel has enough buffer space. See comment in Go().
178          if debugLog {
179              log.Println("rpc: discarding Call reply due to insufficient Done chan capacity")
180          }
181      }
182  }

Uh oh! If the done channel isn’t appropriately buffered, your RPC replies may just disappear into the ether!

Principle #4

An API that sends an unbounded stream of values into a channel should accept the channel as an argument instead of returning a new channel.

Examples

#### signal.Notify

func Notify(c chan<- os.Signal, sig ...os.Signal)

ssh.NewClient

func NewClient(c Conn, chans <-chan NewChannel, reqs <-chan *Request) *Client

When I first saw the API for signal.Notify, I was confused. “Why does it take a channel as an input instead of returning a channel for my use?” I wondered. “Using the API requires the caller to allocate a channel, shouldn’t the API just do that for you, like this?”

func Notify(sig ...os.Signal) <-chan os.Signal

The documentation helps us understand why this is not a good choice:

Package signal will not block sending to c: the caller must ensure that c has sufficient buffer space to keep up with the expected signal rate.

signal.Notify takes the channel as an argument because it gives the caller control over the amount of buffer space. This allows the caller to choose how many signals it can safely miss while responding to a previous one by trading away memory to buffer those signals.

This control of buffer size also matters for high-throughput systems. Imagine this interface to a high-throughput publish/subscribe system:

func Subscribe(topic string, msgs chan<- Msg)

The more messages pushed through that channel, the greater the chance that the channel synchronization could become a performance bottleneck. Because the API allows the caller to create the channel, it delegates the decision about buffering, and thus performance tuning to the caller. This is a more flexible design.

If it’s just about controlling the size of the buffer, one could argue that an API like the following would suffice:

func Notify(sig ...os.Signal, bufSize int) <-chan os.Signal

A channel argument is still preferrable to this design because it allows the caller to wait for multiple types of signals dynamically with a single channel. This provides more flexibility to your callers both for the structure of the program and performance characteristics. As a thought experiment, let’s work with our Subscribe API to build code for the requirements, “subscribe to the ‘newcustomer’ channel, and for each message, subscribe to the topic for that customer.” If the API allows us to pass the receiving channel as an argument, we might write:

msgs := make(chan Msg, 128)
Subscribe("newcustomer", msgs)
for m := range msgs {
    switch m.Topic {
    case "newcustomer":
        Subscribe(msg.Payload, msgs)
    default:
        handleCustomerMessage(m)
}

But if the channel is returned, the caller is forced into a design with a separate goroutine for every subscription. This can cost additional memory and synchronization time in whatever piece is responsible for the demultiplexing:

for m := range Subscribe("newcustomer") {
    go subCustomer(m.Payload)
}

func subCustomer(topic string) {
    for m := range Subscribe(topic) {
        handleCustomerMessage(m)
    }
}

Principle #5

An API which sends a bounded number of values may do so safely by returning an appropriately buffered channel. ### Examples: #### http.CloseNotifier

type CloseNotifier interface {
        // CloseNotify returns a channel that receives a single value
        // when the client connection has gone away.
        CloseNotify() <-chan bool
}

time.After

func After(d Duration) <-chan Time

When an API sends a bounded number of values into a channel, it can return a buffered channel that has enough room for all the values it will send. The directionality indicator on the returned channel guarantees that the caller cannot break this contract. Channels returned by CloseNotify and After take advantage of this.

On the other hand, be aware that these calls could be more flexible by allowing the caller to pass in a channel to receive values, but then they would be forced to cope with cases where the channel is full (Principle #3). For example, this an alternative, more flexible CloseNotifier:

type CloseNotifier interface {
        // CloseNotify sends a single value with the ResponseWriter whose
        // underlying connection has gone away.
        CloseNotify(chan<- http.ResponseWriter)
}

But the cost of the extra flexibility doesn’t seem worth paying since it is unlikely that a single caller would ever want to wait on multiple close notifications. After all, close notifications only make sense within the context of a single connection, and connections are typically largely independent.

P-P-P-Principle Breakers

Some of the APIs we’ve examined break some of the principles. They warrant a closer look.

Breaking Principle #1

An API should declare the directionality of its channels. ### Example #### rpc.Client.Go

There’s no directionality indicator on the done channel you pass in:

func (client *Client) Go(serviceMethod string,
                         args interface{}, 
                         reply interface{}, 
                         done chan *Call
                         ) *Call

Without diving in too deeply, this happens because the done channel is returned to you as part of the Call struct.

type Call struct {
        // ...
        Done          chan *Call  // Strobes when call is complete.
}

This flexibility is required so that a done channel can be allocated for you if you pass nil. Fixing this would require removing Done from the Call struct and creating two functions:

func (c *Client) Go(method string, 
                    args interface{},
                    reply interface{}
                    ) (*Call, <-chan *Call)

func (c *Client) GoEx(method string,
                      args interface{},
                      reply interface{},
                      done chan<- *Call
                      ) *Call

Breaking Principle #4

An API that sends an unbounded stream of values into a channel should accept the channel as an argument instead of returning a new channel.

Examples

#### go.crypto/ssh

func NewClientConn(c net.Conn, addr string, config *ClientConfig)
    (Conn, <-chan NewChannel, <-chan *Request, error)

time.Tick

func Tick(d Duration) <-chan Time

The go.crypto/ssh package returns channels of unbounded streams nearly everywhere. ssh.NewClientConn is just one of those APIs. A better API that gives the callers more control and flexibility would instead be:

func NewClientConn(c net.Conn,
                   addr string,
                   config *ClientConfig,
                   channels chan<- NewChannel,
                   reqs chan<- *Request
                   ) (Conn, error)

time.Tick violates this principle as well, but it’s easy to forgive. It’s rare that you’ll ever be creating that many tickers, and you typically want to handle them independently anyways. Buffering doesn’t make much sense in this case either.

More and revisions

This material was eventually turned into a talk with some updates and changes, those start about half-way through.

Principles of designing Go APIs with channels talk from GopherCon India 2015

-

Thanks to Kyle Conroy, Jeff Lindsay, shazow, and Blake Mizerany for feedback on drafts.