Go concurrency

October 19, 2016
go

Go makes writing concurrent code easy. That is to say, it offers facilities to implement concurrency in our programs, however that it does not guarantee parallel execution.

Parallel execution is dependant on the underlying system on which your code is running. A concurrent program on a single-core CPU will execute sequentially, but put that code on a multi-core CPU and it executes in parallel.

However, put a sequential program on a multi-core CPU and it will still run sequentially.

Communicating Sequential Processes.

This technique was first described in 1978 by Tony Hoare. It is a language for describing program concurrency. Go is heavily inspired by CSP, but with a modern approach.

The main point about CSP is the use of channels to pass information between concurrent routines, which are called goroutines in Go.

If our code can be split into individual parts that do not rely on each other, that can perform their task completely independetly, then we don’t need to worry about channels. All we have to do is run them as goroutines.

It is however more than likely that our individual parts will have to be in sync with each other.

Go concurrency model

The go keyword is the magic term that will spawn a goroutine.

package main

import "fmt"

func main() {
    hello("first")
    go hello("second")
}

func hello(message string) {
    fmt.Println("Hello, " + message)
}

In this example, we call the function hello() twice. The first call will put the function on the current stack, as in any other programming language. This is a standard function call.

The second case is a bit different. The appended go keyword will create a new process, a goroutine, with its own stack (that grows dynamically) and put the function on that stack.

A goroutine is not a system process, nor a thread. It is a construct of Go and it exists in the Go runtime. It can be run as a thread, if the underlying system supports that. However, we have no influence on this matter, it is up to the Go runtime to enable parallelism under right conditions.

Running the example code reveals something interesting.

$ go run sample.go
Hello, first

There is no Hello, second message. The reason behind it is very straightforward. The main() function, which in itself is a goroutine spawned another goroutine. The two goroutines are independent and have no channel to communicate through. So when the main() goroutine dispatches a call to create the hello() goroutine, it is done. There is no more code to execute and it can quit. Because it is the main goroutine, the program exists, regardless of what the other goroutines are doing.

Try making both calls to hello() as goroutines and you will see there is no output.

Just to illustrate that the hello() goroutine does indeed do its job, lets make the main goroutine wait a bit.

package main

import (
    "fmt"
    "time"
)

func main() {
    hello("first")
    go hello("second")
    time.Sleep(time.Second)
}

func hello(message string) {
    fmt.Println("Hello, " + message)
}

time.Sleep() allows us to pause the goroutine for the time specified. Running this code prints two hello messages.

$ go run sample.go
Hello, first
Hello, second

In this example, we are forced to use a magic number. What if the hello() goroutine takes longer than a second? Let’s discover a better way to handle this problem, with channels.

Go channels

A channel is a way to send messages between goroutines. A channel by itself is useless unless we send messages and listen for messages on that channel.

A channel can be closed, indicating that some process has been completed, or it can be left open and unused, in which case the garbage collector will dispose of it.

With that out of the way, let’s test some code.

package main

import "fmt"

func main() {
    // this will create a channel of type bool
    finished := make(chan bool)

    // create a goroutine and pass the channel
    go printMessage(finished)

    // wait until we receive data
    <-finished

    // then exit
}

// channel is received as a function parameter
func printMessage(finished chan bool) {
    fmt.Println("Hello, world")
    
    // send a boolean message across the channel
    finished <- true
}

Ok, let’s break things down. finished := make(chan bool) is how you create a new channel, in this case of type bool. We then pass the channel as a parameter to the function printMessage() in a new goroutine. If we did nothing else at this point, then the program would exit and we would not see the message. The solution is to listen on the channel until some data is received and the channel is closed, which is done with <-finished.

But this design has a flaw. Say we wanted to make two calls to printMessage(). Our code would look like this.

go printMessage(finished)
go printMessage(finished)

<-finished
<-finished

If we did not write <-finished twice, then the program would exit after the first goroutine finishes. This is not very practical and can be fixed easily with a loop.

package main

import "fmt"

func main() {
    finished := make(chan bool)

    // we specify how many workers we want
    numberOfGoroutines := 5

    // foreach worker we want
    for i := 0; i < numberOfGoroutines; i++ {
        
        // create a new goroutine
        go printMessage(finished, i)

        // and wait until it is finished
        <-finished
    }
}

func printMessage(finished chan bool, num int) {
    fmt.Println("Hello, world", num)
    finished <- true
}

Here, we are creating what appears to be a solution, but the catch is hidden within the loop. We are creating a new goroutine and then explicitly waiting until it is finished. We are blocking. And this is not a practical solution because our code is sequential, not concurrent.

If we run the code, we will see that numbers are ascending from 0 to 4.

$ go run sample.go
Hello, world 0
Hello, world 1
Hello, world 2
Hello, world 3
Hello, world 4

The solution is to split the problem into two.

package main

import "fmt"

func main() {
    finished := make(chan bool)
    numberOfGoroutines := 5

    // create workers in advance
    for i := 0; i < numberOfGoroutines; i++ {
        go printMessage(finished, i)
    }

    // then listen for their responses
    for i := 0; i < numberOfGoroutines; i++ {
        <-finished
    }
}

func printMessage(finished chan bool, num int) {
    fmt.Println("Hello, world", num)
    finished <- true
}

In this design, we are essentially creating 6 goroutines, main plus 5 workers. We also cannot guarantee the execution path, which is evident when we run the code.

$ go run sample.go
Hello, world 3
Hello, world 4
Hello, world 1
Hello, world 0
Hello, world 2

Each successive run will produce different output. Our code is concurrent and can be run in parallel.