go's anonymous functions in loops

In May 2017 I wrote a post about some gotchas in golang from my first few years with the language.

In that post I included this example code:

1
2
3
4
5
6
7
8
9
var printers []func()
strings := []string{"a", "b", "c"}
for _, str := range strings {
	printers = append(printers, func() { print(str) })
}

for _, print := range printers {
	print()
}

When run this will always give the output:

1
2
3
c
c
c

This example is an interesting one, because it’s a serial example of a deeper, often concurrent problem. The following is a breakdown of the serial and concurrent versions of this problem and an explantion of why they happen:

Take this example for instance, which is a modified example of the one in the golang common mistakes wiki:

1
2
3
4
5
for _, str := range []string{"a", "b", "c"} {
	go func() {
		fmt.Println(str)
	}()
}

This may look like exactly the same problem, and a comparable program that produces the same output. But it’s not, and they are wildy different. Here why…

Firstly what do they have in common:

Both programs appear produce the same unexpected output.

The differences:

Both programs do not always produce the same unexpected output.

Concurrent Program

Lets start with a runnable example, to observe the program produces the same output, go playground link I’ve included the body of the main function here:

1
2
3
4
5
6
for _, str := range []string{"a", "b", "c"} {
	go func() {
		fmt.Println(str)
	}()
}
<-time.After(1 * time.Second)

I’ve put “same” in italics because this is not stricly true, as we will see

In the runnable example above, you probably observed the same output of c c c but do not be fooled into thinking that it always the same.

The important thing here is that this is no longer the same program that we were working with before. Consider the original program, that did not include the line <-time.After(1 * time.Second); on my machine and in the go playground the main goroutine exits just after finishing the for loop, this means that the program exits and the goroutines that print each of the values are not scheduled, so we get no ouput. The time.After(...) sleep on the main thread blocks it, giving the anonymous function goroutines a chance to run before exiting the main goroutine.

This introduces a side effect in the fact that we are assuming that the extra goroutines that we scheduled inside the body of the for loop are capable of being run, and complete while the main goroutine is asleep on the time.After(...). This is not guaranteed, this means that we have a non deterministic algorithm. Logically we are not always guaranteed to get the same output from the program. In the case where the anonymous goroutines cumulatively take longer than 1 second, we may get fewer lines than we were expecting (< 3).

So sleeping is a bad idea; we can use sync.WaitGroup to get a deterministic output can’t we? Here’s the updated code, and playground link

1
2
3
4
5
6
7
8
9
var wg sync.WaitGroup
for _, str := range []string{"a", "b", "c"} {
	wg.Add(1)
	go func() {
		defer wg.Done()
		fmt.Println(str)
	}()
}
wg.Wait()

In this updated program we create a wait group, register each of the anonymous goroutines with the group inside the body of the for loop, schedule them, and after the loop has exited we wait to ensure that all the goroutines have returned. So this solves the temporal coupling between the time.After(...) and go routine runtime scheduler. We are now guaranteed to wait for as long as the anonymous goroutines take to return before exiting the program. But here’s the spoiler; we are still not guaranteed to get the same output as the original serial program.

The inner anonymous function that’s run as a new goroutine is executed inside the same address space as the parent (in our case main goroutine), ths means that the child or anonymous goroutine function call references the memory location of the original variable.

With a simplified example we can replicate the behaviour, runnable example:

1
2
3
4
5
6
7
8
9
var wg sync.WaitGroup
str := "hello"
wg.Add(1)
go func() {
	defer wg.Done()
	fmt.Println(str)
}()
str = "world"
wg.Wait()

Here we assign the variable str to hold the value hello, schedule a goroutine to print the variable, and then update str to be world. In this example, which again produces a nondeterministic output, we are guaranteed to get something printed out because we wait for the goroutine to return, but we cannot guarantee what will be printed. The options are “hello” or “world”. This is because the anonymous goroutine may be scheduled and run before or after the update to make str hold the value world.

The goroutine runs in the same address space which means that it will print the updated value if it runs after the update has happened. It’s referecing the exact same variable.

So this makes sense for a concurrent program where we are not guaranteed what order the anonymous functions will execute in, because that’s up to the runtime scheduler of goroutines. It’s this lack of guaranteed ordering that makes the output of the concurrent program nondeterministic. But what of the serial program? In that we are guaranteed what order the anonymous functions are executed in, because we store them in a slice of printer funcs and execute them in the same order.

Serial Program

Looking again at the serial program from the start of this post:

1
2
3
4
5
6
7
8
9
var printers []func()
strings := []string{"a", "b", "c"}
for _, str := range strings {
	printers = append(printers, func() { print(str) })
}

for _, print := range printers {
	print()
}

We see here that we create a number of anonymous functions and execute them in the order that we created them. So we don’t get the scheduling issues that we saw with the concurrent version, but we do get the same unexpected c c c output.

If, as with the concurrent version, we consider a simplified example runnable code:

1
2
3
4
5
6
str := "hello"
print := func() {
	fmt.Println(str)
}
str = "world"
print()

In this example, it’s much more obvious what will happen. The anonymous function that we create shares the same address space as the parent function, this means that when we come to execute print() we are operating on the updated value of str.

So, considering again the original example, with multiple printer funcs, why then do we see the output c c c?

The range expression is evaluated once before beginning the loop this means that the values that are initialised by the := declaration are reused for each of the loop iterations. As we saw with the simplified example; anonymous functions run in the same address space. Combine these two facts and we see how we end up with the unexpected c c c output.