go ast got visitor pattern wrong

what’s a visitor?

According to wikipedia:

“The visitor design pattern is a way of separating an algorithm from an object structure on which it operates. A practical result of this separation is the ability to add new operations to existent object structures without modifying the structures. It is one way to follow the open/closed principle.”

The Visitor pattern, as described in wikipedia allows you to “Visit” a structure in a way that doesn’t require the structure to change. This is particularly good in scenarios such as traversal of trees.

Essentially; a Visit(...) function is called for each of the elements in the structure.

This pattern is an alternative to java style instanceof or go style .(type) assertions and switches. The with these being that it is non obvious when adding an additional type where those queries for type are being made, you could be introducing unintended consequences in unrelated parts of the system. There could be many sites in the code that rely on switching / handling all the types that could implement an interface and if one is missed, perhaps the default behaviour is a runtime error.

Using a visitor is a compile time pattern, and the program will not compile if any given type is required to implement the visitor pattern, but visitors have no defined methods for handling that type.

what’s wrong?

The go/ast package defines a Visitor interface:

1
2
3
4
5
6
// A Visitor's Visit method is invoked for each node encountered by Walk.
// If the result visitor w is not nil, Walk visits each of the children
// of node with the visitor w, followed by a call of w.Visit(nil).
type Visitor interface {
	Visit(node Node) (w Visitor)
}

This interface (in idiomatic go style) defines a single method Visit(node Node) that takes an ast.Node. The problem here is that ast.Node is itself an interface.

1
2
3
4
5
// All node types implement the Node interface.
type Node interface {
	Pos() token.Pos // position of first character belonging to the node
	End() token.Pos // position of first character immediately after the node
}

This means you end up with code like:

1
2
3
4
5
6
7
8
func (v *visitor) Visit(n ast.Node) ast.Visitor {
    switch x := v.(type)
    case *ast.Ident:
        // do stuff with *ast.Ident

    case *ast.IfStmt:
    	// do stuff with *ast.IfStmt
}

Type switching on the values that implement ast.Node, and doing stuff based on the type. You have to ensure that you have a case statement for all the types (at least all the ones that you need), and you end up with a verbose switch statement. It’s kind of gross and doesn’t make the best use of the Visitor pattern. It also requires runtime checking of types, which could result in runtime errors if some of the required types are unhandled, or new types are added.

so what’s better?

The visitor pattern can be implemented in a way that retains all the typing information and separates the code logically. Let’s take a stock example:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
type Wheel struct {
	// omitted
}

type Door struct {
	// omitted
}

type CarVisitor interface {
	VisitWheel(w *Wheel)
	VisitDoor(d *Door)
}

type Component interface {
	Accept(v CarVisitor)
}

// make wheel a Component
func (w *Wheel) Accept(v CarVisitor) {
	v.VisitWheel(w)
}

// make door a Component
func (d *Door) Accept(v CarVisitor) {
	v.VisitDoor(d)
}

func main() {
    // v CarVisitor
    // w *Wheel
    // d *Door

    cs := []Component{w, d}
    
    for _, c := range cs {
        // c Component, accepts the visitor and delegates the call 
        // to the concrete implementation of Component, which calls 
        // the CarVisitor v with the correct method, ensuring type safety.
        c.Accept(v)
    }
}

With the Car and Wheel types we can make them Components by implementing the Accept(v CarVisitor) method. The interesting thing here, is that because we delegate the calls to CarVisitor::VisitX to the structs that are being visited, we get type safety, and control over which visit func we are calling. It also means that adding a new type such as Roof, and assuming that Roof needs to be a Component, we must implement the Accept method for Roof. Attempting to do this would show us that our CarVisitor has no method VisitRoof(...) and we would know at compile time that there was a problem here.

Going back to the ast.Visitor example with these types; ast.Visitor does the equivalent of:

1
2
3
type CarVisitor interface {
    Visit(c Component)
}

instead of:

1
2
3
4
5
type CarVisitor interface {
    VisitDoor(d Door)
    VisitWheel(w Wheel)
    ...
}

The difference here is that the Visit func of CarVisitor is not called with a concrete implementation, it’s called with the Component interface. Meaning that adding a new component does not have type safety, and we are not guaranteeing that we are handling the new Roof component.

We don’t pass the CarVisitor to the type and get the type to call us back with the correct method, instead we pass the interface Component that the type implements, and try and assert or query the underlying type.

so why not?

There are some smart programmers working on the go language (considerably smarter than most of us!) So why didn’t they do this?

Well, I can’t speak for them, but there are some things that our simplified examples have got wrong.

The number of nodes is a problem. In our example we only had a few Components of the car. The go/ast package has 68 implementations of ast.Node, that means that our interface needs to implement 68 methods. One for each type. That’s an unrealistic number of methods to need to implement to work with an ast.Node. Instead, we can just implement the switch cases for the node implementations we care about. Plus it contradicts one of the go proverbs:

“The bigger the interface, the weaker the abstraction.”

So, taking all things into account; ast.Visitor is a really nice, clean and idiomatic way of visiting all the nodes in an AST.