Your nil isn't nil
So I was reviewing some Go code the other day and ran into the classic. You know the one. The bug that has probably cost me a combined 6 hours of staring at the screen across my Go career.
type MyError struct{ msg string }
func (e *MyError) Error() string { return e.msg }
func doWork() error {
var e *MyError // nil pointer right
return e
}
func main() {
err := doWork()
if err != nil {
fmt.Println("got error:", err) // this runs. somehow.
}
}The function returns a nil pointer. The check says err != nil and the check passes.
The handler runs. If you call .Error() on it you get a nil pointer dereference and the whole thing falls over.
First time you see this you think the compiler is broken. It isn't. Go is doing exactly what Go was told to do. The problem is the mental model in your head doesn't match what an interface actually is.
Let me explain it properly because I never really got this until somebody sat me down and drew it on a napkin.
An interface isn't a thing, it's two things
When you write var err error, Go doesn't reserve a slot for "an error." It reserves two slots side by side:
err = ( type slot , value slot )The type slot says "what concrete type is in here." Stuff like *MyError, *os.PathError, whatever. The value slot says "where's the actual data."
Each slot is pointer sized. Every single interface variable in your Go program is two words. 16 bytes on a 64 bit machine.
If you ever wanna go look at the runtime source it's called iface (for interfaces with methods) and eface (for empty interfaces). Same idea either way.
nil for each slot means different things
The two slots are independent. You can have one set without the other.
- The type slot is
nilwhen you haven't assigned anything that has a type yet. - The value slot is
nilwhen the thing it points to is itselfnil.
This is the bit that breaks brains. There's two different nils in play and they don't have to agree with each other.
The comparison rule
This is the only rule you have to remember:
err == nilis true ONLY when both slots are nil.
( nil , nil ) → err == nil yes
( *MyError , nil ) → err == nil no <- your bug lives here
( *MyError , &thing ) → err == nil noLook at that middle case. The type slot is set. The value slot is empty. The interface knows it's holding "a *MyError that happens to be nil." From Go's perspective that's a thing. It's not nothing. So != nil.
Why returning the variable triggers this
Back to the example:
var e *MyError // e is a nil *MyError pointer
return e // function signature says we return `error`When you return e, Go has to convert e (a *MyError) into an error (an interface). That conversion happens silently, and both slots get filled.
type slot = *MyError // that's what e is typed as
value slot = nil // that's what e points toNow the caller gets this and does if err != nil. The type slot is set, so by the comparison rule it isn't nil. They walk into the error branch, touch any method that uses the receiver, and: boom.
Compare that with returning the literal nil:
return nilGo fills both slots with nil:
type slot = nil
value slot = nilThat's the real nil interface. The check goes the right way. Everyone's happy.
So the takeaway is that the literal nil and a typed nil variable returned through an interface are NOT the same thing, even though they both look nil when you read the code.
Where this actually shows up
You're not gonna write the trivial example I started with. You'd catch it. Where it gets you in real life is more like this:
func validate(input string) error {
var problems *ValidationErrors
if len(input) == 0 {
problems = newValidationErrors()
problems.Add("empty input")
}
return problems // congrats this is the bug
}When the input is fine, problems stays nil. You return that. But the return type is error, so Go wraps your nil *ValidationErrors into an interface with a non-nil type slot. Every caller does if err != nil and lights up the error path even though there's no actual error.
It's literally just "I declared a typed nil and let it leak out through an interface."
The fix is annoying but easy
Return a bare nil when there's no error, not a typed one.
func validate(input string) error {
if len(input) == 0 {
return &ValidationErrors{messages: []string{"empty input"}}
}
return nil // bare. untyped. real nil.
}If you already have the variable hanging around for some reason, guard before returning:
if problems == nil {
return nil
}
return problemsCan a linter save you?
Sort of. Not really.
go vet (the default one that ships with the toolchain) does NOT catch this. There's a check called nilfunc but that's a different thing entirely (it catches comparing functions to nil). I assumed for a while that go vet shipped with a nilness pass. It doesn't. I was wrong.
The actual nilness analyzer lives at golang.org/x/tools/go/analysis/passes/nilness and most people run it through golangci-lint. It does interprocedural analysis and catches some of these cases but not all. staticcheck flags some adjacent stuff like comparing an interface to nil when it can be proven to always have a type.
Truth is none of these tools save you in general. The bug shows up when the value depends on runtime state, and static analysis can only see so far down that road. You have to know the rule.
What should I keep in my mind?
Forget everything else. Just keep this in your head:
An interface is
(type, value). It only equals nil when both are nil. A typed-nil pointer wrapped in an interface has a type, so it isn't nil.