One of the most interesting go gotchas is the nil-not-nil bug. This happens when a function declares an interface as its return type, but a concrete type is returned. As a result, the returned value can never be nil, leading to unexpected behavior and, yes, panics with an anarchistic flare. A lot of the documentation on this type of bug leaves a lot to the imagination, often making some generalizations that are not necessarily true. This post describes the issue in detail and in highly verbose mode.
Let’s examine this issue in depth until our eyes hurt and we begin to develop a hate for go interfaces while appreciating their complexity. Let’s begin we a simple (but buggy program):
package main
import (
"errors"
"fmt"
"os"
)
func main() {
fmt.Println("file path: ")
var filePath string
fmt.Scanln(&filePath)
err := PrintFile(filePath)
if err != nil {
myErr := err.Error()
fmt.Println(myErr)
}
}
func PrintFile(filePath string) error {
var pathError *os.PathError
_, err := os.Stat(filePath)
if err != nil {
pathError = &os.PathError{
Path: filePath,
Err: errors.New("File not found"),
}
return pathError
}
content, _ := ioutil.ReadFile(filePath)
fmt.Println(string(content))
return pathError
}
The above is only a slightly more realistic version of the type of the issues documented here and here. All the program does is ask the user for a file path and print the contents of that file to STDOUT. If the path specified by the user does not exist, we then return an error of type *os.PathError
. In main
we print the returned error to STDOUT whenever an error is not nil. If you have already checked out either of the links above then you already have an idea of what might happen when no error is returned (that is, when filePath
exists and PrintFile
returns pathError
as nil
):
$ go run clean.go
file path:
mitch-joke.txt
Every book is a children's book if the kid can read.
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x10a0ae6]
goroutine 1 [running]:
os.(*PathError).Error(0x0, 0xe, 0x10fcdc0)
/usr/local/go/src/os/error.go:56 +0x26
main.main()
/Users/au/Projects/ToB/not-going-anywhere/cmd/test/clean.go:17 +0x142
exit status 2
Our check for nil in main
returns false
, and we get a segmentation fault when calling err.Error()
.
Go interfaces - beautiful and a bit complicated
The error above happens because the check if err != nil
in main
evaluates to false
(that is, err
is not nil
) in all cases. To understand why err
is never nil in the above case we must have a better understanding of go interfaces. This is because error
is an interface, not a concrete type.
Most documentation on go interfaces will tell you that they consist of a Type
and a Value
, but let’s take that one step further by looking at the definition of the Interface type in the go source code (declaredhere):
type iface struct {
tab *itab
data unsafe.Pointer
}
The first field is a pointer to an itab
. The tab
field holds both the type of the interface and the type of the concrete type, if there is one. On the other hand, the data
field holds a pointer to its concrete value. The itab
struct is defined here:
// layout of Itab known to compilers
// allocated in non-garbage-collected memory
// Needs to be in sync with
// ../cmd/compile/internal/gc/reflect.go:/^func.dumptabs.
type itab struct {
inter *interfacetype
_type *_type
hash uint32 // copy of _type.hash. Used for type switches.
_ [4]byte
fun [1]uintptr // variable sized. fun[0]==0 means _type does not implement inter.
}
As we can see, the itab
struct includes a pointer to the interfaceType
and a pointer to _type
. The type of the interface is stored in inter
, and the type of its concrete type is stored in _type
(technically, *interfacetype
is a wrapper around _type
). In fact, if we can examine both the interface type and the concrete type returned by the PrintFile
function by compiling our program like so:
$ go tool compile -S nilnotnil.go | grep -A 7 '^go.itab.\*os.PathError'
go.itab.*os.PathError,error SRODATA dupok size=32
0x0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
0x0010 d8 aa 89 b7 00 00 00 00 00 00 00 00 00 00 00 00 ................
rel 0+8 t=1 type.error+0
rel 8+8 t=1 type.*os.PathError+0
rel 24+8 t=1 os.(*PathError).Error+0
The first 8 bytes show the interface type (error
) whereas the next 8 bytes show the concrete type os.PathError
.
Now let’s simplify our PrintFile
function so that we can easily examine what exactly gets returned when we don’t assign any value to pathError
:
//go:noinline
func PrintFile(filePath string) error {
var pathError *os.PathError
return pathError
}
Let’s build our program and examine the compiler output for the PrintFile
function:
$ go build
$ go tool objdump -S -s main.PrintFile -gnu interfaces
TEXT main.PrintFile(SB) /Users/au/Projects/ToB/not-going-anywhere/cmd/interfaces/nilnotnil.go
return pathError
0x10b4d00 488d05f9840400 LEAQ go.itab.*os.PathError,error(SB), AX // lea 0x484f9(%rip),%rax
0x10b4d07 4889442418 MOVQ AX, 0x18(SP) // mov %rax,0x18(%rsp)
0x10b4d0c 48c744242000000000 MOVQ $0x0, 0x20(SP) // movq $0x0,0x20(%rsp)
0x10b4d15 c3 RET // retq
We return a pointer to an itab
, which in this case holds two types, os.PathError
and error
. Additionally, a nil
value is returned in the third instruction (at 0x10b4d0c
). Here we can clearly see that we are not returning nil. In fact, we are returning an interface with a non-nil type (or itab
) and a nil value. Let’s add some debugging print statements to main
to confirm this is the case.
//go:noinline
func main() {
fmt.Println("file path: ")
var filePath string
fmt.Scanln(&filePath)
err := PrintFile(filePath)
fmt.Println("DESC OF err FROM CALLER:")
fmt.Println("err = ", err)
fmt.Printf("Type: %T, Value: %v\n", err, err)
fmt.Println("Is err nil? ", err == nil)
if err != nil {
myErr := err.Error()
fmt.Println(myErr)
}
}
When running our updated code, we get the following:
$ go run nilnotnil.go
file path:
mitch-joke.txt
DESC OF err FROM CALLER:
err = <nil>
Type: *os.PathError, Value: <nil>
Is err nil? false
Eventhough the fmt.Println("err = ", err)
statement prints nil
, the following print statement ouputs a type *os.PathError
and a nil
value, which is why err
is not nil
as shown in the print statement following it. This is also why this type of check wouldn’t work either:
//go:noinline
func main() {
fmt.Println("file path: ")
var filePath string
fmt.Scanln(&filePath)
var emptyErr error
err := PrintFile(filePath)
if err != emptyErr {
myErr := err.Error()
fmt.Println(myErr)
}
}
Although main
is expecting an interface from PrintFile
, go will check against its concrete type, not the interface type.
Checking Type against nil
Now, some documentation on this matter will tell you that when we do err != nil
we are comparing <*os.PathError, nil>
against <nil, nil>
, and that is why err
is never nil
. However, the generated assembly tells us that only the type (itab
) is being compared against nil
, not the value
(data
).
Let’s update our code one more time so we include our comparison against nil and against our new, empty emptyErr
interface:
//go:noinline
func main() {
fmt.Println("file path: ")
var filePath string
fmt.Scanln(&filePath)
var emptyErr error
err := PrintFile(filePath)
if err != nil {
return
}
if err != emptyErr {
fmt.Println("not equal")
}
}
Now let’s compile the above and examine the generated assembly:
$ go build
$ go tool objdump -S -s main.main interfaces
TEXT main.main(SB) /Users/au/Projects/ToB/not-going-anywhere/cmd/interfaces/nilnotnil.go
0x10b2eda e8c1000000 CALL main.PrintFile(SB)
0x10b2edf 488b442410 MOVQ 0x10(SP), AX
0x10b2ee4 488b4c2418 MOVQ 0x18(SP), CX
if err != nil {
0x10b2ee9 4885c0 TESTQ AX, AX
0x10b2eec 0f8587000000 JNE 0x10b2f79
if err != emptyErr {
0x10b2ef2 7462 JE 0x10b2f56
The main
function collects the return values from the CALL
to main.PrintFile
and stores them in AX
and CX
. The psudo-register AX
contains the itab
pointer (which holds the type), and CX
holds the value. The TEST
instruction only checks against the tab
. How can we be sure that AX
holds the type or itab
? If we jump to the interface comparison, we find the following:
if err != emptyErr {
0x10b2f56 48890424 MOVQ AX, 0(SP)
0x10b2f5a 48894c2408 MOVQ CX, 0x8(SP)
0x10b2f5f 48c744241000000000 MOVQ $0x0, 0x10(SP)
0x10b2f68 e8d307f5ff CALL runtime.ifaceeq(SB)
When comparing two interfaces, go calls the runtime.ifaceeq
method defined here. The first argument is a pointer to an itab
, followed by two unsafe.Pointer
arguments. Because AX
has not been updated until now, we know AX
holds the itab
. Likewise, we know that CX
holds a pointer to the value of err
. The last value, $0x0
represents nil for the value of emptyErr
.
Ok, so given what we have learned so far it would follow that the value of pathError
in PrintFile
can never be nil, as go checks against the type in that functions as well, right? Not quite (and you thought we were done). Let’s go ahead and add the same print statements to the simplified PrintFile
function:
//go:noinline
func PrintFile(filePath string) error {
var pathError *os.PathError
fmt.Println("DESC OF pathError FROM CALLEE:")
fmt.Println("pathErr = ", pathError)
fmt.Printf("Type: %T, Value: %v\n", pathError, pathError)
fmt.Println("Is pathErr nil? ", pathError == nil)
fmt.Println("----------------------------------\n")
return pathError
}
The output looks like this. This includes the output from the print statements we added to main
earlier too. Just remember that main
is the caller, and PrintFile
is the callee:
$ go run nilnotnil.go
file path:
somepath
DESC OF pathError FROM CALLEE:
pathErr = <nil>
Type: *os.PathError, Value: <nil>
Is pathErr nil? true
----------------------------------
DESC OF err FROM CALLER:
err = <nil>
Type: *os.PathError, Value: <nil>
Is err nil? false
The callee, PrintFile
sees a Type of *os.PathError
and a nil
Value, which is the same thing that the caller sees when examining the returned error. At this point, both interfaces look exactly the same. Yet, while fmt.Println("Is err nil? ", err == nil)
printedfalse
in main
, fmt.Println("Is pathErr nil? ", pathError == nil)
printedtrue
in PrintFile
. This indicates that, in the callee, go does not perform the comparison to nil using the same logic we saw it used in main
, where the comparison was done only against the interface’s concrete type, not its the value.
Sometimes it is all about the value
To determine what happens in the callee, let’s update our PrintFile
function once more:
//go:noinline
func PrintFile(filePath string) error {
var pathError *os.PathError
_, err := os.Stat(filePath)
if err != nil {
pathError = &os.PathError{
Path: filePath,
Err: errors.New("File not found"),
}
}
if pathError == nil {
return nil
}
return pathError
}
Now let’s recompile the code and examine the resulting assembly. I found that in this case it was easier to see what happens by building the code without optimizations, but the resulting core logic will be the same:
$ go build -gcflags '-N -l'
$ go tool objdump -S -s main.PrintFile interfaces
pathError = &os.PathError{
0x10b5288 488b442438 MOVQ 0x38(SP), AX
0x10b528d 4889442430 MOVQ AX, 0x30(SP)
0x10b5292 eb00 JMP 0x10b5294
if pathError == nil {
0x10b5294 48837c243000 CMPQ $0x0, 0x30(SP)
0x10b529a 7502 JNE 0x10b529e
0x10b529c eb2b JMP 0x10b52c9
return pathError
0x10b529e 488b442430 MOVQ 0x30(SP), AX
0x10b52a3 4889442440 MOVQ AX, 0x40(SP)
0x10b52a8 488d0d51850400 LEAQ go.itab.*os.PathError,error(SB), CX
0x10b52af 48898c2488000000 MOVQ CX, 0x88(SP)
0x10b52b7 4889842490000000 MOVQ AX, 0x90(SP)
0x10b52bf 488b6c2468 MOVQ 0x68(SP), BP
0x10b52c4 4883c470 ADDQ $0x70, SP
0x10b52c8 c3 RET
The comparison of pathError
to nil happens after we create a new os.PathError
struct. In this case, whatever is stored in AX
is used to determine whether pathError == nil
. The question is, does AX
hold the Type or the Value of the interface? The answer to this is in the return statement shown above.
First, we load the address of the itab
(which contains the Type) in CX
. Then, CX
is moved to the stack of the caller.
0x10b52a8 488d0d51850400 LEAQ go.itab.*os.PathError,error(SB), CX
0x10b52af 48898c2488000000 MOVQ CX, 0x88(SP)
The only thing left to do is to move the Value of pathError
to the stack as well (so the caller can access it), which in this case is stored in AX
. After that, we restore the stack and return to the caller:
0x10b52b7 4889842490000000 MOVQ AX, 0x90(SP)
0x10b52bf 488b6c2468 MOVQ 0x68(SP), BP
0x10b52c4 4883c470 ADDQ $0x70, SP
0x10b52c8 c3 RET
This confirms that AX
holds the value, which tells us that the pathError == nil
check is performed against the value of pathError
, not its type. While this may seem strange (or fascinating, depending on how tired you are by now) it makes sense. The compiler is aware that pathError
was declared locally as *os.PathError
type, so it does not need to compare against the type. The compiler knows that pathError
has a type. Instead, it checks against the value. This is in contrast to what we saw in main
, the caller, where go cannot make assumptions regarding the type of the result to PrintFile
(as the signature for the PrintFile
function shows that it returns a generic error
interface) so it performed the comparison against the Type instead.
Engineering fixes
All this chaos, destruction, and panics are fun. But we are engineers, so a solution to this is also in order.
So how do we fix this? One option is to return an explicit nil when you know you have to return nil:
//go:noinline
func PrintFile(filePath string) error {
var pathError *os.PathError
//...
return pathError
}
Another way, as suggested here, is to declare and return the base error
interface rather than a concrete type.
//go:noinline
func PrintFile(filePath string) error {
var error error
_, err := os.Stat(filePath)
if err != nil {
pathError = &os.PathError{
Path: filePath,
Err: errors.New("File not found"),
}
}
return pathError
}
In both cases, PrintFile
will return a nil Type (when the value returned is nil as well), and the returned error
will evaluate to nil when it is actually nil.
Side note on reflection
I looked into the reflection package and thought that it was possible to do extract the Type from a Value by doing this:
//go:noinline
func main() {
var filePath string
fmt.Println("file path: ")
fmt.Scanln(&filePath)
err := PrintFile(filePath)
fmt.Println("\n\nREFLECTION OF err FROM CALLER:")
v := reflect.ValueOf(err)
fmt.Println("Value of err: ", v)
fmt.Println("extracting type from value of err...:", v.Type())
}
The above works; however, when looking at the code for reflect.ValueOf
here, it is clear that go snags Type information when calling that function. ValueOf
calls unpackEface(i)
, which extracts the type from the interface struct.