Statically detecting and mapping HTTP and RPC endpoints in Go code

TL;DR

For the past few months, I have been working on a static analysis tool to help me quickly map HTTP and RPC calls in complex codebases. I decided to call this tool Wally. I designed Wally to aid in code reviews and threat modeling by determining how HTTP or RPC endpoints are connected in large codebases.

Wally can be helpful, particularly when working with monorepos containing multiple microservices, as navigating and securing the intricate web of HTTP and gRPC communications can be a challenge.

Wally’s Features:

Automated Discovery: Wally scans the Go codebase to identify HTTP and gRPC calls and route listeners.
Interactive Visualization: Using the –server option for an interactive web UI, Wally enables you to navigate through call paths visually. This feature can be very helpful for code reviews and threat modeling, offering insights into service calls and their relationships.
Variable Resolution: Wally solves compile-time constant values, uncovering constants and global variables such as HTTP routes and methods. Instead of hunting for paths in code, Wally gives you a nice list of all endpoints along with path and method strings.
Method Chain Mapping: Wally maps out the chains of HTTP and RPC calls necessary to reach any given endpoint.
Security and Connectivity Insights: Wally provides important insights during code reviews and threat modeling, addressing questions such as:
- Can users externally reach service Y through service A?
- Which service should initiate a call to transmit user input to service X?
- What are the intermediary functions between service A and service Y that might sanitize or modify the input sent to service A?

Here is a video showcasing how wally helps you visualize callpaths to HTTP and RPC endpoints:

I am actively working on Wally and hope to continue improving it as others use it. Give it a try, and let me know if you have any questions: https://github.com/hex0punk/wally

About Wally

The initial steps of threat modeling require that the modeler understand the attack surface of the application or system under test. This is a step that is required whether you are doing a formal threat model or a code review. When working with web applications or modern web services, this of course means mapping routes of services exposed by the application. That is, listing not only all the different (external or internal) HTTP and RPC endpoints from which attacks may be introduced, but also understanding how those endpoints map to each other. This allows us to answer questions such as:

If service A is exposed, can attackers use it to reach service D?
What are the different paths that user input can take from browser input to a database?

The above question may be somewhat trivial to answer when researching small applications, but it can be a daunting task when working with monorepos containing dozens, even hundreds of microservices.

Let me give you a more concrete example.

Use Case Example

You are analyzing a monorepo containing multiple microservices. Oftentimes, these sorts of projects rely heavily on gRPC, which generates code for setting up gRPC routes via functions that call Invoke. Other services can then use these functions to call each other.

One of the built-in indicators in Wally will allow it to find functions that call Invoke for gRPC routes, so you can get a nice list of all gRPC method calls for all your microservices. Further, with --ssa you can also map the chains of methods gRPC calls necessary to reach any given gRPC route. With Wally you can then answer:

Can users reach service Y hosted internally via service A hosted externally?
Which service would I have to initialize a call to send user input to service X?
What functions are there between service A and service Y that might sanitize or modify the input sent to service A?

Why Not Just Grep Instead?

Grep can certainly help in listing (but not mapping) HTTP endpoints. However,

You’d need to parse through a lot of unnecessary strings.
You may end up with functions that are similar to those you are targeting but have nothing to do with HTTP or RPC.
Grep won’t solve constant values that indicate methods and route paths.

What About Semgrep

There have been efforts to use Semgrep for listing HTTP endpoints in code, with limited results. You can also write custom rules for the same purpose. However, let’s say that you write a rule that finds the following:

rules:
- id: route-finder
  patterns:
    - pattern-either:
        - pattern: $SVC.CreatePublicHandler($METHOD, $PATH, ...)
        - pattern: $SVC.CreateAdminHandler($METHOD, $PATH, ...)
  message: Semgrep found route `$PATH` for method `$METHOD`
  languages: [go]
  severity: WARNING

In the case where $METHOD or $PATH is anything but a string literal, Semgrep will find it difficult to resolve the value of the variable. So sure, you will get the location in the code where a route is set up, but you’d have to hunt for the value of variables and constants mapping to strings that indicate URL paths.

What Can Wally Do That Grep Can’t?

Wally currently supports the following features:

Discover HTTP client calls and route listeners in your code by looking at each function name, signature, and package to make sure it finds the functions that you actually care about.
Wally solves the value of compile-time constant values that may be used in the functions of interest. Wally does a pretty good job at finding constants and global variables and resolving their values for you so you don’t have to chase those manually in code.
Wally will report the enclosing function where the function of interest is called.
Wally will also give you all possible call paths to your functions of interest. This can be useful when analyzing monorepos where service A calls service B via a client function declared in service B’s packages. This feature requires that the target codebase is buildable.
Wally will output a nice PNG graph of the call stacks for the different routes it finds.

How Does It Work?

At its core, Wally is a function mapper and tracer. In fact, you can define functions in configuration files that have nothing to do with HTTP or RPC routes to obtain call paths for various functions, regardless of their purpose. This is because Wally does the following:

Finds all functions that match details indicated by the user or defined as custom indicators. Indicators are simply strings that help guide Wally in locating functions of interest. This is accomplished by parsing the Abstract Syntax Tree (AST) of a program.
Determine all possible ways in which the function could be called. This is accomplished by using SSA Analysis.

Wally Configurations

Wally needs a bit of hand-holding. Though it can also do a pretty good job at guessing paths, it helps a lot if you tell it the packages and functions to look for, along with the parameters that you are hoping to discover and map. So, to help Wally do the job, you can specify a configuration file in YAML that defines a set of indicators.

Wally runs a number of indicators which are basically clues as to whether a function in code may be related to a gRPC or HTTP route. By default, Wally has a number of built-in indicators that check for common ways to set up and call HTTP and RPC methods using standard and popular libraries. However, sometimes a code base may have custom methods for setting up HTTP routes or for calling HTTP and RPC services. For instance, when reviewing Nomad, you can give Wally the following configuration file with Nomad-specific indicators:

indicators:
  - package: "github.com/hashicorp/nomad/command/agent"
    type: ""
    function: "forward"
    indicatorType: 1
    params:
      - name: "method"
  - package: "github.com/hashicorp/nomad/nomad"
    type: ""
    function: "RPC"
    indicatorType: 1
    params:
      - name: "method"
  - package: "github.com/hashicorp/nomad/api"
    type: "s"
    function: "query"
    indicatorType: 1
    params:
      - name: "endpoint"
        pos: 0

Note that you can specify the parameter that you want Wally to attempt to solve the value to. If you don’t know the name of the parameter (per the function signature), you can give it the position in the signature. You can then use the --config or -c flag along with the path to the configuration file.

SSA Analysis in Wally

SSA stands for Single Source Assignment. Compilers, including Go’s, use SSA as an intermediate representation of the code before generating bytecode. This allows the compiler to perform various optimization tasks such as dead code detection. Similarly, static analysis tools can use SSA to answer questions that require an understanding of the flow of a program, not just its syntax. For Wally, SSA allows it to do the following:

Solve the enclosing function more effectively using SSA.
Output all possible call paths to the functions where the routes are defined and/or called.

This is accomplished by using the --ssa flag, which generates output such as the following:

===========MATCH===============
Package:  net/http
Function:  Handle
Params:
    pattern: "/v1/client/metadata"
Enclosed by:  agent.registerHandlers
Position /Users/hex0punk/Tests/nomad/command/agent/http.go:444
Possible Paths: 6
    Path 1:
        n105973:(*github.com/hashicorp/nomad/command/agent.Command).Run --->
        n24048:(*github.com/hashicorp/nomad/command/agent.Command).setupAgent --->
        n24050:github.com/hashicorp/nomad/command/agent.NewHTTPServers --->
        n47976:(*github.com/hashicorp/nomad/command/agent.HTTPServer).registerHandlers --->
    Path 2:
        n104203:github.com/hashicorp/nomad/command/agent.NewTestAgent --->
        n92695:(*github.com/hashicorp/nomad/command/agent.TestAgent).Start --->
        n32861:(*github.com/hashicorp/nomad/command/agent.TestAgent).start --->
        n24050:github.com/hashicorp/nomad/command/agent.NewHTTPServers --->
        n47976:(*github.com/hashicorp/nomad/command/agent.HTTPServer).registerHandlers --->
    Path 3:
        n105973:(*github.com/hashicorp/nomad/command/agent.Command).Run --->
        n117415:(*github.com/hashicorp/nomad/command/agent.Command).handleSignals --->
        n79534:(*github.com/hashicorp/nomad/command/agent.Command).handleReload --->
        n79544:(*github.com/hashicorp/nomad/command/agent.Command).reloadHTTPServer --->
        n24050:github.com/hashicorp/nomad/command/agent.NewHTTPServers --->
        n47976:(*github.com/hashicorp/nomad/command/agent.HTTPServer).registerHandlers --->

PNG and XDOT Graph Output

Wally can generate graphs of call paths. When using the --ssa flag, you can also use -g or --graph to indicate a path for a PNG or XDOT containing a Graphviz-based graph of the call stacks. For example, running:

$ wally map -p ./... --ssa -vvv -f "github.com/hashicorp/nomad/" -g ./mygraph.png

From nomad/command/agent will output this graph:

Specifying a filename with a .xdot extension will create an xdot file instead.

Future Work

There is still more work to do, which means more to learn.

My immediate need is to add a “guesser” mode so that Wally can be run without indicators. That way, it can make educated guesses about possible paths. Ideally:
Wally should create D2 graphs as they are much nicer than Graphviz.
An additional list of TODOs can be found in the repo’s issues.

I will be writing blog posts on the different approaches for building static analysis tools in Go. So, if for whatever reason you found this blog post AND you are interested in static analysis techniques in Go, keep an eye on this site.

Checkout Wally and let me know what you think (somehow).

January 9, 2024