As described in a previous article, Sourcegraph’s main source of truth for all code data is gitserver, which is a sharded RPC service wrapping git commands. Requests to this service specify a repository name, which indicates the directory to perform the git command in, as well as the arguments that should be passed to the command. The LSIF service needs to fetch commit ancestry data from gitserver, which it gets through a formatting the output of git log. The request payload looks like the following.
|
|
This code path reproducibly returns a subprocess error in the HTTP response trailers on another developer’s machine:
|
|
Ensuring that we were running the same version of the code, I checked that it still worked for me locally. And it did. It also worked as I expected on a second machine of mine…
This isn’t the first time I’ve encountered another developer’s setup that broke assumptions I’ve made in code. Postgres environment variables and the existence of additional tables have bitten me in the past. Passwords with special characters in it have bitten me in the past. Installing different version of protobufs have bitten me in the past. The list goes on.
Looking closer at the error, it appears that git is trying to invoke the git command, which it doesn’t have, and tries to suggest the init command, which has the lowest edit distance. This is really strange, because this command totally works on my machines:
|
|
In fact, any number of leading gits work on my machine!
|
|
Oh, dang.
I remembered my old habit of beginning a git command, spacing out, and then resuming my train of thought from the beginning. I fixed this via a gitconfig alias (which is not original, see the prior art). This alias collapses multiple leading gits into a single git. This alias did not exist on other developer machines (or in production).
Turns out that the args
field of the gitserver request was named args
and not command
for a reason. The issue was trivially fixed here and here, and later ensured that I can’t ever add it back.