Borne of the need to return all the results from a GitLab code search, and run some simple summary stats on those results. This seems like something jq or a competing utility would do, but searching the InterWebs, I turned up empty-handed. The usual suggestions were just use a for loop in shell script.
Probably Xidel has some kind of support for pagination, but the way Xidel works is sometimes difficult to reason about. Shell script, though, I can do.
Tested on macOS, with the *BSD version of sed, but I don't think I've done anything there that won't work on Linux. Feedback welcome.
Make sure you have jq available in your search path.
Create a personal access token in your GitLab settings with the read_api scope.
Then:
GIST=https://gist.github.com/ernstki/3707675c8a4ddb06d128154947c49e29
mkdir -p ~/bin
( curl -L $GIST/raw || wget -O - $GIST/raw ) > ~/bin/glapi
chmod a+x ~/bin/glapi
# define these in your login scripts, or the current shell session
export GITLAB_URL=https://url.to.your/gitlab
export GITLAB_TOKEN='personal access token for read-only API'
# make sure it works
glapi --helpYour ~/bin is typically already in your $PATH for most modern Unixes. You may need to log out and back in if your ~/.profile or similar checks for the existence of ~/bin on login, though.
When searching for the exact phrase, make sure to wrap with literal double quotes, as shown below.
$ glapi search terms # do a code search for 'search' and 'terms'
$ glapi --all '"exact phrase"' # search for an exact phrase, all results
$ glapi -I '"search phrase"' # see HTTP headers for the above
$ glapi --count /projects # count how many projectsNo rate-limiting is done when there are multiple pages of results; this wasn't an issue for me since I created it for use on an internal site, but you could find yourself blocked if you try this on a public or heavily-loaded instance.
There is no error handling if you mess up the GITLAB_URL (remember to include e.g. the /gitlab part of the URL if not served from the root) or your GITLAB_TOKEN is wrong. Here's how you can troubleshoot that, though:
TRACE=1 glapi -I [other options]That is curl's -I / --head option. Other curl options like -f / --fail may work, too.
The way the script is currently implemented, it depends on the x-total: or
x-total-pages: headers (or their title-cased variants, on older GitLabs) to
perform these functions.
If there are a lot of results, GitLab won't return that header at all. However, a breaking change to the API for the 13.5 release appears to have effectively removed these headers for good, although supposedly they can be re-enabled with a feature flag. I have not tried.
In order to fix this for real, I'd have to loop through results one at a time in order to count them all, and that's not something I currently plan to do.
See this Reddit thread, issue #264375 and MR #43159 for details.
The API parameter scope=blob only works when you have a license
that allows you to use the (non-free) "Advanced Search" functionality of
GitLab.
This is not really a bug, but if you get output like:
[
"scope does not have a valid value"
]
…this is the reason why.
- Fetch multiple pages of json with jq - stackoverflow.com
- GitLab REST API - docs.gitlab.com
Your tool is very nice, thanks 🙏
I just had to change the capitalized headers to lowercase because this is what my gitlab (16.9) was inserting in the HTTP header. I don't know if it depends on the version.
E.g.,