This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.


ghProxy is a reverse proxy HTTP cache optimized for use with the GitHub API ( It is essentially just a reverse proxy wrapper around ghCache with Prometheus instrumentation to monitor disk usage.

ghProxy is designed to reduce API token usage by allowing many components to share a single ghCache.

with Prow

While ghProxy can be used with any GitHub API client, it was designed for Prow. Prow’s GitHub client request throttling is optimized for use with ghProxy and doesn’t count requests that can be fulfilled with a cached response against the throttling limit.

Many Prow features (and soon components) require ghProxy in order to avoid rapidly consuming the API rate limit. Direct your Prow components that use the GitHub API (anything that requires the GH token secret) to use ghProxy and fall back to using the upstream API by adding the following flags:

--github-endpoint=http://ghproxy  # Replace this as needed to point to your ghProxy instance.


A new container image is automatically built and published to whenever this directory is changed on the master branch. You can find a recent stable image tag and an example of how to deploy ghProxy to Kubernetes by checking out Prow’s ghProxy deployment.

Throttling algorithm

To prevent hitting GH API secondary rate limits, an additional ghProxy throttling algorithm can be configured and used. It is described here.

1 - ghCache


ghCache is an HTTP cache optimized for caching responses from the GitHub API ( Specifically, it has the following non-standard caching behavior:

  • Every cache hit is revalidated with a conditional HTTP request to GitHub regardless of cache entry freshness (TTL). The ‘Cache-Control’ header is ignored and overwritten to achieve this.
  • Concurrent requests for the same resource are coalesced and share a single request/response from GitHub instead of each request resulting in a corresponding upstream request and response.

ghCache also provides prometheus instrumentation to expose cache activity, request duration, and API token usage/savings.


The most important behavior of ghCache is the mandatory cache entry revalidation. While this property would cause most API caches to use tokens excessively, in the case of GitHub, we can actually save API tokens. This is because conditional requests for unchanged resources don’t cost any API tokens!!! See: Free revalidation allows us to ensure that every request is satisfied with the most up to date resource without actually spending an API token unless the resource has been updated since we last checked it.

Request coalescing is beneficial for use cases in which the same resource is requested multiple times in rapid succession. Normally these requests would each result in an upstream request to GitHub, potentially costing API tokens, but with request coalescing at most one token is used. This particularly helps when many handlers react to the same event like in Prow’s hook component.

2 - Additional throttling algorithm


An additional throttling algorithm was introduced to ghproxy to prevent secondary rate limiting issues (code 403) in large Prow installations, consisting of several organizations. Its purpose is to schedule incoming requests to adjust to the GitHub general rate-limiting guidelines.


An incoming request is analyzed whether it is targeting GitHub API v3 or API v4. Separate queues are formed not only per API but also per organization if Prow installation is using GitHub Apps. If a user account in a form of the bot is used, every request coming from that user account is categorized as coming from the same organization. This is due to the fact, that such a request identifies not using AppID and organization name, but sha256 token hash.

There is a possibility to apply different throttling times per API version.

In the situation of a very high load, the algorithm prefers hitting secondary rate limits instead of forming a massive queue of throttled messages, thus default max waiting time in a queue is introduced. It is 30 seconds.


Flags --throttling-time-ms and --get-throttling-time-ms have to be set to a non-zero value, otherwise, additional throttling mechanism will be disabled.

All available flags:

  • throttling-time-ms enables a throttling mechanism which imposes time spacing between outgoing requests. Counted per organization. Has to be set together with --get-throttling-time-ms.
  • throttling-time-v4-ms is the same flag as above, but when set applies a separate time spacing for API v4.
  • get-throttling-time-ms allows setting different time spacing for API v3 GET requests.
  • throttling-max-delay-duration-seconds and throttling-max-delay-duration-v4-seconds allow setting max throttling time for respectively API v3 and API v4. The default value is 30. They are present to prefer hitting secondary rate limits, instead of forming massive queues of messages during periods of high load.
  • request-timeout refers to request timeout which applies also to paged requests. The default is 30 seconds. You may consider increasing it if throttling-max-delay-duration-seconds and throttling-max-delay-duration-v4-seconds are modified.

Example configuration

Args from ghproxy configuration YAML file:

          - --cache-dir=/cache
          - --cache-sizeGB=10
          - --legacy-disable-disk-cache-partitions-by-auth-header=false
          - --get-throttling-time-ms=300
          - --throttling-time-ms=900
          - --throttling-time-v4-ms=850
          - --throttling-max-delay-duration-seconds=45
          - --throttling-max-delay-duration-v4-seconds=110
          - --request-timeout=120
          - --concurrency=1000 # rely only on additional throttling algorithm and "disable" the previous solution


Impact and the results after applying additional throttling can be consulted using two ghproxy Prometheus metrics:

  • github_request_duration to consult returned status codes across user agents and paths.
  • github_request_wait_duration_seconds to consult the status and waiting times of the requests handled by the throttling algorithm.

Both metrics are histogram type.