Surendhar Reddy

Cloud Run; Learnings, hacks and tips

Over the past few weeks, I have been working on enhancing the efficiency of our computing resources which are hosted on Cloud Run. The aim was to improve the cold start and reduce cloud spending.

In this blog post, I would like to share some of the learnings, hacks and tips that I have discovered during this exercise.

. . .


How to measure the cold starts?

Use container startup latency to measure cold start

Time between when an instance is started and when it’s ready to receive requests

This latency is primarily influenced by what your code does at startup, so now go optimize

Why minimum instances doesn’t always solve the problem

Minimum instances can be used to remove the cold-start encountered when going from zero to one instance, but min-instances aren’t a solution for all cold-starts as traffic scales out to higher numbers of instances. As part of our continued efforts to give you more control over cold start latency, startup CPU boost can help speed up every cold start.

Why use startup CPU boost over min instances

Min instance is here mainly for the “0 > 1” case and completely eliminates cold starts

Startup CPU Boost is here for N > N+1 and speeds up cold starts but does not eliminate them.

Use both, notably because CPU Boost doesn’t impact your bill much.

How request timeout setting works

For Cloud Run services, your container must send a response within the time specified in the request timeout setting after it receives a request, including the container startup time. Otherwise the request is ended and a 504 error is returned.

Container lifecycle

Instances must listen for requests within 4 minutes after being started and all containers within the instance need to be healthy.

A request waiting for an instance to start will be kept in a queue for a maximum of 10 seconds

If one or more Cloud Run containers exceed the total container memory limit, the instance is terminated. All requests that are still processing on the instance end with an HTTP 500 error.

Cloud Run serving errors

Code Reason
HTTP 401 Client is not authenticated properly
HTTP 403 Client is not authorized to invoke or call the service
HTTP 404 Not Found
HTTP 429 No available container instances
HTTP 500 Cloud Run couldn’t manage the rate of traffic
HTTP 500 / HTTP 503 Container instances are exceeding memory limits
HTTP 503 Malformed response or container instance connection issue
HTTP 503 Unable to process some requests due to high concurrency setting
HTTP 504 Gateway timeout error

How pricing works in Cloud Run

We are paying for CPU, memory and the traffic sent to the client from your application (egress traffic).

Tier 2 pricing (in USD)

Resource Free Tier Charged Rate
CPU 100 milliseconds $0.00003360 per vCPU-second
Memory 128 MB $0.00000350 per GB-second
. . .

Hacks (vs impact on cost) to keep the instances warm

. . .

Tips for performance optimisation

. . .

Further reading