THAT 503 WAS NOT YOUR CODE

The alerts started coming in mid-afternoon. Intermittent 503s. Not a flat outage, not a clean failure. Just enough errors to be alarming and inconsistent enough to be confusing. The kind of thing that makes you refresh the dashboard three times hoping the number changes.

Our team builds a complaint intake tool for the FDA. It simplifies the reporting process for the public and agency employees, and it integrates with several upstream systems. When one of those integrations started returning 503s, the software engineering team did exactly what made sense from where they sat: they started looking at the API.

That was the right instinct. It just wasn’t the right layer.

The Assumption That Made Sense

When you spend most of your time writing application code, your mental model of a request is pretty direct. A user does something, the app handles it, a response comes back. If something goes wrong, you look at the app.

That model is correct for your layer of the stack. The problem is that in a cloud environment, there are layers between the client and your application that your code never touches. And those layers can fail too.

In our case, the team concluded the upstream API was degraded. That was a reasonable read. The 503s were real. But the API itself was healthy. The failure was happening before the request ever reached it.

What Actually Lives Between the Client and Your App

GCP’s external HTTP(S) load balancer sits in front of your backend services and manages how traffic gets routed to your application instances. Before it forwards a request anywhere, it checks whether each backend instance is healthy enough to receive traffic.

Those health checks are a separate configuration from your application. You define a path, a port, a protocol, and thresholds for how many consecutive passes or failures should flip an instance’s status. When an instance fails enough health checks in a row, the load balancer marks it unhealthy and stops sending it traffic.

Here is the part that trips people up: when a request comes in and gets routed toward an unhealthy instance, the load balancer returns a 503 to the client. Not your application. The load balancer.

From the client’s perspective, and from a basic API response, that 503 looks identical to a 503 your application would generate. Same status code. Same surface appearance. Completely different origin and completely different fix.

How to Tell the Difference

This is where GCP gives you specific tools, and knowing where to look changes everything.

Start with Cloud Logging. Filter your logs for httpRequest.status = 503 and look at the statusDetails field in the log entries. When the load balancer itself is the source of the 503, GCP populates that field with a value like backend_unhealthy. That is your signal. If you see backend_unhealthy in the logs, your application code is not the problem. Stop looking there.

Check the backend service health directly. In the GCP console, navigate to Network Services > Load Balancing and open the relevant backend service. You will see the health status of each backend instance listed in near real time. If instances are flipping between healthy and unhealthy, that pattern tells you something about what is triggering the health check failures, whether it is a slow startup, a misconfigured check path, resource pressure on the instance, or something upstream that the health check endpoint depends on.

Pay attention to response latency. 503s from the load balancer tend to come back very fast because the request never travels to your application. If your error rate is spiking while your p99 latency is dropping or staying unusually low, that asymmetry is a strong indicator the failures are happening at the load balancer layer, not inside your app.

If all three of those are clean, then it is your application. Not before.

Explaining It to the Team

The framing that seemed to click was this: think of the load balancer as a bouncer working the door. Your application is the bar inside. Before anyone gets in, the bouncer checks whether things are running smoothly back there. If the check comes back wrong, the bouncer turns the customer away at the door. The customer gets a 503. The bar never knew they were there.

The 503 the customer receives looks the same whether the bar is actually closed or the bouncer just decided not to let them in. But the fix is in completely different places. If the bouncer is the problem, changing something in the bar does nothing.

Once the team had that picture, the conversation shifted from “why is the API broken” to “why is the health check failing,” which is the right question. We traced it back to a backend instance that was experiencing intermittent latency on startup, causing it to miss the health check threshold and get marked unhealthy before it was ready to serve traffic.

What to Do Next Time

When you see intermittent 503s in a GCP-hosted service, run through these two checks before touching your application code:

Open Cloud Logging, filter for 503 responses, and look at the statusDetails field. If it says backend_unhealthy, the load balancer is the origin of the error.
Open Network Services > Load Balancing in the GCP console and check the health status of your backend instances directly.

If both of those are clean, then your application is worth investigating. But starting there without ruling out the load balancer layer first is how you spend an hour debugging code that was never broken.

The 503 that scared us that afternoon was not coming from anyone’s code. It was a bouncer doing its job, flagging an instance that was not quite ready. Knowing where to look made the difference between a long incident and a quick one.

Resources for Developers

Diagnose load balancer errors: Use Cloud Logging to filter for statusDetails and identify whether a 503 originated from your application or the load balancer.
Understand GCP health checks: Read the Health Check Overview to learn how check intervals, thresholds, and paths affect backend instance status.
Configure external load balancing: The External HTTP(S) Load Balancer documentation covers backend service setup, URL maps, and forwarding rules end to end.
Monitor backend health in real time: Explore Cloud Monitoring to set up alerting on backend health state changes before the next incident catches you off guard.

Written on June 3, 2026