Monitor types · 8 min read

How to Monitor a REST API for Uptime and Response Time

June 19, 2026

A code editor showing API request and response in a dark theme.

Monitoring a REST API is not the same as monitoring a website. A successful HTTP 200 from your API does not guarantee the response is correct, the response time is acceptable, or the authentication is still working. This guide covers what an API monitor should actually check, the thresholds that catch problems early without being noisy, and the silent failure modes that simple ping checks miss.

The four things an API monitor should check

A useful API monitor checks four things on every probe. Each one catches a different class of failure that the others miss.

Status code: did the API return 200, or did it return 500, 502, or 503? Response time: did the API respond within an acceptable window? Body content: does the response contain the expected fields? Authentication state: is the API still accepting the credentials it was accepting yesterday? A monitor that checks all four catches misbehaviour that any single check would miss.

The endpoints worth monitoring

Not every API endpoint needs a monitor. Three categories cover most of the real value.

A health endpoint: usually /health or /status. It should return 200 with a small JSON payload confirming the database connection and any critical dependencies. Lightweight to probe, fast to fail when something is wrong.
The most-used real endpoint: the one that takes the largest fraction of your traffic. If your /users/me endpoint serves 80% of requests, monitor it directly. If it stops working, your customers notice immediately.
The most-revenue real endpoint: the one that handles payments, subscriptions, or anything that translates directly into money. Same monitor, higher alerting priority.

Setting response-time thresholds without false alarms

Response-time alerts are the easiest way to produce a noisy monitor. A single slow probe during a deploy is not signal; ten slow probes in a row is. The threshold strategy that works is percentile-based, not absolute-based.

Measure your endpoint's p95 response time over a normal week. Set the alert threshold at 2x the p95. Alert only if three consecutive probes exceed it. The result is a monitor that ignores normal variance but catches real degradation. Tune the multiplier and the threshold count over the first month, then leave it alone.

The silent failure modes simple ping checks miss

An API can return HTTP 200 and still be broken. Four common silent failures show up in real incidents.

Cached error responses: the API correctly returns 500 once, gets cached at a CDN, and from then on serves a 200 with a stale error body.
Wrong content type: the API returns HTML instead of JSON because a debug middleware was left enabled. Status is 200, body is unparseable.
Empty success: the API returns 200 with an empty result set, because the underlying query silently lost a join during a migration.
Authentication drift: the API returns 200 because the rate limiter handles the unauthenticated request gracefully, but the body says 'please log in'.

Authenticated monitoring without leaking credentials

Many APIs require authentication on every endpoint. The monitor needs credentials. Two patterns work well; one common pattern is a serious security risk.

Good: a dedicated read-only API key with the narrowest possible scope, rotated quarterly, stored encrypted in the monitoring tool. Better: a webhook signature or HMAC instead of a bearer token, so the monitor cannot impersonate a user. Bad: a long-lived bearer token for the production admin user, pasted into the monitor config. The third pattern produces incidents of its own when the monitor's credentials leak.

The starter setup for an API

Add three monitors per API. A /health monitor at 60-second cadence with a 500ms response time threshold. A monitor on the most-used real endpoint at 60-second cadence with a body-content assertion. A monitor on the most-revenue endpoint at 30-second cadence with the same body assertion plus a tighter response time threshold. Total monitor count: three. Total coverage: most of what matters. Add specific monitors for edge cases as they appear in real incidents, not pre-emptively.