Response Time Monitoring: Track TTFB, Latency & Performance

How to monitor and optimize website and API response times. TTFB, latency percentiles, performance baselines, alert thresholds, and remediation playbooks.

2026-03-25 · 10 min · Guide

Your site responds. Status code 200. Uptime 100%. But users complain: "Everything is slow." The dashboard takes 4 seconds to load. Search takes 3. Checkout takes 5. Technically the site works. Practically users leave. Amazon found that every additional 100ms of delay reduces conversion by 1%. Google discovered that increasing load time from 0.4s to 0.9s cuts traffic by 20%.

Uptime monitoring answers "is the service running?" Response time monitoring answers the harder question: "how well does it run?" Both matter, but teams often ignore the second until degradation becomes obvious.

Anatomy of an HTTP Request: What Makes Up Response Time

When monitoring sends an HTTP request to your server, the request passes through several phases. Each adds time. Understanding these phases is key to diagnosis: if response time grew, you need to know which phase caused it.

DNS Lookup (5-50ms). Converting a domain name to an IP address. Usually cached, but the first request or request after TTL expiry requires a full round-trip to the DNS server. Anycast DNS (Cloudflare, Route53) minimizes this phase. DNS monitoring catches anomalies here.

TCP Connection (10-100ms). Three-way handshake: SYN → SYN-ACK → ACK. Time depends on distance between client and server. Frankfurt to Amsterdam is 10ms. Frankfurt to Singapore is 150ms. CDNs and edge servers reduce this phase.

TLS Handshake (20-100ms). Establishing encrypted connection. TLS 1.3 does this in 1 round-trip (vs 2 for TLS 1.2). OCSP Stapling saves another round-trip for certificate revocation checks.

TTFB — Time to First Byte. Time from sending request to receiving the first byte of response. This is server-side processing: routing, middleware, database queries, template rendering, serialization. TTFB is the metric you control. Everything before TTFB is network; TTFB is your code.

Content Transfer. Receiving the entire response body. For small JSON responses (1-5KB), milliseconds. For heavy HTML pages (500KB+), noticeable time, especially on slow connections.

TTFB: The Most Important Response Time Metric

TTFB isolates server performance from network factors. If TTFB grew from 80ms to 500ms, and DNS and TCP didn't change — the problem is inside your application: slow SQL query, memory leak, cold cache, overloaded worker pool.

AtomPing records TTFB separately from total response time for each HTTP check. This lets you see: "total time = 800ms, but TTFB = 600ms" → server problem; vs "total time = 800ms, but TTFB = 100ms" → response size or transfer speed problem.

TTFB under 100ms — excellent. Your backend is fast, caching works well.

TTFB 100-300ms — good. Typical for APIs with database queries without heavy joins.

TTFB 300-800ms — acceptable for heavy pages with server-side rendering. Look for optimizations.

TTFB 800ms-2 seconds — problem. Users notice the wait. Google may reduce crawl frequency.

TTFB over 2 seconds — critical. Likely slow database queries, heavy computation, or external service issues.

Percentiles vs Average: Why the Average Lies

Say your API handled 10,000 requests in an hour. Average response time: 120ms. Looks great. But look at the distribution:

p50 (median): 90ms — half of requests answer in 90ms. Fast.

p95: 250ms — 5% of requests slower than 250ms. Tolerable.

p99: 3,200ms — 1% of requests take longer than 3 seconds. At 10,000 requests, that's 100 users with a terrible experience.

p99.9: 12,000ms — one in a thousand requests hangs for 12 seconds. Probably an external service timeout.

The average (120ms) masks the tail of the distribution. This is why latency SLIs are defined with percentiles: "p95 response time under 500ms over a rolling 30 days".

How to Set Up Response Time Monitoring

Step 1: Establish Your Baseline

Before setting thresholds, understand what "normal" response time is for each endpoint. Enable response time tracking in AtomPing, wait 24-48 hours, examine the distribution. A landing page, API health check, and heavy dashboard report all have different normal ranges. You can't set a single 500ms threshold for everything.

Step 2: Set Your Thresholds

Two-tier system:

Warning — 2x your baseline. If normal TTFB is 100ms, warn at 200ms. This is an early degradation signal. Slack notification, no on-call wake-up.

Critical — 5x baseline or an absolute ceiling. If normal TTFB is 100ms, critical at 500ms or an absolute cap of 2 seconds (whichever is lower). This opens an incident and sends an alert.

Step 3: Monitor from Multiple Regions

Response time depends on the distance between probe and server. A check from Frankfurt shows 80ms, from Helsinki 200ms for the same endpoint. Multi-region monitoring gives you the full picture: how users in different locations see your service.

Step 4: Track Trends, Not Just Alerts

Response time that grows 5ms per day will add 150ms over a month. Each day looks fine. A month later, noticeable degradation. Seven-day, 30-day, and 90-day graphs show trends that minute-by-minute checks can't.

Diagnosis: Response Time Increased — What to Do

Monitoring shows response time jumped. How do you find the cause? The approach depends on which phase grew.

TTFB grew, total time grew proportionally. Server-side problem. Check: slow queries in PostgreSQL (pg_stat_statements), CPU/memory utilization, active connection count, application logs for errors. Common cause: N+1 queries introduced in a recent deployment.

TTFB normal, total time grew. Response size increased or transfer slowed. Check: is gzip/brotli enabled? Did an unexpectedly large payload get added? Are there CDN or bandwidth issues?

Response time grew in one region. Network problem between that region and your server. BGP routing change, congestion at an intermediate hop. Usually temporary, resolves in hours.

Response time grew on one endpoint. Isolated problem: slow database query, broken cache, heavy computation on that specific route. Useful when monitoring multiple endpoints separately.

Response Time and SEO

Google considers server speed in rankings. The official recommendation is TTFB under 800ms for crawlable pages. In practice: if your site responds in 200ms and a competitor's in 2 seconds, all else equal, you rank higher.

A more practical effect: Googlebot has a crawl budget for each site. If your pages respond slowly, Googlebot indexes fewer pages in one visit. New content appears in search results later. For a blog with dozens of articles, this is direct loss of organic traffic.

Optimizing Response Time: Practical Steps

At the Application Level

Database queries. 80% of TTFB problems are in SQL. Enable slow query logs (>100ms). Add indexes on common WHERE conditions. Avoid N+1: if a list of 20 items makes 21 queries, use JOIN or prefetch. ORMs hide N+1 — count queries per request.

Caching. Redis/Memcached for frequently accessed data. HTTP cache (Cache-Control headers) for CDN and browser. Full-page cache for pages that don't change every second. Rule: if data hasn't changed in the last minute, cache it.

Async processing. Heavy operations (email sending, report generation, image processing) move to background jobs (Celery, Sidekiq, Bull). The API accepts the request and returns 202 Accepted instead of blocking the client for 5 seconds.

At the Infrastructure Level

CDN. Serve static assets (JS, CSS, images) through a CDN. This is not just speed, but also offloads the origin server. For dynamic content, use CDN edge workers (Cloudflare Workers, Lambda@Edge) to cache API responses.

Compression. gzip is minimum. Brotli is better (15-20% more efficient). Verify your nginx/Caddy returns Content-Encoding: br or gzip. For API responses, this can reduce transfer time 3-5x.

Connection reuse. HTTP/2 multiplexes requests over one TCP connection. HTTP/3 (QUIC) eliminates head-of-line blocking. If your server is still on HTTP/1.1, upgrading gives visible gains for clients making multiple requests.

Response Time Monitoring for APIs

For APIs, response time is even more critical than for web pages. A page loads once; API calls fire dozens of times per user action. If one API call is slow, a dashboard composed of 15 API calls becomes unbearably slow.

Practice: set thresholds by endpoint type.

Read endpoints (GET /users, GET /products) — warning 200ms, critical 500ms.

Write endpoints (POST /orders, PUT /settings) — warning 500ms, critical 2 seconds.

Health check (GET /health) — warning 50ms, critical 200ms. Health checks should be instantaneous.

Heavy operations (GET /reports/monthly, POST /export) — warning 3 seconds, critical 10 seconds. If longer, move to async.

Checklist: Minimum Response Time Monitoring

1. Enable response time tracking for each HTTP monitor. Record TTFB and total time separately.

2. Establish baseline for each endpoint (24-48 hours of data).

3. Set warning (2x baseline) and critical (5x baseline) thresholds.

4. Monitor from multiple regions — latency differences between regions reveal CDN or routing issues.

5. Set up graphs for 7-day, 30-day, and 90-day windows — trends matter more than point values.

6. Check business impact: if every 100ms costs 1% conversion, that's a concrete number to justify optimization investment.

FAQ

What is response time in monitoring?

Response time is the total time from sending a request to receiving the complete response. It includes DNS resolution, TCP connection, TLS handshake, server processing, and data transfer. In monitoring, response time is the primary performance indicator — it tells you not just 'is it up' but 'how fast is it.'

What is TTFB and why does it matter?

TTFB (Time to First Byte) measures the time from sending a request to receiving the first byte of the response. It isolates server-side processing time from network transfer. A high TTFB means your server is slow to start responding — usually due to slow database queries, heavy computation, or resource contention.

What is a good response time for a website?

Under 200ms is excellent, under 500ms is acceptable for most sites, and over 1 second is poor. For APIs, user-facing endpoints should aim for under 200ms (p95). These numbers are for server response time, not full page load — add 1-3 seconds for the browser to render the page with all assets.

Should I monitor average response time or percentiles?

Percentiles, always. Average hides problems: if 99 requests take 100ms and 1 takes 10 seconds, the average is 199ms — looks fine. But p99 is 10 seconds — one in a hundred users has a terrible experience. Monitor p50 (median), p95, and p99.

How do I reduce high response time?

First, identify the bottleneck: is TTFB high (server-side problem) or is the transfer time high (large response)? For TTFB: optimize database queries, add caching, increase server resources. For transfer: compress responses, reduce payload size, use a CDN for static assets.

Does response time affect SEO?

Yes. Google uses page speed as a ranking factor, and server response time is a major component. Google's recommendation is TTFB under 800ms for crawlable pages. Consistently slow response times lead to reduced crawl frequency, which means new content gets indexed slower.

Start monitoring your infrastructure

Start Free View Pricing

Monitoring

Features

Tools

Resources