Cryptography in Your SRE Daily Routine — Are You Just Winging It?
This post is also available in 日本語.
Introduction
- You see multiple options in Cloudflare's TLS settings but don't understand the differences, so you leave the default
- After a deploy, health checks fail with
certificate verify failed— even though the site loads fine in a browser
If either of these sounds familiar, this article is for you.
We'll start by looking at what happens behind the scenes in a single HTTPS request, then work through each of the questions above. As an SRE, I deal with cryptography daily, yet I used to struggle when asked to explain how it actually works. After systematically revisiting the fundamentals, the precision of my day-to-day decisions noticeably improved.
What Happens Behind an HTTPS Request
What actually happens when a browser accesses an https:// URL? This site (shinagawa-web.com) is deployed on Cloudflare Workers, so let's use it as an example and peek under the hood with curl -v.
$ curl -v https://shinagawa-web.com 2>&1 | grep -E "SSL|subject|issuer|expire|Cipher"
* SSL connection using TLSv1.3 / AEAD-CHACHA20-POLY1305-SHA256 / [blank] / UNDEF
* subject: CN=shinagawa-web.com
* expire date: Jun 15 04:47:03 2026 GMT
* issuer: C=US; O=Google Trust Services; CN=WE1
* SSL certificate verify ok.
Just a few lines of output, yet they contain five different cryptographic technologies: key exchange, symmetric encryption, tamper detection, digital signatures, and hash functions.
That said, SREs don't need a deep understanding of all five. For example, whether CHACHA20-POLY1305 or AES-256-GCM is used is automatically selected by Cloudflare based on the client environment — it's not a decision SREs make.
The two areas where SREs do need to make their own decisions are:
- TLS version selection — the
TLSv1.3on the first line. Which "Minimum TLS Version" to set in Cloudflare - Certificate chain of trust — the
issuerandsubjectfields. Directly relevant when troubleshootingcertificate verify failed
This article focuses exclusively on these two topics.
Five cryptographic technologies in the curl output
- Key exchange (ECDH) — Allows the client and server to derive a shared key securely over a channel that may be eavesdropped
- Symmetric encryption (ChaCha20 / AES) — Encrypts actual HTTP data at high speed using the shared key
- Authenticated encryption (Poly1305 / GCM) — Simultaneously encrypts data and guarantees it hasn't been tampered with in transit
- Digital signatures — Verifies the server certificate is authentic using the Certificate Authority's (Google Trust Services) signature
- Hash functions (SHA-256) — Generates a fixed-length "fingerprint" of data, serving as the foundation for signatures and tamper detection
The output above selected CHACHA20-POLY1305, but depending on the client environment, TLS_AES_256_GCM_SHA384 may be chosen instead. ChaCha20 runs fast on mobile devices that lack dedicated AES hardware instructions (AES-NI), while AES-GCM is fast on servers and PCs with AES-NI. Regardless of which is selected, the underlying structure — symmetric encryption + tamper detection + hashing — is the same.
Which TLS Version Should You Choose?
Cloudflare's dashboard has a "Minimum TLS Version" setting. It's a simple screen where you pick from TLS 1.0 / 1.1 / 1.2 / 1.3, but the real question is: "Go with TLS 1.3 only, or keep 1.2 as well?"
When You Have to Keep TLS 1.2
If you integrate with external systems that use legacy HTTP client libraries or SDKs, you may need to keep TLS 1.2. As of 2026, all major browsers support 1.3, but embedded devices and partner systems running outdated SDKs may not have caught up.
Start by understanding your current situation. In Cloudflare, you can check the traffic breakdown by TLS version under Analytics > Security. Once you know what percentage of connections use TLS 1.2, you have data to decide when to cut it off. If the 1.2 share is minimal, you can identify the sources, request they upgrade, and draw up a migration schedule.
If you do keep TLS 1.2, simply enabling it isn't enough. TLS 1.2 cipher suites include CBC mode — which has known vulnerabilities where attackers exploit padding in encrypted data — and RSA key exchange, where a leaked private key lets an attacker decrypt all past traffic. Cloudflare automatically optimizes these settings, but if you manage nginx yourself, you need to restrict ssl_ciphers to GCM-based ciphers with ECDHE only, explicitly excluding weak cipher suites.
TLS 1.3 Is Structurally Secure
TLS 1.3 narrows the cipher suites down to just five, and historically problematic options like CBC mode and RSA key exchange simply don't exist as choices. There's no risk of accidentally selecting a weak cipher suite, and no need to worry about whether things are "configured correctly." Every cipher suite provides forward secrecy — the property that even if the private key is later compromised, past communications remain unreadable.
Whether to keep TLS 1.2 should be decided based on the traffic analysis described above. If 1.2 connections are near zero, raise the Minimum TLS Version to 1.3. If they're still present, identify the sources, set a migration deadline, and phase out 1.2 gradually. Instead of "we'd prefer 1.3 if possible," the SRE's job is to look at the data and decide when to switch.
Quick glossary of cipher suite terms
A brief summary of the cryptographic terms that appeared in the sections above.
- AES (symmetric cipher) — The encryption engine. Uses the same key for both encryption and decryption. Key lengths of 128 / 192 / 256 bits are available; 256-bit is currently recommended
- CBC / GCM (block cipher modes) — Methods for processing multiple blocks with AES. CBC has known vulnerabilities; GCM performs encryption and tamper detection simultaneously, making it the current standard
- ChaCha20-Poly1305 — An authenticated encryption scheme with the same role as AES-GCM. Faster on mobile devices without dedicated AES hardware (AES-NI)
- ECDHE (key exchange) — A mechanism for securely deriving a shared key over an eavesdropped channel. Provides forward secrecy (past communications cannot be decrypted later)
- RSA (public-key cryptography) — Uses a public/private key pair. Was used for key exchange in TLS 1.2, but removed in TLS 1.3 due to lack of forward secrecy
- SHA (hash function) — Generates a fixed-length "fingerprint" from data. The foundation of tamper detection and digital signatures
certificate verify failed — What Exactly Is Failing?
Even with TLS versions configured correctly, certificate-related issues show up in a different form. After a deploy, health checks start failing and an alert fires. The site loads fine in a browser, but the health check logs just say certificate verify failed. To triage this error, you need to understand how certificates are trusted.
What Is a Certificate, Anyway?
Recall the curl -v output from the first section. There was subject: CN=shinagawa-web.com and issuer: C=US; O=Google Trust Services; CN=WE1. This means a Certificate Authority (CA) called Google Trust Services is certifying that "this server is indeed shinagawa-web.com." A TLS certificate is a digital document where a trusted third party (the CA) vouches for a domain's authenticity.
The Certificate Chain of Trust
A certificate isn't trusted on its own. There's a chain — Root CA → Intermediate CA → Server certificate — and the client verifies it can walk this chain all the way up to a Root CA.
Three Common Causes
Missing intermediate CA certificate: The intermediate CA certificate isn't configured on the server. Browsers can fill in the gap from their cache, so the site appears to work, but health checks and HTTP client libraries don't. This is why "it only works in a browser."
Expired certificate: Let's Encrypt auto-renewal has stopped. Either certbot's cron isn't running, or the API credentials for DNS-01 challenges have expired.
Domain mismatch: A wildcard certificate like *.example.com covers subdomains under a single certificate, but it only matches one level deep. It matches api.example.com but not a.b.example.com.
Verification Commands
$ openssl s_client -connect shinagawa-web.com:443 -showcerts < /dev/null 2>/dev/null
Certificate chain
0 s:CN=shinagawa-web.com
i:C=US, O=Google Trust Services, CN=WE1
1 s:C=US, O=Google Trust Services, CN=WE1
i:C=US, O=Google Trust Services LLC, CN=GTS Root R4
2 s:C=US, O=Google Trust Services LLC, CN=GTS Root R4
i:C=BE, O=GlobalSign nv-sa, OU=Root CA, CN=GlobalSign Root CA
---
Verify return code: 0 (ok)
s: is the certificate's subject, i: is the issuing CA. The chain connects 0 → 1 → 2, ultimately reaching GlobalSign Root CA. If Verify return code: 0 (ok), the chain of trust is intact.
To check the expiry and domain:
$ echo | openssl s_client -connect shinagawa-web.com:443 2>/dev/null | openssl x509 -noout -dates -subject
notBefore=Mar 17 03:47:05 2026 GMT
notAfter=Jun 15 04:47:03 2026 GMT
subject=CN=shinagawa-web.com
Once You've Identified the Cause
Missing intermediate CA: Configure the full chain (server certificate + intermediate CA certificate) on the server. For nginx, point ssl_certificate to the full-chain file. If you're using Cloudflare, edge certificates are managed by Cloudflare, so this issue doesn't arise.
Expired certificate: With certbot, run certbot renew. If using DNS-01 challenges, also check the API token's expiry. Cloudflare auto-renews its edge certificates, but origin server certificates are managed separately — don't overlook them.
Domain mismatch: Verify that the certificate's SAN (Subject Alternative Name) includes the target domain. For wildcard certificates, remember they only cover one subdomain level, and reissue the certificate to cover all required domains.
Conclusion
Let's revisit the two questions from the introduction.
- Which Cloudflare TLS setting should you pick? → Check the TLS 1.2 connection ratio via traffic analysis and make a data-driven decision on when to switch. If keeping 1.2, pay attention to cipher suite restrictions
- What causes certificate verify failed? → Three patterns: missing intermediate CA, expired certificate, or domain mismatch. Use
openssl s_clientto inspect the chain of trust and narrow it down
Both become straightforward to diagnose and address once you understand the underlying mechanisms. Cryptography is a broad field, but the points where SREs actually need to make decisions are limited. Just covering those fundamentals will make a real difference in your troubleshooting speed and confidence when changing configurations.
Questions about this article 📝
If you have any questions or feedback about the content, please feel free to contact us.
Go to inquiry form