Every TLS certificate has an expiration date. When it passes, browsers show warnings, API clients reject connections, and automated systems start throwing errors. The failure is always total, always visible, and almost always preventable.
SSL certificate expiration monitoring is the practice of continuously tracking the validity of every certificate on your infrastructure and alerting before expiry becomes an outage. This post covers what a real monitoring setup looks like, what signals matter, and how to avoid the common pitfalls.
Why Expired Certificates Keep Happening
Certificate expiry is the most scheduled failure in infrastructure operations. You know the exact moment it will happen. And yet outages from expired certificates continue to take down production services at Microsoft, Spotify, Cisco, and countless smaller operators every year.
The reasons are consistent:
- Automation is configured but silently fails weeks before expiry, with no alerting
- A certificate is deployed manually once and then inherited by the next team
- Internal services (monitoring agents, load balancer backends, mTLS clients) are missed by public scanners
- Short-lived certificates are renewing successfully on one host but not on the others behind the same DNS record
- The person who set up renewal has left the company
None of these are technical problems. They're visibility problems. Monitoring is how you close that gap.
What Certificate Monitoring Actually Checks
A useful monitoring system goes beyond a single expiry date. Certificates can be broken in multiple ways, each with its own failure signature:
- Expiration: the most obvious. Alert at a configurable threshold before
notAfteris reached. - Chain integrity: an expired or missing intermediate certificate causes the same browser errors as an expired leaf. The full chain must be validated, not just the end certificate.
- Hostname matching: a certificate served on the wrong hostname (common after load balancer changes or CDN migrations) is functionally expired from the client's perspective.
- Revocation: a certificate that has been revoked by its CA via OCSP or CRL is no longer valid, even if it hasn't expired.
- CA trust: a certificate issued by a CA that browsers have distrusted (a rare but real event) will fail to validate.
- Fingerprint change: an unexpected certificate rotation may indicate a successful renewal, or it may indicate a configuration change you didn't authorize.
A monitoring system that only tracks expiry dates misses four of the six failure modes above.
Choosing the Right Alert Threshold
The single most important configuration decision is how far in advance to alert. Set it too far out and you'll get noise during normal renewal cycles. Set it too close and you won't have time to fix problems before users are affected.
The right threshold depends on certificate lifetime:
| Certificate Lifetime | Suggested Alert Threshold | Why |
|---|---|---|
| 398 days (legacy) | 30 days | Plenty of runway; rarely noisy |
| 90 days (Let's Encrypt current) | 15 days | Past the automatic renewal window |
| 47 days (post-2029 maximum) | 7 days | Short but still actionable |
The goal is to alert after the normal automatic renewal window has clearly failed, but with enough lead time to diagnose and fix the problem. As certificate lifetimes continue to shrink across the industry, this window is tightening. Alert thresholds need to shrink alongside certificate validity periods.
Internal Certificates Are Often the Biggest Gap
Public-facing certificates on your website and API are typically the ones that get monitored first, because they're visible. But internal infrastructure often runs on certificates that are harder to track:
- mTLS between microservices
- Kubernetes ingress and internal service meshes
- Database connections (PostgreSQL, MySQL, MongoDB) using TLS
- LDAP directories
- Internal VPN endpoints
- Build and CI/CD systems with code-signing or artifact certificates
These certificates typically come from an internal CA, aren't visible to external scanners, and often get renewed manually. They cause the same kind of outages as public certificates, but they're harder to find and easier to forget. Any serious monitoring setup needs a way to cover internal infrastructure, whether through agents running inside the network or direct visibility into your internal CA.
Where Manual Checking Fails
A common starter approach is a cron job that runs openssl s_client or a curl probe once a day. It works for a handful of certificates, but it breaks down quickly:
- It doesn't validate the full chain
- It doesn't check alternate IPs behind a DNS round-robin
- It doesn't track certificates on STARTTLS protocols (SMTP, IMAP, LDAP)
- It has no alerting layer, so failures become cron output nobody reads
- It doesn't handle certificates on non-standard ports
- It doesn't catch certificates that stop serving because a listener was removed
For a single-server blog, the cron approach is fine. For anything larger, you need a dedicated monitoring layer. You can also sanity-check any public certificate manually with the SSL Certificate Check on Mr. DNS, which shows the full chain, expiry, SANs, and validation status for any hostname.
Getting TLS Configuration Right
Monitoring catches problems after they happen. Good TLS configuration prevents a category of them in the first place: weak ciphers that force certificate replacement, chain configuration errors that make valid certs appear broken, and HSTS/OCSP stapling issues that compound expiry problems. GoodTLS has production-ready configurations for common web servers (Nginx, Apache, Caddy, HAProxy) and mail servers (Postfix, Exim, Dovecot) that pair well with active monitoring.
Building the Monitoring Layer
A complete SSL certificate expiration monitoring setup has three components:
- Discovery: an inventory of every certificate you need to track, including internal services, not just public URLs.
- Continuous validation: automated checks at least daily, covering expiry, chain integrity, hostname matching, and revocation status.
- Alerting with triage context: notifications routed to the team that owns the certificate, with enough detail to diagnose without digging through logs.
Generator Labs certificate monitoring handles all three. It checks every certificate on a configurable schedule, validates the full chain across every IP behind your hostnames, and supports on-premise agents for internal infrastructure. Alerts include the certificate details and the specific failure mode, so the person on call can act immediately.
Certificate outages are predictable, preventable, and entirely avoidable. The monitoring layer that makes them so is the cheapest insurance you can buy against the most visible kind of infrastructure failure.