Most teams running databases on EC2 or bare metal have never actually tested a restore. That's not a backup strategy — it's a false sense of security. Here's the uncomfortable truth about self-hosted databases and why switching to RDS or Aurora isn't optional anymore.
Ask any engineering team running a database on EC2 if they have backups, and the answer is almost always yes.
Ask them when they last tested a restore — and the room goes quiet.
That silence is the real problem. It's not that teams don't care about their data. It's that "we have backups" and "we can actually recover from a failure" are two completely different things — and most teams have only the first one.
The Myth of the Nightly Dump
The most common database backup strategy for self-hosted setups is a scheduled mysqldump or pg_dump that runs at 2am and uploads to S3. On paper, this sounds reasonable. In practice, it has three silent failure modes that most teams never discover until it's too late.
The dump might be failing. Cron jobs fail silently. If nobody is monitoring the backup job's exit code and sending alerts on failure, your last successful backup might be from three weeks ago. You won't know until you need it.
The dump might be corrupt. Large database dumps occasionally produce files that appear valid but fail to restore. The only way to know is to actually run a restore. If you haven't done that recently, your backup is unverified.
The dump is 24 hours stale by definition. If something goes wrong at 11pm — an accidental DELETE, a bad migration, data corruption — you're restoring to last night's backup. Every transaction from the past day is gone.
That last point is what makes nightly dumps fundamentally insufficient for any business where data has real value. And almost every business's data has real value.
What Point-in-Time Recovery Actually Means
Point-in-Time Recovery (PITR) is the capability to restore your database to any specific moment — any second — within a defined window. Not to last night's dump. To 3:47pm on Tuesday if that's when the problem occurred.
This is how real database protection works. And it's not something you get for free from a nightly dump script.
True PITR requires continuous archiving of database transaction logs — WAL files in PostgreSQL, binary logs in MySQL — shipped to durable storage as they're generated. It requires infrastructure to replay those logs from a base backup to an exact point. It requires monitoring to ensure archiving hasn't stopped. And it requires periodic testing to confirm that recovery actually works.
Building and maintaining that correctly, on a self-hosted database, is a genuine engineering project. Most teams don't have the bandwidth to do it right. So they run the nightly dump and tell themselves they're covered.
Amazon RDS gives you proper PITR the moment the instance starts — no configuration, no scripts, no monitoring to set up. It just works, with up to 35 days of recovery window.
Failover: The 3am Phone Call You're Waiting For
Self-hosted databases fail. Hardware breaks, disks fill up, kernels panic. When that happens at 3am, someone gets a phone call.
That person has to SSH in, assess the situation, decide whether to failover to a replica (if one exists), update connection strings, restart application servers, and verify that everything is working correctly. In a best-case scenario with an experienced engineer who's done it before, this takes 15–30 minutes. In reality — with a tired engineer, an unfamiliar system, and cascading alerts — it often takes much longer.
Every minute of that downtime has a cost. Lost revenue, frustrated users, eroded trust. For a business that processes transactions or serves clients in real time, the math is not kind.
Amazon RDS Multi-AZ keeps a synchronised standby in a separate Availability Zone at all times. When the primary fails, RDS detects the failure automatically, promotes the standby, and updates the DNS endpoint — your application reconnects to the same hostname it always used. The process takes 60 to 120 seconds, with no human intervention required. Aurora is even faster, with failover completing in around 30 seconds.
The 3am phone call doesn't happen. The engineer sleeps. The business stays online.
The Security Problem Nobody Talks About
Self-hosted databases accumulate security debt quietly. The database version that was current when it was set up gradually falls behind. CVEs get published. Security advisories get sent. And patching a live production database — with the downtime and risk that implies — keeps getting pushed to "next sprint."
Meanwhile, the vulnerability window grows.
RDS handles minor version patching in a scheduled maintenance window. Security updates that would require a manual process and a maintenance window on self-hosted infrastructure happen automatically, on a schedule you control, with RDS managing the coordination. You stop accumulating that security debt.
Beyond patching, RDS integrates with AWS's security infrastructure in ways that are difficult to replicate on EC2. Encryption at rest for all storage, credentials managed through IAM and Secrets Manager, network isolation through VPC security groups, audit logging through CloudTrail — all of this is the default, not something you have to build.
The Storage Crisis That Will Eventually Happen
Self-hosted databases run out of disk space. It's not a question of if — it's when.
The disk fills gradually, then suddenly. The application starts throwing errors. Someone has to extend the EBS volume, resize the filesystem, wait for the database to acknowledge the new space, all while the application is degraded or down.
On Aurora, storage scales automatically to 128 TB. On RDS, storage autoscaling adjusts capacity as needed. Neither requires manual intervention. Neither pages anyone at 3am because a disk is at 95%.
"But It's More Expensive"
This is the objection that comes up every time. And it's technically true — an RDS instance costs more in pure compute terms than an equivalent EC2 instance running the same database.
What the comparison misses is everything else.
The engineering time to build and maintain proper backup and recovery infrastructure. The cost of testing restores (and the cost of not testing them). The cost of a database engineer's time when a failover happens manually at 3am. The cost of a security incident from an unpatched vulnerability. The cost of an hour of downtime for your business.
When you add those up, RDS is almost never more expensive. It's a trade: you pay a premium on the infrastructure line item to avoid paying a much larger cost elsewhere — in engineering time, operational risk, and eventual incidents.
Most teams discover this the hard way. Some discover it before the incident. The ones who migrate before something goes wrong are the ones who looked honestly at their self-hosted setup and asked: "If I needed to restore right now, could I? How long would it take? What would I lose?"
If you can't answer those questions confidently, that's your answer.
The Databases RDS and Aurora Support
This isn't a PostgreSQL-only conversation. RDS supports every major relational database engine:
- PostgreSQL — full compatibility, including logical replication and most extensions
- MySQL and MariaDB — drop-in compatibility with binary log CDC
- Microsoft SQL Server — Express through Enterprise editions
- Oracle — Standard and Enterprise editions
Aurora supports PostgreSQL and MySQL with a distributed storage engine that offers faster failover, up to 15 read replicas with no replication lag (they share the same storage layer), and Backtrack — the ability to rewind your live cluster to any second in the past without a restore.
Whatever database you're running today on self-hosted infrastructure, there's a managed path to RDS or Aurora.
When to Make the Move
The right time to migrate to RDS is before you need to. Not after a disk fills up, not after a failover takes 45 minutes, not after you discover your last three weeks of backups failed silently.
If you're running a self-hosted database and any of these are true, you should be planning the migration now:
- You've never successfully restored from a backup
- You don't have PITR — only periodic dumps
- A database failure would require manual steps to recover
- You're not monitoring your backup jobs
- Your database version is more than two minor releases behind
- Storage capacity is managed manually
The migration itself — moving data from a self-hosted database to RDS — is well-understood and can be done with zero downtime using AWS Database Migration Service for most use cases. The complexity is manageable. The risk of not migrating is not.
Your database is probably the most critical piece of infrastructure your business runs on. It deserves the same level of managed reliability that you'd apply to any other critical system. RDS exists precisely because managing that reliability at the infrastructure level — so application teams don't have to — is genuinely hard to do well.
Most self-hosted database setups are one bad night away from a very bad week. The question is whether you find out on your terms or the infrastructure's.
If you're evaluating a database migration or want to understand what moving to RDS would look like for your specific setup, let's have a conversation. Getting this right before something goes wrong is exactly the kind of work I help teams with.