Netflix Migrates to Amazon Aurora: 75% Performance Boost and 28% Cost Reduction

19 points | by rbanffy 2 days ago

13 comments

michael1999 2 days ago
This matches my testing. Aurora is fast enough that we actual started to contend on the heap locks, and had to tune our batch orders. Too bad network-attached i/o is so slow.
moralestapia 2 days ago
>from self-managed PostgreSQL on EC2 to the managed Aurora service
They were on AWS already, they just moved from "the most expensive" to "the second most expensive" alternative.
[-]
- LunaSea 2 days ago
  That also explain the performance gains.
  I don't think that these numbers stand if you compare with a PostgreSQL on a bare metal server with NVMe SSDs attached to it.
  [-]
  - mannyv 2 days ago
    Note that they were running psql on ec2 instances, which implies bare metal with SSDs.
    Yes you can get actual hosts in AWS if you pay for them.
    The article also implies that they were never able to get psql to replicate effectively.
    Whomever their DBA was was couldn't do it, so they were like "fuck it, let's move to Aurora." Their database brought no actual value, so it make sense for them.
    [-]
    - LunaSea 2 days ago
      > Note that they were running psql on ec2 instances, which implies bare metal with SSDs.
      No, this doesn't imply SSDs.
      EC2 instances do not have access to SSDs but to EBS volumes which is a much slower distributed network block storage medium.
      Not sure about their PostgreSQL replication issues but lots of companies manage to make it work without a hitch.
      [-]
      - ActorNightly 2 days ago
        You can have EC2s with ssds. But generally the workflow for running your own DB is smart caching in memory on top of whatever storage you got.
        [-]
        LunaSea a day ago
        The problem with the metal instances you are referring to is that the SSD is lost if you stop and start the server.
    - ericmcer 2 days ago
      I wonder if he still has a job? Running all your postgres on ec2 is like 1000X harder than looking at the dashboard for an Aurora cluster.
    - yearolinuxdsktp 2 days ago
      They likely were running data on EBS volumes instead of bare metal SSDs, due to ease of recovery (a failed instance does not lose data on the attached EBS volumes). You can only run your DBs on bare metal SSDs if you are prepared to lose a node’s data completely.
      In fact, many instance types no longer have any ephemeral storage attached and it’s a default practice to use EBS for root and data volumes.
      There are some instance types that have extremely fast EBS performance (EBS io2 Block Express), which has hardware acceleration and an optimized network protocol for EBS network I/O and offers sub-millisecond latency. However, these are expensive and get even more so if you go up in IOPS.
      [-]
      - Tostino 2 days ago
        I'd argue you need the infrastructure to recover from loss of a host regardless if you have backups setup properly.
        Using EBS seems like a total anti-pattern for DB workloads.
amtamt 2 days ago
Why Griffiths Waite website does not have netflix in "we have helped" customers list?
[-]
- Macha a day ago
  It doesn’t appear they were involved in the migration, just in writing this article which is mostly reiterating the AWS press release?
- throwaway-blaze 2 days ago
  Probably contractually disallowed.