Close-to-the-metal architecture handles millions of OPS with predictable single-digit millisecond latencies.
Learn MoreClose-to-the-metal architecture handles millions of OPS with predictable single-digit millisecond latencies.
Learn MoreScyllaDB is purpose-built for data-intensive apps that require high throughput & predictable low latency.
Learn MoreLevel up your skills with our free NoSQL database courses.
Take a CourseOur blog keeps you up to date with recent news about the ScyllaDB NoSQL database and related technologies, success stories and developer how-tos.
Read MoreScyllaDB has recently experienced a significant increase in the number of DynamoDB users moving to ScyllaDB Cloud, our database-as-a-service offering. Price performance is cited as a primary driver in virtually all of these interactions.
To help teams better assess whether a move makes sense, we decided to perform a detailed DynamoDB benchmark with a price-performance comparison analyzing:
This benchmark report outlines the detailed findings, but here’s the bottom line: ScyllaDB costs are significantly lower than DynamoDB costs in all but one scenario. In realistic workloads, costs would be 5X to 40X lower — with up to 4X better P99 latency.
Here is a consolidated look at how DynamoDB and ScyllaDB compare on cost and performance for just one of the many workloads we tested. DynamoDB shines with a Uniform distribution and struggles with the others; we chose to highlight a case where it shines.
For our cost comparisons, we started with a range of throughput + workload scenarios and calculated their cost on ScyllaDB Cloud with the AWS instances specified in the extended report. Next, we calculated the cost of running the same throughput + workload scenarios on DynamoDB. We needed to know:
We used an item size of 1081 bytes (calculated using the DynamoDB documentation), which translates to 2 WCUs per write operation and 1 RCU per read operation on DynamoDB. 1 TB costs ~$250/month. The cost per unit varies according to the pricing model: provisioned or on-demand.
For ScyllaDB we used a 3-node cluster for all modes. Users can scale out/in the cluster at any point. Hourly rates (on-demand) were used for ScyllaDB. As the platform scales with the amount of resources, you can linearly change the price for the performance level you require. Annual pricing provides significant cost reduction but is out of the scope of this benchmark.
DynamoDB has two modes for non-annual pricing: provisioned and on-demand pricing. Provisioned mode is recommended if your workloads are reasonably predictable. On-demand pricing is significantly more expensive and is a fit for unpredictable workloads. It is possible to combine modes, add auto-scaling, and so forth. DynamoDB provides considerable flexibility around managing the cost and scale of the aforementioned options, but this also results in considerable complexity.
We measured the ScyllaDB performance in all different workload mixes and distributions, then compared it to the cost of DynamoDB with allocations to the required read/write units. Note that beyond ScyllaDB’s on-demand pricing (which can be estimated via our pricing calculator), discounts are provided for annual commitments. To see how we calculated costs, refer to the Cost Calculations section of the full report’s Appendix.
Provisioned mode is recommended if your workloads are reasonably predictable. With DynamoDB, you need to be able to predict per-table capacity regarding read and write capacity units.
With just one exception, DynamoDB’s cost estimates were consistently higher than ScyllaDB’s – and much more so for the most write-heavy workloads.* This is not surprising, given that DynamoDB charges 5X more for writes than for reads, while ScyllaDB does not differentiate between operations, and its pricing is based on the actual cluster size.
We also compared pricing using DynamoDB’s on-demand pricing model. For details, see the complete benchmark report.
Next, let’s shift the focus to performance. For use cases that require near real-time responses (e.g., AdTech, messaging, gaming, IoT), P99 latency is critical for meeting SLAs and delivering an engaging user experience.
Here are the performance results for the Uniform distribution.
ScyllaDB’s latency was significantly lower than that of DynamoDB – at a fraction of the cost. An even larger reduction in P99 latency and mean latency (under 1 ms) could be achieved by using a larger ScyllaDB cluster and reducing the utilization from 75% to the 30%-50% range. Additionally, ScyllaDB comes with several options which might further improve latencies, such as the BYPASS CACHE extension, designed for workloads that don’t make effective use of caching (e.g., a workload that is much larger than the RAM with Uniform distribution). Such options have not been applied for the purpose of this performance testing.
Now, let’s shift to a more realistic distribution: Hotspot. As explained in the full report’s Appendix, the Hotspot distribution has some sets of values accessed more frequently than others.
As you can see, the discrepancy between DynamoDB and ScyllaDB latencies widened for these more realistic workloads. Since a portion of the requests had a higher chance of hitting an object from the hot set, the probability of ScyllaDB’s cache being touched during reads increased significantly. The results show that ScyllaDB P99 latencies decreased compared to the previous Uniform results as the read ratio increased.
The next graph represents the P99 latency as a function of ScyllaDB’s utilization. Note how the P99 decreases along with a lower cluster load:
Finally, let’s look at the interesting case of Zipfian distribution. Zipfian distribution takes the hotspot to new levels by simulating access patterns with an exponential relation between items (a.k.a. “hotness”). As Alex DeBrie notes, Zipfian is generally problematic for DynamoDB: “The most popular items are accessed orders of magnitude more than the average item. Because DynamoDB wants a more even distribution of your data, your application may get throttled as it tries to access popular items.”
As expected, DynamoDB performed poorly on these workloads.
As noted by DeBrie, the Zipfian distribution is likely to create hot partitions, an imbalance introduced by uneven item access from within the database. As hot partitions are a known antipattern, DynamoDB restricts the number of hits on the same partition (documented to be 3,000 RCUs and 1,000 WCUs per partition at the time of writing). Aware of this limitation and of Zipfian’s imbalances, we compared both ScyllaDB and DynamoDB at scale.
Once DynamoDB limits were reached, requests started getting throttled, requiring the application to retry. As a result of the throttling and retries, the YCSB latency became severely impacted. Under some circumstances, we experienced DynamoDB throttling at up to ~2.5 seconds per request – thus preventing the application from accessing the item during that time.
At best, DynamoDB delivered 39.72% of the expected throughput (88.19 good kops/s vs. the target 222 kops/s). At worst, it delivered only 16.22% (20.28 kops/s) of what was purchased (125 kops/s) and supposedly provisioned. Given the documented DynamoDB limitations, some discrepancy is to be expected. However, the magnitude of this unfulfilled throughput was surprising.
Unlike DynamoDB, ScyllaDB managed to sustain the target throughput without any throttling and still deliver single-digit millisecond latencies. In that regard, ScyllaDB does not impose any hard limits on querying hot partitions. Even though ScyllaDB implements concurrency-limiting mechanisms, frequently accessing popular items will typically benefit from its cache implementation – thus explaining why no failures have been seen during the Zipfian tests. Even then, it is worth underscoring that hot partitions are an antipattern, even for databases like ScyllaDB. Read more about ScyllaDB’s advanced control mechanisms in ScyllaDB in this blog.
As the results indicate, what might begin at a seemingly reasonable cost can quickly escalate into “bill shock” with DynamoDB – especially as the throughput increases, and particularly with write-heavy workloads. This makes it a suboptimal choice for data-intensive applications anticipating steady or rapid growth. ScyllaDB’s significantly lower costs – a reflection of ScyllaDB taking full advantage of modern infrastructure for high throughput and low latency – make it a more cost-effective solution for data-intensive applications.
ScyllaDB – with its LSM-tree-based storage, unified caching, shard-per-core design, and advanced schedulers – allows you to maximize the advantages of modern hardware, from huge CPU chips to blazing-fast NVMe.
This small-scale benchmark demonstrated how a 57K OPS workload with a 50:50 read/write ratio that cost $4,500/month on-demand on ScyllaDB would cost $30,000 to $200,000/month with DynamoDB. Beyond those cost savings, ScyllaDB sustains 2X peaks and provides 2X-4X better P99 latency. Additionally, it can further reduce latency when idle – or enable spare resources to be shared across multiple tables. For larger workloads spanning 500K-1M OPS and beyond, this can result in a cost saving in the millions – with better performance and fewer query limitations.
Apache® and Apache Cassandra® are either registered trademarks or trademarks of the Apache Software Foundation in the United States and/or other countries. Amazon DynamoDB® and Dynamo Accelerator® are trademarks of Amazon.com, Inc. No endorsements by The Apache Software Foundation or Amazon.com, Inc. are implied by the use of these marks.