Back in 2017, we published a performance benchmark to showcase the vast volumes of events Apache Kafka can process. Natural to Aiven services, we evaluated the performance across the public cloud providers we supported.

A number of changes have been made to Kafka and the resources that the public cloud providers offer since then. As such, we felt it was a due time for a refresher and decided to remeasure write performance in this test.

That said, we made some small changes to the benchmark set up so that it better reflected real world workloads. We also calculated the monthly throughput cost for each plan, on each cloud this time. So, let’s jump in!

2019 Aiven Kafka benchmark setup

But this time around, we’re using a replication factor of 3 to match the regular use case. With replication, this test accounts for the network traffic between the brokers as well.

We used a single topic for our write operations with a partition count set to either 3 or 6, depending on the number of brokers in each test cluster. As the test clusters were regular Aiven services, the partitions and replicas were spread out across availability zones.

Messages were produced via the librdkafka_performance tool with a message size of 512 bytes, a default batch size of 10,000 and no compression. Continuing our quest to simulate real-world use, client connections were made over TLS.

We used Kafka version 2.1 running with Java 8; as a side note, it’ll be interesting to benchmark Aiven Kafka running with Java 11 in future tests because we expect Java improvements to positively impact its performance.

During the test, we kept increasing the number of producing clients until we reached the maximum throughput rate each plan tier’s cluster could accept. To verify our readings, we left the load running for some time.

If you’re interested in verifying our results, you can get the test code here. In our tests, we actually used Google’s managed Kubernetes service to easily scale number of load generating nodes up and down.

Kafka Business-4 benchmark results

Since our previous test omitted replication, the somewhat lower performance of GCP and Azure can be explained by this test’s inclusion of it. Surprisingly, AWS’s performance jumped from 50,000 messages/second in the previous test to this number. This is explained by the more recent instance types and network improvements AWS has been fielding in their cloud.

Kafka Business-4 performance in MB/second

We then used the message rates to derive throughput numbers, which were over 65 MB/second for AWS, and just under 35 MB/second for GCP and Azure. Pretty impressive! But, what is the cost per performance?

Kafka Business-4 monthly throughput cost

Kafka Business-8 benchmark results

AWS performance didn’t move from the previous plan sizes. A look into our monitoring revealed that both tests were capped by the available network bandwidth on the broker instances.

Kafka Business-8 performance in MB/second

Kafka Business-8 monthly throughput cost

Kafka Premium-6x-8 benchmark results

And the message rates? An impressive 270,000 messages per second on AWS, 238,000 on Azure and 167,000 on GCP; well in line with the expected results.

Kafka Premium-6x-8 performance in MB/second

Kafka Premium-6x-8 monthly throughput cost

Wrapping up

Again, we’d like to stress that monthly throughput cost should not be considered in isolation when comparing plans. Although important, there are additional factors that come into play when pricing plans, such as storage.

Additionally, we can’t stress enough that workloads vary and you should definitely benchmark your own representative event flows. For a more robust test, we’ll be addressing read/write tests in the near future.

Aiven Kafka

After launching your Aiven Kafka service in minutes, you can rest assured they’ll remain operational, performant, up-to-date, and secure. Find out more about Aiven Kafka, keep up to date on our changelog, find us on social, and try the Aiven platform free for 30 days.

Your database in the cloud,