Performance Tuning Our RabbitMQ Routing Strategy

A couple of years ago we made the decision to move our software to use headers exchanges. We did this because it provides extensible routing information that can be extended without breaking any existing bindings. For example, most of our messages contain a FixtureId and a BookmakerId, so when we bind to these messages we can easily just pick up everything for a single sporting fixture and/or customer. Now if I wanted to get every message for a single sport I could add SportId into the message headers and easily bind to it without breaking any other binding implementations. If I were to add that information into a routing key, any queues bound to that would need to be changed to * or # out the new part of the routing key. Another huge benefit is that by using well defined names for the header keys we can further decouple components. When using topic routing keys you are more likely to end up in a situation where producers and consumers share some knowledge of how to create the relevant routing keys to speak to each other, adding some level of coupling to your supposedly decoupled messaging systems.

All of this sounds great but we have recently started to see strange load profiles on our RabbitMQ servers during busy periods. The servers are not scaling in a linear fashion as we would expect them to so I started thinking about our use of RabbitMQ and our routing strategies. I know there are other users out there who have achieved much higher throughput than us so we must be doing something wrong.

I recently posted this question to the RabbitMQ team:

@RabbitMQ do you have any comparative metrics on the computational cost of headers vs topic routing?

And they replied with:

@cjbhaines we don't have any numbers but the headers exchange is rarely used not optimised

@cjbhaines on the other hand, the topic one is very often used and is optimised

The optimizations they made to the topic routing algorithms in 2010 (2.4.0) are documented in these 2 blog posts titled "Very fast and scalable topic routing": Part 1 Part2. This is an excerpt from the end of the second blog post: “the performance improvement in this graph varies from 25 to 145 times faster” (than the 2.3.1 version of the topic routing algorithm). I have to commend them on that piece of work, those are some impressive stats.

To performance test the different exchange types I wrote a simple test harness.

  • 1000 queues
  • 10 message types – All have 4 headers/routing key parts. Each message published only hits 1 of the 1000 queues
  • 50 publisher threads – Each thread publishes a random message type with a random message number every 100-200ms result in ~330msg/s
  • 50KB message size
  • 5 second message TTL – There are no consumers running so we need a small TTL
  • 5 minute test run time
  • 64 core server running RabbitMQ 3.3.5

Results:

Performance test results

It is a shame that they have made an assumption that users of RabbitMQ do not use the headers exchange because it's difficult to gauge the uptake without surveying users regularly. Headers provides a lot of power to the implementer to keep code clean and simple, however it is their product and we have to respect their choices. Lets hope they do some optimization on this in the future.

Back to topic exchanges we go!

comments powered by Disqus