RabbitMQ vs Kafka

Message broker systems are important to web developer technologies. When implemented right, messaging brokers behave well when there’s a large backlog of messages, be able to create a cluster and in case of the failure of a node in a cluster, try to protect the data but never blocks the publishers even though that might imply data lost.

There are different enterprise architecture patterns for working with message queue systems. Primarily:

  1. Point-to-Point (Synchronous): Caller places a message onto a queue that is meant to be read by one consumer (although that consumer can have multiple instances in a cluster, competing for those messages). This is often used in a synchronous manner, like REST communication, with the caller blocking and cannot continue working until it has received a response message from the consumer.
  2. Publish-Subscribe (Asynchronous): Callers place a message onto a queue, and multiple consumers subscribe to that queue and receive their own copies of it. This is analogous to sending an email to a mailing list and is an inherently asynchronous mode of communication. Also, the callers can move on to other things deal with a response later (or not at all).

Message brokers are useful in a number of situations; any time we want to execute a task asynchronously, we put the task on a queue and some executor (could be another thread/process/machine) eventually runs the task. Depending on the use case, the brokers can give various guarantees on message persistence and delivery. For some use-cases, it is enough to have an in-memory, volatile message broker. For others, we want to be sure that once the message send completes, it is persistently enqueued and will be eventually delivered, despite node or system crashes. For comparison, we’ll look at both RabbitMQ and Apache Kafka.

RabbitMQ

RabbitMQ is one of the leading open-source messaging systems. It is written in Erlang, implements AMQP and is a very popular choice when messaging is involved. With a great length of documentation available, this could be preferable for developers to pick up when learning about messaging systems. It supports both message persistence and replication, with well-documented behaviour in case of e.g. partitions.

RabbitMQ follows a standard store-and-forward pattern where you have the option to store the data in RAM, on disk, or both. It supports a variety of message routing paradigms. RabbitMQ can be deployed in a clustered fashion for performance, and mirrored fashion for high availability. Consumers listen directly on queues, but publishers only know about “exchanges.” These exchanges are linked to queues via bindings, which specify the routing paradigm (among other things). More importantly, RabbitMQ is broker-centric, focused around delivery guarantees between producers and consumers, with transient preferred over durable message.

RabbitMQ’s messaging engine also contains a variety of powerful features such as built-in queues, topics, publish-subscribe model, support for persistence, web panels with monitoring tools, scalability options.

Apache Kafka

Kafka was designed originally by LinkedIn, it is written in Java and it is now taken over by Apache. It’s like RabbitMQ on steroids, Kafka has a plug-and-play setup and performs at much faster speeds.

With Kafka you can do both real-time and batch processing. It has the ability to ingest tons of data, route via publish-subscribe (or queuing). The broker barely knows anything about the consumer. All that’s really stored is an “offset” value that specifies where in the log the consumer left off. Unlike many integration brokers that assume consumers are mostly online, Kafka can successfully persist a lot of data and supports “replay” scenarios.

The architecture is fairly unique; topics are arranged in partitions (for parallelism), and partitions are replicated across nodes (for high availability), and the server itself is a streaming publish-subscribe system. Being such a simple server, responding to consumers is a much lower priority, making the system super fast and low on resource.

Kafka is producer-centric, based around partitioning a fire hose of event data into durable message brokers with cursors, supporting batch consumers that may be offline, or online consumers that want messages at low latency.

Feature-wise, Kafka is extremely lacking. While it clearly outperforms all other frameworks, but they lack broker infrastructure and monitoring tools so they are rather well suited in broker-less designs, than in microservices environment.

Conclusion

Kafka offers amazing performance for the “pub-sub” pattern and is far easier than RabbitMQ to scale across multiple data centers. However, it is not as attractive an option for the “point-to-point” pattern, because that simply isn’t what it was designed for. Since there can be various ways that messaging queue systems are used, it is best to determine which engine can fit your intended use case.