RabbitMQ: a Message Broker

created:

updated:

tags: rabbitmq message broker

What is a RabbitMQ?

RabbitMQ is a message broker: it accepts and forwards messages. You can think of it as a post office: when you put the mail that you want posting in a post box, you can be sure that the letter carrier will evetually deliver the mail to your recipient. In this analogy, RabbitMQ is a post box, a post office, and a letter carrier.

What consists of messaging

  • Producing:

Producing means nothing more than sending. A program that sends messages is a producer

  • Queue:

A queue is the name for the post box in RabbitMQ. Although messages flow through RabbitMQ and your applications, they can only be stored inside a queue. A queue is only bound by the host’s memory & disk limits, it’s essentially a large message buffer. Many producers can send messages that go to one queue, and many consumers can try to receive data from one queue.

  • Consuming:

Consuming has a similar meaning to receiving. A consumer is a program that mostly waits to receive messages.

‘Hello World’ tutorial 1

Hello World tutorial

  • This tutorial focuses on sending and receiving messages from a named queue.
  • Before sending a message, we need to ensure the recipient queue exists. If the message is sent to a non-existing location, RabbitMQ will drop the message.
  • In RabbitMQ, a message can never be sent directly to the queue, it always needs to go through an exchange.

  • A default exchange is identified by an empty string. The exchange allows us to specify to which queue the message should go.

“Work Queues (aka: Task Queues)’ tutorial 2

Work Queue tutorial

  • This tutorial focuses on distributing time-consuming tasks among multiple workers.
  • The main idea behind Work Queues (aka: Task Queues) is to avoid doing a resource-intensive task immediately and having to wait for it to complete. Instead we schedule the task to be done later. We encapsulate a task as a message and send it to the queue. A worker process running in the background will pop the tasks and eventually execute the job. When you run many workers, the tasks will be shared between them. This concept is especially useful in web applications where it’s impossible to handle a complex task during a short HTTP request window.

  • One of the advantages of using a Task Queue is the ability to easily parallelize work. If we are building up a backlog of work, we can just add more workers and that way, scale easily.

  • By default, RabbitMQ will send each message to the next consumer, in sequeunce. On average, every consumer will get the same number of messages. This way of distributing messages is called round-robin.

How to ensure tasks are not lost even when a worker dies

  • In order to make sure a message is never lost, RabbitMQ supports ‘message acknowledgements’. An ack (acknowledgement) is sent back by the consumer to tell RabbitMQ that a particular message had been received, processed, and that RabbitMQ is free to delete it.

  • If a consumer dies (its channel is closed, connection is closed, or TCP connection is lost) without sending an ack, RabbitMQ will understand that a message wasn’t processed fully and will re-queue it. If there are other consumers online at the same time, it will then quickly redeliver it to another consumer. That way you can be sure that no message is lost, even if workers occasionally die.

  • A timeout (30 minutes by default) is enforced on consumer delivery acknowledgement.

What happens with forgotten acknowledgements

It’s a commonmis take to miss basic_ack. It’s an easy error, but the consequences are serious. Messages will be redelivered when your client quits (which may look like random redelivery), but RabbitMQ will eat more and more memory as it won’t be able to released any unacked messages.

  • We can debug by using rabbitmqctl:
    sudo rabbitmqctl list_queues name messages_ready messages_unacknowledged
    

What happens RabbitMQ server stops

  • When RabbitMQ quits or crashes, it will forget the queues and messages unless you tell it not to. Two things are required to make sure messages aren’t lost: we need to mark both the queue and messages as durable.

  • RabbitMQ doesn’t allow you to redefine an existing queue with different parameters and will return an error to any program that tries to do that.

    • As a workaround, we can declare a new queue with different name.
  • Marking messages as persistent doesn’t fully guanrantee that a message won’t be lost. Although it tells RabbitMQ to save the message to disk, there is still a short time window when RabbitMQ has accepted a message and hasn’t saved it yet. Also, RabbitMQ doesn’t do fsync(2) for every message – it may be just saved to cache and not really written to the disk. If you need a stronger gurantee, then you can use ‘publisher confirms’.

Fair dispatch

  • There may be situations “with two workers, when all odd messages are heavy and even messages are light, one worker will be constantly busy and the other one will do hardly any work”. “This happens because RabbitMQ just dispatches a message when the message enters the queue. It doesn’t look at the number of unacknowledged messages for a consumer. It just blindly dispatches every n-th message to the n-th consumer.”
  • We can prevent this situation by configuring channel’s method with the prefetch_count=1 setting which will ensure that “RabbitMQ not to give more than one message to a worker at a time”.

References