Kleppman defines a message broker as a database purpose-built for handling message streams.

To quote (p 690):

Databases often support secondary indexes and various ways of searching for data, while message brokers often support some way of subscribing to a subset of topics matching some pattern. The mechanisms are different, but both are essentially ways for a client to select the portion of the data that it wants to know about.

Ephemeral (traditional) message brokers

Ephemeral message brokers such as RabbitMQ and Google Cloud Pub-Sub behave very differently from traditional databases:

  • Messages are deleted as soon as they are retrieved
    • Hence read operations have a destructive side-effect
  • Working set is small
  • Arbitrary queries not supported

Of course, they are still storing data that is queried in a particular way, so it’s a database in some sense.

Persistent message brokers

However, once the message broker supports persistence (as with Apache Kafka), it becomes a kind of database well-aligned with change data capture and event sourcing.