Kleppman defines a message broker as a database purpose-built for handling message streams.
To quote (p 690):
Databases often support secondary indexes and various ways of searching for data, while message brokers often support some way of subscribing to a subset of topics matching some pattern. The mechanisms are different, but both are essentially ways for a client to select the portion of the data that it wants to know about.
Ephemeral (traditional) message brokers
Ephemeral message brokers such as RabbitMQ and Google Cloud Pub-Sub behave very differently from traditional databases:
- Messages are deleted as soon as they are retrieved
- Hence read operations have a destructive side-effect
- Working set is small
- Arbitrary queries not supported
Of course, they are still storing data that is queried in a particular way, so it’s a database in some sense.
Persistent message brokers
However, once the message broker supports persistence (as with Apache Kafka), it becomes a kind of database well-aligned with change data capture and event sourcing.