David's raw ML reference notes

❯

05 Data engineering and information science

❯

Stream processing

❯

Persistent and ephemeral message brokers as databases

Persistent and ephemeral message brokers as databases

Feb 14, 20251 min read

Kleppman defines a message broker as a database purpose-built for handling message streams.

To quote (p 690):

Databases often support secondary indexes and various ways of searching for data, while message brokers often support some way of subscribing to a subset of topics matching some pattern. The mechanisms are different, but both are essentially ways for a client to select the portion of the data that it wants to know about.

Ephemeral (traditional) message brokers

Ephemeral message brokers such as RabbitMQ and Google Cloud Pub-Sub behave very differently from traditional databases:

Messages are deleted as soon as they are retrieved
- Hence read operations have a destructive side-effect
Working set is small
Arbitrary queries not supported

Of course, they are still storing data that is queried in a particular way, so it’s a database in some sense.

Persistent message brokers

However, once the message broker supports persistence (as with Apache Kafka), it becomes a kind of database well-aligned with change data capture and event sourcing.

Graph View

Ephemeral (traditional) message brokers
Persistent message brokers

Backlinks

Kleppman ch. 11 -- Stream processing
Change data capture
Ephemeral (traditional) message brokers
Message queue (broker)
Message topics
Persistent message brokers
Stream (data processing)
Comparing AWS options for ML model inference (deployment)

Created with Quartz v4.4.0 © 2025

Terms of Use
LinkedIn
Buy me a coffee