Fanout

I have encountered the term “fanout” (or “fan-out”) in two contexts:

The dissemination of messages in a stream to consumers; and
The dissemination of social messages to followers.

My first thought was that it was an overloaded term, but writing it as I did above, it appears to me that these are essentially the same task.

The difference is one of typical cardinality: a message broker collects messages from multiple producers, at which point you expect the message to exist in one or a small number of queues. By comparison, in a social network, the number of consumers and the number of producers are definitionally equal, since everyone is both. Hence, you end up with a different optimal data structure for each.

Stream

You’ll need a persistent message broker if you want to fan out to an arbitrary number of consumers without reconfiguring the broker whenever one is added/removed, but the data representation is sequential across topics, where is far smaller than the number of messages published. So you can get by with log-structured storage, as with Apache Kafka.

Here the number of “topics” is in some sense every publisher-follower pair, meaning that the cardinality can actually greatly exceed the number of messages published. In some cases, as with tweets from celebrities, you might have tens of millions of copies of the message being published.

So two special issues come up. First, you’d like to be able to choose between fanning out on read or on write in order to minimize duplication. Second, you’d like to have some efficient way to query for publisher-follower relationships. Since we can fan out on either read or write, we must be able to query in either direction. This is a very natural application of graph databases.

David's raw ML reference notes

Explorer

Fanout

Stream

Graph View

Table of Contents

Backlinks

David's raw ML reference notes

Explorer

Fanout

Stream

Social

Graph View

Table of Contents

Backlinks