I have encountered the term “fanout” (or “fan-out”) in two contexts:
- The dissemination of messages in a stream to consumers; and
- The dissemination of social messages to followers.
My first thought was that it was an overloaded term, but writing it as I did above, it appears to me that these are essentially the same task.
The difference is one of typical cardinality: a message broker collects messages from multiple producers, at which point you expect the message to exist in one or a small number of queues. By comparison, in a social network, the number of consumers and the number of producers are definitionally equal, since everyone is both. Hence, you end up with a different optimal data structure for each.
Stream
You’ll need a persistent message broker if you want to fan out to an arbitrary number of consumers without reconfiguring the broker whenever one is added/removed, but the data representation is sequential across
Social
Here the number of “topics” is in some sense every publisher-follower pair, meaning that the cardinality can actually greatly exceed the number of messages published. In some cases, as with tweets from celebrities, you might have tens of millions of copies of the message being published.
So two special issues come up. First, you’d like to be able to choose between fanning out on read or on write in order to minimize duplication. Second, you’d like to have some efficient way to query for publisher-follower relationships. Since we can fan out on either read or write, we must be able to query in either direction. This is a very natural application of graph databases.