Consider a social news feed. Social relationships exhibit scale-free properties, and so we expect an approximately power law distribution of connection vertices. When designing a fanout process for such a network, one has two main options:
Fanout on write (push): when someone creates content, push the content to all connections. Fanout on read (pull): store content in a central store, then pull the content from all connections.
Push fanout works great for most people; the writes affect a relatively small number of users, and reads are fast. It performs poorly, however, for nodes with many connections. It also wastes write operations on users that rarely sign on. Pull fanout is better for these cases.
So in practice, we need to create a hybrid fanout system for networks like these. We consider three cases: typical nodes, central nodes, dormant nodes.
| Node class | On write | On read |
|---|---|---|
| Typical nodes | Push a copy to the central store and to the node-specific store for typical nodes. | Fetch node-specific store; query central store for events from central and dormant nodes. |
| Central nodes | Push a copy only to the central store. | Likely involves special selection logic, since central users will not want updates from all their followers. LinkedIn allows premium users to decouple connection with “following,” which reduces both information overload and processing time. |
| Dormant nodes | Push a copy to the central store and to the node-specific store for typical nodes. | Query central store for events from all connections. |
| It may also be desirable to [[Materialized view | materialize]] counts in a cache for greater perceived response speed. |