Xu dedicates ch. 12 of vol. 1 to this topic. It is not his best chapter.
The first thing he drives home is that Hypertext transfer protocol (HTTP) will be very wasteful since it requires a new request for every message back and forth, and we will need to make many requests in order to poll. So he suggests Websockets.
Most of the service can be built statelessly, but the websockets mean the chat service itself requires state. He also points out that, even at a meaningfully large volume of requests, a single moderately-sized server could handle the entire thing, though obviously this is unacceptable from an availability standpoint.
Finally, there is the question of data store. Given that chats mean roughly equal reads and writes, he recommends a key-value store. But then he mentions that Facebook uses HBase and Discord uses Apache Cassandra for this purpose. Neither of these are really key-value stores.
(He then goes on to propose two tables, one for Message and one for GroupMessage, that look a whole lot like relational tables to me. I’m guessing he is thinking that the “primary key” is the key and the other columns are represented as a document-typed value. I’m going to read (Perkins, Wilson, and Redmond) Seven Databases in Seven Weeks to finally have some idea of what’s going on with these many kinds of databases.)
He continues by pointing out that you need a way to find a chat server to which your request should be routed. He uses this as a way to introduce Apache Zookeeper and service discovery, though honestly, this seems like a job for a Load balancer.
One high point is a discussion of keeping track of which users are online. You can use a heartbeat to detect if a user has become disconnected. You can then use a KV store to keep track of who is online, and a fan-out strategy similar to that of a social news feed (though with different cardinality) to notify friends of online status change.