- Etsy had a wide range of two-tower models used for recommendations and ads
- All of these required predicated ANN
- Either because of fast-changing predicates (ads budget), or
- Multi-use embeddings (taxonomy)
- Additionally, there was a need for both online updates and sharding
- So Etsy’s existing VDBMS, FAISS, was not a suitable option
- Horizontal scaling: manual replication only
- Predicated queries: No predicate support; only option is post-filtering
- Online updates: manually send update to all replicas
- Prior to my arrival, Etsy had settled on Vertex AI Vector Search
- Then called “Vertex Matching Engine” (VME)
- In theory, this should have suited Etsy’s use case very well
- However, VVS/VME had some major usability issues:
- Network configuration
- Metadata / lifecycle management
- Very poor documentation (at the time)
- Additionally, some teams needed to use VME both for real-time and batch serving
- They were running into quota issues while doing ETL jobs in Beam (using DataFlow)
- Previous integration efforts had been siloed and did not take use case into account
- So I got together with scientists from Recsys and worked backwards from their use cases
- Came up with two systems
- A stateful abstraction layer to manage index lifecycles, discovery, and versioning
- An Apache Beam module capable of making concurrent requests from the same worker
- Unblocked work that, I have been told, was worth $100M in incremental revenue
- This work had been stalled for almost a year