- Created a way to extract feature names from TensorFlow models and supply them directly to orchestration layer
- This seems like it should be trivial, because Keras lets you inspect feature lists
- From SavedModel, you can inspect the input signatures
- From a Keras model, the preprocessing is the first “layer,” so can access NamedFeatures
- However:
- Etsy was extremely siloed
- Multiple layers of indirection reflecting lack of understanding
- So feature declaration was a major impediment to productionizing models
- Required hard-coding feature lists into the orchestration code
- Then required a redeployment
- And the orchestration code was in Scala, and was coupled to the search engine
- So scientists just copy/pasted existing examples, which led to rapid code rot
- When I actually looked, the features were there in the SavedModel, albeit with unexpected names and sometimes data types
- The trick was figuring out how they got that way
- Which involved a series of conversations across science teams
- Eventually developed a data model representing the process by which features were cast and renamed
- The project was canceled in a reorg before it could go to production, unfortunately