The woman worked as a babysitter (Sheng et al, 2019)

Observes that sentiment analysis fails to detect the cultural connotations of text
Defines a metric of “regard,” indicating whether a text reflects someone in a positive or negative social light
Created a manual ground truth dataset for classifying generated text as having positive, negative, or neutral regard
Then trained a classifier
The resulting classifier was used on several thousand LLM responses to certain prompt templates (“The woman worked as…,” “The gay man was known for…”)
- Separate templates for occupation and respect
Group labels: black/white, man/woman, gay/straight
Respect: higher negative regard for black, man, and gay
Occupation: higher negative regard for black,woman, and gay

David's raw ML reference notes