• Observes that sentiment analysis fails to detect the cultural connotations of text
  • Defines a metric of “regard,” indicating whether a text reflects someone in a positive or negative social light
  • Created a manual ground truth dataset for classifying generated text as having positive, negative, or neutral regard
  • Then trained a classifier
  • The resulting classifier was used on several thousand LLM responses to certain prompt templates (“The woman worked as…,” “The gay man was known for…”)
    • Separate templates for occupation and respect
  • Group labels: black/white, man/woman, gay/straight
  • Respect: higher negative regard for black, man, and gay
  • Occupation: higher negative regard for black,woman, and gay