• NER tagging = identifying the named entities in a natural language passage
    • “Georg Wilhelm Friedrich Hegel”
    • “10 kilometers”
    • “January”
    • “The Soviet Union”
  • Obviously an essential step in many tasks involving web scrapes and other bulk unstructured text
  • Today, somewhat more niche to do as a stand-alone task
    • LLMs can accomplish both the NER task and whatever downstream task the NER was supporting often zero-shot
  • Stand-alone approaches still useful for:
  • State of the art is fine-tuned LLMs, typically BERT variants
    • Commonly accessed via spaCy
  • As with POS tagging (which often precedes NER, especially in older techniques):
    • Started out with manual construction of grammar-based rules, from which syntax trees could be constructed
    • Progressed to HMMs in the 1980s
    • Eventually RNNs
    • Now Transformers like BERT