Part-of-speech Tagging (POS)

POS Tagging is the task of assigning word types(Part of Speech) to tokens, based on the context.


  • Rule-Based: A dictionary of words and rules is constructed(manually, using machine learning models, or both) with possible tags for each word. Rules guide the tagger to disambiguate.
  • Statistical(stochastic or probabilistic taggers): A text corpus is used to derive useful probabilities. Given a sequence of words, the most probable sequence of tags is selected.
  • Memory-Based: Selecting best match from cases stored in memory, containing context and suitable tag.
  • Transformation-Based: Utilize Transformers powered by a combination of rule-based and stochastic methods. Tagging is done using broad rules and then improved or transformed by applying narrower rules
  • Artificial Neural Networks (ANN): Mainly Recurrent Neural Networks (RNN) and Long-Short Term Memory (LSTM)