Web Mining Process

A subset of Data Mining focusing on extraction and preparation of text from online data sources.

  1. Data collection:
    • Web Crawling
    • Data Steams, E.g. social media, news feed, etc.
  2. Text Cleaning:
    • Remove HTML Tags
    • Remove URLs
    • Lower-casing
    • Spelling Correction
    • abbreviation correction
    • Removing Punctuations
    • Handling Emojis
    • Removing Stop Words