Table of Contents



Convolutional Neural Networks for Language (CMU)
Text Classification (Oxford)

Introduction

  1. Text Classification Breakdown:
    We can think of text classification as being broken down into a two stage process:
    1. Representation: Process text into some (fixed) representation -> How to learn \(\mathbf{x}'\).
    2. Classification: Classify document given that representation \(\mathbf{x}'\) -> How to learn \(p(c\vert x')\).
  2. Representation:
    Bag of Words (BOW):
    • Pros:
      • Easy, no effort
    • Cons:
      • Variable size, ignores sentential structure, sparse representations

    Continuous BOW:

    • Pros:
      • Continuous Repr.
    • Cons:
      • Ignores word ordering

    Deep CBOW:

    • Pros:
      • Can learn feature combinations (e.g. “not” AND “hate”)
    • Cons:
      • Cannot learn word-ordering (positional info) directly (e.g. “not hate”)

    Bag of n-grams:

    • Pros:
      • Captures (some) combination features and word-ordering (e.g. “not hate”), works well
    • Cons:
      • Parameter Explosion, no sharing between similar words/n-grams
  3. CNNs for Text:
    Two main paradigms:
    1. Context-window modeling: for tagging etc. get the surrounding context before tagging.
    2. Sentence modeling: do convolution to extract n-grams, pooling to combine over whole sentence.