Table of Contents
Convolutional Neural Networks for Language (CMU)
Text Classification (Oxford)
Introduction
- Text Classification Breakdown:
We can think of text classification as being broken down into a two stage process:- Representation: Process text into some (fixed) representation -> How to learn \(\mathbf{x}'\).
- Classification: Classify document given that representation \(\mathbf{x}'\) -> How to learn \(p(c\vert x')\).
- Representation:
Bag of Words (BOW):- Pros:
- Easy, no effort
- Cons:
- Variable size, ignores sentential structure, sparse representations
Continuous BOW:
- Pros:
- Continuous Repr.
- Cons:
- Ignores word ordering
Deep CBOW:
- Pros:
- Can learn feature combinations (e.g. “not” AND “hate”)
- Cons:
- Cannot learn word-ordering (positional info) directly (e.g. “not hate”)
Bag of n-grams:
- Pros:
- Captures (some) combination features and word-ordering (e.g. “not hate”), works well
- Cons:
- Parameter Explosion, no sharing between similar words/n-grams
- Pros:
- CNNs for Text:
Two main paradigms:- Context-window modeling: for tagging etc. get the surrounding context before tagging.
- Sentence modeling: do convolution to extract n-grams, pooling to combine over whole sentence.