Introduction
-
- Sentiment Analysis:
- is the automated identification and quantification of affective states and subjective information in textual data.
-
- Applications:
-
- Formulating the Problem:
- Tasks to Extract:
- Holder (source): of the attitude.
- Target (aspect): of the attitude.
- Type: of the attitude.
- Input:
- Text: Contains the attitude
- Sentence Analysis
- Entire-Document Analysis
- main: second
- Text: Contains the attitude
Algorithms
-
- Binarized (Boolean Feature) Multinomial Naive Bayes:
- This algorithm works exactly the same as the Multinomial Naive Bayes algorithm.
- However, the features (Tokens) used in this algorithm are counted based on occurrence rather than frequency,
i.e. if a certain word occurs in the text then its count is always one, regardless of the number of occurrences of the word in the text.
- Justification: The reason behind the binarized version is evident, intuitively, in the nature of the problem.
The sentiment behind a certain piece of text is usually represented in just one occurrence of a word that represents that sentiment (e.g. “Fantastic”) rather than how many times did that word actually appear in the sentence.
-
- Better Algorithms:
-
- Max-Entropy
- SVMs
Sentiment Lexicons
-
- Sentiment Lexicons:
- Specific key-words that are related to specific polarities.
They are much more useful to be used instead of analyzing all of the words (tokens) in a piece of text.