Ahmad Badary

Image Classification

The Problem:

Assigning a semantic label from a fixed set of categories to a sub-grid of an image.

The problem is often referred to as The Semantic Gap.
The Challenges:
1. Viewpoint Variation
2. Illumination Conditions
3. Deformation
4. Occlusion
5. Background Clutter
6. Intra-class variation
7. Scale Variation
Attempts:
The Data-Driven Approach:
1. Collect a dataset of images and labels.
2. Use Machine Learning to train a classifier.
3. Evaluate the classifier on new images.
Asynchronous:
Asynchronous:
Asynchronous:
Asynchronous:

K-Nearest-Neighbors:
Complexity:
Training: \(\:\:\:\:\mathcal{O}(1)\)

Predict: \(\:\:\:\:\mathcal{O}(N)\)
Asynchronous:
Asynchronous:
Asynchronous:
Asynchronous:
Asynchronous:
Asynchronous:
Asynchronous:

L1 Distance:

\[d_1(I_1, I_2) = \sum_p{\|I_1^p - I_2^p\|}\]

Pixel-wise absolute value differences.
L2 Distance:

\[d_2 (I_1, I_2) = \sqrt{\sum_{p} \left( I^p_1 - I^p_2 \right)^2}\]
L1 vs. L2:

The L2 distance penalizes errors (pixel differences) much more than the L1 metric does.
The L2 distnace will be small iff there are man small differences in the two vectors but will explode if there is even one big difference between them.

Another difference we highlight is that the L1 distance is dependent on the corrdinate system frame, while the L2 distance is coordinate-invariant.
Asynchronous:
Asynchronous:
Asynchronous:
Asynchronous:
Asynchronous: