Table of Contents



FIRST

  1. Asynchronous:

SECOND


THIRD


ML System Design

ML System Design (NG)

  1. ML System Design (NG) - Summary:
    Problem: Spam Classification
    Task: Build a Spam Classifier

    1. How to spend your time to lower the systems error:
      • Collect lots of data
      • Develop Sophisticated Features based on email routing information (from email header)
      • Develop Sophisticated Features for message body (“deal” vs “Deals”, etc.)
      • Develop Sophisticated algorithm for misspellings (“Med1cine”, “M0rtgage” etc.)

      Tip: List all of your options for this category. Brainstorm then systematically select/prioritize.

    Recommended Approach:

    • Start with a simple algorithm that you can implement quickly. Implement it and test it on your cross-validation data.
    • Plot learning curves to decide if more data, more features, etc. are likely to help.
    • Error analysis: Manually examine the examples (in cross validation set that your algorithm made errors on. See if you spot any systematic trend in what type of examples it is making errors on.



Notes: