Machine Learning System Design (Part - 2)

Content

Content
How to approach System Design Interview Question
1. Be familiar with basic knowledge
2. Top-down + modularization
Design Youtube Recommendation system?
1. Big Picture:
2. Storage and data model
3. Video and image storage
4. Popular VS long-tailed videos
5. Recommendation Architecture:
6. Potential Scale Issues:
Design a movie recommendation system like Netflix?
ML in production
Production Machine Learning Monitoring: Outliers, Drift, Explainers & Statistical Performance
1. ML Performance monitoring
System design interview question strategy?
1. Technology to focus
2. Algorithm to Focus:
What is CDN - Content Delivery Network?
What is Bloom Filter?
How to design Twitter?
1. Define Problem:
2. Follow Up question
3. Can we order feed by other algorithms? Relevency and Recency algorithm
TODO: How to design a Search Engine ?
How the Instagram algorithm works in 2019?
1. How the algorithm uses ranking signals to decide how to arrange each individual user’s feed.
How Does the YouTube Algorithm Work? A Guide to Getting More Views
How to succeed in a system design interview?
Recommendation system for Duolingo [@chiphuyen]
1. Lexical and syntactic complexity
2. Content and subject matter
3. How to measure Text Readability?
Fraud Detection [@chiphuyen]
Build a recommendation system to suggest replacement items [@chiphuyen]
Twitter follower recommendation
How would you design an algorithm to match pool riders for Lyft or Uber?
1. Design Decisions
2. Naive Matching
3. Improvement
4. GeoHash based model
5. Efficiency and Efficiency improvements
6. The Road to Becoming Less Greedy
7. Constraints
8. Challenges
9. Resource
10. Meta-knowledge
Trigger/Wake word detection, e.g, ‘ok, google!’ [@chiphuyen]
Click Through Rate (CTR) Prediction
Design dynamic pricing
1. Overview
2. Challenge
3. Constraints

How to approach System Design Interview Question

Be familiar with basic knowledge

First of all, there’s no doubt you should be very good at data structure and algorithm. Take the URL shortening service as an example, you won’t be able to come up with a good solution if you are not clear about hash, time/space complexity analysis.

Quite often, there’s a trade-off between time and memory efficiency and you must be very proficient in the big-O analysis in order to figure everything out,

There are also several other things you’d better be familiar although it’s possible that they may not be covered in your interview.

Abstraction: It’s a very important topic for system design interview. You should be clear about how to abstract a system, what is visible and invisible from other components, and what is the logic behind it. Object oriented programming is also important to know.
Database: You should be clear about those basic concepts like relational database. Knowing about No-SQL might be a plus depends on your level (new grads or experienced engineers).
Network: You should be able to explain clearly what happened when you type “gainlo.co” in your browser, things like DNS lookup, HTTP request should be clear.
Concurrency: It will be great if you can recognize concurrency issue in a system and tell the interviewer how to solve it. Sometimes this topic can be very hard, but knowing about basic concepts like race condition, dead lock is the bottom line.
Operating system: Sometimes your discussion with the interviewer can go very deeply and at this point it’s better to know how OS works in the low level.
Machine learning: (optional). You don’t need to be an expert, but again some basic concepts like feature selection, how ML algorithm works in general are better to be familiar with.

Top-down + modularization

This is the general strategy for solving a system design problem and ways to explain to the interviewer.

The worst case is always jumping into details immediately, which can only make things in a mess.

It’s always good to start with high-level ideas and then figure out details step by step, so this should be a top-down approach. Why? Because many system design questions are very general and there’s no way to solve it without a big picture.

You should always have a big picture.

Let’s use Youtube recommendation system as an example. I might first divide this into front-end and backend (the interviewer may only ask for backend or a specific part, but I’ll cover the whole system to give you an idea). For backend, the flow can be 3 steps: collect user data (like videos he watched, location, preferences etc.), offline pipeline that generating the recommendation, and store and serve the data to front-end. And then, we can jump into each detailed components.

Reference:

Machine Learning System Design (Part - 2)

Content

How to approach System Design Interview Question

Be familiar with basic knowledge

Top-down + modularization

Design Youtube Recommendation system?

Big Picture:

Storage and data model

Video and image storage

Popular VS long-tailed videos

Recommendation Architecture:

ML Algorithm:

Cold Start:

Heuristic Solution

Feature engineering:

Potential Scale Issues:

Design a movie recommendation system like Netflix?

ML in production

Production Machine Learning Monitoring: Outliers, Drift, Explainers & Statistical Performance

ML Performance monitoring

System design interview question strategy?

Technology to focus

Algorithm to Focus:

What is CDN - Content Delivery Network?

What is Bloom Filter?

How to design Twitter?

Define Problem:

Follow Up question

Can we order feed by other algorithms? Relevency and Recency algorithm

TODO: How to design a Search Engine ?

How the Instagram algorithm works in 2019?

How the algorithm uses ranking signals to decide how to arrange each individual user’s feed.

How Does the YouTube Algorithm Work? A Guide to Getting More Views

How to succeed in a system design interview?

Recommendation system for Duolingo [@chiphuyen]

Lexical and syntactic complexity

Content and subject matter

How to measure Text Readability?

Fraud Detection [@chiphuyen]

Build a recommendation system to suggest replacement items [@chiphuyen]

Twitter follower recommendation

How would you design an algorithm to match pool riders for Lyft or Uber?

Design Decisions

Naive Matching

Improvement

GeoHash based model

Efficiency and Efficiency improvements

The Road to Becoming Less Greedy

Constraints

Challenges

Resource

Meta-knowledge

Maximum Matching

Secretary Problem

A* Algorithm

Contraction Hierarchies

Trigger/Wake word detection, e.g, ‘ok, google!’ [@chiphuyen]

Click Through Rate (CTR) Prediction

Design dynamic pricing

Overview

Challenge

Constraints

How the algorithm uses `ranking signals` to decide how to arrange each individual user’s feed.