In this video, we design an end-to-end Machine Learning system for YouTube video search using embedding-based retrieval. We go step by step and build the system exactly how it is done in real-world large-scale ML systems — without relying on traditional keyword search like Elasticsearch. 🔍 What you’ll learn in this video: 1) How YouTube-like search works using semantic embeddings 2) Query embeddings vs video embeddings 3) Using title + description + video frames to represent videos 4) Contrastive learning (InfoNCE loss) for training retrieval models 5) Offline vs online evaluation (MRR, NDCG@K, CTR, Watch Time) 6) Complete serving pipeline with projection heads and embedding consistency 7) Real-world ML system design trade-offs 🧠 Who is this video for? 1) ML Engineers preparing for system design interviews 2) Engineers learning recommendation systems & search 3) Anyone interested in how YouTube search works internally 4) Beginners transitioning into applied ML / GenAI roles Timestamps 00:00 Introduction 01:17 Objective 03:08 Input & Output 04:00 Data Engineering 05:28 Feature Engineering 17:29 Model Development 27:25 Evaluation 37:50 Serving 👍 Like, Share & Subscribe for more ML System Design content!