Building AI-powered workflows is only the first step making them secure, reliable, and scalable is where most developers run into challenges. The AI Gateway in Azure API Management provides enterprise-grade tools to expose AI workloads safely and efficiently. In this session, you’ll learn how to manage token quotas, semantic caching, safety policies, and authentication, ensuring that your AI services perform reliably under load while staying secure. We’ll demo how to wrap AI services in API Management, apply policies for rate limiting, monitoring, and cost control, and optimize AI workload performance in production. By the end, you’ll have practical patterns and examples for turning AI capabilities into secure, production-ready APIs that your teams can confidently consume. 📌 This session is a part of a series! Learn more here - https://aka.ms/AIS/series Chapters: 00:06 – Welcome & Housekeeping 01:00 – Introducing the Speakers & Session Overview 01:54 – Challenges in Securing AI Workloads 03:05 – Why AI Needs an API Gateway 04:20 – The AI Gateway Pattern Explained 05:02 – Key Features: Security, Token Limits & Content Safety 06:07 – Load Balancing & Session Awareness 06:34 – Semantic Caching for Cost & Latency Optimization 07:03 – Supporting OpenAI-Compatible & Third-Party Models 07:58 – Developer Velocity & Observability Features 08:42 – Demo: Deploying Models in Azure AI Foundry 10:21 – Demo: Serving Models via Azure API Management 13:00 – Setting Token Limits & Logging 15:02 – Testing Rate Limits with Python SDK 17:01 – Semantic Caching in Action 20:06 – Logging Prompts & Completions 23:07 – Cost Monitoring & FinOps Dashboards 26:04 – Continuous Model Evaluation with Azure AI Foundry 30:04 – Content Safety Integration with Azure API Management 33:52 – Demo: Blocking Unsafe Prompts & Jailbreak Attacks 36:00 – Agentic Workloads & MCP (Model Context Protocol) 38:00 – Demo: Building Agents with Microsoft Agent Framework 41:00 – Tracing Agent Behavior in Azure AI Foundry 43:00 – Summary & Key Takeaways 45:00 – Upcoming Session on MCP Deep Dive #MicrosoftReactor #learnconnectbuild [eventID:26309]