Updating

AI Podcast

Kirill Solodskikh

Released: 2025-01-15

Free 0%

1 Episode

Audio

Free 0%

1 Episode

Audio

Released: 2025-01-15

±0

#23 in Top Podcasts > Science > Mathematics

Most Recent Episode

AI Podcast: Quantization of Neural Networks, Part 1. Introduction, Definitions, Examples.

Time: 19:08

Play

Quantization is a powerful technique for reducing memory usage and speeding up AI applications built with LLMs, diffusion models, CNNs, and other architectures. In fact, quantization is fundamental to all data compression—from JPEG and GIF to MP3 and MP4 (HEVC)! In this episode, we'll cover the basics of neural network quantization, laying the groundwork for future episodes where we'll dive into specific quantization algorithms.

The AI Podcast is hosted by Kirill, CEO of TheStage AI. With his team's deep scientific and industrial expertise in neural network acceleration and deployment, they'll show you how to run AI anywhere and everywhere!

OUTLINE:

00:00 - Jingle!

01:24 - Structure of Podcast

01:46 - When and How to Use Quantization?

03:11 - Speedup or reduce memory? Or Both?

04:18 - Hardware with quantization support

05:28 - DNN compilers to run quantized networks

06:01 - What is quantization mathematically?

07:22 - Fake Quantized Tensors

08:43 - Symmetric, asymmetric, per-tensor, per-channel, per-group

09:43 - Quantized matrix multiplication

11:31 - Quantization algorithms

13:23 - Examples of PTQ and QAT

16:11 - Quantized parameters exists not in discrete space! Is it manifold?

18:08 - Details of the next episode!

Episode ID: 1000684836484

GUID: TheStageAi.podbean.com/57e5be5c-c365-3780-9b9b-b4c468514f92

Release Date: 15/01/2025, 00:51:10

Description

Educational AI Podcast from CEO of TheStage AI. We will learn mathematics and engineering behind efficient models deployment.

Feed URL

https://feed.podbean.com/TheStageAi/feed.xml

Apple Podcasts: Customer Reviews

No Entry