What can Storage do for AI?

Tue Sep 17 | 2:00pm

Location:

Cypress

Abstract

With the increased business value that AI enabled applications can unlock, there is a need to support Gen AI models at varying degrees of scale - from foundational model training in data centers to inference deployment on edge and mobile devices. Flash Storage and PCIe/NVMe storage in particular, can play an important role in enabling this with their density and cost benefits. Enabling NVMe offload for Gen AI requires a combination of careful ML model design and its effective deployment on a memory-flash storage tier. Using inference as an example, with the Microsoft Deep Speed library, we highlight the benefits of NVMe offload and call out specific optimizations and improvements that NVMe storage can target to demonstrate improved LLM inference metrics

Learning Objectives

Recognize the need to democratize training and inference at scale
Understand what does enabling NVMe offload of LLMs require
Be aware of opportunities for NVMe flash to enable improved LLM inference performance

Download the Presentation

---

Suresh Rajgopal

Micron Technology

Sujit Somandepalli
Micron Technology
Katya Giannios
Micron Technology

Related Sessions

AI / ML Applications for Storage

Storage in the era of Large-scale AI computing : What we already know (or) not?

Pratik Mishra

AMD

Favorites

AI / ML Applications for Storage

Memory Optimizations in Machine Learning

As Machine Learning continues to forge its way into diverse industries and applications, optimizing computational resources, particularly memory, has become a critical aspect of effective model dep

Tejas Chopra

Netflix, Inc.

Favorites

AI / ML Applications for Storage