The performance gap between compute and storage has always been fairly considerable. This mismatch between applications need from storage and what they deliver in large-scale deployments is ever increasing, especially those catering AI workloads. The storage requirements encompasses nearly everything and anything, i.e. capacity/$, availability, reliability, IOPS, throughput, security, etc. Moreover, these requirements are highly dynamic for different phases of AI pipelines. For instance, emerging storage-centric AI applications such as Vector DBs and RAGs have unique requirements of performance, capacity and throughput compared to the most talked upon, i.e. AI Training and Inference. However, the main challenge common to all is to reduce data-entropy across the network, i.e., from storage to compute and vice-versa. All this while balancing the foreground and background IO processing.
In this talk, we would discuss on the core-issues and requirements from storage deployed in data-centers doing AI, with the focus on emerging application use-cases. It also aims to provide a holistic full-stack view of the infrastructure and the under-the-hood of how AI impacts each layer of computing, i.e. compute, network, and storage. This is followed with discussions on myraid of already existing solutions and optimizations which are currently deployed as piece-meal solutions but needs a refresh for enabling storage to meet the challenges posed by AI.