Integrating S3 into Distributed, Multi-protocol Hyperscale NAS

Wed Sep 18 | 10:35am

Location:

Stevens Creek

Abstract

The performance requirements needed to power GPU-based computing use cases for AI/DL and other high-performance workflows are challenged by the performance limitations of legacy file and object storage systems. Typically, such use cases have needed to deploy parallel file systems such as Lustre or others, which require networking and skillsets not typically available in standard Enterprise data centers.

Standards-based parallel file systems such as pNFS v4.2 provide the high-performance needed for such work loads, and do so with commodity hardware, standard Ethernet infrastructure. They also provide the multi-protocol file and object access not typically supported by HPC parallel file systems. PNFS v4.2 architectures used in this way are often called Hyperscale NAS, since they merge very high throughput parallel file system performance with the standard capabilities of enterprise NAS solutions. It is this architecture that is deployed at Meta to feed 24,000 GPUs in its AI Research SuperCluster at 12.5TB per second on commodity hardware and standard Ethernet to power its Llama 2 & 3 large language models (LLMs).

But AI/DL data sets are often distributed across multiple incompatible storage types in one or more locations, including S3 storage at edge locations. Traditionally, to pull S3 data from the edge into such workflows has required deployment of file gateways or other methods to bridge protocols.

This session will look at an architecture that enables data on S3 storage to be automatically integrated into a multi-platform, multi-protocol, multi-site Hyperscale NAS environment seamlessly. By leveraging real-world implementations, the session will highlight how this standards-based approach can enable organizations to leverage conventional enterprise infrastructure with data in place on existing storage of any type to feed GPU-based AI and other high-performance workflows.

Learning Objectives

Learn how S3 data silos can be seamlessly integrated into high-performance multi-protocol parallel file system workflows, such as are needed to power GPU computing for AI/DL and other high-performance use cases.
Understand how distributed data sources can be consolidated for high-performance use cases with data in place, without needing to copy data into a proprietary and often siloed new data repository.
Learn best practices for utilizing commodity hardware, standard networking, and existing multi-vendor, multi-protocol and often distributed storage resources into an integrated, vendor neutral environment capable of powering high-performance use cases.

Download the Presentation

---

Alan Wright

Hammerspace

Johan Ballin
Hammerspace

Related Sessions

File Systems & Protocols

Linux NFS Server Progress Update

The Linux in-kernel NFS server continues its history of innovation, leveraging the rich storage and network ecosystems available in the Linux kernel.

Charles Lever

Oracle

Favorites

File Systems & Protocols

Samba Development Status Update

This talk is going to give an overview of recent changes in the Samba fileserver and an outlook on the development roadmap.

Ralph Böhme

Samba Team / SerNet

Favorites

File Systems & Protocols

NFS and SMB Common Infrastructure

Volker Lendecke

SerNet GmbH

Favorites

File Systems & Protocols

Advancements in pNFS/NFS4.2 for High-Performance and Distributed Storage

NFS 4.2 introduces significant advancements tailored for high-performance workloads, GPU computing, and distributed storage environments, elevating the capabilities of standards-based modern data c

Trond Myklebust

Hammerspace

Mike Snitzer
Hammerspace

Favorites

File Systems & Protocols

What's new in macOS SMB Client - 2024 Edition

Brad Suinn

Apple

Favorites

File Systems & Protocols

Elevating Linux File Access: Recent Enhancements to the SMB 3.1.1 Client

Steven French

Microsoft

Favorites

File Systems & Protocols

Demystifying Linux SMB Mount Options

Bharath S M

Microsoft

Shyam Prasad
Microsoft

Favorites

File Systems & Protocols

Read performance Strategies for Workload using EBPF

In today’s AI driven workloads read performance matters a lot. There are three main ways we can make sure that read performance is good

Yadavendra Yadav

IBM

Favorites

File Systems & Protocols

SMB Witness Service in Samba

Samba 4.20 will ship with rpcd_witness, which provides a service for MS-SWN within a ctdb cluster.

Stefan Metzmacher

SerNet GmbH

Favorites

File Systems & Protocols

Azure Files: Design Challenges for the Biggest File Server in the World

For over nine years, Microsoft Azure has provided completely managed file shares in the cloud.
Azure Files provides SMB3, NFS4.1 and REST based access to file shares.

Utsav Mohata

Microsoft