Read performance Strategies for Workload using EBPF

IBM

Abstract

In today’s AI driven workloads read performance matters a lot. There are three main ways we can make sure that read performance is good

1. Getting data from kernel page cache to avoid latency from fetching data from storage device.
2. Proactively prefetching data from storage device in background, so that next read request will find data in kernel page cache.
3. Keep discarding pages which are no longer used so that there can be more room for prefetching data from storage devices to kernel page cache.

To help on above points kernel has provided on demand readahead mechanism, where user applications can advise kernel about the way I/O will happen, e.g “madvise” systems calls have flags like “MADV_SEQUENTIAL” which says Expect page references in sequential order, “MADV_WILLNEED” Expect access in the near future. (Hence, it might be a good idea to read some pages ahead.)

To be sure if above advise/strategy works as expected we need some way to verify whether we are getting more cache-hit with prefetch and if we are also discarding unused pages. For this eBPF is optimal mechanism where we can put probes/traces at various interesting places in kernel, put data/state in BPF storage like maps and then in user space display those maps in some graphical form to see above data.

This talk will focus on how we can use eBPF to gather information from different places/functions in a kernel and using user space techniques like histogram/graph decide if our advises to kernel is working in expected manner. Data collected from eBPF can also be sent to LLM (Large Language models) to do a Contextual Analysis using RAG (Retrieval Augmented Generation) to generate insights, identify patterns, and offer recommendations based on the input data. For example, it could identify correlations between different performance metrics, highlight anomalies or trends, and suggest adjustments to optimise read performance further.

Learning Objectives

Understand strategies to increase read performance and using eBPF to verify if those strategies are working as expected for specified workloads in particular environment.
Learn how to inject eBPF programs at various places in kernel and how eBPF output can be displayed in user space to easily comprehend.
Learn various strategies that can be applied to eBPF output e.g to do Contextual Analysis using RAG (Retrieval Augmented Generation) to generate insights, identify patterns, and offer recommendations based on the input data. For example, it could identify correlations between different performance metrics, highlight anomalies or trends, and suggest adjustments to optimise read performance further.

Download the Presentation

Related Sessions

File Systems & Protocols

Linux NFS Server Progress Update

The Linux in-kernel NFS server continues its history of innovation, leveraging the rich storage and network ecosystems available in the Linux kernel.

Charles Lever

Oracle

Favorites

File Systems & Protocols

Samba Development Status Update

This talk is going to give an overview of recent changes in the Samba fileserver and an outlook on the development roadmap.

Ralph Böhme

Samba Team / SerNet

Favorites

File Systems & Protocols

NFS and SMB Common Infrastructure

Volker Lendecke

SerNet GmbH

Favorites

File Systems & Protocols

Advancements in pNFS/NFS4.2 for High-Performance and Distributed Storage

NFS 4.2 introduces significant advancements tailored for high-performance workloads, GPU computing, and distributed storage environments, elevating the capabilities of standards-based modern data c

Trond Myklebust

Hammerspace

Mike Snitzer
Hammerspace

Favorites

File Systems & Protocols

What's new in macOS SMB Client - 2024 Edition

Brad Suinn

Apple

Favorites

File Systems & Protocols

Elevating Linux File Access: Recent Enhancements to the SMB 3.1.1 Client

Steven French

Microsoft

Favorites

File Systems & Protocols

Demystifying Linux SMB Mount Options

Bharath S M

Microsoft

Shyam Prasad
Microsoft

Favorites

File Systems & Protocols

SMB Witness Service in Samba

Samba 4.20 will ship with rpcd_witness, which provides a service for MS-SWN within a ctdb cluster.

Stefan Metzmacher

SerNet GmbH

Favorites

File Systems & Protocols

Integrating S3 into Distributed, Multi-protocol Hyperscale NAS

The performance requirements needed to power GPU-based computing use cases for AI/DL and other high-performance workflows are challenged by the performance limitations of legacy file and object sto

Alan Wright

Hammerspace

Johan Ballin
Hammerspace

Favorites

File Systems & Protocols

Azure Files: Design Challenges for the Biggest File Server in the World

For over nine years, Microsoft Azure has provided completely managed file shares in the cloud.
Azure Files provides SMB3, NFS4.1 and REST based access to file shares.

Utsav Mohata

Microsoft

Rena Shah
Microsoft

Favorites

File Systems & Protocols

Joining the Cephalopods: Adding SMB Support to Ceph

The Ceph storage ecosystem currently covers the full range of File, Object and Block access with Ceph-specific protocols.

Günther Deschner

IBM

John Mulligan
IBM

Favorites

File Systems & Protocols

Troubleshooting/Debugging Issues on Linux SMB Client

Bharath S M

Microsoft

Favorites