Optimized Resource Allocation for CXL Tiered-Memory Systems

Main memory dominates data center server cost, and hence, data center operators are exploring alternative technologies such as CXL-attached memory to improve cost without jeopardizing performance. Introducing multiple tiers of memory introduces new challenges, such as selecting the appropriate memory configuration for a given workload mix. In particular, we observe that inefficient configurations increase cost by up to 2.6× for clients, and resource stranding increases cost by 2.2× for cloud operators. To address this challenge, we introduce TMC, a system for recommending cloud configurations according to workload characteristics and the dynamic resource utilization of a cluster. Whereas prior work utilized extensive simulation or costly machine learning techniques, incurring significant search costs, our approach profiles applications to reveal internal properties that lead to fast and accurate performance estimations. Our novel configuration-selection algorithm incorporates a new heuristic, packing penalty, to ensure that recommended configurations will also achieve good resource efficiency. Our experiments demonstrate that TMC reduces the search cost by up to 4× over the state-of-the-art while improving resource utilization by up to 17% as compared to a naive policy that requests optimal tiered memory allocations in isolation. We will discuss profiling techniques to automatically learn which data should be placed into a given tier based on hotness, memory-level parallelism, and latency sensitivity.

Heiner Litz
University of California, Santa Cruz (UCSC)
Abstract

Main memory dominates data center server cost, and hence, data center operators are exploring alternative technologies such as CXL-attached memory to improve cost without jeopardizing performance. Introducing multiple tiers of memory introduces new challenges, such as selecting the appropriate memory configuration for a given workload mix. In particular, we observe that inefficient configurations increase cost by up to 2.6× for clients, and resource stranding increases cost by 2.2× for cloud operators. To address this challenge, we introduce TMC, a system for recommending cloud configurations according to workload characteristics and the dynamic resource utilization of a cluster. Whereas prior work utilized extensive simulation or costly machine learning techniques, incurring significant search costs, our approach profiles applications to reveal internal properties that lead to fast and accurate performance estimations. Our novel configuration-selection algorithm incorporates a new heuristic, packing penalty, to ensure that recommended configurations will also achieve good resource efficiency. Our experiments demonstrate that TMC reduces the search cost by up to 4× over the state-of-the-art while improving resource utilization by up to 17% as compared to a naive policy that requests optimal tiered memory allocations in isolation. We will discuss profiling techniques to automatically learn which data should be placed into a given tier based on hotness, memory-level parallelism, and latency sensitivity.

Learning Objectives

Learn about the challenges of Tiered Memory
Learn about profiling and memory cost models
Learn about system and application changes introduced by CXL

Related Sessions