SNIA DEVELOPER CONFERENCE



September 16-18, 2024 Santa Clara, CA

# Smart Data Accelerator Interface Use Cases Proof Points v1.1 and beyond

Shyam Iyer Chair, SNIA SDXI TWG Member, SNIA Technical Council Distinguished Engineer, Dell



SDXI Intro and brief overview of v1.0

- SDXI v1.1 preview
- Software Enablement
- Proof points
- Summary



# SDXI Intro and brief overview of v1.0



#### Sample accelerator usage models





#### **SDXI** Intro

- Smart Data Accelerator Interface (SDXI) is a SNIA standard for a memory to memory data movement and acceleration interface that is -
  - Extensible
  - Forward-compatible
  - Independent of I/O interconnect technology
  - Features:
    - Virtualized address space to address space data movement
    - Offloads data movement, common memory operations, and data transformations while moving data
    - Offloads data movement while preserving address space and context isolation.
    - Standardized interfaces and architected states for DMA engine
    - Standardized for user-level software.
- v1.0 released!
  - https://www.snia.org/sdxi
- SNIA's SDXI TWG is now working on v1.1 now
  - SDXI TWG also has a software focused group that is working on a reference libsdxi implementation





# Memory Structures(1) – Simplified view



- All states in memory
- One standard descriptor format
- Easy to virtualize
- Architected function setup and control
  - \*layered model for interconnect specific function management
  - SDXI class code registered for PCIe implementations



#### A Standard Descriptor Format (1)

|          | Rsvd                                | Operation  | Op Group           | Rsvd   | CTL V | Architecturally Registered Operation Groups:                                                                                                                                                                                                                                                                                                                                                          |  |  |
|----------|-------------------------------------|------------|--------------------|--------|-------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--|--|
|          | ĺ                                   | Operation- | Specific Descripto | r Body |       | DMA BaseAdministrativeVendor-DefinedFull AtomicMinimal AtomicOthers                                                                                                                                                                                                                                                                                                                                   |  |  |
| 64-Bytes | Completion_Ptr                      |            |                    |        |       | DMA: Nop, Copy, RepCopy, WriteImm<br>Atomic: Bitwise Ops, Add(minimal), Sub, Swap(minimal), Min,<br>Max, CmpSwap(minimal), etc<br>Admin: Start/Stop/Update/Sync, Interrupt Function & Contexts<br>(easily virtualizable)                                                                                                                                                                              |  |  |
|          | *Room for lots of future operations |            |                    |        |       | <ul> <li>A pointer to a 32-byte aligned region of memory containing the Completion Status Block that contains</li> <li>Completion Signal <ul> <li>Initialized by SW, Decremented by Function on Success</li> <li>Error Bit(ER) to indicate the operation encountered an error</li> <li>Other bits in the 32-byte field are reserved to support future expansion of error codes</li> </ul> </li> </ul> |  |  |

#### Multi-Address Space Data Movement within an SDXI function group (2)





#### Need more on SDXI Internals

#### SNIA SDXI Specification v1.0 Internals

https://www.youtube.com/watch?v=wjc4ZnCQibw&pp=ygUNc2RjIDIwMjMgc2R4 aQ%3D%3D





# SDXI v1.1 Preview



10 | ©2024 SNIA. All Rights Reserved.

# SDXI v1.1 investigations

- Connection manager
- New data mover operations for smart acceleration
- SDXI Host to Host investigations
- Scalability & Latency improve
- Cache coherency mod .a movers
- SDXIV1.1 Update Security Feature .J data movers
- Jos involving persistent memory targets Data mov
- Q0<sup>°</sup>
- .ated use cases
- Leterogenous environments





# SDXI v1.1, v1.2, and v2.0

While investigating features for v1.1 SDXI TWG developed a framework for features:

■ v1.1

- Mostly errata fixes from v1.0,
- Additional use cases prioritized by member participation
- Retains compatibility with v1.0
- v1.2
  - Overflow from v1.2
  - Retains compatibility with v1.0, and v1.1
- v2.0
  - More intrusive features



#### v1.1 Sneak Peak

#### SDXI v1.1 Practical Considerations

- Definable Operations to enable innovation
- Define new data mover operations to enable critical member use cases
- Improvements around memory ordering
- Improved point of view for
  - Connection Manager
  - Use cases involving memory fabrics,
  - Host to Host use cases
  - QoS use case
  - Storage Use Cases involving NVMe, and Computational Storage
  - Security considerations
  - Al Use cases



#### v1.1 Candidate: Definable Operations Group

- v1.0 Vendor-defined operations group definition was rigid
- Required vendors to register a vendor opcode to avoid collisions
  - Slows innovation
- Innovators want flexibility in defining new operations
  - However, they require leverage with:
    - Software, and APIs without rewriting infrastructure code
- Definable operations group to the rescue!
  - Requires new UUID for definable operations in vendor space
  - Each vendor can support a profile to enable its own set of definable operations
- Are you using the v1.0 vendor-defined encodings?
  - Expect this v1.0 feature to get deprecated with v1.1



#### While we are on the topic of deprecation:

#### Potential candidates for deprecation from v1.0

- Mailbox
- Vendor Defined Operations Group in favor of Definable Operations Group
- Are you affected?
  - Please yell or join the workgroup!



#### v1.1 Candidate: Make me another copy!

#### **Double Copy**

- Single Source buffer two destination buffers
- Single Source buffer, and two destination buffers
  - Each buffer can be in different address spaces.
  - Producer context can also be in an independent address space.





# v1.1 Candidate: Data Integrity

- Cyclic Redundancy Checks(CRC)
- Protection Information(PI)
  - Memory to memory with PI Check, Strip, Insert, Update, Compare, etc.







#### v1.1: New Data Mover Operations





#### v1.1 Memory Operations and Data Transformations

#### **Operations**

- POSIX memory ops
- Compression
- Bring Your Own Operation(BYOO)





# v1.1: Memory Ordering improvements

#### SDXI v1.0 memory ordering

- Write after Write 'seq'
- Read after Completion of previous operation 'sync'

#### v1.1 Memory ordering relaxations and clarifications

- Read after Write
- Valid bit checking
- Flagged Write



#### Point of view: Connection Manager





21 | ©2024 SNIA. All Rights Reserved.

#### Point of view: CXL based Architectures





#### Point of View: Computational Storage, NVMe, and SDXI



← SDXI → CSEE, CSF is SDXI Producer ← SDXI → Host is SDXI Producer



# Point of view: Does it apply to AI? Yes!!!





# Software Enablement



# Software Ecosystem

#### SDXI TWG is working on libsdxi

- OS-agnostic user space library
- Helps user space applications use SDXI accelerated data movement operations
- Control Plane API
  - Probing resource discovery
  - Context management
  - Connection management
- Data Plane API
  - Memcpy
  - Zero Memfill
  - <Memory Operations>
- SDXI TWG is enabling SDXI driver work in various OSes
- SDXI Kernel mode Use cases
  - Linux DMA engine
  - Mem-zero
  - Autonuma aka numa page migration
- SDXI emulation project investigation for ecosystem development



#### **Baremetal Stack View**





#### Scale with Compute Virtualization– Multi-VM address space



28 | ©2024 SNIA. All Rights Reserved.

# Proofpoints



#### Proofpoints: SDXI PoC Demo at Memcon 2024

#### SDXI Sample User Mode application with Linux

|                                                  |                                                                                                                            | Terminalizer                                                                |                                                                                         |
|--------------------------------------------------|----------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|-----------------------------------------------------------------------------------------|
| Application Virtual Memory User Application Code | <pre>\$ modprobe sdxi \$ ls /dev/sdxi -1</pre>                                                                             | dxi': No such file or directory<br>0, 0 Mar 4 12:59 <b>/dev/sdxi</b>        | ,                                                                                       |
|                                                  | DXI<br>dware<br>f config.to<br>autogen.sh<br>config.su<br>autom4te.cache<br>ChangeLog<br>compile<br>f cd samples/<br>\$ 1s | g COPYING libtool<br>atus depcomp LICENSE<br>b docs ltmain.sh<br>include m4 | Makefile.am run.sh<br>Makefile.in samples<br>missing src<br>NEWS<br>README<br>README.md |
| Operating System SDXI driver Security Controls   | context.c Makefile.am m                                                                                                    | emcopy.o repcopy.o <b>uadd</b><br>x55a216dc8000                             | uadd.c write-imm.c<br>uadd.o write-imm.o<br>write_imm                                   |
| 5   © SNIA. All Rights Reserved.                 |                                                                                                                            |                                                                             | SNIA                                                                                    |





#### Summary and Call to Action

- SNIA is developing SDXI a memory to memory data movement standard
  - v1.0 released!
- Multiple companies involved in the effort
- SDXI standard continues to improve with new features and use cases
  - SDXI TWG working v1.1 specification
  - TWG has a framework and roadmap for v1.1, v1.2, and v2.0
- SDXI software ecosystem is developing, and proof points are emerging
- Learn more:
  - https://www.snia.org/sdxi







32 | ©2024 SNIA. All Rights Reserved.



# Please take a moment to rate this session.

Your feedback is important to us.



33 | ©2024 SNIA. All Rights Reserved.