Cloud customers need a range of storage types to use as the durable layer of their systems. For example, some customers want a Blob interface, some want to use files, and others want disks. Cloud providers must offer all of these and more options, while continuing to innovate and reduce costs. Creating separate systems for each product offering is inefficient. Trying to build all product offerings on one system is ineffective.
Azure Storage has created two distinct platforms used for all offerings. One is a platform based on append-only logic. This platform is optimized for blob-style data and file data. The other is an update-in-place platform which is optimized for virtual machine remote disk storage. Both platforms include a replication layer and an indirection layer; however, these layers are ordered differently in the two systems. That difference in layering results in radically different architectures with different pros and cons. This talk describes both architectures and explores how Microsoft is converging components of these divergent systems to improve efficiency while maintaining the benefits each has to offer.
Running two separate systems is expensive, but each has different benefits. Finding a way to share a common foundation while maintaining their separate layer ordering provides a cost-effective way to realize the benefits of each system. Two previously undisclosed software technologies being developed by Microsoft will be explained in this talk.
The first system is a new local object store. This store will provide a common storage platform for both Azure Storage systems that abstracts the underlying hardware while taking advantage of various storage hardware advancements. It will allow for the optimal use of SMR HDDs and ZNS SSDs, providing a superior abstraction to existing general purpose files systems for building efficient, high performance, large scale distributed storage systems.
The second system is a development environment. It will allow for code to be written and compiled for a variety of silicon types. It uses software patterns modeled on run-to-completion to provide the most efficient platform for developing optimally efficient high throughput data storage and processing applications.
These two technologies will allow both of Azure Storage’s cloud scale storage systems to retain their unique advantages while utilizing a common storage foundation beneath them.