Programmer Thoughts

By John Dickinson

Swift (OpenStack Object Storage) Overview

November 06, 2010

What is it?

Swift is a highly scalable redundant unstructured data store designed to store large amounts of data cheaply. “Highly scalable”, means that it can scale to thousands of machines with tens of thousands of hard drives. Swift is designed to be horizontally scalable–there is no single point of failure. In most large-scale deployments, swift should become more performant as the cluster grows larger. In the CAP theorem, swift sacrifices C for A and P. Most operations happen synchronously, but consistency is sacrificed in failure scenarios.

“Redundant” means that swift stores multiple copies of each entity in the system. Each copy is stored in physically distinct availability zones, so common failures like hard drive failure network issues are highly unlikely to cause data loss or downtime.

“Unstructured data store” means that swift simply stores bits. Swift is not a database. Swift is not a block-level storage system. Swift stores blobs of data. Swift offers namespace groupings within accounts as containers, but no other relation between objects is stored.

For more information on the internal workings of swift, see http://swift.openstack.org/overview_architecture.html.

What can it do?

Although swift is a key-value store, it is optimized for highly available reads and writes. This makes it ideal for storing backups and static web content. Swift is well-suited to storing and serving server backups, VM snapshots, database backups, image libraries, scripts and stylesheets, or or any other static content that needs to be accessed frequently.

Also, because swift guarantees that objects will be available for reading as soon as they are successfully written, swift can be used to store content that changes frequently.

How does one use swift?

Swift has a ReST-ful API. All communication with swift is done over HTTP, using the HTTP verbs to signal the requested action. A swift storage URL looks like

swift.example.com/v1/account/container/object

Swift’s URLs have four basic parts. Using the example above, these parts are:

One may get a list of all containers in an account with a GET on the account: GET http://swift.example.com/v1/account/

One may create new containers with a PUT to the container: PUT http://swift.example.com/v1/account/new_container

One may list all object in a container with a GET on the container: GET http://swift.example.com/v1/account/container/

One may create new objects with a PUT on the object: PUT http://swift.example.com/v1/account/container/new_object

Additionally, one may use POST to change metadata on containers and objects.

Get it

Swift is completely open-source released under the Apache 2.0 license. Find it at http://swift.openstack.org. Current documentation is found at http://swift.openstack.org. Patches are welcome.

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

The thoughts expressed here are my own and do not necessarily represent those of my employer.