Programmer Thoughts

By John Dickinson

Swift State of the Project Spring 2012

April 05, 2012

The last six months of OpenStack swift development have been the most active six-month period for the project since the code was first put into production. The developer community has grown, the code has improved, and adoption has increased. The past six months have covered the Openstack “Essex” release cycle. During this time, swift has made five releases: 1.4.4 through 1.4.8.

Where We Are

The easiest way to get an overview of swift’s evolution is to look at the version control logs.

Swift has had 125 non-merge commits:

git shortlog -nes --no-merges 1.4.3..1.4.8 | awk '{SUM+=$1} END {print SUM}'

Greg Holt has been the most prolific commiter:

git shortlog -nes --no-merges 1.4.3..1.4.8 | head -1

Swift has had contributions from Rackspace, SDSC, RedHat, Nebula, HP, SwiftStack, Internap, Memset, CERN and others.

The three largest commits in the last six months have been for the formpost middleware, man pages, and the expiring objects feature:

formpost 7fc1721d7d5290a6af278f9b6844cd3b96b7c7c3
    (11 files changed, 3359 insertions(+), 16 deletions(-))
man pages 0b0785e984d9164c1d1cd84f05dd9909bb7d37a8
    (27 files changed, 3148 insertions(+), 0 deletions(-))
expiring objects 872420efdb8e6e945cd2fe06994136b8c2ee153a
    (20 files changed, 2043 insertions(+), 53 deletions(-))

But looking at VCS logs doesn’t tell the whole story. What is in these commits?

Several important new features have been added to swift. Swift now supports expiring objects, HTML form POSTs with temporary signed URLs, and the Openstack auth 2.0 API in the swift CLI. Other new features include new config options, optional functionality in middleware, and more ops tools.

Expiring objects allow a swift user to set an expiry time or a TTL on an object, after which the object is no longer accessible and will be deleted from the system. This feature enables new use cases for swift. For example, this feature could be used by a document managements system with data retention requirements.

The new formpost and tempurl middleware modules allow a swift user to create a URL with write access and then use that URL as the target of an HTML form POST. This feature is aimed at a control panel use case. Since swift uses an auth method based on information in request headers, browsers typically can’t access swift directly. With these two new middleware modules, someone building a swift control panel can have the browser directly upload content into the swift cluster. Since the requests are going directly to swift and don’t have to be proxied through the control panel web servers for auth, the control panel deployer only has to scale infrastructure based on the control panel usage, not swift usage.

In addition to new features, many bugs have been squashed as well. Swift developers have found and fixed memory leaks, improved data corruption detection, improved replication, and improved the way rings are built.

Swift’s documentation has also been greatly improved in the last six months. Thanks to Marcelo Martins, an ops engineer at Rackspace, swift now has a full set of man pages. Additionally, swift’s self-auditing tool (swift-recon) now has full documentation.

Beyond the code, swift’s community has grown quite a bit. In addition to many private deployments, several companies have announced public deployments or their internal usage of swift. Softlayer, Haylix, and Aptira have all announced public clouds that use swift. Wikimedia Foundation has announced that all thumbnails on wikipedia are now served from a swift cluster, and they are migrating all of their media files to a swift back end.

Swift now has fifty-nine contributors listed in the AUTHORS file. Twenty-seven have been added in the last six months. This is incredible growth (nearly 50%), and many of these new contributors come from companies that had not previously contributed to swift. This growth speaks to the increasing rate of adoption of swift and builds a strong developer base that will ensure swift’s success in the furture.

Where We’re Going

However, swift is by no means “finished” or “complete”. There are always bugs to fix and edge cases that can be handled better. There are new features and use cases that can and should be solved. Some examples include solving multi-site deployments and keeping very large containers performant. Both of these improvements will allow swift to grow beyond its current use case, but they involve tremendous complexity to implement well. It is unlikely that serious attempts to solve these issues will be attempted until they become pain points for swift deployers. As one of the swift developers said, “Swift has solved all the easy problems. All we have left are the really hard problems.”

The biggest challenges facing swift are not technical; they are about the developer community. Expect the swift community to continue to grow. More companies are deploying swift. More developers will be contributing to swift. A larger developer community will of course bring new challenges, but much can be learned from other Openstack projects like nova. Bringing more developers to swift will allow swift to become more robust and more adaptable to a wider variety of use cases.

The next six months for swift should bring more community education and a larger ecosystem. More companies will deploy swift, and their unique experiences will allow swift to become more robust and feature-filled. Swift’s future is bright as both public and private clouds continue to grow.

Storage is important. Everyone has data, and it’s always growing. You should have ownership of everything that touches your data. OpenStack gives you that power.

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

The thoughts expressed here are my own and do not necessarily represent those of my employer.