Programmer Thoughts

By John Dickinson

Swift State of the Project Fall 2011

October 13, 2011

Swift has been running in production at Rackspace for over a year with near 100% uptime. Rackspace’s swift clusters store billions of objects and petabytes of data. Several companies, including Internap, KT, SDSC, HP, and others, have also deployed and are running swift clusters in production. These clusters range in size from fairly small to several petabytes. Other organizations, including CERN, are evaluating swift for production use. Overall, swift is a success.

Swift’s active developers are all currently Rackspace employees. Other companies have talked about features and promised to contribute code, but so far, no major patches have been forthcoming. Unfortunately this means that the day-in day-out needs for swift come from Rackspace. While many needs can be anticipated and met simply by looking at Rackspace’s needs, some key areas of development are missed. For example, Rackspace does not often deploy new swift clusters, therefore automatic deployment tools are neglected. Similarly, Rackspace’s needs focus on a general use case for a broad set of customers. Other clusters may have more specific needs based on different use cases. Until swift developers see these other use cases, swift is not likely to optimize for them.

Although other companies are not currently contributing swift code, many companies are active in the community. Piston, Nebula, Voxel, HP, and others are actively engaging the developer and user communities. They sponsor biweekly meetups, engage on the mailing lists, contribute in IRC, participate in the design summits, and generally talk about what they are doing and what needs to be done. For this, I am grateful. It seems that swift currently meets the needs of these groups. I hope that as they grow and use swift more, they will see areas to improve the software and contribute those improvements back to the community.

As we move forward with swift development, certain fundamental things must be preserved, protected, and encouraged. We must maintain a healthy project. We must ensure good feedback channels with users. We must encourage other companies to continue to participate and even submit patches. We must do what we can to encourage and support an active ecosystem of tools for swift. The universe of end-user tools, automation software, and monitoring systems all factor in to a decision to use swift or not. If we fail in these fundamental areas, we might as well pack up and go do something else.

With these concerns in mind, I see three realms of future swift development. Realm one is improving swift by fixing bugs and adding features. Realm two of swift development is data-compute locality. Most (if not all) data processing tasks can be improved by reducing the latency between where the data is stored and where it is processed. Realm three moves beyond data-compute locality by a single swift deployer and solves data federation.

We are currently working on realm one: improve swift by fixing bugs and adding features. The main goals are around very large and very small clusters. This is generally an ongoing task, and even when large and small deployments are better served, there will always be bug fixes and smaller feature improvements. Some features will be large, and some will be small. The work here mostly focuses on filling out the feature gaps in swift for specific use cases.

Realm two is waiting on nova stability. After large nova clusters are running in production, we can start to explore what it will take to unify the clusters. The goal is to bring compute to “near” the data in a network sense. The closest “near” can be is local to the same server, but it could perhaps be more simply solved by only being in the same cabinet or availability zone. Since data is “sticky” and hard to move, oftentimes bringing the compute to the data is more realistic. I do not foresee swift ever merging with nova; rather, I would like to see swift and nova cooperate in such a way that swift’s ring can be used as a scheduler for nova VMs. Currently nova is in a state of flux and needs to focus on maturity before large problems like swift integration are tackled. I expect nova-swift integration to be on hold for about another 12 months while nova matures.

Realm three is the ultimate goal. Federating compute is a fairly simple concept to understand. “Bursting into the cloud” is common enough to have become a marketing phrase. Federating storage still needs to be defined even before it can be understood. I believe it involves datasets distributed and replicated across many storage providers and dynamically balancing access to them. This is something I’ve talked about in a previous blog post.

Solving these problems will take a lot of work and a lot of time. As we move from one realm to the next, we must not consider work to be “done” in the previous realms. We must always listen to feedback and continue to polish the system as a whole.

Swift has been actively developed for a little over two years now. It was revealed to the world about one year ago and has made tremendous progress since. I’m quite proud to have been a part of the project. We have all learned a lot and had a lot of fun. Swift is in a great place: openstack momentum is growing, more users are deploying swift, and the vast majority of the feedback we hear is positive. Swift’s first two years have been a success. As we remember the fundamental things and work together as part of an active community, swift’s future will be even brighter than its past.

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

The thoughts expressed here are my own and do not necessarily represent those of my employer.