Hello,
my question concerns the development of eucalyptus - since I could not find a developer mailing list I'll post it here - please redirect my request if there is a better place ;)
At the Distrubuted Systems Group of the Philipps University, Marburg we developed and evaluated an image distribution mechanism via BitTorrent for cloud environments. This allows for the distribution of an image to any number of downloading hosts in constant time (independent of the number of hosts). The ideas were successfully integrated into the Xen Grid Engine (see http://mage.uni-marburg.de/trac/xge).
Since the image distribution in eucalyptus between Walrus and the node controllers is done via separate http requests (using curl) which may result in a performance overhead if an image is downloaded by multiple hosts at the same time we thought about integrating the BitTorrent distribution mechanism into eucalyptus.
However, there are several open questions for us. It is not clear which components will need modification (since due to the use of multiple frameworks/webservices withing the different eucalyptus components there are many dependencies that need to be kept in mind and which are difficult to track when reading the source). In setups where the node controllers cannot be accessed from the outside (e.g. because of a private network, firewall, etc.) the images will either need to be cached at the corrensponding cluster controller or the cluster controller will need to act as a BitTorrent Bridge between Walrus and the node controllers). If the caching solution is chosen, a decision between a push protocol (images are forwarded to the cluster controllers once submitted to Walrus) and a pull protocol (images are retrieved only on demand by the cluster controllers) has to be made. Are there already any plans to implement other image distribution mechanisms?
Regard this post as a kind of request for comments. Any ideas, objections or information about where to start best are welcome.
Kathi Haselhorst
Hi Kathi,
Thanks for contacting us! (And, yes, this is the right venue as we prefer to communicate via the forum.)
You are absolutely right that image distribution mechanism within Eucalyptus can be made more efficient. We've discussed a few approaches to this problem, but so far nothing concrete has made it into our short-term plans. (This is partly because we are not confident that simply plugging BitTorrent among Walrus and NCs will be near optimal, which makes this an interesting problem.) So, if this is something you are interested in working on in, say, the next six months, I do not anticipate any duplication of work. (There might be changes to image cache management on the NC, but that would be complementary to adding a swarming-based transport.)
You also seem to have the right understanding of Eucalyptus architecture components related to this task. To your two options I could add a third: instead of involving CC in image distribution (either as a BitTorrent bridge or as a cache), it seems that NCs could form a swarm of their own, with the first one(s) to download the image from Walrus using HTTP acting as seeds. This solution has the advantage of narrowing down changes in Eucalyptus to NC, which is the simplest component, written in C with some shellouts to Perl.
In terms of push/pull, we've generally stuck to top-down push for control messages while lower-level components (SCs and NCs) initiate large data transfers, either via pull or push.
Hope this helps!
Dmitrii
Hello Dimitrii,
thx for your reply! Your third option indeed sounds interesting, especially because we would not need to touch the WS stack in the cloud controller :)
Is there any specification for the manifest file that describes an image? Is it some amazon ec2 standard or an eucalyptus specific format?
Kathi