How to write MCollective Agents to run actions in background

The Marionette Collective, or MCollective in short, is a cutting-edge tech for running system administration tasks and jobs in parallel against a cluster of servers. When the number of servers you have to manage grows, the task of managing them, including keeping the OS and packages installed on them updated, becomes without a doubt a nightmare. MCollective helps you drag yourself out of that nightmare and into a jolly dream, where you are the king, and at your disposal is a powerful tool, by merely wielding which you can control all of your servers in one go. I’ll probably do a bad job of painting MCollective in good light, so I’d recommend you read all about it on MCollective’s main website.

Like every good tool worth its while, MCollective gives you the power to extend it by writing what it calls agents to run custom code on your servers to perform any kind of job. You can read about agents here. MCollective is all in Ruby, so if you know Ruby, which is a pretty little programming language by the way, you can take full advantage of MCollective. Incidentally, a part of my day job revolves around writing MCollective agents to automate all sorts of jobs you can think of performing on a server.

For a while I have been perplexed at the lack of support for being able to run agents in the background. Not every job takes milliseconds to finish itself. Most average-level jobs, in terms of what they have to do, take anywhere from seconds to, even longer, minutes. And since I write an API which uses MCollective agents to execute jobs, I often run into the problem of having the API block while the agent is taking its sweet time to run. As far as I’ve looked, I haven’t found support within MCollective for asynchronously running actions.

So, I got down to thinking, and came up with a solution, which you could call a bit of a hack. But insofar as my experience testing it has been, I’ve yet to face any issues with it.

I’ve an example hosted on GitHub. It’s very straightforward, even if crude. The code is self explanatory. At the heart of it is creating your actions in agents to fork and run as childs, without having the parent wait for the childs to reap, and having one extra action for each agent to fetch the status of the other actions. So, essentially, the approach has to be taken for every agent you have, but only for those actions which you wish to run asynchronously. With agents in place, all you’ll need to do is call the agents thus:

$ mco rpc bg run_bg -I node -j
{"status": 0, "result": "2434"}

and repeatedly fetch the status of the async action thus:

$ mco rpc bg status pid=2434 operation=run_bug -I node -j
{"status": 0, "result": ""}

It’s a solution that works. I’ve tested it over a month and a half now over many servers without any issues. I would like to see people play with it. The code is full of comments which explain how to do what. But if you have any questions, I’d love to entertain them