How to write MCollective Agents to run actions in background

The Marionette Collective, or MCollective in short, is a cutting-edge tech for running system administration tasks and jobs in parallel against a cluster of servers. When the number of servers you have to manage grows, the task of managing them, including keeping the OS and packages installed on them updated, becomes without a doubt a nightmare. MCollective helps you drag yourself out of that nightmare and into a jolly dream, where you are the king, and at your disposal is a powerful tool, by merely wielding which you can control all of your servers in one go. I’ll probably do a bad job of painting MCollective in good light, so I’d recommend you read all about it on MCollective’s main website.

Like every good tool worth its while, MCollective gives you the power to extend it by writing what it calls agents to run custom code on your servers to perform any kind of job. You can read about agents here. MCollective is all in Ruby, so if you know Ruby, which is a pretty little programming language by the way, you can take full advantage of MCollective. Incidentally, a part of my day job revolves around writing MCollective agents to automate all sorts of jobs you can think of performing on a server.

For a while I have been perplexed at the lack of support for being able to run agents in the background. Not every job takes milliseconds to finish itself. Most average-level jobs, in terms of what they have to do, take anywhere from seconds to, even longer, minutes. And since I write an API which uses MCollective agents to execute jobs, I often run into the problem of having the API block while the agent is taking its sweet time to run. As far as I’ve looked, I haven’t found support within MCollective for asynchronously running actions.

So, I got down to thinking, and came up with a solution, which you could call a bit of a hack. But insofar as my experience testing it has been, I’ve yet to face any issues with it.

I’ve an example hosted on GitHub. It’s very straightforward, even if crude. The code is self explanatory. At the heart of it is creating your actions in agents to fork and run as childs, without having the parent wait for the childs to reap, and having one extra action for each agent to fetch the status of the other actions. So, essentially, the approach has to be taken for every agent you have, but only for those actions which you wish to run asynchronously. With agents in place, all you’ll need to do is call the agents thus:

$ mco rpc bg run_bg -I node -j
{"status": 0, "result": "2434"}

and repeatedly fetch the status of the async action thus:

$ mco rpc bg status pid=2434 operation=run_bug -I node -j
{"status": 0, "result": ""}

It’s a solution that works. I’ve tested it over a month and a half now over many servers without any issues. I would like to see people play with it. The code is full of comments which explain how to do what. But if you have any questions, I’d love to entertain them

Advertisements

Mercurial: Merging changes from an untracked copy.

One of our project that is tracked using Mercurial and hosted remotely on BitBucket.org presented us with a version control conundrum we had not faced. This post is dedicated to describing the problem and one particular solution of it.

The project in question is being tracked using Mercurial and has a history to it. By history, I refer to a set of change-sets (or commits, if you are a Git user not familiar with the use of change-sets to refer to commits). We had a user, let’s call them User A for the sake of clarity, who downloaded the source of the project at a particular change-set. Let’s call that change-set ‘Change-set X’. Note that User A didn’t clone the project repository and go back to Change-set X. They used the ‘get source’ convenience feature on BitBucket which makes possible downloading only the source of the project as it looks at a particular change-set or the tip of the repository (by tip, I refer to the last change-set recorded by the project). So, this User A got the source on their local system, and made some changes to the source. They unfortunately did all this without actually tracking the source under Mercurial, or any other version contorl system, for that matter.

Let’s introduce User B. User B is another contributor on the project. However, they have a clone of the repository on their local machine, and contribute changes and additions through their tracked working copy. User B committed and pushed several changes since Change-set X. All these changes were visible on the remote repository on BitBucket for the project. Now is when the problem came to be about. Both User A and User B now wanted the changes User A made to their private, untracked copy to be merged and pushed to the remote repository, so that they were available to User B. How do they go about doing it?

The biggest hurdle here was that the copy User A had wasn’t tracked under Mercurial at all. That meant that it had no history, nothing of the sort. It just wasn’t being tracked.

I thought about this. After much thinking, I came to a plan that I thought would be worth giving a go at. The plan was like this. First of all, User A had to initialize a Mercurial repository on the local copy that they had made changes to. This would create a new repository. Next, they had to push the changes upstream to the remote repository on BitBucket. However, when they did that, Mercurial aborted, and complained about the local repository being unrelated to the remote repository. Having looked around for what that really meant, I found out that Mercurial complained in that tone whenever the root or base change-set on two repositories were missing or different. This was indeed the case here. However, reading the help page for the “hg push” command, I noticed the “–force, -f” switch, with a note that said that this could be used to force push to an unrelated repository. For what it’s worth, the forced push worked, in that, it was able to push the changes on to the remote repository. It did mess the history and the change-set time-line a bit, because the change-sets from User’s A copy had a different base and parent than those that were on the tip on the remote repository. As far as User B was concerned, when they pulled the latest changes from the repository, they had two dangling heads to deal with and had to merge. The merge resulted in a lot of unresolved files with marked conflicts. Since I wasn’t familiar with the changes in the project, I didn’t pursue User B (and User A) after this point.

While wondering what could be a better way to handle this situation, I posed the question on the #mercurial IRC channel on FreeNode. User “krigstask” was kind enough to provide a solution. His solution went like this. User A would clone the remote repository on their local machine.

$ hg clone URL cloned-repo

They would then jump back to Change-set X.

$ cd cloned-repo
$ hg update -C change-setX

They would copy recursively the changes from their local, untracked copy of the repo into their working copy.

$ cp -r /path/to/local/untracked/repo/* .

They would go through the diffs to ensure their changes are there.

$ hg diff

They would commit their changes.

$ hg commit

Finally, they would merge their commit with the tip of the repo.

$ hg merge

In hindsight, this is a cleaner approach that actually works. I wondered why it didn’t cross my mind when I was thinking about a solution. Many thanks to “krigstask” for taking the time out to explain this approach to me.

I hope I was able to provide something helpful for my version control readers in particular and everyone else in general.

Converting a Git repo to Mercurial

     Until today, we had most of the projects on Mercurial repositories hosted on a Windows box in-house. The Mercurial web interface was set up to provide a convenient read-only access to both contributors and non-contributors within. However, the box in question was unreliable, and so was the Internet link it was alive on. And, in this world, there are few things more troublesome than having your source control server unavailable due to an unexpected downtime.

     What’s special about today is that we moved most of our Mercurial repositories to BitBucket.org. While they will remain private, the contributors as well as the internal non-contributors will be given access to them. Aside from having the repositories on a central server that we can mostly always rely on, having a lovely Web interface to browse and control the repositories, we also get useful goodies like issue trackers, wiki pages, and an easy user management interface. It is a win-win situation for everyone.

     One of the projects I was working on had only been on a local Git repository. I started work on it at a time when I found myself deeply in love with Git. Since BitBucket is a Mercurial repository warehouse, I had to find a way to convert or migrate the Git repository into a Mercurial one.

     I looked around on the Web and found a lot of people recommending the use of the HgGit plugin. As I understood it, this plugin made possible, among other things, the workflow that involved working on a Git repository and pushing the changesets to a Mercurial counterpart. However, the process of setting it up seemed rather complex to me. Plus, I didn’t want to keep the Git repository lying around when I was done with the migration. I wanted to be able to migrate the Git repository to a Mercurial one, push it upstream to BitBucket and make any changes in future in the source code by cloning from the Mercurial repository. What HgGit did seemed rather overkill for my needs.

     I then discovered the Mercurial ConvertExtension. This extension did just what I wanted: Convert repositories from a handful different SCMs into Mercurial. The process of converting a Git (or any other repository) to a Mercurial one through ConvertExtension is very straightforward.

     As a first step, you are required to edit your global .hgrc file to enable the extension thus:


[extensions]
hgext.convert=

     You are then required to run the hg convert command on your target Git repository thus:


$ hg convert /path/to/git/repo

     This will migrate the Git repository, and store it in the current directory inside a directory named repo-hg. Once inside the newly created Mercurial repository (and this part is important), you have to run the following command to checkout all the changesets:


$ hg checkout

     You may then push the repository with the usual hg push to BitBucket.

     :)

     PS: This blog post really helped get me going in the right direction.

Ranting about Mercurial (hg) (part 1)

     The day I got a grip of git, I fell in love with it. Yes, I am talking about the famous distributed version control application: git. I was a happy Subversion (svn) user before discovering git. The day to day workflow with git and the way git worked made me hate Subversion. I was totally head over heels in love with git.

     I have come to a point where the first thing I do when I am about to start a new project is to start a git repository for that project. Why? Because it is so easy and conveneient to do. I really like the git workflow and the little tools that come with it. I know that people who complain about git often lament about its lack of good, proper GUI front-ends. While the state of GUI front-ends for git has improved incredibly and drastically over the years, I consider myself a happy command-line user of git. All my interactions with git happen on the command-line. And, I will admit, it does not slow me down or hinder my progress or limit me in any way in what I do. I have managed fairly large projects on git from the command-line, and let me tell you, if you are not afraid of working on the command-line, it is extremely manageable. However, I understand that a lot of people run away from the command-line as if it were the plague. For those people, there are some really good git GUI front-ends available.

     I have to admit that for someone coming from no version control background or centralized version control background (such as Subversion or CVS), the learning curve to git could turn out to be a litle steep. I know that it was for me, especially when it came to getting to grips with concepts involving pushing, pulling, rebasing, merging, etc. I don’t think I will be off-base here to think that a good number of people who use version control software for fairly basic tasks are afraid of merge conflicts in code when collaborating code. I know I was always afraid of it, until I actually finally faced a merge conflict for the first time. That was the last time I felt afraid of it, because I was surprised to find out how easy it was to deal with and (possibly) resolve merge conflicts. There’s no rocket science to it.

     I have been plenty impressed with git that I wince when I am forced to use any other version control. In fact, I refuse to use Subversion, as an example, and try my best to convince people to move to git. Why? Because git is just that good. I should probably point the reader to this lovely article that puts down severals reasons that explain why git is better that a number of other popular version control applications out there.

     Recently, at work, I have had to deal with the beast they call Mercurial (hg). Yes, it is another famous distributed version control system out there. I was furious at the decision of my peers to select Mercurial for work-related use, and tried my best to convince them to use git instead, but they didn’t give in to my attempts to persaude them. I had no choice but to unwilling embrace the beast and work with it.

     Instantly, I had big gripes with the way Mercurial worked as well as its general workflow. Where I fell in love with how git tracked content and not files, I hated the fact that Mercurial only tracked files. What I mean by that is if you add a file in a Mercurial repository to be tracked and commit it, and then make any changes to the file, Mercurial would automatically add it to the staging commit. Yes, it wouldn’t let you specify manually which files that were changed should be part of the staging commit, something that git does. A staging commit is a staging area where files that have been changed and have been selected to be part of the next commit are loaded to be ready for commit. With git, you have to manually specify which file to add to the staging commit. And I loved doing it that way. With Mercurial, this was done automatically. This in turn brings me to the second big gripe I had with Mercurial: incremental commits. Git workflow places an indirect emphasis on incremental commits. What are incremental commits? In the version control world, it is considered best practice to perform a commit that is as small as possible and that does not break any existing functionality. So, if for example, you have made changes to ten files to add five different functionalities, that may be independent of each other, but would only like to commit one functionality at a time. That is where incremental commits come in action. With incremental commits, you specify only the files you want to be committed and commit them, instead of committing all the changed files in a bunch and add a general purpose commit message. Incremental commits is a big win for me, and I adore that feature. Because with git you had to manually specify which of the changed files you wanted in the staging commit area, incremental commits came easily and naturally. With Mercurial, where it automatically added all changes to tracked files into the staging commit area, it did not.

     So I asked the obvious question: How do we do incremental commits with Mercurial? Aftering sifting through the logs, I found out that it was indeed possible, but in a very ugly way. The Mercurial commit command takes an extra parameter, -I, that can be used to specify which files explicity to make part of the commit. But that wasn’t it. If you had, say, five files of the ten changed files you wanted to commit, you would have to tag the -I switch behind each of those files, otherwise, Mercurial will fail to include the files in the commit. To this day, this little quirk bites me every now and then.

     I will admit that the one thing I really liked about Mercurial was the built-in web server it came with that could provide a web-based interface to your Mercurial repositories for viewing. This was something that was missing from or quite difficult to do with git. And, this was also one of the biggest reasons why my peers at work decided to go with Mercurial for. They all wanted a convenient way to access the repositories over the web.

To be continued.

Short and useless commit messages

I have been guilty of writing short, pointless, and completely useless commit messages [before]. Stoked by a recent thread on StackOverflow about the worst commit message that one has ever authored, I looked into the commit log for a project I have been working on and subsequently maintaining for the last one year. I had to learn git as it had been decided that git would be used for version control for the project. Git indeed has been used since the onset of the project, but during the first three quarters working on the project, due to my unfamiliarity with working with distributed version control and shameful lack of knowledge of how git worked and a growing misperception that git is overly complicated for beginners to distributed version control, I used git sparingly (in the sense that I did not commit regularly, nor did I use any of its more than simple features) and ham-handedly. Nevertheless, skimming through the commit log from since the very initial commit, I found the following gems:

  • minor changes
  • some brief changes
  • assorted changes
  • lots and lots of changes
  • another big bag of changes
  • lots of changes after a lot of time
  • LOTS of changes. period

Of course, the ever notorious “bug fixes” is also to be found at a lot of places.

On a happier note, since then, I have come to understand and utilise the full potential of git. And I no longer blurt out when authoring commit messages.