I have been writing a small Perl script (which is part of a large Perl script) to parse the log file generated by MRTG at work. However, apparently for an unknown reason (so far), MRTG is introducing a time drift in the first few entries in the log file consistently. Although I have devised a workaround for that that works well for the requirement at hand, I would still like to know what’s going on and whether the drift only exists in my environment. If anyone of you have any experience running MRTG, do take a look at this thread I opened on the mrtg-mailing-list, and do suggest if there’s anything you know that might explain the situation. I’d appreciate that.
Just this afternoon, I had the occasion to think up of a quick regular expression to match some numbers in a fixed string within a large block of string. The regular expression didn’t take long to figure out. However, after testing it against the block of string, it became obvious that the string had multiple substrings that matched the regular expression pattern and that all the matches were required. Now, if I were doing this in Python, I would without thinking use re.search() to apply the pattern across the string (or the other way round) and use the groups() method to retrieve all the matched groups. In Perl, despite having spent thrice as much time coding in than in Python, I didn’t know what to do. I tried looping around the match expression, but that didn’t quite work as I wanted it to. It was then, when haplessly groping for some sort of help, I opened perldoc perlrequick. It is a Quick Tutorial on Regular Expressions In Perl that comes bundled with the Perl distribution. The section on “More Matching” in that tutorial introduced the use of the global modifier “///g”. Now, I had known of the global modifier but I didn’t know how to use it in the context of multiple group matches.
I had to match the pattern http://www.securityfocus.com/bid/(\d+) multiple times. Using the global modifier it can be achieved thusly:
my @bid = ($alert =~ m#http://www.securityfocus.com/bid/(\d+)#g);
Last night, I was bitten by namespace clobbering in Perl. I am not sure if that is what it is called.
I was modifying Perl code I had written to parse log files generated by MRTG to track traffic that comes into and goes out of an interface on a device being monitored. The part of code I was tweaking depended on calculating date and time through localtime, which is a built-in function in Perl. However, instead of throwing what I was expecting it to throw, the function was displaying a bogus Array Reference. Since that part of the code worked as documented in the manual when moved into and run from a separate source file independent of the bigger piece of code I was trying to get it to coexist with, I was stumped looking for the problem for a long time.
It took me a few hours to bump into the documentation for the Time::localtime module imported early in the actual code. Apparently, importing Time::localtime was clobbering the local namespace by overriding the local localtime function. Since the module was being loaded in the actual source file and not in the smaller, test source script, the code broke when run from within the bigger source than when run from the test script. This behaviour is documented in “perldoc Time::localtime”.
Like what I usually do when I stumble on a problem that involves programming languages, I logged on #perl on irc.freenode.net. A few people suggest a few alternatives. The one that worked for me involves not importing the module at all, and, instead, indirectly calling the module’s function at whatever point in code it is required. For example:
$sec = Time::localtime::localtime->sec;
my ($sec, $min, $hour, $mday, $mon, $year, undef, undef, undef) = localtime($timestamp);
The first line invokes the localtime() method defined within Time::localtime, while the following calls the built-in localtime() function.
I am hooked to Skype. If you can figure out my Skype ID, you can have a talk with me. I will drop behind a simple hint: first name minus last name. Heh.
Sunday morning just drifted away into afternoon. We are having the most wonderful weather here in Karachi. It has been raining, on and off, since the wee hours of the morning, with cold winds making the bathing leaves dance in the air. Lovely.
Now, I only wish I had better eyes. I could then go out and get drenched. If I do that right now, my glasses will get all wet, I won’t be able to see anything, and three quarters of my time would be spent taking my glasses off, cleaning them, putting them back on, then taking them off again, and so on and so forth, instead of actually enjoying the weather. Oh well!
Some people, when confronted with a problem, think “I know, I’ll use regular expressions.”
Now they have two problems.
−−Jamie Zawinski, in comp.emacs.xemacs
And it is true. Regular expressions, known as REs in short in Python, are a beast. They are powerful and difficult to handle, read, and maintain at the same time. It is tempting to use regular expressions to solve problems involving string parsing. However, it is pretty easy and just as likely to fall into the trap set by regular expressions.
I was reading the Python Regular Expression HOWTO this morning. I came across what is called ”Named groups”. I didn’t know Python’s re module (the regular expression engine) supported named groups. In regular expressions, if a part of the string that is matched against a pattern is desired to be extracted, then the part of the pattern that matches that substring is enclosed within the special metacharacters: “(“ and “)”. A pattern covered with these metacharacters is called a group. The name comes from the semantic nature of parenthesis to group together smaller expressions in mathematics.
In Perl, if one wants to extract the user and domain parts from a standardised e-mail address (I am not considering exceptions here), one could use the following RE:
my $email = ‘firstname.lastname@example.org’;
my ($user, $domain) = $email =~ m/([\w+.]+)@([\w+.]+)/;
That was a simple RE. There were only two groups to match. If there are more groups, they can be accessed using Perl’s special $1, $2, and so on variables which automatically contain the substrings matched by the grouped part of RE. One has to keep track of the numbers, plus, with REs exhibiting the naughty habit of getting pretty dirty real fast, it can go wrong in all sorts of ways. I can safely say I have been burned by that before: REs having more than ten groups, and I ending up getting confused by the group numbering. I am not sure if Perl has a cleaner workaround for this. I haven’t looked.
Enter the world of the re module in Python. The re module is fascinating and fun to work with. Why? You’ll know when you’ll use it yourself: There are just too many good points to list down and too less space to do that in. I will, however, put light on one of the features of the re module that provides a convenient solution to the problem identified in the above paragraph (and it also happens to be the theme of this post, so I’d make no sense if I didn’t talk about it).
You already know what groups are, and you know that groups are accessed via numbers (remember $1, $2?). What has Python’s re module got that makes it so fascinating with respect to this particular problem? Named Groups. Yes. Named Groups: groups that can be accessed via names as well. Confusing, eh? A quick example will clear it all up.
pattern = re.compile(r'(?P<user>[\w+.]+)@(?P<domain>[\w+.]+)')
address = pattern.search('email@example.com')
user, domain = (address.group(‘user’), address.group(‘domain’))
Isn’t that simply convenient, intuitive, and beautiful?
The other day I finally got a 4-port USB hub, a cheap headphone and microphone set, and a nice, hand-size optical mouse with that snazzy yet pretty useless (for me) 2x click button (it performs the double-click function so you don’t have to double click anymore). The hub as well as the headphone and microphone cost Rs 100 each, while the mouse was a decent 250.
The USB hub worked out of the box on Slackware. I’ve my external USB keyboard attached to it. The optical mouse put up resistance, though. Prior to getting this, I was using another normal optical mouse plugged into the PS/2 port on my lovely ThinkPad T21. I had only one InputDevice section defined in xorg.conf, and both the previous mouse and the TrackPoint on the ThinkPad coexisted peacefully. The scroll wheel didn’t work as setting the appropriate parameters in xorg.conf to enable wheel scrolling made the TrackPoint go mad (plus the mouse refused to work sanely as well). However, as I plugged in this 2x click optical mouse and restarted X, I found only the TrackPoint to be responding. I frantically tried different settings, rebooted the system, booted into an older release of Slackware I’ve installed on the laptop, but couldn’t get anywhere.
Frustrated and disappointed as I had been excited to use that mouse, I went over to ##slackware on irc.freenode.net to solicit help. Old_Fogie there was kind enough to bear my questions. He suggested I need two InputDevice sections in xorg.conf: one to account for the TrackPoint, the other for the external PS/2 mouse. I hacked up modifications to xorg.conf, but instead broke the TrackPoint (not physically, but it stopped working altogether). I was stumped. Pissed. Disappointed.
I don’t know how the thought crossed my mind, but I remembered the disable TrackPoint option in the BIOS from a long time. I hurriedly rebooted the laptop, dropped into the BIOS, and disabled the feature. Two minutes later, I was comfortably using the external mouse with a smile across my face.
I also set up xorg.conf properly to get the scroll wheel working. Surprisingly, I had to tweak xorg.conf not more than what is required to get the scroll wheel working to get the 2x click button running. It actually is working as intended. (The relevant settings that go into xorg.conf to enable scroll wheel are provided at the end of this post.)
I am so glad. Next up on the list of things to fix is the sound module on Slackware. Apparently, the system fails to reload the sound modules without disabled sound after a system resume. I’ve tried toying with the OSS modules, but had no dice so far. I will probably have to compile newer version of ALSA library and drivers. I hope that’s sufficient.
Option "Protocol" "IMPS/2"
Option "Device" "/dev/psaux"
Option "Buttons" "5"
Option "ZAxisMapping" "4 5"
# this section has other stuff too, which has been
# snipped out to keep it short.
InputDevice "ExternalMouse" "CorePointer"