Guide: Deploying web.py on IIS7 using PyISAPIe

I spent the last week travailing away, trying to painfully find a way to deploy a web.py based API on IIS7 using PyISAPIe. As frustrations had begun to mount up, I had nearly decided to give up. Being a die-hard Linux and Mac guy, I despise having to work on Windows. Here I was, not only forced to work on Windows, but to find a solution for a problem that left no leaves unturned in its effort to drive me crazy. As if someone decided all this misery wasn’t quite enough, I had to work with a remote desktop session in order to research, tweak, bang my head, and get things to work. Eventually, I cut through massive frustration and despair, managing to find a satisfactory solution. I almost danced in excitment and relief, letting out all sorts of expletives directed at Windows in general and IIS in particular.

To get back to the important question of deploying a web.py script on IIS7 using PyISAPIe, I will make it such that this guide will list down various steps I took, including snippets of relevant code I changed, to tame the beast. I can only hope that what is below will help a poor, miserable soul looking for help as I did (and found none).

I worked with PyISAPIe because I had successfully deployed multiple Django websites on IIS7 on it. The script in question was going to be a part of another Django website (though acting independently). It only made sense to use PyISAPIe for it as well.

First and foremost, I had to install the web.py module on the system. Having had trouble before with IIS with web.py installed through easy_install, I decided to be safe and installed it from source.. Getting web.py to work with PyISAPIe required a small hack (I notice I may make it sound as though it all came down to me in a dream, but in reality, it took me days to figure it out, and clearly after much anguish and pain). In the file Lib\site-packages\web\wsgi.py lies the following function:

def _is_dev_mode():
    # quick hack to check if the program is running in dev mode.
    if os.environ.has_key('SERVER_SOFTWARE') \
        or os.environ.has_key('PHP_FCGI_CHILDREN') \
        or 'fcgi' in sys.argv or 'fastcgi' in sys.argv \
        or 'mod_wsgi' in sys.argv:
            return False
    return True

In its pristine state, when web.py is imported from a source file through PyISAPIe, an exception is thrown. The exception, while I don’t have the exact message, is about it complaining about sys.argv not having an attribute argv, which reads fishy. Since the function _is_dev_mode() only checks whether web.py is being run in development mode, I thought I didn’t care about it since I wanted everything to run in production mode. I edited the function such that its body would be bypassed, while it returned a False boolean value. It looked like this (the important changes I made are highlighted):

def _is_dev_mode():
    return False
    # quick hack to check if the program is running in dev mode.
    if os.environ.has_key('SERVER_SOFTWARE') \
        or os.environ.has_key('PHP_FCGI_CHILDREN') \
        or 'fcgi' in sys.argv or 'fastcgi' in sys.argv \
        or 'mod_wsgi' in sys.argv:
            return False
    return True

This innocuous little addition did away with the exception.

Next up, I used default Hello World-esque example of web.py found on their site to test the deployment (of course, I went on to use my original API script, which was far too complex to trim down and fit into as an example). I called it code.py (I placed it inside the folder C:\websites\myproject). It looked like this:

  import web
  urls = (
      '/.*', 'hello',
      )
  class hello:
      def GET(self):
          return "Hello, world."
  application = web.application(urls, globals()).wsgifunc()

It was pretty simple. You have to pay particular attention on the call to web.application. I called the wsgifunc() to return a WSGI-compatible function to boot the application. I prefer WSGI.

I set up a website under IIS using the IIS Management Console. Since I was working on a 64-bit server edition of Windows and had chosen to use 32-bit version of Python and all modules, I made sure to enable 32-bit support for the application pool being used for the website. This was important.

I decided to keep the PyISAPIe folder inside the folder where code.py rested. This PyISAPIe folder contained, of import, the PyISAPIe.dll file, and the Http folder. Inside the Http folder, I placed the most important file of all: the Isapi.py. That file could be thought of as the starting point for each request that is made, what glues the Request to the proper Handler and code. I worked with the Examples\WSGI\Isapi.py available as part of PyISAPIe. I tweaked the file to look like this:

from Http.WSGI import RunWSGI
from Http import Env
#from md5 import md5
from hashlib import md5
import imp
import os
import sys
sys.path.append(r"C:\websites\myproject")
from code import application
ScriptHandlers = {
	"/api/": application,
}
def RunScript(Path):
  global ScriptHandlers
  try:
    # attempt to call an already-loaded request function.
    return ScriptHandlers[Path]()
  except KeyError:
    # uses the script path's md5 hash to ensure a unique
    # name - not the best way to do it, but it keeps
    # undesired characters out of the name that will
    # mess up the loading.
    Name = '__'+md5(Path).hexdigest().upper()
    ScriptHandlers[Path] = \
      imp.load_source(Name, Env.SCRIPT_TRANSLATED).Request
    return ScriptHandlers[Path]()
# URL prefixes to map to the roots of each application.
Apps = {
  "/api/" : lambda P: RunWSGI(application),
}
# The main request handler.
def Request():
  # Might be better to do some caching here?
  Name = Env.SCRIPT_NAME
  # Apps might be better off as a tuple-of-tuples,
  # but for the sake of representation I leave it
  # as a dict.
  for App, Handler in Apps.items():
    if Name.startswith(App):
      return Handler(Name)
  # Cause 500 error: there should be a 404 handler, eh?
  raise Exception, "Handler not found."

The important bits to note in the above code are the following:

  • I import application from my code module. I set the PATH to include the directory in which the file code.py is so that the import statement does not complain. (I’ve to admit that the idea of import application and feeding it into RunWSGI came to while I was in the loo.)
  • I defined a script handler which matches the URL prefix I want to associate with my web.py script. (In hindsight, this isn’t necessary, as the RunScript() is not being used in this example).
  • In the Apps dictionary, I again route the URL prefix to the lambda function which actually calls the `RunWSGI` function and feeds it application.
  • I also imported the md5 function from the hashlib module instead of the md5 module as originally defined in the file. This was because Python complained about md5 module being deprecated and suggested instead of use hashlib.

And that’s pretty much it. It worked. I couldn’t believe what I saw on the browser in front of me. I danced around my room (while hurling all kinds of expletives).

There’s a caveat though. If you have specific URLs in your web.py script, as I did in my API script, you will have to modify each of those URLs are add the /api/ prefix to them (or whatever URL prefix you set in the Isapi.py. Without that, web.py will not match any URLs in the file.

What a nightmare! I hope this guide serves to help others.

Thank you for reading. Good bye!

PS: If you want to avoid using PyISAPIe, there is a simpler way of deploying web.py on IIS. It is documented crudely over here.

Advertisements

Get all public interface IPs on a system using Python

     I recently came across a requirement in a project where I had to, in Python, programmatically extract all available public IPs on available interfaces on the machine the code would run. I looked around and settled with the following snippet of code that uses the built-in, standard socket Python module:

import socket
ip_list = [ip for ip in socket.gethostbyname_ex(socket.gethostname())[2] if not ip.startswith("127.")]

     While this piece of code does find a public IP listening on any of the available interfaces, its restriction lies in not being able to return all public IPs on interfaces: It gives back just one IP.

     This wasn’t clearly sufficient. I looked around again, and this time, found a third-party Python module called pynetinfo. This module could make possible working with different network device settings.

     I rearranged the code around pynetinfo and produced this:

def get_inet_ips():
  try:
    import netinfo
  except ImportError:
    return None
  else:
    inetIPs = []
    for interface in netinfo.list_active_devs():
      if not interface.startswith('lo'):
        ip = netinfo.get_ip(interface)
        inetIPs.append(ip)
    return inetIPs

     The code above loops through all available and active interfaces on the system, fetching and storing their IP in a simple datastructure. That got me all of the IPs available to a machine, excluding the loopback one, which the code was set to discard.

     But that wasn’t it. There was a slight problem. Not all the active interfaces on the system had public IPs. Some had private, local LAN IPs in the 192.168.0.0/16 and 10.0.0.0/8 subnets. The code above was returning all the IPs it could find, including public and private ones.

     I then found the netaddr third-party Python module which provided a Pythonic means of manipulating network addresses. I modified my code to use the netaddr module and got the following to boot with:

def get_inet_ips():
  try:
    import netinfo
    from netaddr import IPAddress, AddrFormatError
  except ImportError:
    return None
  else:
    inetIPs = []
    for interface in netinfo.list_active_devs():
      if not interface.startswith('lo'):
        ip = netinfo.get_ip(interface)
        try:
          ip_address = IPAddress(ip)
        except AddrFormatError:
          continue
        else:
          # If the IP is not private, use it.
          if not ip_address.is_private():
            inetIPs.append(ip)
    return inetIPs

     The netaddr.IPAddress.is_private() method in the code above determines whether the given IP is part of any of the defined private networks.

     Admittedly, there is much room for improvement in the code above. I can only hope that if it doesn’t help, then at the very least it serves as an interesting read.

Debugging Django applications!

When developing Django applications, there can be many times when you would want to roll up your sleeves and get your hands dirty. You may want to know, for example, why the control isn’t falling into a certain block of code, what values are being returned to the view from the browser or test client, why a certain form is bailing out on you, or any manner of possible problematic scenarios. For these and many more, there are a number of things you can do to help yourself and your code. I am going to describe a couple of those that I frequently employ.

1. Quick and dirty debugging

Yes! It is the quickest, dirtiest, and easiest form of debugging that has been around since who knows when. Like with many other programming languages, it takes the form of Python print statements for Django. It is really helpful when you are running your Django applications off of the Django development server, where the print output make their way to the console from which the development server is run.

2. Python Logging framework

The Logging framework for Python is as easy to use as they come. The use of the logging framework resembles to some extent quick and dirty debugging, but goes much further in terms of the flexibility as well as the levels of sophistication it provides. The simplest and quickest use is to import the logging module in the file carrying the code you want to debug, and use any of the available log message creation methods: debug(), info(), warning(), error(), etc. The output from these methods will make their way to the console where the development server is running.

With the default settings, however, the logged messages do not provide useful information beyond what you tell them to print. Also, you have no control over where the messages end up showing. So, say, if you are running your Django project over Apache with mod_python or mod_wsgi, you may be up a stump trying to locate where the messages go, or you may want to keep the messages aloof in a different file, but will find that the default settings for the logging framework won’t be able to lend you much room to breathe.

However, that is where the Logging framework really shines. It is configurable to a great extent. The docs for the framework give detailed information about the different nuts and bolts of the framework and the different ways in which it can be tuned. For the sake of this article, I will briefly brush over a slightly basic configuration that I use the logging framework in when debugging Django applications. I simply create a basic configuration setting for the logger, and move it into the settings.py file for the Django project. It looks like this:

import logging
logging.basicConfig(level=logging.DEBUG,
  format='%(asctime)s %(funcName)s %(levelname)-8s %(message)s',
  datefmt='%a, %d %b %Y %H:%M:%S',
  filename='/tmp/project.log', filemode='w')

Not only does this separate the log messages to the file /tmp/project.log, it also adds useful debugging information to the start of each logged message. In this particular case, the date and time, the name of the function from which the logging method was called, the logging level, and the actual message passed are displayed. All these and much more are thoroughly documented with elaborate examples in the documentation for the logging module.

3. The Python Debugger (pdb)

You may probably already have used the pdb module before for debugging Python scripts. If you have not figured out already, you can just as easily use pdb to debug your Django applications.

If you are like me, you may have got into the habit of writing unit tests prior to writing down Django views that the unit tests test. It is a wonderful habit, but it can get unnerving at times when you are writing your tests first. This is primarily because of the way the unit tests interact with your code: Once written, they run in their entirety without any form of interaction from the user. If any number of tests fail, the fact is made clear at the conclusion of the test runner. What I want to say is that there is no easy way for you to interact when the tests are being run.

With pdb, you can have a moment to sit back and take a sigh of relief. By importing the pdb module, and calling the pdb.set_trace() method right before the point where you want to start to debug your code, you can force the test runner to freeze itself and drop to the familiar, friendly pdb prompt. This helps immensely when you want to find out, for example, why a form that you are unit testing is not validating, what error messages it is receiving, what errors or outputs it should produce so that your tests can simulate those, etc. Once at the pdb prompt, you may use the usual lot of pdb commands to inspect and step through the code.

The use of pdb, however, it not restricted to unit testing. It may equally well be used when serving your Django applications over the development server. However, there is one little detail that needs to be accounted for. When you want to debug your Django applications over the development server with pdb, you must start the development server with these additional switches:

$ python -m pdb manage.py runserver --noreload

The -m pdb switch is documented in the documentation for the pdb module. Simply, it makes sure that if the program being debugged burps and crashes (either owning to an error, or when stimulated such as by calling the pdb.set_trace() method), the pdb module automatically falls flat on its face and activates the post-mortem debugger. This is very convenient, because what it means is that the friendly pdb prompt shows up, and you can dissect the code from that point onwards.

The --noreload switch to the runserver command, however, is crucial. The Django development server is designed to automatically reload the Python interpreter if there’s a crash or error of some sort, or to reread all the Django files if there has been a change in any one of those. One fallout of this default behaviour of the development server is that since the Python interpreter is reloaded, all previous context is lost, and therefore, there is no way for pdb to save face. The --noreload switch, therefore, forces the Django server to stray off of its default behaviour.

With the development server running with these switches, all you have to do is make sure you have placed calls to pdb.set_trace() method in your code where you want to break out. And that’s as easy as it gets.

I hope that what I have described finds its way into the useful bucket of my readers. For now, that’s all. I hope you do enjoy, if you did not before, debugging your Django applications after reading this article. Please, stay safe, and good bye!

A guide to configuring two different Django versions for development

Django 0.9x and Django 1.x branches are so far apart that a project built on top of the former, not least when it has been rolled off into production, that porting it to the 1.x branch of Django can easily escalate into a nightmare. If you have to maintain a legacy Django project — I choose to call projects built on anything before 1.x legacy –, you will find that it is easier if you maintain two different versions of Django on your development environment. How do you do that, is the quagmire this post revolves around.

I prefer Apache with mod_python to Django’s simplistic development web-server for Django projects. Initially, the task of setting up mod_python and Apache to your tastes may seem daunting. However, once you get around its not so steep curve, the difficult task becomes second nature.

I am working with Python 2.5 on a project with Django 1.0.2. Django is deployed on the system in the site-wide default directory for third-party Python modules and applications. I also have to support development for a legacy application built on Django 0.97 (a trunk snapshot from a time long ago). My requirement is simple: I want to be able to run both projects simultaneously without having to muck around tweaking configurations besides the one time when I set the projects up.

A garden-variety solution is to change the default Django directory in the Python path to point to the specific Django that is the need of the moment. This approach puts upfront a limitation of not being able to run both projects side by side. It also is annoying because it requires tap-dancing around Django directories when switching between the two versions. It is not a considerable blow to productivity, but if it can be avoided for a better, more efficient solution, there is no reason not to.

The solution I am proposing involves having the recent Django set up as the default, with the legacy Django installed in a different directory—after all, the legacy Django is the odd item of the group. I develop on an OS X environment, where I have the two Django versions set up thus:

/Library/Python/2.5/site-packages/django
/Library/Python/2.5/site-packages/django-0.97/django

You can probably tell which is which.

I have both projects deployed as virtual hosts on Apache, each running off of the root of the web-server without stepping on the feet of the other. They can easily instead be set up to, instead of the root of the web-server, serve from under a unique sub-URL path. That is mostly a matter of preference.

For each virtual host, I have assigned a domain name based off of the name of the project. I should emphatically point out that these domains are set up to resolve to localhost on my system. They may likely have real world twins that are routable over the Internet, but for all intents and purposes, they are local to my development environment. I have done it so, partly out of convenience, and partly because one of the projects hinges on subdomain-based user-to-account mapping (what this means, simply, is that registered users on the project are assigned different accounts that directly relate to subdomains, and can log in to the project only via their respective subdomains) for proper functioning. For the latter, the domain based approach was inevitable.

With that in mind, here are the virtual host settings for the two projects

<VirtualHost *:80>
   ServerAdmin root@localhost
   ServerName projectone.com
   <Location "/">
      SetHandler python-program
      PythonHandler django.core.handlers.modpython
      SetEnv DJANGO_SETTINGS_MODULE projectone.settings
      PythonPath "['/Users/ayaz/Work/', 
            '/Users/ayaz/Work/projectone/'] + sys.path"
      PythonDebug On
   </Location>
</VirtualHost>

<VirtualHost *:80>
   ServerAdmin root@localhost
   ServerName legacyproject.com
   <Location "/">
      SetHandler python-program
      PythonHandler django.core.handlers.modpython
      SetEnv DJANGO_SETTINGS_MODULE legacyproject.settings
      PythonPath "['/Library/Python/2.5/site-packages/django-0.97', 
            '/Users/ayaz/Work/', '/Users/ayaz/Work/legacyproject/'] + sys.path"
      PythonDebug On
        </Location>
</VirtualHost>

While each line of the above is important, the following is the highlight of this post:

   PythonPath "['/Library/Python/2.5/site-packages/django-0.97', 
            '/Users/ayaz/Work/', '/Users/ayaz/Work/legacyproject/'] + sys.path"

I have effectively injected the path to Django 0.97 into Python path. What this helps achieve is that when mod_python loads the Python interpreter to serve an incoming request for the project, Python analyses its path, looking for a module named django. The first successful match is under the /django-0.97 directory, which, if I have not already lulled you into sleep, is where Django 0.97 lives.

For the curious, all possible mod_python directives are documented here.

All this is a one-time headache: you pop in a pill, and forget about it. I can now type in projectone.com or legacyproject.com in the browser to get to either project.

I should mention, still and all, that I have of late come to know of virtualenv, a Virtual Python Environment builder. It may be a more proper solution than what I have proposed, but I have not yet used it myself to say more.

I hope I was able to clearly explain what I had intended to.

Building MySQL-python on OS X 10.5.x (Leopard)

Stock OS X 10.5.4 (Leopard) is devoid of MySQL. Thankfully, binary packages are available from the official MySQL.com website (MySQL 5.0.67, in this case). To use Python with MySQL, not least such when with the MySQL backend, Django is required to run, a Python binding to MySQL need be installed. It is called MySQL-python, and at the time of writing, 1.2.2 is its latest version available. To many a user’s dismay, binary packages of it are not available yet on the official website. To add fuel to fire, causing much frustration, building from source of MySQL-python is best an exercise not suited for those faint of heart.

Two packages, peculiarly to many and aptly to some, named mysql15-dev and mysql15-shlibs, providing development headers and libraries and a bunch of shared libraries all in some manner related to MySQL-5.0.x, are required. With a binary package of MySQL installed already, it makes sense to have binary packages as well of these two installed. This is where the mighty, god-sent fink comes to timely rescue. Luckily, the fink repo has binary packages of the two softwares in question available. Being a dependency of mysql15-dev, mysql15-shlibs is automatically installed with a touch of command such as this:

$ sudo fink --use-binary-dist install mysql15-dev

Building and installing MySQL-python from hereon isn’t any more difficult than running the de facto python package build and install commands. Voila!

MacBook, OS X, some cool softwares, and happy me!

I have always dreamt of having a MacBook one day. Last week was nothing short of a dream coming true (much thanks to you know who you are). I got my first brand-new, shiny spanking white MacBook. It’s got a 2.1-GHz core 2 duo processor. I bumped up the RAM from the standard 1-GB to a whooping 4-GB. The screen is smaller than my Dell, about 13.1 inches. The entire laptop, in fact, is much smaller than the Dell. But doubtless it is nothing short of being a beauty. It is running the latest iteration of Mac OS X, Leopard, 10.5.5.

I wanted to mention some of the softwares I have downloaded and/or installed separately. Some of them are what I believe those that any first-time Mac user would want to have on their Mac. Do note that I’ve never earnestly used a Mac before, which pretty much makes me a first-time user.

IM

  • Adium Adium is a multi-protocol IM software for Mac. Being multi-protocol, it supports a two dozen different protocols. I use it mostly for MSN, Yahoo, and GTalk. The interface resembles very much, if you have used it, Pidgin. It is stable, and works very reliably.
  • Mac Messenger There is also a free port of MSN Messenger available on Mac called the Mac Messenger. It isn’t exactly like the Windows counterpart in terms of UI and features, but for those of you who want a similar experience, it is the best thing that comes the closest.
  • X-Chat Aqua Yes. That is X-Chat on Mac. It is an awesome IRC client for Mac. I have used it on Windows and Linux before.
  • Skype You know what Skype is. Best for voice and video chat on Mac with all your friends who don’t own a Mac–those who do, I would highly recommend the built-in Mac application iChat. Excellent stuff.
  • Colloquy This is an advanced IRC client for Mac that supports both IRC and SILC (if you’ve ever used that before).

Office Productivity

  • OpenOffice.org for Mac I needn’t say anything. It is great.
  • Microsoft Office for Mac There is also the famous Microsoft Office for Mac, but, you guessed it, it isn’t free of cost.
  • FreeMind An excellent Java-based mind mapping tool. Great for brain-storming and generally anything that requires you to create mind maps.

Browser

  • Firefox How can I not mention that? Safari, Apple’s premier browser, is great, but Firefox is greater.

Package Management
If you are migrating from a Linux background, as I am, you will find the following two tools indispensable. They are the equivalents of tools you might be in love with on Linux, such as, ‘apt-get’, ‘yum’, etc.

Development

  • XCode and Mac Dev Tools XCode is Apple’s development environment on Mac. Not merely an IDE, it constitutes the entire development tool chain, including gcc, gdb, make, etc, along with the Cocoa and Carbon frameworks and tools for development in Objective-C. Even if you don’t require the IDE or the frameworks, you may still need the development tool chain, if you ever plan to build software from source (not least your favourite open source softwares).
  • iPython If you hack often on the Python shell, it goes without saying that you MUST get iPython. You will never look back. It is an excellent wrapper over the bare Python shell, providing countless convenience features and lots of colourful eye-candy.
  • pysqlite Mac comes with the SQLite DB and client pre-installed. For the Python SQLite binding, you have to compile and install pysqlite from scratch. There may also be binary packages available.

SCM

  • Git If you want to move onto a feature-rich, robust and reliable distributed source code management system, do give Git a go.
  • Subversion (SVN) SVN comes pre-installed with Mac. For a non-distributed SCM, I’d pick SVN any day.

Right, that’s all for now. I’ll be droning on about everything Mac quite often now.

On someone’s requests, I made a five minutes un-boxing video of my Mac. I have it available in private on youtube. If you’d like to take a peek at it, please email me your YouTube account ID at ayaz -at- ayaz.pk and I’ll send you the link.