Software Development

Stop Whining, Marissa Mayer is right

I agree with Mayer. Stop whining. The media is quick to jump on the band wagon and proclaim Mayer is heading backwards in time. Not true.

  1. Not allowing working from home full time is not the same as inflexible work arrangement
  2. Nothing can substitute for in person communications (read up on Sherry Turkle's work)
  3. Would you rather never see your adult children in person? No more family gatherings for life? I don't think so.
  4. People who claim working from home is more productive is missing the point. Personal productivity is a very narrow measurement of success.
  5. Virtual team has to be built from in person connections

I am all for flexible workplace. Having to work with many different people in different stages of their lives, this is what I do:

  1. allow for flexible work time, but require core time block when everyone is in the office, say 10-3 M-Th
  2. allow people to start the day early and leave early -- great for parents who need to pick up their young children, and start the day late and work late, for the stereotypical techie
  3. allow for a comfortable work place, access to food/drinks/support services and R&R spaces (a given in tech companies)
  4. allow for short Friday's as long as work is done M-Th

Most importantly, stop whining. If you are unwilling to get dressed, commute into the office to work with your peers just because you feel like you work better at home? What else are you unwilling to do?

mysql installation on ubuntu failed

We often use wordpress as the CMS for our application's public site. That means we have to install mysql on our rackspace servers. Today the installation process failed several times, with this error message in syslog.



ERROR: 1064  You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'ALTER TABLE user ADD column Show_view_priv enum\('N','Y') CHARACTER SET utf8 NOT ' at line 1


I first try to uninstall and reinstall sqlserver but it will not uninstall cleanly. Finally I have to both use apt-get and manually remove some directories to get back to a clean install:

[shell] apt-get purge mysql-server apt-get purge mysql-common rm -rf /var/log/mysql rm -rf /var/log/mysql.* rm -rf /var/lib/mysql rm -rf /etc/mysql # and then: apt-get install mysql-server --fix-missing --fix-broken [/shell]

Then I get a clean (re)install of mysql and it started up.

Paul Bissex Presenting at the Boston Django Meetup 2010

Paul Bissex, author of "Python Web Development with Django", described how he use Django to replace two legacy desktop applications at at the Boston Django meetup this month. Because of Django's ease of use and robustness, not only he replaced and deploy those apps easily, the apps has been running error free for a very long time!

Paul Bissex Presenting at the Boston Django Meetup 2010 from PK Shiu on Vimeo.

Jacob Kaplan-Moss on DevOps

Jacob gave a talk at the Boston Django Meetup this month on the topic of DevOps -- The role of the developer and the role of sysadmin are merging, and it is a good thing. This idea certainly resonate with me. I started my career as a DevOps by necessity -- I worked with a new and proprietary mini-computer, the Stratus. The OS was designed to be a developer's OS. All the operation support tools are designed for used by developers. It was a great OS to work with. When I moved to the *nix world, I met Ben, still one of the best sysadmin I know, who actually introduced me to Python. The world has come full cycle now. With utility computing and browser based clients, small team with small budgets launch large scale, fast growing web applications, developers need to be their own sysadmin as well. This is the video of Jacob's talk. I was a little bit late and missed a few minutes at the beginning.

Jacob Kaplan-Moss on DevOps at Boston Django Meetup 2010 from PK Shiu on Vimeo.

Django CSRF Migration

Like many of you, I am migrating all my Django sites to Django 1.2.1. For sites that are currently in production, I am doing the slow migration route. Just trying to get the site up with 1.2 without using any of the new features yet. One thing that I ran into is the new CSRF support. If you were not using it before, there really is no change with one exception -- all the generic views and admin views requires CSRF protection. This means that if you are using django login view django.contrib.auth.views.login , you have to make sure that any wrapper or custom templates support CSRF.


  1. If you use your own login template, you must add {% csrf_token %} to the end of the openning form tag.
  2. If you wrap the call to login with your own view, you must add the csrf decorator @csrf_protect to your view, after importing django.views.decorators.csrf.csrf_protect
  3. If you use the django.contrib.auth.logout view to redisplay a login form, you have to replace that with a wrapper because the auth.logout view does NOT add the csrf token. (Updated)

Otherwise django will send you a 403 error when you try to login.

Massachusetts DOT Opens Transit Data for Public Access

Massachusetts department of transportation, Mass DOT, recently decided tooffer real time and other transit data on a trial basis to the public. I attended their very first developer conference (official website). My Notes follow:

Open is Good - Don't Build Apps like Bridges

The central message was voiced by many presenters: Open is good for everyone. The case for open is good for the transit agency was nicely explained by Christopher Dempsey of the Executive Office of transportation (EOT).

The DOT would build software applications like they would build a bridge. It can take two years, full bidding and awarding process, costing up to 200,000 dollars, and that's to build one app. For this trail, the DOT opens up the data, let the developer community innovate, and they got some amazing application built in a few months, almost for free minus the operating cost of serving up the data.

Reduce Travel Time, not just Transit Time

The case of open is good for the people is explained by Michael Smith of Next Bus inc. Michael makes a convincing argument that cities need a working mass transit system. A working mass transit system needs customers (riders). There are many ways to make increase ridership, and cutting down travel time (door to door time) is an important part of it. Having real time data available to riders will often mean getting someone who would normally not ride public transit to use public transit.

Power of Technology

Keynote speaker Robin Chase, founder of ZipCar, eloquently explained how the right technology platform can open up a business model. She use "beds" as an example. If you have a spare bed in your guest bedroom, you could share that resource in a very limited way with your friends and family, using the guest bedrooms in your personal network. Hotels on the other hand fully commercialize the bed sharing concept, with huge investments in physical infrastructures. In return hotels needs great profit to continue operation. Compare that to -- no infrastructure, but with the right technology platform, their website, and the community support (their members), it has provided 2.8 million positive experience in 231 countries in 67,438 cities since 2004.

Open Platform, Let the Market Decide

To summarize: provide open access to transit data allow far quicker development of innovative applications, and the market will drive the good applications to the top.

What data is available?

There are really several different types of data being made available:

  • transit schedule/route information - Think of these as bus and train schedules. For Mass DOT, they refer to this as the GTFS files, because they are in Google Transit Feed Specification format.
  • real time vehicle location and arrival / departure prediction -- For Mass DOT this is the real-time XML feed, providing by their service vendor Next Bus.
  • informational alerts and schedule events -- these are short term immediate alerts of equipment problems, elevator outage, route changes, or longer term announcements of constructions.
  • other statistical data -- include accident statistics, which are not available from Mass DOT yet.

Schedule / Route data

There are schedules that are defined by the transit agencies. There typically change several times a year. A schedule is defined via several components -- A route is what we normally think of when we thing of mass transit, Bus 77, or the Silver Line. The route shape defines the long/lat location of each stop, the service -- weekday, sat, sun, the stop time, when the bus (or train or...) are suppose to arrive at each stop, all grouped into trips. A trip is one single "run" of a bus.

Because of Google, (you can get trip planning information from Google map now, select by transit instead of driving when you ), there is a "standard" call GTFS that some agencies use to provide the data. Mass DOT uses GTFS files.

Real Time Data

Real time data is the exciting part of the open initiative. By providing real time information on the location of a vehicle, and the estimated arrival or departure time of a vehicle at each specific stop along a transit route, commuters will benefit. Imagine you can sit in the comfort of your office, watching and waiting for the next bus to arrive to your bus stop. Only when the bus is about to arrive, you can leave and get to the bus without waiting at the bus stop.

Most transit agencies will use a data vendor like Next Bus to provide the real time data. Note that a data processing vendor like Next Bus is not merely a data aggregator. They actually applies their own algorithms to make "prediction" for real time transit information. For example, consider a bus waiting to start it's route at the terminal. What is the arrival time at the first stop? Since the bus has not started moving yet, we need to predict the arrival time using historical data (what's a typical travel time from terminal to first stop at this time of the day), published schedule data (when the bus is suppose to begin the trip).


  • in most cases, the transit agencies own the data. We could argue that the citizen owns the data, since we pay taxes and fairs to keep the agencies operational. So the agencies has to be persuaded to provide open data access.
  • Agencies being a government entities, has formal process in vendor selection. Selecting a vendor like Next Bus to manage and provide access to the transit data requires the typical procurement process which can take time.
  • vehicle location tracking -- this is not simply a GPS issue. GPS information may not be available to a vehicle if it is underground (in a tunnel). Some subways have track segment information available within the subway management system. If the data vendor can access this information, it can be used to locate a train very accurately.
  • vehicle location reporting -- even if a vehicle knows it's own location, it has to report it. Some systems uses radio network, some use cellular network. These are not perfect and not always available.


Currently the Mass DOT has made all their route information available as GTFS files, and on a trial basis made some of the highly use bus routes real time information available. Let's hope that they will officially make real time information available to all routes soon, remember that they have to go thru formal governmental processes to make that happen.


Django Production Error Handler

I am working on an application that, besides providing a dynamic website, also talks to an iPhone application. What happens when the iPhone, or a web visitor, triggered a bug in the application? Django actually provide a nice mechanism to report error in its "batteries included" goodness. You can easily setup the Django environment so that it will send you an email when a "server error" occurs. You just need to make sure the following is setup:

Outbound email working

The django environment must be able to send outbound emails. The actual requirement depends on your server environment, but you definitely need to have correct values setup for:


Admin users

settings.ADMINS -- this is a list of lists (or more accurately tuple of tuples) settings.SERVER_EMAIL -- email address of the error reporting from address Debug Setup


500.html and 404.html

Once DEBUG is off, Django will want to display your 500 or 404 page. Create these pages and make them available on one of our template directories.


Here are some sample entries from my settings file:

EMAIL_HOST='' EMAIL_PORT=25 EMAIL_HOST_USER='my_mailbox_name' EMAIL_HOST_PASSWORD='my_mailbox_password' SERVER_EMAIL='' ADMINS=( ('PK Shiu', ''),) DEBUG=False


Understanding this, $(this), and event in a JQuery callback function

I run into this issue all the time. Inside a callback function,like a function that response to OnClick, do I use this, $(this), or event to get to what I need?

Let's use a real example to demostrate what to do: A web page of biography for a website uses a side menu to select one of several bios to display in the main area. Doing this in JQuery means that we want to hook a handler to the click events of each of the <A> element in the menu. When a menu item is clicked, we will display the corresponding biography enclosed in a <DIV>


Here's the HTML for the menu area and the biography area. I am using ID's to distinguish each biography.

Line 1 gives the menu a class of MENU so that we can easily find it in JQuery. Line 2 to 4 provide three menu items for the user. Notice that at line 3 and 4, there are extra styling for the menu text.

Line 7 and on are three DIV's with ID corresponding to each of the menu items. Each DIV has a class of BIO, again for easy selection by JQuery.

<ul> class="menu">
<li><a href="#" id="1">1. Click me to show first bio.</a></li>
<li><a href="#" id="2">2. Click <em>me</em> to show <em>second</em> bio.</a></li>
<li><a href="#" id="3">3. Click <strong>me to show third</strong> bio.</a></li>

<div class="bio" id="1">
Biography for person number one.
<div class="bio" id="2">
Biography for person number two.
<div class="bio" id="3">
Biography for person number three.


JQuery: Initialization

The following initialization code first hides all biography divs on load, and then show just the first one by default. Notice how we use the :first selector to make selecting the first matching DIV easy.

Line 7 attach our function click_me to the menu.

    /* Hide all bio divs, then show the very first one. */
    $(".bio").hide(); $(".bio:first").show();

    /* Hook the onClick function, any A inside an
        object with class menu (UL in this case) */
  $(".menu A").click(click_me);


JQuery Handler Function

Here is the handler function that we hooked to the anchors in the menu. Can you guess what's the console log output would be?

function click_me(event){

    /* "this" is DOM element attached by the function */
    console.log('this=' + this);

    /* "" is DOM element receiving the event, can be a child */
    console.log('' +;

    /* Show the bio selected */
    var s = ".bio[id=" + + "]";

    /* Silly effect to show the use of $(this). */



Here is the explanation: 1. this inside a JQuery event handler is the DOM element that has the handler attached. This (sic) this is not necessarily the A link itself. If we have attached the handler to something else, like the containing DIV, we will get the DIV DOM element doesn't matter which inside elements of the DIV we clicked. In this case we have attached the click_me function to the anchors (A) themselves, so we are safe to know that will give us the ID specified within our A tags.

2. The standard argument passed to the function, event is the eventObject; It is a JQuery structure the contains many useful attribute regarding the (click) event. Specifically is the DOM element received the click event. Note that it will be "the DOM element that issued the event. This can be the element that registered for the event or a child of it".

For example, in menu item two, if you clicked on the word "me", it will be the span element, not the A element.

3. Finally, when you turn this into $(this), you are creating a JQuery object out of, well, this. Turning a DOM element (the A in this case) into a JQuery object allows you to use all the JQuery functions on it. We do not really need to do that here, but just as a demonstration, we use $(this) to fade the menu item out and in. We need to turn the A into a JQuery object so that we can call the JQuery fadeOut and fadeIn functions.

Fix Python source code to use spaces instead of tabs

What if someone gave you a Python source file that is indented using tabs? If you are using emacs, the following will let you convert it back to using spaces: [shell] # first set the buffer tab width to 4 (or whatever you like) M-x set-variable <return> tab-width <return> 4

# then mark the entire file C-x h

# do untabify to convert: M-x untabify <return> [/shell]

# That's it!

Technorati Tags: ,

Serving favicon in an Django App using Apache

I got a free few minutes to work on my own site here. Since I migrated the site from all static pages to Django served, I still haven't put back the favicon icon back onto the site. The sites runs under a virtual host in apache2 at WebFaction. This is what you need to put in your httpd.conf file:

alias /favicon.ico /home/your-home/your-app-etc/static/image/favicon.ico

<LocationMatch "\.(jpg|css|gif|pdf|ico)$">
SetHandler None

The alias line tells apache to go look for the favicon.ico file at a static location of your choice.

The LocationMatch directives tell apache to not run those files thru the Django engine.

Resetting Django Admin Password

This barely qualifies for a blog post, but what to do if you loaded, via loaddataa full json file from someone during testing, and don't have their user's password?

Just run the shell, and by hand reset all the passwords:

from django.contrib.auth.models import User for u in Users: u.set_password('secret')

That's why you have to keep your shell login and settings files safe !!

Django and PyTextile Revisited

I wrote a post earlier about PyTextile not working well in Django. James Bennet was nice enough to add some pointers to the issue. (Comments are working now on my Blog). Now that I am using textile more, I want to investigate and document the issue better. Most importantly, the problem exists only with the0.96x version of Django. If you are up to date with trunk, you can skip this article.

The issues are as follows: - the textile code in Django 0.96x assumes that input to textile is in the Django default encoding, which usually is utf-8 - Django database typically uses utf-8 encoding for storage, but - Python2.5 + Django 0.96x is mostly unicode

Considering the following sources of data for used by the "textile" filter in a template, only the last case, using clean_data, will break:

1. Direct objects retrieved from database, works fine. 2. Browser input via GET, works fine 3. Browser input via form POST, accessed not using clean_data, works fine 4. The clean_data pseudo dictionary returns data in unicode, and will break the textile filter.

Code Example: # view: obj = ModelClass.objects.get(pk=id) # template: {{ obj.textile }} works

# view: field = request.GET.get('some_field') # template: {{ field | textile }} works

# view: field = request.POST.get('some_field') # template: {{ field | textile }} works

# view: form = TestForm(request.POST) field = form.clean_data['some_field'] # template: {{ field | textile }} gives:

Exception Type: UnicodeDecodeError Exception Value: 'ascii' codec can't decode byte 0xb4 in position 0: ordinal not in range(128)

Implication A very common pattern for object editing is to let the user input data for a field, saves it to the database, and finish with a template displaying the updated data. Unfortunately, since the database object is updated with the form.clean_data input, it will break on the display template after saving to the database.

Solution There are two possible solutions. The first one that I proposed earlier is for force the data encoding to utf-8 when assigning a model field from clean_data. e.g.:

my_object.field = clean_data['field_name'].encode('utf-8')

A better solution, is to encapsulate the textile processing in a function. Do not force the template author even know that we are using textile. Before Django 1.0, this function will do the magic utf-8 conversion. At Django 1.0, the function can simply call pytextile. [sourcecode lang='python'] class MyClass(models.Model): body = models.CharField()

def body_formatted(self): import textile return textile.textile( self.body.encode('utf-8'), encoding='utf-8', output='utf-8') [/sourcecode]

J2EE to Django, slides for the Presentation at Cambridge Python Meetup

I gave a short presentation on Django to the Cambridge Python Users group earlier. Nate has a great writeup of the event and the other presentations that evening. I just want to share the slides here. The slides are just visual reminders and do not stand on their own. If you want more info free feel to shoot me an email. I switched from J2EE to Django as my sole web application platform two years ago and has not looked back since. It allows me to develop, and more importantly maintain, web apps faster and better. It is more time and cost effective for my customers and I.

Slides from J2EE to Django Presentations at Cambrdge Python Group from PK Shiu on Vimeo.

Django Tip: No leading slash for upload_to for FileField and ImageField

This is a common mistake. When defining a FileField or an ImageField, you need to specifywhere the files are stored. This is done by specifying a relative path in the upload_to argument. Django will then store your files in a subdirectory as named, under the MEDIA_ROOT directory. But, don't put a leading slash in the relative path. Otherwise it will try to store the file at the system root directory.

MEDIA_ROOT = '/home/myname/files/ upload_to='pictures' file: abc.jpg results in: /home/myname/files/pictures/abc.jpg


MEDIA_ROOT = '/home/myname/files/ upload_to='/pictures' file: abc.jpg results in: /pictures/abc.jpg

Also a side tip: You can add strftime style arguments to the upload_to argument, storing the files in dated sub directory. e.g.

MEDIA_ROOT = '/home/myname/files/ upload_to='/pictures/%Y/%b/%d' file: abc.jpg results in: /pictures/2008/May/09/abc.jpg

Django Tip: Outputting list of items separated by commas, but only if it has more than one item

How many times do you need to do this? You have a list of things to output. The list can be empty, has one element, or more. You want to separate each items with a separator for readability. What do you do? 1. The simple but not reader friendly way:

toppings = [ 'cheese','tomatos','pineapple' ] or toppings = ['cheese']

{% for t in toppings %} {{ t }} , {% endfor %}

that will output: cheese, tomatos, pineapple, or


Note the ugly trailing comma.

2. This is the smart way using the template variables available in loops:

{% for t in toppings %} {% if not forloop.first %}, {% endif %} {{ t }} {% endfor %}

that will output: cheese, tomatos, pineapple or

cheese No more trailing commas, thanks to the built in forloop variables.

gettext on Leopard for Django Internationalization

I started working on one of my internationalized applications on the new Mac. I realized I did not install "gettext", which is required by the make-messages and compile-messages scripts. I want to avoid installing things into OS X if I can. Then I found the easy way out: 1. Install poedit for os x. I need this to edit po files anyway. You can download it here. 2. There is no step two -- poedit for OS X comes with (obviously) all the gettext tools.

o.k. There is a little step two. You want to include the path to those tools when you run compile-messages. Create a little script like this:

PATH=$PATH:/Applications/ python /Library/python/2.5/site-packages/django/bin/

and run this script instead of running compile-messages directly.