Django and PyTextile Revisited

I wrote a post earlier about PyTextile not working well in Django. James Bennet was nice enough to add some pointers to the issue. (Comments are working now on my Blog). Now that I am using textile more, I want to investigate and document the issue better. Most importantly, the problem exists only with the0.96x version of Django. If you are up to date with trunk, you can skip this article.

The issues are as follows: - the textile markup.py code in Django 0.96x assumes that input to textile is in the Django default encoding, which usually is utf-8 - Django database typically uses utf-8 encoding for storage, but - Python2.5 + Django 0.96x is mostly unicode

Considering the following sources of data for used by the "textile" filter in a template, only the last case, using clean_data, will break:

1. Direct objects retrieved from database, works fine. 2. Browser input via GET, works fine 3. Browser input via form POST, accessed not using clean_data, works fine 4. The clean_data pseudo dictionary returns data in unicode, and will break the textile filter.

Code Example: # view: obj = ModelClass.objects.get(pk=id) # template: {{ obj.textile }} works

# view: field = request.GET.get('some_field') # template: {{ field | textile }} works

# view: field = request.POST.get('some_field') # template: {{ field | textile }} works

# view: form = TestForm(request.POST) field = form.clean_data['some_field'] # template: {{ field | textile }} gives:

Exception Type: UnicodeDecodeError Exception Value: 'ascii' codec can't decode byte 0xb4 in position 0: ordinal not in range(128)

Implication A very common pattern for object editing is to let the user input data for a field, saves it to the database, and finish with a template displaying the updated data. Unfortunately, since the database object is updated with the form.clean_data input, it will break on the display template after saving to the database.

Solution There are two possible solutions. The first one that I proposed earlier is for force the data encoding to utf-8 when assigning a model field from clean_data. e.g.:

my_object.field = clean_data['field_name'].encode('utf-8')

A better solution, is to encapsulate the textile processing in a function. Do not force the template author even know that we are using textile. Before Django 1.0, this function will do the magic utf-8 conversion. At Django 1.0, the function can simply call pytextile. [sourcecode lang='python'] class MyClass(models.Model): body = models.CharField()

def body_formatted(self): import textile return textile.textile( self.body.encode('utf-8'), encoding='utf-8', output='utf-8') [/sourcecode]