Sam Wilmott

Markup Languages -- Programming Languages          Experience, Insight and Innovation


Home

My Profile

My Resume

AFL

RXSLT

Words for Nerds

Python

Conferences

Language

Bell Ringing

eBooks

SLED

Sam's Python Sandbox

I've been looking at and playing with Python over the last few months, and having fun doing so. It serves well as a scripting and prototyping language, and it'll definitely be a key part of my tool box for a while.

If you're interested in Python and haven't had any experience with it, I'd recommend going to the Python website at www.python.org. There's lots of material there, including a tutorial. The material is quite good, but playing with the language helps a lot, clearing up some questions you'll probably ask while reading the material.

If you're going to play with Python, I strongly recommend using Stackless Python, and looking into Christain Tismer's "tasklets" that are described on the Wiki pages. "tasklets" used with "channels" give a clean and efficient implementation of coroutines. For many purposes they are both a performance and functional improvement on using threads.

Here's what I've been doing with Python.

1. Pattern Matching

I've been a big fan of text processing since I first encountered it in the late '60's. Pattern matching is a very important tool if you're processing text, be it structured or un-structured. I've been experimenting with an alternative model of pattern matching in Python, and you can find what I've done here:

Pattern Matching In Python.

This paper and its accompanying implementation are also a good example of using Python's overloaded operator definition facilities.

This should be considered an alpha implementation. I've used it successfully for a number of purposes, but it needs a more thorough review and testing before it's ready for prime time.

Latest update: 15 August 2004: there have been a few improvements, and additions, most notably "lastMatch", and I've added a new example program.

2. Serial XML Processing

There are two major models of XML parsing out there:

  • DOM: Read and parse the whole document at once and make the result available to the client as a data structure.
  • SAX: Read and parse the document a bit at a time, and give the components to the client as they are able.

The DOM model seems to be more popular, and it's the appropriate model to use a lot of the time. However, the serial model is a good choice at other times. This implementation presents an alternative model of serial XML parsing that makes serial parsing easier and more natural, and might help popularize this approach.

An XML Parser For Python.

This paper and its accompanying implementation make substantial use of generators -- the key component of the implementation is a generator of parsed XML tokens (objects) -- and might help those interested in using generators in Python.

This should also be considered an alpha implementation. I've used it successfully for a paper I presented at the Extreme Markup Conference in Montreal in August, but that's it. I've included that documents as an example.

Latest update: 15 August 2004: There have been quite a few things changed, most notably: the ".children" method (thanks Paul) and overlapped markup support, and switching over to use Stackless Python's tasklets.


There's some not exactly Python but related material on my Words for Nerds page.

15 August 2004