>I'll also point out that modern browsers do a decent job with imperfect 
>DOM trees (i.e., pages with unclosed elements, illegal characters, 
>etc.).  There might be a strategy that involved using the browsers 
>native DOM parsing capacities, but that would likely lead to browser 
>specific solutions.

This is another story and off topic for this group, but I could really 
do with a PHP HTMLTidy library to clean up users HTML input. It's either 
that or go to a Wiki/phpbb intermediate ML.

