[uf-discuss] "Must Ignore vs. Microformats"
kmarks at technorati.com
Wed Jul 19 19:05:49 PDT 2006
On Jul 19, 2006, at 10:55 AM, Tantek Çelik wrote:
> On 7/19/06 10:34 AM, "Charles Iliya Krempeaux"
> <supercanadian at gmail.com>
>> One "good" thing about XML, IMO, is that for certain simple markups
>> based on XML, it's easier for a beginner-level or intermediate-level
>> developer to write a parser for it (as compared to writing a parser
>> for Micrformats... since HTML is more difficult to parse).
>> (For example, writing a parser in C, C++, PHP, Java, C# or whatever.)
> This is why the supposed "easier to parse" aspect of XML is incredibly
> misleading. It ignores both the need to be easier to publish, and the
> that XML, in fact, is *harder* to publish.
Also, the Babel aspect of XML means that you always do need to write a
parser, if not of the XML itself but to transform the
plucked-from-the-air schema and arbitrary choices of what is an
attribute and what an element to the data structure you are using.
A key part of Microformats is converging the schemas so this becomes
much less necessary.
>> One example of such a simple format based on XML is RSS.
> You're kidding right?
> It is certainly *not* pretty easy for someone to write a parser for
> RSS that
> actually works with real RSS on the Web.
Have a look at the Universal Feed Parsers 3000 test cases...
More information about the microformats-discuss