[uf-discuss] "Must Ignore vs. Microformats"

Wed Jul 19 10:34:58 PDT 2006

Hello,

On 7/19/06, Tantek Çelik <tantek at cs.stanford.edu> wrote:
> On 7/19/06 8:37 AM, "Frances Berriman" <fberriman at gmail.com> wrote:
>
> > http://cafe.elharo.com/xml/must-ignore-vs-microformats
> >
> > A friend of mine showed me this today.  Macroformats, over Microformats.
>
> The article is terrible and about 90% incorrect.  Unfortunately this is
> perhaps in due in some part to the IBM article which though decent overall,
> has some errors itself, and takes a walk through transcoding to XML and back
> which is interesting but perhaps unnecessary.
>
> The author of the "macroformats" article misses all the reasons that XML has
> failed on the Web, and all the specific design principles that have gone
> into microformats that were developed by learning from XML's failure.  In
> fact, he continues to push several of these reasons as actual *plusses* for
> XML (namespaces, invalidity, etc.)
>
> There will continue to be plenty of folks banging there head against the
> wall and trying to push "plain old xml" (POX) on the Web, and they will
> likely continue to see the same amount of success as they have to date.
>
> What we can do to be helpful:
>
>
> 1. Dissect articles like this into a series of assertions/questions and put
> them on the wiki, e.g.:
>
> * "why would anyone write markup like this? It brings exactly nothing to the
> table."

(Sorry to bring up a point for XML, but.... I know others will
probably bring this up outside of here... so I might as well do it
here....)

One "good" thing about XML, IMO, is that for certain simple markups
based on XML, it's easier for a beginner-level or intermediate-level
developer to write a parser for it (as compared to writing a parser
for Micrformats... since HTML is more difficult to parse).

(For example, writing a parser in C, C++, PHP, Java, C# or whatever.)

One example of such a simple format based on XML is RSS.

I'd say it is pretty easy for someone to write a parser for it since
RSS is such a simple markup.  (Although, technically, their parser
will probably be wrong and might choke and die if some fancy things
are done with the XML... like using namespaces, adding DTD's, etc.)

OPML is probably another example too of a simple XML markup.

And yes, I know both formats have ALOT of problems.  But their
simplicity (in that respect) helps bring on developer adoption.  (Or
at least, helps bring on adoption by a certain kind of developer.)

Sure the parsers they write might be technically wrong... but these
developers can "see" something going pretty quickly.  (Which might
encourage them to further develop their systems.  And maybe even
eventually support all the "advanced" stuff to make their parsers
technically correct.)

Now, having said that, in other realms, Microformats are much much
easier to parse.  (Like for in-browser technologies.  Like CSS
styling, JavaScript manipulation, and user scripts.... like
greasemonkey.)

(I even have a PHP parser written that makes parsing Microformats and
other kinds of semantic HTML dead easy... coming to you via LGPL
eventually... once I improve the HTML-repairing part of it.  Gotta
compile tidy and see if that can improve the HTML-repairing.)

So, maybe we should address that point to.  Maybe something like...

Q: But writing parsers for Microformats is hard in language X...
A: You don't need to write a parser in language X, here's a list of
some parsers....

See ya

-- 
    Charles Iliya Krempeaux, B.Sc.

    charles @ reptile.ca
    supercanadian @ gmail.com

    developer weblog: http://ChangeLog.ca/
___________________________________________________________________________
 Make Television                                http://maketelevision.com/