[microformats-discuss] FYI: two posting about the Semantic Web, the "SynWeb", scraping and microformats

Tue Oct 25 08:05:04 PDT 2005

It might seem strange given that myself and the RDF t-shirt merged
long ago, but I feel obliged to nitpick.

On 10/25/05, Dimitri Glazkov <dimitri.glazkov at gmail.com> wrote:

> RDF and HTML are very closely related, because one is subset of
> another. Yes, from the infoset perspective, HTML is a form of RDF.
> It's just the former is limited to only two (implied) predicates:
>
> * holds/contains
> * links to
>
> All other relationships have to be defined in "rel", "rev", "class",
> "type", etc. attributes and are not part of the baseline semantics.
> So, let's not try to juxtaposition the two -- there is a reason why
> they seem so similar at times.

Here I'd step back further, and say, sure, the SGML/XML structure of
(X)HTML can be seen as having an implied containership predicate, more
even - that the components of the markup have additional semantics,
check Tantek's slides for those. The hyperlink relationships do nicely
match RDF's statements (there was Intelligent Design at work).

But I believe it's misleading to frame HTML as a subset of RDF, or at
least there's little to be gained from the picture. Yep, the stuff can
be viewed as a bunch of predicates, but you could equally say HTML is
a subset of Prolog, or a subset of Codd's relational model, or even a
subset of first order logic. You *can* view HTML data in this way, but
without a specific application in mind I don't think it buys you much,
apart from maybe irritating a few docheads (heh).

In the rainy day section of my to-do list is the task of writing a
text editor based on the RDF model. I think it's feasible, and likely
to be fun, but in many respects going against the grain. Keeping
things rigorously structured, in (document) order is difficult in RDF,
because it is such a generalised, open model. Whereas HTML just works.
This is probably a minority opinion, but I really don't think RDF's
strengths are around the handling of literal data at a granular level.

> Also, it is no surprise that holding/containment relationship is so
> overwhelmingly dominant in the current Web -- hierarchy is the most
> natural and easiest thing to grasp conceptually for us, the sapient
> ones.

Hmm, it's possible to argue that case for a lot of different data
structures, check any of the material on lists, e.g.
http://www.dehora.net/journal/2005/01/lml_list_markup_language.html

> So, the rest of argument really comes down to addresing these two questions:
>
> a) Is there a need to express more relationships than the two
> mentioned above? Looking at XFN, No-follow, and pretty much the whole
> microformats movement, the answer is undoubtedly yes.

On that I agree, and it's an intuition TimBL and others had long ago
that is now being borne out by the machine-logical expression
appearing in microformat HTML.

> b) Is it a near-future possibility that the definition and expression
> of relationships in popular serialization formats (ahem) will be
> strong enough to treat Web as the universal database, rather than data
> mine? I hope nobody argues that the stronger, the better.

I personally believe so, and I think experience suggests that it isn't
necessary to boil the ocean to get some utility from such a
perspective. A layered architecture is possible, and incremental steps
are possible. Mr. Pilgrim's phrase relating to such an increment if I
remember correctly was  "1000 times more useful".

Cheers,
Danny.
--

http://dannyayers.com