[uf-discuss] HTML Analytical Lexicon

Scott Reynen scott at randomchaos.com
Sat Apr 1 08:58:24 PST 2006


After reading the recent discussion of a "meta-microformat," I  
decided to play around with the idea.  I've probably gone through  
9000 revisions, but I think I have a usable tool now, which I'm  
calling "HTML Analytical Lexicon."  What it does is parse natural  
language queries and derive a semantic representation (internally  
stored as RDF) using Princeton's WordNet:

http://wordnet.princeton.edu/

Then it searches an internal cache of microformatted HTML content  
(converted via XSLTs to RDF) collected from around the web, assigned  
meaning via XMDP if available, otherwise WordNet.  It finally  
translates the RDF result back into natural language results.

So far it seems to be working well.  You can submit a query like  
"Where is Tantek Çelik right now?" and it will find Tantek's vcard  
info, match that against all the hcalendar data  on the web, look for  
a vevent matching the current time, pull out the location, and return  
an answer like "Tantek is at home."

Because it uses WordNet to determine meaning of both content and  
class names, it doesn't require formal definitions of microformats,  
so it will work just as well for future microformats that haven't  
been created yet or even unstructured formats individuals use in  
their own markup.

I'm still making changes and testing so things may break, but if  
you'd like to try it out, here are some example queries:

http://randomchaos.com/microformats/hal/?q=Where+it+Tantek+Çelik+right 
+now?
http://randomchaos.com/microformats/hal/?q=What+is+the+date+today?
http://randomchaos.com/microformats/hal/?q=Open+the+pod+bay+doors

Peace,
Scott


More information about the microformats-discuss mailing list