[uf-discuss] Microformat for dictionary/thesaurus ... + research

Kevin Marks kmarks at technorati.com
Wed Nov 30 12:38:59 PST 2005


Certainly worth gathering examples. For my boys' wordlists, I wrote a  
scraper that went to merriam-webster, looked up a list of words, and  
created a XOXO approximation from their markup:

<table cellpadding="0" cellspacing="0" width="400" border="0">
	<tr>
		<td align="left">
One entry found for <b>impostor</b>.<form name="entry"  method=post  
action="/cgi-bin/dictionary"><table border="0" cellpadding="0"  
cellspacing="0" valign="top"><tr><td>
<input type=hidden name=hdwd value="impostor"><input type=hidden  
name=listword value="imposter"><input type=hidden name=book  
value=Dictionary></td></tr></table>
</form>
Main Entry:	<b>im·pos·tor</b> <a  
href="javascript:popWin('/cgi-bin/audio.pl? 
impost02.wav=impostor')"><img src="/images/audio.gif" border=0  
height=11 width=16></a><br>
Variant(s):	<i>or</i> <b>im·pos·ter</b> <a  
href="javascript:popWin('/cgi-bin/audio.pl? 
impost02.wav=imposter')"><img src="/images/audio.gif" border=0  
height=11 width=16></a> /<tt>im-'p&auml;s-t&amp;r</tt>/<br>
Function:	<i>noun</i><br>
Etymology:	Late Latin <i>impostor, </i>from Latin <i>imponere</i><br>
<b>:</b> one that assumes false identity or title for the purpose of  
deception
		</td>
		<td><img src="/images/pixt.gif" alt="" width="10" height="1"  
border="0"></td>
	</tr>	
</table>

to something cleaner, eg

http://homepage.mac.com/kevinmarks/flock.html

Needs more work (the 3rd layer senses should be a sub-list too, and the  
markup could be way more semantic, and converting their home-made  
phonetics to IPA would be nice).


On Nov 30, 2005, at 12:27 PM, Ryan King wrote:

> On Nov 30, 2005, at 12:20 PM, Chris Messina wrote:Anyway, checking out  
> that page resulted in a format that seemed ripe
>> for some dl|dt|dd microformat love:
>>
>> <entry type="subject" sortkey="sortkeyhere" status="Active"
>>                        title="noad:1.01" entry="0" stage="1">
>>     <meta> ...  </meta>
>>     <hwGrp> ... <hwGrp>
>>     <senseBlock>
>> 	   <meta> ...  </meta>
>> 	   <prelim> ... </prelim>
>> 	   <sense>
>> 	       <meta> ...  </meta>
>> 	       ...
>> 	   </sense>
>>     </senseBlock>
>> </entry>



More information about the microformats-discuss mailing list