[uf-dev] parsing question

Ryan King ryan at technorati.com
Thu Jan 11 10:21:08 PST 2007


On Jan 11, 2007, at 1:28 AM, Kevin Marks wrote:
> On Jan 11, 2007, at 1:13 AM, Brian Suda wrote:
>> On 1/10/07, Ryan King <ryan at technorati.com> wrote:
>>> How should I handle the following in hCard:
>>
>> --- i think the valuw would be '//'
>>
>> The class="email" on the DIV is not an 'a' or 'abbr' (it is a div) so
>> paring rules say use the node value.
>>
>> In this case the node value is a script node, which contains only  
>> ' //'
>>
>> As a parser, you should NOT look inside comments <!-- --> or <![CDATA
>>
>> That's my take? any other thoughts.
>>
> I think that also matches the author's intent - the point of that  
> kind of gibberish is to avoid machine-harvesting of email  
> addresses, so pulling it out in a microformat parser would undo that.

Thanks for all the input guys. I posted this partially because it  
frustrated and disgusted me and wanted you to share in my pain. :D

I'm acutally thinking now that we should just ignore content inside  
<script> elements. I've yet to see a case where there's valuable  
microformat content inside a script tag.

-ryan
--
Ryan King
ryan at technorati.com





More information about the microformats-dev mailing list