title attribute and abbreviated
classnames(Was:[uf-discuss]Currency Quickpoll: Preliminary results)
Brian Suda
brian.suda at gmail.com
Thu Oct 19 09:43:01 PDT 2006
On 10/19/06, Ciaran McNulty <mail at ciaranmcnulty.com> wrote:
> For instance, we could introduce the implied optimisation that if
> there is no explicit 'amount' then the amount could be taken to be
> everything inside the 'money' that isn't the 'currency'.
>
> i.e. <span class="money"><abbr class="currency"
> title="USD">$</abbr>5.99</span> would be equivalent to your example
> above.
>
> That would simplify the markup in a large number of the cases, and I
> don't think would complicate the parsing *too* much.
--- while i would dissagree that it is infact complicated for parsers,
that is not our target audience. i know in the past, i've mentioned
things like "oh, it would be so much easier for X2V if ...." and the
response has always been no, we should favour the publishers.
So, i am willing to explore some optimizations at the risk of adding
some complexity to the parsers.
To be more specific the issue of not having a class="value" or
class="amount" means that there is no easy XPath expression to extract
the data.
//*[@class="money"] would get you both the $ and the 5.99
//*[@class="money"]//*[@class="currency"] will get you the currency value $
(you actually need to check first the type if name()='abbr' then use @title ...)
//*[@class="money"]//*[@class="amount"] would easily get you 5.99
we then say that class="amount" is optional then the XPath would need
to be something like
//*[@class="money"]//*[@class="currency"]::next-sibling() to get the
#text node that is next to the $, but it could actually be before it
35.99 kr, so there is more checking involved. (and that is assuming my
XPath is correct?)
Depending on how you extract MFs it might be easier or impossible.
Your REGEX milage may vary...
But i am willing to discuss optimizations and any issues.
-brian
--
brian suda
http://suda.co.uk
More information about the microformats-discuss
mailing list