From dimitri.glazkov at gmail.com  Thu Oct  4 08:53:53 2007
From: dimitri.glazkov at gmail.com (Dimitri Glazkov)
Date: Thu Oct  4 08:53:57 2007
Subject: [uf-dev] JSON serialization of microformats
Message-ID: <fb15ac210710040853r26d43c10xe0ecd413195ea6cd@mail.gmail.com>

Just had a brief conversation on IRC w/Mike, and I agree with him
(Mike, feel free to object if I am putting words in your mouth). We
need to set some guidelines (standard?) on JSON serialization of
microformats. This has been an issue of minimal importance when JSON
intersection was limited to test cases, but with arrival of Optimus,
the mass has shifted.

For instance, currently Optimus (Dmitry, this is no knock against you
or the product), will intelligently substitute values for arrays if
there is more than one value detected. For instance, hcard: {} will
become hcard: [ {}, ... ] if there is more than one hcard. While this
is indeed clever, it requires branching for the reader of the
resulting JSON, albeit a very easy one.

It seems that if the value can be plural, it should be always wrapped
into an array. But ultimately, both Mike and I feel that this should
be agreed upon and documented on the wiki to let developers of JSON
producers and consumers be on the same page. I'd be happy to get the
process started, documenting existing behavior based on test cases and
Optimus output, provided we have consensus on utility of the endeavor.

What are your thoughts on this?

:DG<
From wilson.jim.r at gmail.com  Thu Oct  4 10:05:22 2007
From: wilson.jim.r at gmail.com (Jim Wilson)
Date: Thu Oct  4 10:05:25 2007
Subject: [uf-dev] JSON serialization of microformats
In-Reply-To: <fb15ac210710040853r26d43c10xe0ecd413195ea6cd@mail.gmail.com>
References: <fb15ac210710040853r26d43c10xe0ecd413195ea6cd@mail.gmail.com>
Message-ID: <ac08e8d0710041005w3b5c0c1cq4a959ea5a8d08c89@mail.gmail.com>

Hi Dimitri,

> While this is indeed clever, it requires branching for the reader of the
> resulting JSON, albeit a very easy one. It seems that if the value can
> be plural, it should be always wrapped into an array.

Personally (and this is just my opinion) I don't like that.  In order
to enforce such a rule, you have to know in advance whether any
particular thing can or cannot be plural - which would seem to be
context specific.

Also, what does that say about when something can be plural but there
are none of them? Should it be an empty array?  Should it be null?
Should the value even be represented as a key in the parent hash?

I'd like to see the JSON representations stay flexible and expect the
consumer to deal with reasonable variability.

> But ultimately, both Mike and I feel that this should
> be agreed upon and documented on the wiki to let developers of JSON
> producers and consumers be on the same page.

Agreed - sounds like a good idea!  Perhaps a discussion page is in order?

-- Jim R. Wilson (jimbojw)

On 10/4/07, Dimitri Glazkov <dimitri.glazkov@gmail.com> wrote:
> Just had a brief conversation on IRC w/Mike, and I agree with him
> (Mike, feel free to object if I am putting words in your mouth). We
> need to set some guidelines (standard?) on JSON serialization of
> microformats. This has been an issue of minimal importance when JSON
> intersection was limited to test cases, but with arrival of Optimus,
> the mass has shifted.
>
> For instance, currently Optimus (Dmitry, this is no knock against you
> or the product), will intelligently substitute values for arrays if
> there is more than one value detected. For instance, hcard: {} will
> become hcard: [ {}, ... ] if there is more than one hcard. While this
> is indeed clever, it requires branching for the reader of the
> resulting JSON, albeit a very easy one.
>
> It seems that if the value can be plural, it should be always wrapped
> into an array. But ultimately, both Mike and I feel that this should
> be agreed upon and documented on the wiki to let developers of JSON
> producers and consumers be on the same page. I'd be happy to get the
> process started, documenting existing behavior based on test cases and
> Optimus output, provided we have consensus on utility of the endeavor.
>
> What are your thoughts on this?
>
> :DG<
> _______________________________________________
> microformats-dev mailing list
> microformats-dev@microformats.org
> http://microformats.org/mailman/listinfo/microformats-dev
>
From microformats at kaply.com  Thu Oct  4 10:23:37 2007
From: microformats at kaply.com (Mike Kaply)
Date: Thu Oct  4 10:23:40 2007
Subject: [uf-dev] JSON serialization of microformats
In-Reply-To: <ac08e8d0710041005w3b5c0c1cq4a959ea5a8d08c89@mail.gmail.com>
References: <fb15ac210710040853r26d43c10xe0ecd413195ea6cd@mail.gmail.com>
	<ac08e8d0710041005w3b5c0c1cq4a959ea5a8d08c89@mail.gmail.com>
Message-ID: <e06e0e0b0710041023u33794b53ufdb714d19f606742@mail.gmail.com>

On 10/4/07, Jim Wilson <wilson.jim.r@gmail.com> wrote:
>
> Hi Dimitri,
>
> > While this is indeed clever, it requires branching for the reader of the
> > resulting JSON, albeit a very easy one. It seems that if the value can
> > be plural, it should be always wrapped into an array.
>
> Personally (and this is just my opinion) I don't like that.  In order
> to enforce such a rule, you have to know in advance whether any
> particular thing can or cannot be plural - which would seem to be
> context specific.


You do know in advance. The microformat spec indicates which things are
plural

Also, what does that say about when something can be plural but there
> are none of them? Should it be an empty array?  Should it be null?
> Should the value even be represented as a key in the parent hash?


In JSON it simply wouldn't be in the structure at all...

Mike Kaply
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://microformats.org/discuss/mail/microformats-dev/attachments/20071004/0c7f67c3/attachment.html
From dmitry.baranovskiy at gmail.com  Thu Oct  4 18:33:00 2007
From: dmitry.baranovskiy at gmail.com (Dmitry Baranovskiy)
Date: Thu Oct  4 18:33:04 2007
Subject: [uf-dev] JSON serialization of microformats
In-Reply-To: <e06e0e0b0710041023u33794b53ufdb714d19f606742@mail.gmail.com>
References: <fb15ac210710040853r26d43c10xe0ecd413195ea6cd@mail.gmail.com>
	<ac08e8d0710041005w3b5c0c1cq4a959ea5a8d08c89@mail.gmail.com>
	<e06e0e0b0710041023u33794b53ufdb714d19f606742@mail.gmail.com>
Message-ID: <8a52ddad0710041833p51db7566k67f72716c04124bd@mail.gmail.com>

As far as I know there is already project related to standardising
JSON for ?f: http://microjson.org/
There is a proposal for hCard: http://microjson.org/wiki/JCard

What I don't like there is using of different names: "name" instead of
"n" for instance. We already have ?f specs, so I think it would be
easier to simply follow them.

Other thing?I think JSON is good as long as it has all the data you
wish and it is structured properly. I was thinking it would be better
to have one value as value, not an array of one element? I still think
so ? but it is not a point I am going to fight for.

Probably we could start page at Wiki for this and output some
proposals for JSON format for, lets say, hCalendar.

From mdagn at spraci.com  Thu Oct  4 19:05:09 2007
From: mdagn at spraci.com (Michael MD)
Date: Thu Oct  4 19:05:00 2007
Subject: [uf-dev] JSON serialization of microformats
References: <fb15ac210710040853r26d43c10xe0ecd413195ea6cd@mail.gmail.com>
Message-ID: <001c01c806f4$282b6aa0$116bacca@COMCEN>

>
> For instance, currently Optimus (Dmitry, this is no knock against you
> or the product), will intelligently substitute values for arrays if
> there is more than one value detected. For instance, hcard: {} will
> become hcard: [ {}, ... ] if there is more than one hcard. While this
> is indeed clever, it requires branching for the reader of the
> resulting JSON, albeit a very easy one.
>
> It seems that if the value can be plural, it should be always wrapped
> into an array. But ultimately, both Mike and I feel that this should
> be agreed upon and documented on the wiki to let developers of JSON
> producers and consumers be on the same page. I'd be happy to get the
> process started, documenting existing behavior based on test cases and
> Optimus output, provided we have consensus on utility of the endeavor.
>


not just JSON....

The little perl liberal parser I made here  (work in progress ... still some 
bugs to fix) puts its output into a perl hash - with any element where there 
can be multiple values represented by an array. (that should be reflected in 
any conversion to JSON too).

I don't really see the need to substitute ..
the first one found could be in [0] regardless...

I'd like to see a bit of discussion about such output  from parsers 
generally with the aim of
working out something that where it is as easy as possible to get at data 
for simple applications
while still maintaining some clues to its hierarchy for more complex apps. 
(even if it means some data might need to be duplicated)

any ideas?


From kevinmarks at gmail.com  Fri Oct  5 00:43:12 2007
From: kevinmarks at gmail.com (Kevin Marks)
Date: Fri Oct  5 00:43:14 2007
Subject: [uf-dev] JSON serialization of microformats
In-Reply-To: <001c01c806f4$282b6aa0$116bacca@COMCEN>
References: <fb15ac210710040853r26d43c10xe0ecd413195ea6cd@mail.gmail.com>
	<001c01c806f4$282b6aa0$116bacca@COMCEN>
Message-ID: <73766b160710050043v643eb60clde8324709ecad800@mail.gmail.com>

> not just JSON....
>
> The little perl liberal parser I made here  (work in progress ... still some
> bugs to fix) puts its output into a perl hash - with any element where there
> can be multiple values represented by an array. (that should be reflected in
> any conversion to JSON too).
>
> I don't really see the need to substitute ..
> the first one found could be in [0] regardless...
>
> I'd like to see a bit of discussion about such output  from parsers
> generally with the aim of
> working out something that where it is as easy as possible to get at data
> for simple applications
> while still maintaining some clues to its hierarchy for more complex apps.
> (even if it means some data might need to be duplicated)

JSON is a good way to represent the kinds of nested hashes and arrays
that perl, PHP, Python and Javascript all make natural - there is a
consistent two-way mapping to native forms in each of these languages.
From danny.ayers at gmail.com  Fri Oct  5 01:53:39 2007
From: danny.ayers at gmail.com (Danny Ayers)
Date: Fri Oct  5 01:53:42 2007
Subject: [uf-dev] JSON serialization of microformats
In-Reply-To: <73766b160710050043v643eb60clde8324709ecad800@mail.gmail.com>
References: <fb15ac210710040853r26d43c10xe0ecd413195ea6cd@mail.gmail.com>
	<001c01c806f4$282b6aa0$116bacca@COMCEN>
	<73766b160710050043v643eb60clde8324709ecad800@mail.gmail.com>
Message-ID: <1f2ed5cd0710050153r57424e71k5bfb07c670057155@mail.gmail.com>

fyi, there's a JSON/RDF under discussion:

http://n2.talis.com/wiki/RDF_JSON_Specification

It's normalised down to this kind of shape:

(
resource (
                property:value,
                property:value...
             )
resource (
               property:value,
               property:value...
             )
)

Anything expressed in the other main RDF serializations (RDF/XML,
Turtle, RDFa, eRDF/HTML, GRDDL/XML etc...) should be serializable this
way - although I'm not sure how much that's been tested. There's
already a fairly stable JSON representation of SPARQL query results:
http://www.w3.org/2001/sw/DataAccess/json-sparql/

If the microformats JSON captures all the relevant info, it should be
fairly cleanly mappable to RDF/JSON. I imagine names used for
attributes in the uF version could be disambiguated to URIs by
prepending the microformat's profile URI (plus a '#' or '/' if
necessary).

Cheers,
Danny.

-- 

http://dannyayers.com
From dimitri.glazkov at gmail.com  Fri Oct  5 08:13:29 2007
From: dimitri.glazkov at gmail.com (Dimitri Glazkov)
Date: Fri Oct  5 08:13:33 2007
Subject: [uf-dev] Re: JSON serialization of microformats
In-Reply-To: <8a52ddad0710041833p51db7566k67f72716c04124bd@mail.gmail.com>
References: <fb15ac210710040853r26d43c10xe0ecd413195ea6cd@mail.gmail.com>
	<ac08e8d0710041005w3b5c0c1cq4a959ea5a8d08c89@mail.gmail.com>
	<e06e0e0b0710041023u33794b53ufdb714d19f606742@mail.gmail.com>
	<8a52ddad0710041833p51db7566k67f72716c04124bd@mail.gmail.com>
Message-ID: <fb15ac210710050813v513d36e6l70967a084a7e0f74@mail.gmail.com>

I created a page yesterday:

http://microformats.org/wiki/json

Dmitry, can you briefly document how Optimus does the JSON serialization?

Ideally, I would like the serialization spec to be a short bulleted
list of rules that are applicable to any formats.

On 10/4/07, Dmitry Baranovskiy <dmitry.baranovskiy@gmail.com> wrote:
> As far as I know there is already project related to standardising
> JSON for ?f: http://microjson.org/
> There is a proposal for hCard: http://microjson.org/wiki/JCard
>
> What I don't like there is using of different names: "name" instead of
> "n" for instance. We already have ?f specs, so I think it would be
> easier to simply follow them.
>
> Other thing?I think JSON is good as long as it has all the data you
> wish and it is structured properly. I was thinking it would be better
> to have one value as value, not an array of one element? I still think
> so ? but it is not a point I am going to fight for.
>
> Probably we could start page at Wiki for this and output some
> proposals for JSON format for, lets say, hCalendar.
>
> _______________________________________________
> microformats-dev mailing list
> microformats-dev@microformats.org
> http://microformats.org/mailman/listinfo/microformats-dev
>

From uf-discuss at cilux.org  Thu Oct 11 02:42:42 2007
From: uf-discuss at cilux.org (Duncan Cragg)
Date: Thu Oct 11 02:42:45 2007
Subject: [uf-dev] Linked JSON Microformats (and the 'Micro Mash' viewer)
Message-ID: <5b5fe14a0710110242n18690399r4542d5e143f892b0@mail.gmail.com>

Hello, uf-dev!
____________________________

I've been advised on uf-discuss to continue a thread of mine over here:

 http://microformats.org/discuss/mail/microformats-discuss/2007-October/010838.html

It started here:

 http://microformats.org/discuss/mail/microformats-discuss/2007-September/010769.html

Then went on to this:

 http://microformats.org/discuss/mail/microformats-discuss/2007-October/010833.html

Is there any interest over here in this kind of thing?
____________________________

Well, two things:

 - pulling out uFs into URI-referencable JSON resources
 - then linking those JSON objects up.

And calling it 'hyperdata' (as opposed to 'Hyperdata', the Semantic
Web version).

Or linked data (as opposed to Linked Data). You get the idea...
____________________________

Sidebar:

I'm currently working on a Javascript 'browser extension' for viewing
such linked JSON objects, which I call 'micros'.  Micros can be uFs or
other uF-like data.

Working name: 'Micro Mash'    =0)

Aim: to be able to create mashups declaratively, - i.e. without
writing any imperative code.

Uses Ajax to pull in micros and assemble them in the page down to some
depth into the graph. Uses script tags to allow cross-site mashing. If
the script sees a geo, it can decorate it with a map, if it sees an
hfeed, it can look like a feed reader, etc. Of course, hCard, etc
JSONs get rendered in DOM hCard, etc form, so Operator can see them.
Non-uF JSONs end up in POSH. Whole page wrapped in a nav decoration
(header, footer, sidebar, view controls, etc), also defined in JSON.

I'm working on JSON versions of hAtom and hCard (here at the FT we've
got news and companies, right?!). JSON elements can be selected via
JSONPath-like selectors, which jump inter-micro links transparently.

No server-side support needed (just vanilla Apache). All the funky
Ajax/DHTML driven by declarative programming of JSON, no raw
Javascript needed.

If you're interested I'll let you know when it's worth looking at.
I've already done an XHTML version, but I'm now porting it to pure
JSON.
____________________________

Cheers!

Duncan Cragg
The Financial Times Group (UK)
http://duncan-cragg.org/blog/
____________________________
From andy at pigsonthewing.org.uk  Thu Oct 11 03:25:06 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Thu Oct 11 03:25:16 2007
Subject: [uf-dev] Linked JSON Microformats (and the 'Micro Mash' viewer)
In-Reply-To: <5b5fe14a0710110242n18690399r4542d5e143f892b0@mail.gmail.com>
References: <5b5fe14a0710110242n18690399r4542d5e143f892b0@mail.gmail.com>
Message-ID: <37506.80.249.57.38.1192098306.squirrel@www.gradwell.com>

On Thu, October 11, 2007 10:42, Duncan Cragg wrote:

> 'hyperdata' (as opposed to 'Hyperdata', the Semantic Web version).

> Or linked data (as opposed to Linked Data).

Differentiating terms by capitalisation is A Bad Thing (TM). Some search
engines are case-insensitive; there are problems at the start of
sentences, and assistive technologies and aural browsers may render both
versions equally (just how different do 'hyperdata' and 'Hyperdata' 
sound?)

What about microdata (data from microformats)?

-- 
Andy Mabbett
** via webmail **

From uf-discuss at cilux.org  Thu Oct 11 03:51:49 2007
From: uf-discuss at cilux.org (Duncan Cragg)
Date: Thu Oct 11 03:51:50 2007
Subject: [uf-dev] Linked JSON Microformats (and the 'Micro Mash' viewer)
In-Reply-To: <37506.80.249.57.38.1192098306.squirrel@www.gradwell.com>
References: <5b5fe14a0710110242n18690399r4542d5e143f892b0@mail.gmail.com>
	<37506.80.249.57.38.1192098306.squirrel@www.gradwell.com>
Message-ID: <5b5fe14a0710110351v4600c011j313eda36d8378391@mail.gmail.com>

> > 'hyperdata' (as opposed to 'Hyperdata', the Semantic Web version).
> > Or linked data (as opposed to Linked Data).
>
> Differentiating terms by capitalisation is A Bad Thing (TM).

Ooops: It was not a suggestion to actually do this: I was just trying
to implicitly refer to the concept of 'small-s' and 'big-s' semantic
web!  Sorry I wasn't more clear about that!  =0)  Oh, btw/BTW, isn't
it 'tm' not 'TM'??!!

Duncan
From andy at pigsonthewing.org.uk  Thu Oct 11 04:02:15 2007
From: andy at pigsonthewing.org.uk (Andy Mabbett)
Date: Thu Oct 11 04:02:27 2007
Subject: [uf-dev] Linked JSON Microformats (and the 'Micro Mash' viewer)
In-Reply-To: <5b5fe14a0710110351v4600c011j313eda36d8378391@mail.gmail.com>
References: <5b5fe14a0710110242n18690399r4542d5e143f892b0@mail.gmail.com>
	<37506.80.249.57.38.1192098306.squirrel@www.gradwell.com>
	<5b5fe14a0710110351v4600c011j313eda36d8378391@mail.gmail.com>
Message-ID: <48526.80.249.57.38.1192100535.squirrel@www.gradwell.com>

On Thu, October 11, 2007 11:51, Duncan Cragg wrote:

>> Differentiating terms by capitalisation is A Bad Thing (TM).

>Oh, btw/BTW, isn't it 'tm' not 'TM'??!!

They're the same. Differentiating terms by capitalisation is A Bad Thing.

-- 
Andy Mabbett
** via webmail **

From jeff at jeffmcneill.com  Sat Oct 20 17:42:47 2007
From: jeff at jeffmcneill.com (Jeff McNeill)
Date: Sat Oct 20 17:42:50 2007
Subject: [uf-dev] MediaWiki extension to support classes in anchor tags
Message-ID: <519fa62f0710201742i287008b4j3f0ddb7ae0543c07@mail.gmail.com>

Hi folks,

I hacked a MediaWiki extension for class support in anchor tags...
http://www.mediawiki.org/wiki/Extension:ExtendAnchorTags

Any and all input or suggestions welcome.

-- 
Sincerely,
Jeff McNeill
http://jeffmcneill.com/
From jeff at jeffmcneill.com  Sat Oct 20 21:56:37 2007
From: jeff at jeffmcneill.com (Jeff McNeill)
Date: Sat Oct 20 21:56:40 2007
Subject: [uf-dev] Second mediawiki extension: EnableAbbrTags
Message-ID: <519fa62f0710202156g21869dbduc3e2676590bfdf31@mail.gmail.com>

Aloha folks,

A second mediawiki extension has been released that enables support
for the abbr tag:

http://www.mediawiki.org/wiki/Extension:EnableAbbrTags

This along with ExtendAnchorTags (
http://www.mediawiki.org/wiki/Extension:ExtendAnchorTags ) helps
provide support for POSH and Microformats on the MediaWiki platform.

With these two, just replace all <abbr...></abbr> with
<xabbr...></xabbr> and <a...></a> with <xa...></xa> and everything is
good to go... woot!

All feedback welcome and encouraged.

P.S., It would be lovely if these extensions could be enabled on
microformats.org itself.

-- 
Sincerely,
Jeff McNeill
http://jeffmcneill.com/
From tantek at cs.stanford.edu  Sun Oct 21 09:56:09 2007
From: tantek at cs.stanford.edu (Tantek =?ISO-8859-1?B?xw==?=elik)
Date: Sun Oct 21 09:54:49 2007
Subject: [uf-dev] Second mediawiki extension: EnableAbbrTags
In-Reply-To: <519fa62f0710202156g21869dbduc3e2676590bfdf31@mail.gmail.com>
Message-ID: <C340D26B.9619D%tantek@cs.stanford.edu>

On 10/20/07 9:56 PM, "Jeff McNeill" <jeff@jeffmcneill.com> wrote:

> Aloha folks,
> 
> A second mediawiki extension has been released that enables support
> for the abbr tag:
> 
> http://www.mediawiki.org/wiki/Extension:EnableAbbrTags
> 
> This along with ExtendAnchorTags (
> http://www.mediawiki.org/wiki/Extension:ExtendAnchorTags ) helps
> provide support for POSH and Microformats on the MediaWiki platform.
> 
> With these two, just replace all <abbr...></abbr> with
> <xabbr...></xabbr> and <a...></a> with <xa...></xa> and everything is
> good to go... woot!
> 
> All feedback welcome and encouraged.
> 
> P.S., It would be lovely if these extensions could be enabled on
> microformats.org itself.

Jeff, both of these look very cool!

Just so this is not lost in email, could you add both extensions to:

http://microformats.org/wiki/to-do#Wiki_improvements

Thanks again!

Tantek

From wilson.jim.r at gmail.com  Sun Oct 21 22:21:24 2007
From: wilson.jim.r at gmail.com (Jim Wilson)
Date: Sun Oct 21 22:21:26 2007
Subject: [uf-dev] MediaWiki extension to support classes in anchor tags
In-Reply-To: <519fa62f0710201742i287008b4j3f0ddb7ae0543c07@mail.gmail.com>
References: <519fa62f0710201742i287008b4j3f0ddb7ae0543c07@mail.gmail.com>
Message-ID: <ac08e8d0710212221o40d34d28i9bbc0d8aec27095d@mail.gmail.com>

Hi Jeff,

Please excuse the length of this email.  I think it's really great
that you're doing MediaWiki extension development, and I encourage you
to keep it up!

I noticed another of your extensions at mediawiki.org having to do
with "enabling" <abbr> tags.  Unfortunately, MediaWiki does not make
it easy to simply "enable" these kinds of things - causing a vacuum to
be filled by extensions such as yours.  It has been on my TODO list
for a while to modify the Parser and Sanitizer classes to give wiki
administrators granular control over which tags/attributes are
allowed, but alas I haven't gotten around to it :/

In any case, here are my thoughts on your extension as it is today:

Coding Observations:

* The following line should be removed from the Extension code (it is
unnecessary):

  $wgHooks['ParserAfterStrip'][] = 'extendAnchorTag';

* The comments mention matching the href attribute against
$wgUrlProtocols, but that variable isn't actually checked.  Instead, I
suggest validating against the result of wfUrlProtocols() since this
will handle the concatenation for you, and allow the wiki admin to
retain control over the allowed protocol list.

* There's a lot of code duplication in the tag rendering portion of
startExtendAnchor().  This can be shortened greatly (code sample
provided at the end).

* Also, it's legal in MediaWiki to create extension tags with the
names of real actual tags, so feel free to use <a> instead of <xa> if
you feel it's more appropriate - it's totally up to you.

Security Observations:

* I notice you use htmlspecialchars() to sanitize user input prior to
display.  This is a great first step.  To be even more effective, you
probably want to set the optional second parameter to ENT_QUOTES.
This will ensure that single quotes will be encoded along with double
quotes.  For non-url parameters (like target or class), I usually
recommend wfUrlencode(), which builds on PHP's native urlencode()
method.  Also note that htmlspecialchars() is often not sufficient to
stop XSS attacks (as illustrated by the following example).

* If the internal anchor text starts with "<img ", it is kept after an
htmlspecialchars() cleansing. I believe this indicates that you mean
to allow images.  Unfortunately this permits the user to hotlink
external images (which a wiki administrator may want to prevent).
Secondly, this provides a convenient injection vector for arbitrary
JavaScript.  To see why, consider this markup:

  <xa href="#"><img src='http://jimbojw.com/images/transparent.png'
onLoad='alert("XSS Code Here...")' /></xa>

When added to a wiki page, this will execute the alert() after the
image has loaded, meaning after it has downloaded from the server (or
been retrieved from the browser cache).

The alternative I'd suggest is using the Parser's recursiveTagParse()
method to evaluate the $input.

So, without further ado, here is my proposed implementation of
startExtendAnchor():

----------------------------------------
function startExtendAnchor( $input, $argv, &$parser ) {

    # Short-circuit if required 'href' param is missing
    if (!isset($argv['href']))
        return "<div class='errorbox'>Error: <tt>href</tt> attribute
missing for <tt>&lt;xa&gt;</tt> tag.</div>";

    # Short-circuit if a bad protocol is encountered
    if (!preg_match( '/^(#|'.wfUrlProtocols().')/', $argv['href']))
        return "<div class='errorbox'>Error: Bad protocol specified in
<tt>href</tt> attribute of <tt>&lt;xa&gt;</tt> tag.</div>";

    # Set aside $href and sanitize the rest of the $argv array
    $href = htmlspecialchars( $argv['href'], ENT_QUOTES);
    $argv = array_intersect_key( $argv, array( 'target'=>1,
'class'=>1, 'rel'=>1 ) );
    array_walk( $argv, 'wfUrlencode' );

    # Build and return anchor markup
    $anchor = "<a href=\"$href\"";
    foreach ( $argv as $attrib => $val ) $anchor .= " $attrib=\"$val\"";
    return $anchor . " >" . $parser->recursiveTagParse($input) . "</a>";

}
----------------------------------------

I'll be happy to answer any questions, and good luck!

-- Jim R. Wilson (jimbojw)

On 10/20/07, Jeff McNeill <jeff@jeffmcneill.com> wrote:
> Hi folks,
>
> I hacked a MediaWiki extension for class support in anchor tags...
> http://www.mediawiki.org/wiki/Extension:ExtendAnchorTags
>
> Any and all input or suggestions welcome.
>
> --
> Sincerely,
> Jeff McNeill
> http://jeffmcneill.com/
> _______________________________________________
> microformats-dev mailing list
> microformats-dev@microformats.org
> http://microformats.org/mailman/listinfo/microformats-dev
>