[uf-new] Microformats archival of non-HTML files? (was: Receipt microformat)

Manu Sporny msporny at digitalbazaar.com
Mon Jul 23 12:47:14 PDT 2007

Storset, Leif wrote:
> Manu Sporny wrote:
>> I'm assuming that Leif will take lead on this and post the first set 
>> of examples (let me know if this is not the case, Leif).
> I have now sent all my collected samples to Rob Manson for upload. 

The receipt-examples collection process raises an interesting problem:

Where are these receipt files going to be stored long-term?

I attempted to upload one of my receipts to the website because the URL
is no longer valid. This is the case with a large number of
receipt-displaying websites. However, it is vital that we archive these
receipts in a public area that is permanent and accessible to all -
including how the examples changed through time. This place should be
under the control of microformats.org.

Several of the audio-examples links already do not work, making it
impossible to prove that the analysis statistics are valid. If we had
archived the examples when the analysis happened, this wouldn't be an issue.

I propose that the Microformats community create a central repository to
store content that are used as examples - such as inaccessible HTML,
data files, and images. Preferably, this repository would keep a
versioned history, much like the wiki and be accessible via the Web.

There are two ways that we could do this:

1. Enable image and data upload support via the wiki (preferable).
2. Create a subversion repository and make it browseable via HTTP.

Has this problem been addressed before, and if so, where are we supposed
to store files in the long term?

-- manu

PS: I would also like to donate the crawler and Microformat image
analysis software we created to the community. Where is the source code

More information about the microformats-new mailing list