[uf-discuss] picoformats and extracting event data from text email
mdagn at spraci.com
Sun Jul 30 19:44:17 PDT 2006
I'm thinking about easy (for the user) ways an event promoter could add
machine-readable data to their emails for adding events to spraci.com
I get a lot of emails about upcoming events but there is no time to do data
(I say to them the correct way to add events to spraci.com is to use the
forms on the website or to provide a data feed).
However, it would be good if at least some of the stuff sent by email could
be included in the listings (as that could be a significant amount of extra
Obviously for html emails they could use hcalendar, but for plain text
emails and for users who are not familiar with html,xml, etc I'm thinking an
easy way for them might be just to include something like this in their
Name: (event name)
Date: event dates (fill date or start-end including the year)
Location: (including city and country)
Description: (short description, lineup, etc)
Categories: (comma separated list of tags for things such as event-type,
music, genres, etc)
This could be added after each event blurb with at least one blank
line separating it from any other text.
The important thing here is it has to be easy for any event promoter to do
without too much thought and must not need any special authoring tools. (so
the names must be obvious - this could get complicated if support for other
languages is added so initially I guess its just English)
If they omit the location field I could also make it check for '@' and use
anything between that and a new line.
Of course the names would have to be loose because it has to be very easy
for people to do without having to look them up. (otherwise I would probably
think of using the iCal/hCal names - A parser could accept both).
For date formats I think it would have to accept most common formats, such
as those accepted by commonly used perl modules or strtotime in php.
(with a warning to people not to use dd/mm/yyyy or mm/dd/yyyy as those are
ambiguous). It will probably be hard to explain to event promoters why the
year is required (they seem to be so used to leaving it off), I will
probably have to provide them with a list of accepted formats for dates.
I don't really see any easy way around that.
For the city/country a parser could check words against a list of known
cities/countries. If both a city and country (or state) are found it could
reduce the possibility of an incorrect match. (of course they will have to
be careful to spell them correctly and I probably would have to make it
somehow check for common alternate spellings of names and common
abbreviations for state names). If there are no matches it probably should
default to the area/city associated with the user.
This is similar to something I had for processing incoming events from a
certain contributor back in the mid 90s.
I needed something like this because the data from that person was
just plain text that was sent to me by email and I wanted to reduce the
amount of manual retyping.
I think this idea is closely related to picoformats so I'll be watching that
I'm thinking about this (yet again) for incoming emails and possibly also
future mobile applications where people are manually typing text as a single
More information about the microformats-discuss