[uf-discuss] Storing Microformats
zen at zenpsycho.com
Mon Sep 17 15:26:06 PDT 2007
I would say that a relational database would be the best option here-
if it weren't for SQL. It seems all these suggestions are all more
about avoiding the pain of dealing with SQL than doing the right
thing. The fact is, that microformat attributes all have well defined
relations which can easily be modeled with the relational model. The
fact that SQL makes the relational model so difficult to actually do
is the huge barrier here (that's why you were using a "flat" table,
and also why it didn't work).
One easy option is to simply serialize the microformat into json,xml,
or as the original html markup, and store it in a text blob. This is a
perfectly legitimate solution which is often avoided due to some
misunderstandings about atomicity in databases, or due to performance
(most often these performance concerns are 'premature optimization').
if you have more questions about the specifics of how I would do the
more difficult (but more flexible) relational solution, which I fear
may be straying off topic here, and would take some significant effort
to write out, please feel free to email me.
On 9/18/07, Philip Tellis <philip.tellis at gmail.com> wrote:
> On 18/09/2007, Paul Kinlan <paul.kinlan at gmail.com> wrote:
> > One of the other ideas that I am toying with is a Microformat spider,
> > that crawls the web looking for microformats, storing them and then
> > allowing them to be searched. My question is: How are people storing
> > the data present in microformats so that they can be searched and
> > maintained and consumed in applications etc?
> You may want to look at either an Object Oriented Database or an XML
> Database. Short of that, you're probably best of just using flat text
> files (each file is an object here) and letting your search engine
> index them. You'll quickly run into scale issues wrt number of files
> per directory, so take care of that.
> microformats-discuss mailing list
> microformats-discuss at microformats.org
More information about the microformats-discuss