From kavi at google.com Mon Apr 20 13:28:01 2009 From: kavi at google.com (Kavi Goel) Date: Mon Apr 20 13:28:08 2009 Subject: [uf-new] New proposal - aggregate microformats In-Reply-To: <199b56630904101822k2284c44bp24fd084f870b5813@mail.gmail.com> References: <199b56630904101822k2284c44bp24fd084f870b5813@mail.gmail.com> Message-ID: <199b56630904201328w5fdb320and0e83942a06012c6@mail.gmail.com> [I've been hearing that this email may not have gone through, so I am resending.] Please read a proposal for microformats about aggregate data (i.e. collections of reviews, collections of products, etc) below. Feedback is encouraged. ---------- Forwarded message ---------- From: Kavi Goel Date: Fri, Apr 10, 2009 at 6:22 PM Subject: New proposal - aggregate microformats To: "For discussion of new microformats." Microformats gurus, Awhile ago, I expressed interest in extending hReview to handle the "aggregate reviews" case (i.e. 34 reviews, average rating of 3.5 / 5). Currently hReview is designed to mark up only a single review, whereas frequently on the web the aggregate info is more important than any single user review. Technically, aggregating reviews by aggregating a bunch of individual hReviews across a site can be very difficult, so there is a lot of value to be gained by marking up aggregate information directly. The proposal that gained momentum was to define a new "aggregate hReview" microformat that contains one field "count" and has one embedded hReview. A further comment was made that if we are going to define a new microformat, we should apply this idea more generally to handle many types of aggregations rather than consider aggregate price info (aggregate hProduct/hListing), aggregate discussion posts (aggregate hAtom), etc separately. Othar Hansson and I have put together a proposal to handle aggregations of microformats in a general way. Please provide feedback: http://microformats.org/wiki/aggregate-microformat-template-examples http://microformats.org/wiki/aggregate-microformat-template-brainstorming Kavi From yngyani at gmail.com Tue Apr 21 06:08:01 2009 From: yngyani at gmail.com (Ganesh YN) Date: Tue Apr 21 06:08:05 2009 Subject: [uf-new] DCMF - Dublin Core metadata and Microformats Message-ID: <6f11d5660904210608v14c09b1dn3ac81266aa3dedf7@mail.gmail.com> Hi Everyone, Sharing a thought, The concept of Dublin Core Microformat (DCMF) was last presented by Dr. Eva Mendez during the DC2008 in Berlin. - http://dc2008.de/wp-content/uploads/2008/09/dc2008_mendezetal.pdf I had ealier discussed with Dr. Eva that the core elements in the microformat can be derived from the already established DC-KERNEL elements, (http://yngyani-semanticweb.blogspot.com/2009/04/microformats-and-dublin-core.html) There would be a terrific potential usage by the Library community as thay can benefit from this microformat if it gets formalized here as well. I would like to seek feedback & suggestions from the group. regards, Ganesh Yanamandra -- Please consider the environment before printing this email. From brian.suda at gmail.com Tue Apr 21 06:54:49 2009 From: brian.suda at gmail.com (Brian Suda) Date: Tue Apr 21 06:54:54 2009 Subject: [uf-new] DCMF - Dublin Core metadata and Microformats In-Reply-To: <6f11d5660904210608v14c09b1dn3ac81266aa3dedf7@mail.gmail.com> References: <6f11d5660904210608v14c09b1dn3ac81266aa3dedf7@mail.gmail.com> Message-ID: <21e770780904210654y5712e4e3k1dab6a6b438661b9@mail.gmail.com> On Tue, Apr 21, 2009 at 1:08 PM, Ganesh YN wrote: > There would be a terrific potential usage by the Library community as > thay can benefit from this microformat if it gets formalized here as > well. > > I would like to seek feedback & suggestions from the group. --- Hello, You should have a look at http://microformats.org/wiki/cite there was an effort to map the dublin core as well as other citation formats to class values as a microformat. Hopefully this will give you more information about where things stand and where they can move forward. -brian -- brian suda http://suda.co.uk From khaitan at gmail.com Thu Apr 23 07:25:48 2009 From: khaitan at gmail.com (Indus Khaitan) Date: Thu Apr 23 07:53:32 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries Message-ID: ufrs, I'm interested in finding a way to identify the content boundaries for content aggregated/published on a composite page. This is from the perspective of a search engine (and/or a data aggregator) which looks at the page as a single unit of content while in reality it may be a container page of aggregated data (I have explained this visually here http://www.khaitan.org/blog/2009/03/the-micro-content-problem-in-search-result-pollution/). Examples of such pages are comments, multiple posts on a single page, twitter's public_timeline, message boards/groups with discussions/threads, flickr photo sets, Question & Answer pages, composite FAQ pages, a monthly calendar, a task list, activity streams and so on. Expanding on a simple example: twitter's public timeline page, consists of 20 individual content (can I say micro-content?) units (or twitter statuses). On an aggregate page, these status messages are uniquely identifiable units of content but there is no determinate way of discerning boundaries of the individual statuses and successfully parsing them without knowing the visual arrangement of markup. The same problem statement can be attached to other example situations. The current mechanisms do not provide any semantic cues to a search bot. Nor, there is any easy way to detect duplicate content across multiple sites when it can be done using a simple annotation in an extended use-case of the proposed solution. I was hoping to see something around better content identification and grouping in the upcoming efforts, but the HTML5 spec for grouping content (See Sec. 4.5 http://www.whatwg.org/specs/web-apps/current-work/#grouping-content) only proposes markup for visually grouping the content. Possible Solution: I'm thinking of using something like "rel=cboundary" (not able to come up with a better name) with a link tag. Similar to "rel=nofollow" which provides meta information for un-endorsed links, "rel=cboundary" can provide the meta information for content demarcation/chunking. By adding this the page would indicate that the markup (or content) following the link is semantically demarcated from the markup preceding the link. This solution can be extended and can work with "rel=bookmark" to identify duplicate content when same micro content is present elsewhere. Another benefit I see is that this can become an elemental microformat and can be used in hReview, hCalendar and several other microformats like activity-streams, comments, etc. which are in active discussions. Would love to hear some comments before proceeding further. Indus -- http://khaitan.org +1 408 689 9587 From scott at makedatamakesense.com Thu Apr 23 08:39:10 2009 From: scott at makedatamakesense.com (Scott Reynen) Date: Thu Apr 23 08:39:15 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries In-Reply-To: References: Message-ID: <173AE03B-191B-4AE4-8C30-B13AA24A92AF@makedatamakesense.com> On [Apr 23], at [ Apr 23] 8:25 , Indus Khaitan wrote: > Expanding on a simple example: twitter's public timeline page, > consists of 20 individual content (can I say micro-content?) units (or > twitter statuses). On an aggregate page, these status messages are > uniquely identifiable units of content but there is no determinate way > of discerning boundaries of the individual statuses and successfully > parsing them without knowing the visual arrangement of markup. The > same problem statement can be attached to other example situations. > > The current mechanisms do not provide any semantic cues to a search > bot. Twitter already uses hentry [1] to separate individual posts. You seem to be describing something more generic than hentry, but I'd suggest starting with hentry anyway to see if it doesn't handle most of your use cases. [1] http://microformats.org/wiki/hatom -- Scott Reynen MakeDataMakeSense.com From mail at tobyinkster.co.uk Thu Apr 23 09:03:43 2009 From: mail at tobyinkster.co.uk (Toby Inkster) Date: Thu Apr 23 09:08:55 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries In-Reply-To: References: Message-ID: <1240502623.7021.6.camel@ophelia2.g5n.co.uk> On Thu, 2009-04-23 at 19:55 +0530, Indus Khaitan wrote: > I'm interested in finding a way to identify the content boundaries for > content aggregated/published on a composite page. Surely hAtom entries are perfect for this purpose. What specific need do you have which isn't addressed by hAtom? -- Toby Inkster From khaitan at gmail.com Thu Apr 23 20:32:58 2009 From: khaitan at gmail.com (Indus Khaitan) Date: Thu Apr 23 20:33:16 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries In-Reply-To: <173AE03B-191B-4AE4-8C30-B13AA24A92AF@makedatamakesense.com> References: <173AE03B-191B-4AE4-8C30-B13AA24A92AF@makedatamakesense.com> Message-ID: Scott, Toby: Thanks for the input about hentry/hatom. I'm looking for something generic for demarcating arbitrary content boundaries. Whether the content following is hAtom(s), hResume(s), hProduct(s) or something else would be decided by the capabilities of the parser (it at all it is interested in parsing the specific microformat). Another prime motivation in the extended use-case is to allow bots to see that the individual content units (for example comments) have 'URL equivalence' [1] with other pages where the content unit has its own linked page but also exist in other linked container pages. Indus [1] URL equivalence where an embedded content unit has it's own linked page somewhere. eg. an embedded YouTube video has it's own linked page somewhere on YouTube. From scott at makedatamakesense.com Thu Apr 23 21:53:53 2009 From: scott at makedatamakesense.com (Scott Reynen) Date: Thu Apr 23 21:54:02 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries In-Reply-To: References: <173AE03B-191B-4AE4-8C30-B13AA24A92AF@makedatamakesense.com> Message-ID: <90BC1891-7AA6-41DD-96E5-B5AB098DB1C4@makedatamakesense.com> On [Apr 23], at [ Apr 23] 9:32 , Indus Khaitan wrote: > Thanks for the input about hentry/hatom. > > I'm looking for something generic for demarcating arbitrary content > boundaries. Can you give a few examples on the web today of content that you think could not be demarcated with hAtom? I'm assuming you've read the process, with its heavy emphasis on starting with existing formats? http://microformats.org/wiki/process > Another prime motivation in the extended use-case is to allow bots to > see that the individual content units (for example comments) have 'URL > equivalence' This sounds like another problem hAtom solves (with rel-bookmark) [1]. Can you give some examples of where this wouldn't work? [1] http://microformats.org/wiki/hatom#Entry_Permalink -- Scott Reynen MakeDataMakeSense.com From davidjanes at blogmatrix.com Fri Apr 24 03:12:01 2009 From: davidjanes at blogmatrix.com (David Janes) Date: Fri Apr 24 03:12:07 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries In-Reply-To: References: <173AE03B-191B-4AE4-8C30-B13AA24A92AF@makedatamakesense.com> Message-ID: <21e523c20904240312w10b14ca1n88776d087db6a863@mail.gmail.com> On Thu, Apr 23, 2009 at 11:32 PM, Indus Khaitan wrote: > > Scott, Toby: > > Thanks for the input about hentry/hatom. > > I'm looking for something generic for demarcating arbitrary content > boundaries. Whether the content following is hAtom(s), hResume(s), > hProduct(s) or something else would be decided by the capabilities of > the parser (it at all it is interested in parsing the specific > microformat). > > Another prime motivation in the extended use-case is to allow bots to > see that the individual content units (for example comments) have 'URL > equivalence' [1] with other pages where the content unit has its own > linked page but also exist in other linked container pages. > > Indus > > [1] URL equivalence where an embedded content unit has it's own linked > page somewhere. eg. an embedded YouTube video has it's own linked page > somewhere on YouTube. > _______________________________________________ > microformats-new mailing list > microformats-new@microformats.org > http://microformats.org/mailman/listinfo/microformats-new Demarcating arbitrary content within a page is (as mentioned by all the other responders) exactly the purpose of hAtom. rel-bookmark seems to implicitly handle your URL issues. This is a decent reference [1]. The weakest parts of hAtom right now are the required element issues. Regards, etc... [1] http://www.ablognotlimited.com/articles/getting-semantic-with-microformats-part-5-hatom/ -- David Janes Mercenary Programmer http://code.davidjanes.com From khaitan at gmail.com Mon Apr 27 17:23:28 2009 From: khaitan at gmail.com (Indus Khaitan) Date: Mon Apr 27 17:23:46 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries In-Reply-To: <21e523c20904240312w10b14ca1n88776d087db6a863@mail.gmail.com> References: <173AE03B-191B-4AE4-8C30-B13AA24A92AF@makedatamakesense.com> <21e523c20904240312w10b14ca1n88776d087db6a863@mail.gmail.com> Message-ID: Something basic and more elemental than using hAtom as a requirement for demarcating content boundaries; which can be used across any microformat which deals with content. > The weakest parts of hAtom right now are the required element issues. True. Not every piece of arbitrary content may have date, author and title. Content may just have text. From jeremy at adactio.com Tue Apr 28 04:38:22 2009 From: jeremy at adactio.com (Jeremy Keith) Date: Tue Apr 28 04:38:32 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries In-Reply-To: References: <173AE03B-191B-4AE4-8C30-B13AA24A92AF@makedatamakesense.com> <21e523c20904240312w10b14ca1n88776d087db6a863@mail.gmail.com> Message-ID: <30D7CC17-B109-4414-8655-86CA6FC8E7CB@adactio.com> >> The weakest parts of hAtom right now are the required element issues. > > True. Not every piece of arbitrary content may have date, author and > title. Content may just have text. I concur. But I think that rather than looking at creating a new format, we'd be better off relaxing the required content rules for hAtom. Shall we move this over to the discuss list and kick start the discussion there? There are already some issues raised on the hAtom issues page, such as the author requirement (probably the most frustrating requirement): http://microformats.org/wiki/hatom-issues#author_as_an_hcard_is_too_much_to_require -- Jeremy Keith a d a c t i o http://adactio.com/ From brian.suda at gmail.com Tue Apr 28 04:44:23 2009 From: brian.suda at gmail.com (Brian Suda) Date: Tue Apr 28 04:44:29 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries In-Reply-To: <30D7CC17-B109-4414-8655-86CA6FC8E7CB@adactio.com> References: <173AE03B-191B-4AE4-8C30-B13AA24A92AF@makedatamakesense.com> <21e523c20904240312w10b14ca1n88776d087db6a863@mail.gmail.com> <30D7CC17-B109-4414-8655-86CA6FC8E7CB@adactio.com> Message-ID: <21e770780904280444y1142e096o20917483d101caa3@mail.gmail.com> On Tue, Apr 28, 2009 at 11:38 AM, Jeremy Keith wrote: >>> The weakest parts of hAtom right now are the required element issues. >> >> True. Not every piece of arbitrary content may have date, author and >> title. Content may just have text. > > I concur. But I think that rather than looking at creating a new format, > we'd be better off relaxing the required content rules for hAtom. --- There is some discussion about an 'item' container, http://microformats.org/wiki/item that would allow for arbitrary data to be grouped together. It is another option to explore. -brian -- brian suda http://suda.co.uk From davidjanes at blogmatrix.com Tue Apr 28 04:51:28 2009 From: davidjanes at blogmatrix.com (David Janes) Date: Tue Apr 28 04:51:32 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries In-Reply-To: <30D7CC17-B109-4414-8655-86CA6FC8E7CB@adactio.com> References: <173AE03B-191B-4AE4-8C30-B13AA24A92AF@makedatamakesense.com> <21e523c20904240312w10b14ca1n88776d087db6a863@mail.gmail.com> <30D7CC17-B109-4414-8655-86CA6FC8E7CB@adactio.com> Message-ID: <21e523c20904280451q4e08bf21u2d66fc5d9bc487c4@mail.gmail.com> Seconded. On Tue, Apr 28, 2009 at 7:38 AM, Jeremy Keith wrote: >>> The weakest parts of hAtom right now are the required element issues. >> >> True. Not every piece of arbitrary content may have date, author and >> title. Content may just have text. > > I concur. But I think that rather than looking at creating a new format, > we'd be better off relaxing the required content rules for hAtom. > > Shall we move this over to the discuss list and kick start the discussion > there? > > There are already some issues raised on the hAtom issues page, such as the > author requirement (probably the most frustrating requirement): > > http://microformats.org/wiki/hatom-issues#author_as_an_hcard_is_too_much_to_require > > -- > Jeremy Keith > > a d a c t i o > > http://adactio.com/ > > > _______________________________________________ > microformats-new mailing list > microformats-new@microformats.org > http://microformats.org/mailman/listinfo/microformats-new > -- David Janes Mercenary Programmer http://code.davidjanes.com From khaitan at gmail.com Tue Apr 28 08:13:45 2009 From: khaitan at gmail.com (Indus Khaitan) Date: Tue Apr 28 08:14:10 2009 Subject: [uf-new] New proposal: Elemental microformat for content boundaries In-Reply-To: <21e770780904280444y1142e096o20917483d101caa3@mail.gmail.com> References: <173AE03B-191B-4AE4-8C30-B13AA24A92AF@makedatamakesense.com> <21e523c20904240312w10b14ca1n88776d087db6a863@mail.gmail.com> <30D7CC17-B109-4414-8655-86CA6FC8E7CB@adactio.com> <21e770780904280444y1142e096o20917483d101caa3@mail.gmail.com> Message-ID: > --- There is some discussion about an 'item' container, > http://microformats.org/wiki/item that would allow for arbitrary data > to be grouped together. It is another option to explore. +1. This is what I'm looking for. 'll explore more on the wiki and revert. -- http://khaitan.org