web-page

<entry-title>web-page</entry-title>

This brainstorm is premature and fails to follow the microformats process.

Please see and use existing microformats, e.g. hAtom (which completely handles the required properties), before making new proposals.

- Tantek 07:57, 17 August 2009 (UTC)

brainstorm 2009-08-14

Author

Luís Nóbrega (nobrega.luis@gmail.com)

Copyright

Public Domain Contribution Requirement. Since the author(s) released this work into the public domain, in order to maintain this work's public domain status, all contributors to this page agree to release their contributions to this page to the public domain as well. Contributors may indicate their agreement by adding the public domain release template to their user page per the Voluntary Public Domain Declarations instructions. Unreleased contributions may be reverted/removed.

Patents

This specification is subject to a royalty free patent policy, e.g. per the W3C Patent Policy, and IETF RFC3667 & RFC3668.

Introduction

The purpose of this document is to standardize the way a web page should be identified across different softwares, platforms or purposes.

Please if you think you can contribute to this feel free to.

Problem statement

The variety of web pages is gigantic nowadays.

The trends show that they will not stop to grow. That growth shows that they will influence our lives gradually.

We use web pages to several purposes because they are embracing and because they are different in some aspects. But in others they are not.

Why we should fill fields in Delicious or Endnote or Connotea or Citeulike if it's the same page?

Why we have to develop a separate file like sitemap.xml to improve Google's indexing?

I think we should not.

If there was an open standard to normalize the way a web page should be identified maybe from the browser we could directly do a enormous quantity of things simplier.

Possible uses

Bookmark from browser directly to bookmarking services like Delicious and others without extensions or plugins;
Create references directly in a reference management software web-based or not like Endnote, Citeulike or Connotea automatically from the browser;
Webmasters could define the way search engines should index their web pages easily.

A prescriptive proposal

Required (by alphabetical order)
- Author (1)
- Keywords (1)
- Publisher (1)
- Title (1)
- To be indexed (2)
- URL (1)
- Year (1)

Recommended/optional (by alphabetical order)
- Abstract (1)
- Acession number (1)
- City (1)
- Contents (1)
- Database Provider (1)
- Description (1)
- DOI (1)
- Edition (1)
- Thumb Image (3)
- Medium Image (3)
- Large Image (3)
- Language (1)
- Last Update Date (1)
- Name of Database (1)
- Notes (1)
- Research Notes (1)
- Series Editor (1)
- Series Title (1)
- Type of Medium (1)
- Priority in this domain (4)

(1) Inspired in Endnote.

(2) A boolean value allowing or not indexing by search engines.

(3) Defines the URL to images that represent the web page. Those could appear in the search results of search engines and other web content providers.

(4) Defines the priority of a page in the same domain. Proposed by Google (see https://www.google.com/webmasters/tools/docs/en/protocol.html (*)) and his purpose was to order results between URLs from the same web site.

(*) This information was collected here, but recently the page disappear and Google doesn't says why. Anyone knows why please?

Possible Mark-up

<div class="hWebPage">
      <h3 class="webpage-title">Dummy title</h3>
	...
</div>