[uf-discuss] Worth blogging about W3C's press release today?
khare at alumni.caltech.edu
Tue Sep 11 21:56:56 PDT 2007
September 11, 2007 10:00 AM Eastern Daylight Time
W3C Completes Bridge Between HTML/Microformats and Semantic Web
GRDDL Gives Web Content Hooks to Powerful Reuse and Data Integration
http://www.w3.org/--(BUSINESS WIRE)--Today, the World Wide Web
Consortium completed an important link between Semantic Web and
microformats communities. With "Gleaning Resource Descriptions from
Dialects of Languages", or GRDDL (pronounced "griddle"), software can
automatically extract information from structured Web pages to make
it part of the Semantic Web. Those accustomed to expressing
structured data with microformats in XHTML can thus increase the
value of their existing data by porting it to the Semantic Web, at
very low cost.
"Sometimes one line of code can make a world of difference," said Tim
Berners-Lee, W3C Director. "Just as stylesheets make Web pages more
readable to people, GRDDL makes Web pages, microformat tags, XML
documents, and data more readable to Semantic Web applications,
opening more data to new possibilities and creative reuse."
Getting Data into and out of the Web; how is it happening today?
One aspect of recent developments some people call "Web 2.0" involves
applications based on combining — in "mashups" — various types of
data that are spread all around on the Web. A number of active
communities innovating on the Web share the goal of sharing data such
as calendar information, contact information, and geopositioning
information. These communities have developed diverse social
practices and technologies that satisfy their particular needs. For
instance, search engines have had great success using statistical
methods while people who share photos have found it useful to tag
their photos manually with short text labels. Much of this work can
be captured via "microformats". Microformats refer to sets of simple,
open data formats built upon existing and widely adopted standards,
including HTML, CSS and XML.
This wave of activity has direct connections to the essence of the
Semantic Web. The Semantic Web-based communities have pursued ways to
improve the quality and availability of data on the Web, making it
possible for more intensive data-integration and more diverse
applications that can scale to the size of the Web and allow even
more powerful mash-ups. The Web-based set of standards that supports
this work is known as the Semantic Web stack. The foundations of the
Semantic Web stack meet the requirements for formality of some
applications such as managing bank statements, or combining volumes
of medical data.
Each approach to "getting your data out there" has its place. But why
limit yourself to just one approach if you can benefit, at low cost,
from more than one? As microformats users consider more uses that
require data modelling, or validation, how can they take advantage of
their existing data in more formal applications?
A Bridge from Flexible Web Applications to the Semantic Web
GRDDL is the bridge for turning data expressed in an XML format (such
as XHTML) into Semantic Web data. With GRDDL, authors transform the
data they wish to share into a format that can be used and
transformed again for more rigorous applications.
GRDDL Use Cases provides insight into why this is useful through a
number of real-world scenarios, including scheduling a meeting,
comparing information from various retailers before making a
purchase, and extracting information from wikis to facilitate e-
learning. Once data is part of the Semantic Web, it can be merged
with other data (for example, from a relational database, similarly
exposed to the Semantic Web) for queries, inferences, and conversion
to other formats.
The Working Group has reported on implementation experience, and its
members have come forward with statements of support and commitments
to implement GRDDL.
GRDDL Test Cases is also published today, which describes and
includes test cases for software agents to support GRDDL. The Working
Group has produced a GRDDL service that allows users to input a
GRDDL'd file and extract the important data.
About the World Wide Web Consortium [W3C]
The World Wide Web Consortium (W3C) is an international consortium
where Member organizations, a full-time staff, and the public work
together to develop Web standards. W3C primarily pursues its mission
through the creation of Web standards and guidelines designed to
ensure long-term growth for the Web. Over 400 organizations are
Members of the Consortium. W3C is jointly run by the MIT Computer
Science and Artificial Intelligence Laboratory (MIT CSAIL) in the
USA, the European Research Consortium for Informatics and Mathematics
(ERCIM) headquartered in France and Keio University in Japan, and has
additional Offices worldwide. For more information see http://
More information about the microformats-discuss