<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>http://microformats.org/wiki/index.php?action=history&amp;feed=atom&amp;title=dataset-examples</id>
	<title>dataset-examples - Revision history</title>
	<link rel="self" type="application/atom+xml" href="http://microformats.org/wiki/index.php?action=history&amp;feed=atom&amp;title=dataset-examples"/>
	<link rel="alternate" type="text/html" href="http://microformats.org/wiki/index.php?title=dataset-examples&amp;action=history"/>
	<updated>2026-04-09T10:05:39Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.38.4</generator>
	<entry>
		<id>http://microformats.org/wiki/index.php?title=dataset-examples&amp;diff=69595&amp;oldid=prev</id>
		<title>Aaronpk: Replace &lt;entry-title&gt; with {{DISPLAYTITLE:}}</title>
		<link rel="alternate" type="text/html" href="http://microformats.org/wiki/index.php?title=dataset-examples&amp;diff=69595&amp;oldid=prev"/>
		<updated>2020-07-18T16:21:41Z</updated>

		<summary type="html">&lt;p&gt;Replace &amp;lt;entry-title&amp;gt; with {{DISPLAYTITLE:}}&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 16:21, 18 July 2020&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l1&quot;&gt;Line 1:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 1:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;entry-title&amp;gt;&lt;/del&gt;Dataset examples&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&amp;lt;/entry-title&amp;gt;&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;{{DISPLAYTITLE:&lt;/ins&gt;Dataset examples&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;}}&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br/&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;There are many people and organizations publishing datasets online in a wide variety of formats (csv, sequence, xls, etc). Examples of webpages describing and linking to datasets are explored here.  &lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;There are many people and organizations publishing datasets online in a wide variety of formats (csv, sequence, xls, etc). Examples of webpages describing and linking to datasets are explored here.  &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Aaronpk</name></author>
	</entry>
	<entry>
		<id>http://microformats.org/wiki/index.php?title=dataset-examples&amp;diff=53129&amp;oldid=prev</id>
		<title>Aloisius: Initial version</title>
		<link rel="alternate" type="text/html" href="http://microformats.org/wiki/index.php?title=dataset-examples&amp;diff=53129&amp;oldid=prev"/>
		<updated>2013-05-01T22:40:50Z</updated>

		<summary type="html">&lt;p&gt;Initial version&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&amp;lt;entry-title&amp;gt;Dataset examples&amp;lt;/entry-title&amp;gt;&lt;br /&gt;
&lt;br /&gt;
There are many people and organizations publishing datasets online in a wide variety of formats (csv, sequence, xls, etc). Examples of webpages describing and linking to datasets are explored here. &lt;br /&gt;
&lt;br /&gt;
== The Problem ==&lt;br /&gt;
Discovering these datasets is incredibly difficult because there exists simple way of marking up pages that describe these datasets. Today, links to various datasets can be scattered throughout the web or entered into various central repositors. Being able to publish a dataset in a way that an automated search engine or software tool could discover them would go a long way towards easing the discovery process.&lt;br /&gt;
&lt;br /&gt;
== Use Cases ==&lt;br /&gt;
As the originator of the data, you publish a webpage with a link to that data for discovery purposes.&lt;br /&gt;
&lt;br /&gt;
Alternatively, a third party may publish links to your data (or webpage describing the data) and include extra metadata about it that the originator may not have included.&lt;br /&gt;
&lt;br /&gt;
== Real-World Examples ==&lt;br /&gt;
''Links to public web pages, either popular or insightful''&lt;br /&gt;
&lt;br /&gt;
=== Individual/Organizational Publishers ===&lt;br /&gt;
* FreeBase https://developers.google.com/freebase/data&lt;br /&gt;
* 1000 Genomes http://www.1000genomes.org/data&lt;br /&gt;
* Common Crawl - https://commoncrawl.atlassian.net/wiki/display/CRWL/About+the+Data+Set&lt;br /&gt;
* Data.gov - https://explore.data.gov/Geography-and-Environment/Worldwide-M1-Earthquakes-Past-7-Days/7tag-iwnu&lt;br /&gt;
* Data.gov.uk - http://data.gov.uk/dataset/average_earnings_index&lt;br /&gt;
&lt;br /&gt;
=== Centralized Repositories and/or Directories ===&lt;br /&gt;
* DataBib - http://databib.org/repository/380&lt;br /&gt;
* Amazon Public Datasets - http://aws.amazon.com/datasets/Economics/2285&lt;br /&gt;
* DataHub - http://datahub.io/dataset/diavgeia&lt;br /&gt;
&lt;br /&gt;
== Common Practices ==&lt;br /&gt;
Datasets typically are described using several common fields.&lt;br /&gt;
&lt;br /&gt;
* fn - name of the dataset&lt;br /&gt;
* records - number of records&lt;br /&gt;
* size - byte size of dataset&lt;br /&gt;
* schema - link to something describing the schema or a description of the schema itself&lt;br /&gt;
* url - url to dataset&lt;br /&gt;
** type - format the data is in&lt;br /&gt;
* sample - sample of data or link to the sample&lt;br /&gt;
** type - format the data is in&lt;br /&gt;
* summary - summary of the dataset&lt;br /&gt;
* description - description of the dataset&lt;br /&gt;
* terms - terms of use for dataset, url likely&lt;br /&gt;
* dtpublished - date dataset was published&lt;br /&gt;
* dtupdated - date dataset was updated&lt;br /&gt;
* contributor - people/organizations contributing to the dataset&lt;br /&gt;
&lt;br /&gt;
== Existing Practices ==&lt;br /&gt;
* http://www.w3.org/standards/semanticweb/data&lt;br /&gt;
* http://www.w3.org/wiki/WebSchemas/Datasets&lt;br /&gt;
&lt;br /&gt;
== Brainstorming ==&lt;br /&gt;
&lt;br /&gt;
== See Also ==&lt;/div&gt;</summary>
		<author><name>Aloisius</name></author>
	</entry>
</feed>