Genealogy Formats

(Difference between revisions)

Jump to: navigation, search
(GEDCOM _IS_ a format for genealogical data.)
Current revision (15:30, 25 July 2013) (view source)
(subheads, fix typo)
 
(11 intermediate revisions not shown.)
Line 1: Line 1:
-
= Genealogy Formats =
+
<entry-title> Genealogy Formats </entry-title>
-
I started this page because someone (Bob Jonkman apparently) added a bunch of stuff to the [http://developers.technorati.com/wiki/MicroFormats Technorati microformats page] on genealogy, and I moved it here. -[http://tantek.com/log/ Tantek]
+
Per the microformats [[process]], towards the development of a [[genealogy]] microformat, this page documents previous/existing genealogy related formats.
-
 
+
-
 
+
-
== In the wild ==
+
-
 
+
-
see: The Dring tree [http://www.sussexbarn.com/dring/web/dring/pafg01.htm] for an interesting family tree website.
+
-
 
+
-
[http://www.comp.utas.edu.au/users/rsmith/levett/wc01/wc01_045.html this family group] is pretty much a direct translation of a gedcom FAM structure, but with some names added to the links. It also includes back links to parents.
+
-
 
+
-
[http://www.comp.utas.edu.au/users/rsmith/levett/ps02/ps02_361.html an individual from the same tree] This is basically an INDI record from GEDCOM.
+
-
 
+
-
== problem statement ==
+
-
The main problem for geneaology on the web is that many people are posting their family trees, but if you were searching for your ancestors, there is no semantic in these pages which helps you link them to similar named individuals in your own tree. some sites like freeCEN and freeBMD have databases which can assist in this linkage, but they are incomplete and frustrating to use.
+
-
 
+
-
If there were some kind of order to this process, ordinary web searching might be used; and we could interlink family trees more readily.
+
-
 
+
-
[http://jay.askren.net/Projects/SemWeb/ RDF] and the semantic web has been used to tackle this problem, but this doesnt help people that want to publish, or search published trees until there is a real semantic web.
+
-
 
+
-
What I think we need is some kind of microformat markup to add to examples like [http://jay.askren.net/Projects/SemWeb/FamilyTrees/AbrahamLincoln.html this tree of Abraham Lincoln].
+
== GEDCOM ==
== GEDCOM ==
-
GEDCOM has become pretty much the defacto standard for sharing data between geneaology systems.  It is hierachical and link based, much like HTML; but it encodes family structure (which is a general graph) outside of this structural hierachy. <blockquote>GEDCOM was developed (...) to provide a flexible, uniform format for exchanging computerized genealogical data.[http://homepages.rootsweb.com/~pmcbride/gedcom/55gcint.htm]</blockquote>
+
GEDCOM has become pretty much the defacto standard for sharing data between genealogy systems.  It is hierarchical and link based, much like HTML; but it encodes family structure (which is a general graph) outside of this structural hierarchy. <blockquote>GEDCOM was developed (...) to provide a flexible, uniform format for exchanging computerized genealogical data.[http://homepages.rootsweb.com/~pmcbride/gedcom/55gcint.htm]</blockquote>
* Coding the GEDCOM standard into hGEDCOM. I don't know of any projects working on this, but nominate [http://www.starkeffect.com/ged2html/3.6a/ Gene Stark's GED2HTML translator] (esp. the [http://www.starkeffect.com/ged2html/3.6a/templates.html modifiable output program]) as a suitable candidate for hGEDCOM hacking. See also [http://homepages.rootsweb.com/~pmcbride/genweb.html#gedcom Paul McBride's links] to GEDCOM standards and resources. --[mailto:bjonkman@sobac.com Bob Jonkman]
* Coding the GEDCOM standard into hGEDCOM. I don't know of any projects working on this, but nominate [http://www.starkeffect.com/ged2html/3.6a/ Gene Stark's GED2HTML translator] (esp. the [http://www.starkeffect.com/ged2html/3.6a/templates.html modifiable output program]) as a suitable candidate for hGEDCOM hacking. See also [http://homepages.rootsweb.com/~pmcbride/genweb.html#gedcom Paul McBride's links] to GEDCOM standards and resources. --[mailto:bjonkman@sobac.com Bob Jonkman]
-
* I'm not sure whether it makes sense to do GEDCOM as its own format, the FAM structure and the need to present different reports, suggest to me that we need some kind of post GEDCOM markup. To see how direct use of GEDCOM might pan out I hacked up this [[GEDCOM Worked example]]. To me the main issue seems to revolve around the FAM structure. I think the [http://jay.askren.net/Projects/SemWeb/ Jay Askren] approach might be better thsn the Gene Stark work as a starting point.
+
* I'm not sure whether it makes sense to do GEDCOM as its own format, the FAM structure and the need to present different reports, suggest to me that we need some kind of post-GEDCOM markup. To see how direct use of GEDCOM might pan out I hacked up this [[GEDCOM Worked example]]. To me the main issue seems to revolve around the FAM structure. I think the [http://jay.askren.net/Projects/SemWeb/ Jay Askren] approach might be better than the Gene Stark work as a starting point.
-
 
+
* Had a look at some examples of what GEDCOM creates [http://en.wikipedia.org/wiki/GEDCOM#Example].  Basically, seems to be [[xfn|XFN]] relationships (siblings, spouses etc.) and [[hcard|hCard]] information (could genealogy be inferred from existing XFNs regardless of a hGED format?). The only additional information we do not currently hold in a format is that of gender. GEDCOM specifies male or female for each individual. Creating something using these formats would be quite straightforward, but not sure its takeup would be good unless someone was interested in creating a hGEDCOM2GEDCOM. -- [[user:Phae|Frances Berriman]]
* Had a look at some examples of what GEDCOM creates [http://en.wikipedia.org/wiki/GEDCOM#Example].  Basically, seems to be [[xfn|XFN]] relationships (siblings, spouses etc.) and [[hcard|hCard]] information (could genealogy be inferred from existing XFNs regardless of a hGED format?). The only additional information we do not currently hold in a format is that of gender. GEDCOM specifies male or female for each individual. Creating something using these formats would be quite straightforward, but not sure its takeup would be good unless someone was interested in creating a hGEDCOM2GEDCOM. -- [[user:Phae|Frances Berriman]]
-
* GEDCOM is basically a set of INDIvidual records, related by FAMily nodes the family nodes contain the HUSBand, WIFE and CHILd. The INDI records are quite similar and might be replaced by hCard records, but the graph structure is a little harder to capture; families arent strict trees, so a direct mapping to XML doesnt really work. Publishing a GEDCOM database directly to the web might not be the most logical thing to do.
+
* GEDCOM is basically a set of INDIvidual records, related by FAMily nodes the family nodes contain the HUSBand, WIFE and CHILd. The INDI records are quite similar and might be replaced by hCard records, but the graph structure is a little harder to capture; families aren't strict trees, so a direct mapping to XML doesn't really work. Publishing a GEDCOM database directly to the web might not be the most logical thing to do.
* Genealogical information has date-of-death, which is also missing in hCard format (although hCard does have date-of-birth).  Much of genealogical information is event based: Date of birth, date of death, dates of marriages and divorces, and many other significant events such as religious observances (Baptisms, Bar/Bat Mitzvahs) and migrations ("Moved to Canada from the Netherlands").  This all translates wonderfully to [[hCalendar]].  Additionally, a properly researched family tree will cite sources for all the data listed, and so could use [[citation|hCite]].  The biggest problem I see in using hCalendar is that genealogical data allows approximate dates, specifically "ABT 4 July 1776", "BEF 25 Dec 1903", "AFT 11 Nov 1918". It also also allows ambiguous dates, "July 1867" or just "1886", or even "4 July".  And these in combination, (Approximately ambiguous dates?  Ambiguously approximate dates?), eg. "BEF Feb 2007", "AFT 1945".  The most ambiguous entries I've seen for dates are "DECEASED" when date-of-death is unknown, and "NOT MARRIED" for couples who have not had a wedding ceremony.  (Info from ''Guidelines for event dates'' in the PAF Help File).
* Genealogical information has date-of-death, which is also missing in hCard format (although hCard does have date-of-birth).  Much of genealogical information is event based: Date of birth, date of death, dates of marriages and divorces, and many other significant events such as religious observances (Baptisms, Bar/Bat Mitzvahs) and migrations ("Moved to Canada from the Netherlands").  This all translates wonderfully to [[hCalendar]].  Additionally, a properly researched family tree will cite sources for all the data listed, and so could use [[citation|hCite]].  The biggest problem I see in using hCalendar is that genealogical data allows approximate dates, specifically "ABT 4 July 1776", "BEF 25 Dec 1903", "AFT 11 Nov 1918". It also also allows ambiguous dates, "July 1867" or just "1886", or even "4 July".  And these in combination, (Approximately ambiguous dates?  Ambiguously approximate dates?), eg. "BEF Feb 2007", "AFT 1945".  The most ambiguous entries I've seen for dates are "DECEASED" when date-of-death is unknown, and "NOT MARRIED" for couples who have not had a wedding ceremony.  (Info from ''Guidelines for event dates'' in the PAF Help File).
Line 44: Line 25:
::[[User:Bob Jonkman|Bob Jonkman]] 07:58, 9 Feb 2007 (PST)
::[[User:Bob Jonkman|Bob Jonkman]] 07:58, 9 Feb 2007 (PST)
-
==Wikipedia's Persondata==
+
== GEDCOM Replacement Efforts ==
 +
 
 +
There are currently two major efforts to develop a replacement for the largely out-of-date GEDCOM format (last updated in 1999).
 +
 
 +
=== GEDCOM X ===
 +
One effort is GEDCOM X [http://www.gedcomx.org/], by FamilySearch, the original creator of GEDCOM. While the format is openly published on github and the development is fairly transparent, it is completely controlled by FamilySearch (a division of the Mormon church). Includes JSON and XML serialization formats, as well as a file format which includes many files compressed into a zip file.
 +
 
 +
=== FHSIO ===
 +
The other effort is the Family History Information Standards Organization (FHSIO) [http://fhiso.org/] which is gathering member companies into a consortium to develop a replacement format. Part of the goal of FHISO is specifically to take genealogy standards out of the control of a single organization. FHISO was spawned out of a grass-roots effort to replace GEDCOM called BetterGEDCOM [http://bettergedcom.wikispaces.com/].
 +
 
 +
==Wikipedia Persondata==
Wikipedia's [http://en.wikipedia.org/wiki/Wikipedia:Persondata Persondata] aligns very closely with hCard, but has additional date and place of birth & death fields. [[User:AndyMabbett|Andy Mabbett]] 13:04, 28 Jan 2007 (PST)
Wikipedia's [http://en.wikipedia.org/wiki/Wikipedia:Persondata Persondata] aligns very closely with hCard, but has additional date and place of birth & death fields. [[User:AndyMabbett|Andy Mabbett]] 13:04, 28 Jan 2007 (PST)
 +
 +
==vCard birth death extensions==
 +
http://tools.ietf.org/html/draft-li-vcarddav-vcard-id-property-extensions
 +
 +
This vCard extension draft proposes new properties related to birth location, death date, and death location.
 +
* BIRTHPLACE
 +
* DEATHPLACE
 +
* DEATHDATE
== External Links ==
== External Links ==
Line 58: Line 57:
==See also==
==See also==
-
*[[genealogy-brainstorming]]
+
{{genealogy-related-pages}}
-
*[[hcard|hCard]]
+

Current revision


Per the microformats process, towards the development of a genealogy microformat, this page documents previous/existing genealogy related formats.

Contents

GEDCOM

GEDCOM has become pretty much the defacto standard for sharing data between genealogy systems. It is hierarchical and link based, much like HTML; but it encodes family structure (which is a general graph) outside of this structural hierarchy.
GEDCOM was developed (...) to provide a flexible, uniform format for exchanging computerized genealogical data.[1]
The only relationship links in GEDCOM are HUSBand, WIFE and CHILd. All other relationships (brother, sister, grandparents, grandchildren, uncles, aunts, nieces, nephews, cousins) can be inferred by traversing family records. This does mean that any collection of genealogical pages need some way to cross-reference to each other. This isn't a problem for all pages on a single Web site, which use RIN (Record Identifier) or REFN (User Reference Number). However, different Web pages maintained by different genealogists may have conflicting RINs and REFNs. There is a globally-unique AFN (Ancestral File Number) issued by the Church of Jesus Christ of Latter-Day Saints (LDS), but I don't know how they're issued and most genealogical sites don't use them anyway.
The GEDCOM format contains much other data specific to the LDS, but I don't know how widespread it is, nor how appropriate it would be to code it into a microformat intended to reach well beyond the LDS.
Regardless of whether an hGED microformat is developed, it would still be valuable to mark up genealogical information with microformats on Web pages for the semantic value.
Bob Jonkman 07:58, 9 Feb 2007 (PST)

GEDCOM Replacement Efforts

There are currently two major efforts to develop a replacement for the largely out-of-date GEDCOM format (last updated in 1999).

GEDCOM X

One effort is GEDCOM X [3], by FamilySearch, the original creator of GEDCOM. While the format is openly published on github and the development is fairly transparent, it is completely controlled by FamilySearch (a division of the Mormon church). Includes JSON and XML serialization formats, as well as a file format which includes many files compressed into a zip file.

FHSIO

The other effort is the Family History Information Standards Organization (FHSIO) [4] which is gathering member companies into a consortium to develop a replacement format. Part of the goal of FHISO is specifically to take genealogy standards out of the control of a single organization. FHISO was spawned out of a grass-roots effort to replace GEDCOM called BetterGEDCOM [5].

Wikipedia Persondata

Wikipedia's Persondata aligns very closely with hCard, but has additional date and place of birth & death fields. Andy Mabbett 13:04, 28 Jan 2007 (PST)

vCard birth death extensions

http://tools.ietf.org/html/draft-li-vcarddav-vcard-id-property-extensions

This vCard extension draft proposes new properties related to birth location, death date, and death location.

External Links

See also

Genealogy Formats was last modified: Thursday, July 25th, 2013

Views