Difference between revisions of "species-strawman-01"

From Microformats Wiki
Jump to navigation Jump to search
(→‎Expressing a species with a GUID: LSID is effectively dead as a universal GUID proposal)
(→‎Questions: Anything going on?)
Line 289: Line 289:
:<blockquote>I definitely prefer "superregnum" over "superkingdom". Using the Latin forms of the taxon names makes the format less language-dependent and more international. It probably also reduces ambiguity somewhat (a biological "kingdom" is also called an "empire" in English).</blockquote>
:<blockquote>I definitely prefer "superregnum" over "superkingdom". Using the Latin forms of the taxon names makes the format less language-dependent and more international. It probably also reduces ambiguity somewhat (a biological "kingdom" is also called an "empire" in English).</blockquote>
* Is there still anything happening with this? The proposal seems to have been around for a couple of years, could it be finalised? There will always be edge cases, but that shouldn't prevent us from starting to use something in the simpler cases. I'd favour "taxon" over "species" or "biota" as the top level tag, and "commonname" over "cname". [[User:ThomasK|ThomasK]] 15:04, 21 July 2010 (UTC)
===To add===
===To add===

Latest revision as of 15:04, 21 July 2010

Species straw-man proposal

This page is a sub-page of species-brainstorming

I (Andy Mabbett) am tending towards this model, nested according to components of the microformat, not taxonomically:

[Note: in taxonomy, levels such as "subphylum", "class" or "order" are known as a "rank"].


[Note: It is intended that all these (X)HTML classes be optionally available to publishers, but they need use only those which apply to their particular needs. Compare, for instance, with all the classes and types available in hCard.]

  • species (scientific name; aka botanical name) (better: taxon; or biota)
    • domain (alternatively: "superregnum")
    • kingdom (alt: "regnum")
    • subkingdom (alt: "subregnum")
    • superphylum
    • phylum
    • subphylum
    • taxoclass (alt: "taxo-class ", "taxonclass", "taxon-class", "classis")
    • subclass (alt: "subclassis")
    • infraclass (alt: "infraclassis")
    • superorder (alt: "superordo")
    • order (alt: "ordo")
    • suborder (alt: "subordo")
    • infraorder (alt: "infraordo")
    • parvorder
    • superfamily (alt: "superfamilia")
    • family (alt: "familia")
    • subfamily (alt: "subfamilia")
    • rank (alt: "taxorank", "taxon-rank", et al) - "unranked". See [1]; might also be used where the level of a rank is disputed, or the author simply has no ability or wish to declare the rank more explicitly.
    • binomial ("binomial name" alt: "binominal")
      • genus
      • specific (="specific epithet")
    • subsp ("subspecies")
    • variety
    • subvar ("subvariety")
    • form
    • subform
    • cultivar
    • cultgp ("cultivar group")
    • cross (e.g. "F1")
    • strain
    • ? morph (or phase) (e.g "Gyrfalcons, for example, have a grey morph and a white morph" [2]; "the Lesser Snow Goose (C. c. caerulescens), commonly occurs in two plumage variants. White-phase birds are white except for black wing tips, but blue-phase geese have bluish-grey plumage replacing most of the white except on the head, neck and tail tip." [3])
    • trade ("trade name")
    • breed (e.g. Bull Terrier)
    • sense (botanical - see examples)
    • authority
      • year (...of authority)
    • cname ("common name" - should this be "common" or "vernacular"?)
    • guid
    • vgroup ("vernacular group" (alt: "vernacular-group") - there is possibly a better term for this. Often, a genus or family doesn't encapsulate a particular group of species in a practical or useful fashion. For example, it is difficult to seperate fungi species and lichen species as they are taxonomically intermingled. Thus, within taxonomic databases, a vernacular group of "fungi" and "lichen" is often applied to species falling into either of these groups. A vernacular group could be considered similar to a common name, but for groups of species. See the NBN Gateway for an example of vernacular groups in use; these group names are also used in the Recorder biological recording software.
    • ? gender (useful for species exhibiting sexual dimorphism - "find me a picture of a male Pintail"; "I want to buy a female Holly bush" - a binary value, male or female; or including nueter, hermaphoradite, unspecified and/ or mixed?)- see Future development
    • ? age bracket (adult/ juvenile/ seed/ egg/ nymph/ nestling/ pup/ cub/ instar1/ instar2 etc. - needs more work) - see Future development
    • ? count (a number, or represenattion of some other value - none, unspecified, "present", etc) - see Future development
    • [name to be suggested ("type", "role"?)] an indicator of type, e.g. for bees, "queen" or "worker" [Q: Is there a proper name, in the scientific commnuity, for these distinctions?]

where are optional, and it is possible to infer from simply:

<abbr class="binominal" title="Anas platyrhynchos">Mallard</abbr>


<span class="binominal">Anas platyrhynchos</span>

that the genus is Anas and the species is platyrhynchos (and, thus, "binominal" is to "sci"; as "adr" is to "hCard 1.0")

A species (Citrine Wagtail, a bird):

    <span class="species">
	<span class="binominal">Motacilla citreola</span>

Sub-species (animal):

    <span class="species">
        <span class="binominal">Larus glaucoides</span>
        <span class="subsp">kumlieni</span>

Variety (plant):

  <span class="species">
    <span class="binominal">Pisum sativum</span>
    var. <span class="variety">macrocarpon</span> 

Species (animal, common name displayed):

    <span class="species">
        <abbr class="binominal" title="Larus thayeri">
            <span class="common">Thayer's Gull</span>

Species (animal, scientific name displayed):

    <span class="species">
        <abbr class="common" title="Thayer's Gull"> 
            <span class="binominal" Larus thayeri</span> 

Fungus, kingdom included:

    <span class="species"> 
        <abbr class="kingdom" title="Fungi"> 
            <span class="binominal">Amanita muscaria</span> 

Same name for different Genera:

    <p class="species">
        An unidentified
         <abbr class="taxoclass" title="Aves"> 
         <abbr class="genus" title="Oenanthe">
         <span class="common">

and :

    <p class="species">
        An unidentified
         <abbr class="taxoclass" title="Magnoliopsida"> 
         <abbr class="genus" title="Oenanthe">
         <span class="common">
            Water Dropwort

Species (animal, with authority and year):

    <span class="species"> 
        <span class="binominal">Pica pica</span> 
        <span class="authority">Linnaeus</span>, 
        (<span class="year">1758</span>) 

Re-classified species (animal):

    The species was classified as
    <span class="species">
        <abbr class="binominal" title="Bartramia longicauda">Tringa longicauda</abbr>
        by Johann Bechstein in 1812.

A more extreme example, where there is a need to describe the full taxonomic hierarchy:

  <span class="species">
    <span class="domain">Eukarya</span>
    <span class="kingdom">Animalia</span>
    <span class="subkingdom">Eumetazoa</span>
    <span class="superphylum">Deuterostomia</span>
    <span class="phylum">Chordata</span>
    <span class="subphylum">Vertebrata</span>
    <span class="taxoclass">Aves</span>
    <span class="subclass">Neognathae</span>
    <span class="order">Passeriformes</span>
    <span class="suborder">Passeri</span>
    <span class="parvordo">Passerida</span>
    <span class="superfamily">Passeroidea</span>
    <span class="family">Motacillidae</span>
    <span class="binominal">
	<span class="genus">Motacilla</span>
	<span class="specific">alba</span>
	<span class="subspecies">yarrellii</span>
    <span class="cname">Pied Wagtail</span>
    <span class="authority">Linnaeus</span>
    <span class="year">1758</span>

Expressing a species with a GUID

In the following example case an NBN GUID is provided. This GUID would be usable on the NBN Gateway, The NHM Species Dictionary, in Recorder 2002 and Recorder 6, and in the forthcoming OpenRecorder online recording toolkit. As there are different GUIDs for different databases, the type of GUID can be indicated with a code followed by a hyphen followed by the GUID (e.g. nbn-NBNSYS0000005133).

    <span class="sci nbn-NBNSYS0000005133">
        <span class="binominal">Lutra lutra</span>

Alternatively, the GUID could be expressed as an element in its own right, with the GUID type being expressed as a secondary class name:

    <span class="species">
        <span class="binominal">Lutra lutra</span>
        <span class="uid nbn">NBNSYS0000005133</span>

As a further alternative, the abbr design pattern could potentially be used, although this is semantically questionable:

    <span class="species">
        <abbr class="binominal" title="NBNSYS0000005133">Lutra lutra</abbr>

Yet another alternative, using a uBio LSID as the GUID:

    <span class="species urn:lsid:ubio.org:namebank:8341384">
      <span class="cname">Green Sandpiper</span>
      <span class="binominal">Tringa ochropus</span>

uBio has a publicly available SOAP web services interface which makes mining for taxonomic intelligence relatively easy.


  • Is "sci" the best attribute name for the top-level?
    • No - Scott Reynen
      • What do you think would be better? - Andy Mabbett
        • Assuming "sci" is short for "scientific name", I propose "scientific-name".
          • It is. That's 12 extra characters! - Andy Mabbett
    • Taxon is a far better solution [4]. It's short, meaningful and in keeping with the other class types. - Andy Mabbett
      • I think "taxonname" or "taxon-name" would be a better value for the class attribute. It is more descriptive of the data your trying to specify the format of. Taxon refers more to the classification grouping I thought. The class attribute is used frequently for the application of CSS styling so the top level class at least needs to be fairly distinctive I would have thought to avoid clashes with other class attribute values in the page and CSS files. - Tony Prichard
        • The OED defines taxon as "A taxonomic group". See also the URL cited, [5]. - Andy Mabbett
          • I agree that taxon would be the most suitable name. It could be considered as a shortening of TaxonConcept (or TaxonName), which is the term used by the TCS - Charles Roper
    • or Biota - Andy Mabbett
  • Species is used in the above, for the sake of having one name to use, but "biota" or "taxon" are likely to be used in the final version. AndyMabbett 09:15, 22 Oct 2006 (PDT)
  • Should "bin", var", "cult", etc., be written in full? (I think not, to save bloating file sizes)
    • Yes - Scott Reynen
      • Conceded, and applied to the above. What about "subsp", etc? AndyMabbett 09:15, 22 Oct 2006 (PDT)
  • Should other attribute names be abbreviated for brevity?
    • No, brevity is not one of the naming principles. "bin", "var", and "cult" all leave ambiguous meaning, which is a problem. We should "Use class names based on names from the original schema," e.g. full words or phrases where they aren't especially long. - Scott Reynen
      • Fair enough, though I worry about some of my pages, with tens or hundreds of species listed! Also, note that "var" "sub" and suchlike are the proper abbreviations to use, in botanical nomenclature (see the posted examples). - Andy Mabbett
      • I think a balance will need to be achieved between brevity in the interests of avoiding bloated html in a page with many species names and giving a meaningful name - Tony Prichard
        • Would bloating really be an issue? Many, if not most, servers (including this one) now gzip,deflate content and thus transfer time aren't so much of an issue. The front page of the microformats site states "Designed for humans first and machines second[...]", so unabbreviated terms would be more consistent with this aim. - Charles Roper
          • 341 species, 58Kb. 'Nuff said? AndyMabbett 11:53, 26 Sep 2006 (PDT)
            • Your bird list page can be compressed by 79%, i.e. it would go down from 58KB to 12KB by enabling output compression on your server. It would also make the page load faster and save you bandwidth. No doubt compression technologies will improve over time, as will connection speeds and server speeds, so the technical solution to reducing page size would seem to me to be preferable over the "manual compression" method, i.e. using abbreviated, less clear, less readable class names. While it is easy to improve the compression technology (or switch it on, even), it's much harder to change an established microformat standard. - Charles Roper
          • Clearly page size is an issue. One way of reducing page size would be to have two versions of the page. One would be microformat marked up and the other not. It might be that the non-marked up page would be used on larger lists and as you got more refined, it would switch to marked-up pages, with the option of going with only marked-up pages or never being left to the web-user in some kind of preferences area. Styling the marked-up version into simpler HTML should be quite easy. - MichaelLee
  • Is "class" a potentially confusing attribute name, and what should replace it ("taxoclass", perhaps? or "classis"?)
    • Yes I would avoid class as it a frequent keyword in software languages - Tony Prichard
      • "bin" and "var" are also extremely common terms using in programming languages - Charles Roper
      • Conceded, and "taxoclass" applied to the above. "classis" would be an alternative. AndyMabbett 09:15, 22 Oct 2006 (PDT)
  • What other attribute names are needed, if any (we could do with help from a taxonomist!)
  • How to deal with: "Podiceps sp." (a grebe of unknown species)
    • How about the following, where we can infer an unknown species by the absence of that attribute?:
<span class="bin"><span class="genus">Podiceps</span></span>
      • This fails to account for the fact that this is thought to be only one species of a genus. Perhaps it's better to mark the species as unknown or "sp." or even "sp. #2": MichaelLee
<span class="bin"><span class="genus">Podiceps</span><span class="species">unknown</span></span>
    • There are also species aggregates and groups to be considered Grey/Dark Dagger sp., where it is one of two species but where the genus Acronicta cannot be used as there are more than the two species in the genus - Tony Prichard
      • Any suggestions? Or other examples? - AndyMabbett
        • This kind of aggregates are often used by birdwatchers (in Finland). How about separating the names with a slash (or some other sign)?: - MikkoBiomi
<span class="bin">Phylloscopus trochilus/Phylloscopus collybita</span>
<span class="bin"><span class="genus">Anser</span>/<span class="genus">Branta</span></span>
          • It would be clearer to come up with operators to allow a customized grouping of one's choice. Someone may want to say "I saw this or that" and one might want to say "I saw this AND that." You can use operators like / and +, but these aren't easily computer parsed. These could also be nested to come up with all sorts of groups. MichaelLee
<span class="group.or"><span class="bin">Phylloscopus trochilus</span><span>Phylloscopus collybita</span></span>
<span class="group.and"><span class="bin">Phylloscopus trochilus</span><span>Phylloscopus collybita</span></span>
  • Should we allow divisions of "binominal" with no parent "species", such as:
<span class="binominal">Larus glaucoides <span class="sub">kumlieni</span></span>
  • Is the "fungus" example OK, given that Amanita muscaria is not an abbreviation of "funghi"?
    • I do not like the use of the abbr tag at all in the examples given. The abbr tag is for abbreviations with the suggestion that the title is used for the full name. The implication in the Mallard example is that Mallard is an abbreviation for the scientific name, it is not it is a different type of name - Tony Prichard
  • Do the "authority" and "date" pair need a joint wrapper?
  • Is "bin" (short for binominal) the most appropriate term for a taxon name? When subspecies, var, subvar, etc. are nested, then surely it becomes trinomial? Would simply name or TaxonName not be more flexible? - Charles Roper
  • Some instances of "authority" can be complex, such as:
Salmonella enterica (ex Kauffmann & Edwards 1952) Le Minor & Popoff 1987

in which the parentheses are significant (see Wikipedia Taxobox usage and Wikipedia's article on zoological authority citations for further examples). Should we therefore allow "authority" to be a text string, and omit a separate year field? Andy Mabbett 03:17, 27 Jun 2007 (PDT)

I definitely prefer "superregnum" over "superkingdom". Using the Latin forms of the taxon names makes the format less language-dependent and more international. It probably also reduces ambiguity somewhat (a biological "kingdom" is also called an "empire" in English).

  • Is there still anything happening with this? The proposal seems to have been around for a couple of years, could it be finalised? There will always be edge cases, but that shouldn't prevent us from starting to use something in the simpler cases. I'd favour "taxon" over "species" or "biota" as the top level tag, and "commonname" over "cname". ThomasK 15:04, 21 July 2010 (UTC)

To add

  • Animal hybrids
  • GUID (Globally Unique Identifier). When referencing to a taxon name, there is also often the need to provide a GUID which relates to a taxonomic concept database (such as the NHM Species Dictionary). By providing a GUID, ambiguity is removed. - Charles Roper

Future development

Instead of including gender, age-bracket and count, we could allow for a future microformat, called, perhaps, "sighting", which might have the components:

  • sighting
    • species (a "species" microformat)
      • set (one or more)
        • count
        • age-bracket
        • gender
    • location (hCard or geo)
    • date-time

See West Midland Bird Club's Latest news from Ladywalk and In and around South Staffordshire 2006 (blog) for simple examples.


This proposal has been implemented approximately in Cognition and Operator (user script). Both are roughly compatible, but more work is required fleshing out the details of this spec so that full compatibility can be achieved, and more implementors can jump on board.

See also