species-brainstorming

(Difference between revisions)

Jump to: navigation, search
(Taxonomic Databases Working Group)
(sub-dividing)
Line 22: Line 22:
===Straw man proposal===
===Straw man proposal===
-
I'm tending towards this model, nested according to components of the microformat, not taxonomically:
+
See : [[species-strawman-01]]
-
 
+
-
[Note: in taxonomy, levels such as "subphylum", "class" or "order" are known as a "rank"].
+
-
 
+
-
[Note: It is intended that all these (X)HTML classes be '''''optionally''''' available to publishers, but they need use '''''only''''' those which apply to their particular needs. Compare, for instance, with all the classes and types available in [[hcard|hCard]].]
+
-
 
+
-
*species (scientific name; aka botanical name) (better: '''taxon'''; or '''biota''')
+
-
**domain (alternatively: "superregnum")
+
-
**kingdom (alt: "regnum")
+
-
**subkingdom (alt: "subregnum")
+
-
**superphylum
+
-
**phylum
+
-
**subphylum
+
-
**taxoclass (alt: "taxo-class ", "taxonclass", "taxon-class", "classis")
+
-
**subclass (alt: "subclassis")
+
-
**infraclass (alt: "infraclassis")
+
-
**superorder (alt: "superordo")
+
-
**order (alt: "ordo")
+
-
**suborder (alt: "subordo")
+
-
**infraorder (alt: "infraordo")
+
-
**parvorder
+
-
**superfamily (alt: "superfamilia")
+
-
**family (alt: "familia")
+
-
**subfamily (alt: "subfamilia")
+
-
**rank (alt: "taxorank", "taxon-rank", et al) - "unranked". See [http://names.ubio.org/browser/classifications.php?conceptID=2463046]; might also be used where the level of a rank is disputed, or the author simply has no ability or wish to declare the rank more explicitly.
+
-
**binomial ("binomial name" alt: "binominal")
+
-
***genus
+
-
***specific (="''specific epithet''")
+
-
**subsp ("subspecies")
+
-
**variety
+
-
**subvar ("subvariety")
+
-
**form
+
-
**subform
+
-
**cultivar
+
-
**cultgp ("cultivar group")
+
-
**cross (e.g. "F1")
+
-
**strain
+
-
**? morph (or phase) (e.g "Gyrfalcons, for example, have a grey morph and a white morph"  [http://www.peregrine-foundation.ca/info/dictionary.html]; "the Lesser Snow Goose (C. c. caerulescens), commonly occurs in two plumage variants. White-phase birds are white except for black wing tips, but blue-phase geese have bluish-grey plumage replacing most of the white except on the head, neck and tail tip." [http://en.wikipedia.org/wiki/Snow_goose])
+
-
**trade ("trade name")
+
-
**breed (e.g. Bull Terrier)
+
-
**sense (botanical - see [[species-examples#Sense (plant)|examples]])
+
-
**authority
+
-
***year (...of authority)
+
-
**cname ("common name" - should this be "common" or "vernacular"?)
+
-
**guid
+
-
**vgroup ("vernacular group" (alt: "vernacular-group") - there is possibly a better term for this. Often, a genus or family doesn't encapsulate a particular group of species in a practical or useful fashion. For example, it is difficult to seperate fungi species and lichen species as they are taxonomically intermingled. Thus, within taxonomic databases, a vernacular group of "fungi" and "lichen" is often applied to species falling into either of these groups. A vernacular group could be considered similar to a common name, but for groups of species. See the [http://www.searchnbn.net/directory/browseTGLevel1.jsp NBN Gateway] for an example of vernacular groups in use; these group names are also used in the [http://www.recordersoftware.org/ Recorder] biological recording software.
+
-
**? gender (useful for species exhibiting sexual dimorphism - "find me a picture of a male Pintail"; "I want to buy a female Holly bush" - a binary value, '''m'''ale or '''f'''emale; or including '''n'''ueter, '''h'''ermaphoradite, '''u'''nspecified and/ or '''m'''ixed?)- see [[#Future development|Future development]]
+
-
**? age bracket (adult/ juvenile/ seed/ egg/ nymph/ nestling/ pup/ cub/ instar1/ instar2 etc. - '''needs more work''') - see [[#Future development|Future development]]
+
-
**? count (a number, or represenattion of some other value - none, unspecified, "present", etc) - see [[#Future development|Future development]]
+
-
** [name to be suggested ("type", "role"?)] an indicator of type, e.g. for bees, "queen" or "worker" [Q: Is there a proper name, in the scientific cmmnuity, for thes edistinctions?]
+
-
 
+
-
 
+
-
where are optional, and it is possible to infer from simply:
+
-
:<pre><nowiki><abbr class="binominal" title="Anas platyrhynchos">Mallard</abbr></nowiki></pre>
+
-
or
+
-
:<pre><nowiki><span class="binominal">Anas platyrhynchos</span></nowiki></pre>
+
-
that the genus is ''Anas'' and the species is ''platyrhynchos'' (and, thus, "binominal" is to "sci"; as "[[adr]]" is to "[[hCard]]")
+
-
 
+
-
A species (Citrine Wagtail, a bird):
+
-
 
+
-
<pre><nowiki>
+
-
    <span class="species">
+
-
<span class="binominal">Motacilla citreola</span>
+
-
    </span>
+
-
</nowiki></pre>
+
-
 
+
-
Sub-species (animal):
+
-
<pre><nowiki>
+
-
    <span class="species">
+
-
        <span class="binominal">Larus glaucoides</span>
+
-
        <span class="subsp">kumlieni</span>
+
-
    </span>
+
-
</nowiki></pre>
+
-
 
+
-
Variety (plant):
+
-
<pre><nowiki>
+
-
  <span class="species">
+
-
    <span class="binominal">Pisum sativum</span>
+
-
    var. <span class="variety">macrocarpon</span>
+
-
  </span>
+
-
</nowiki></pre>
+
-
 
+
-
Species (animal, common name displayed):
+
-
<pre><nowiki>
+
-
    <span class="species">
+
-
        <abbr class="binominal" title="Larus thayeri">
+
-
            <span class="common">Thayer's Gull</span>
+
-
        </abbr>
+
-
    </span>
+
-
</nowiki></pre>
+
-
 
+
-
Species (animal, scientific name displayed):
+
-
<pre><nowiki>
+
-
    <span class="species">
+
-
        <abbr class="common" title="Thayer's Gull">
+
-
            <span class="binominal" Larus thayeri</span>
+
-
        </abbr>
+
-
    </span>
+
-
</nowiki></pre>
+
-
 
+
-
Fungus, kingdom included:
+
-
<pre><nowiki>
+
-
    <span class="species">
+
-
        <abbr class="kingdom" title="Fungi">
+
-
            <span class="binominal">Amanita muscaria</span>
+
-
        </abbr>
+
-
    </span>
+
-
</nowiki></pre>
+
-
 
+
-
Same name for different Genera:
+
-
 
+
-
<pre><nowiki>
+
-
    <p class="species">
+
-
        An unidentified
+
-
        <abbr class="taxoclass" title="Aves">
+
-
        <abbr class="genus" title="Oenanthe">
+
-
        <span class="common">
+
-
            Wheatear
+
-
        </span>
+
-
        </abbr>
+
-
        </abbr>
+
-
    </p>
+
-
</nowiki></pre>
+
-
 
+
-
and :
+
-
 
+
-
<pre><nowiki>
+
-
    <p class="species">
+
-
        An unidentified
+
-
        <abbr class="taxoclass" title="Magnoliopsida">
+
-
        <abbr class="genus" title="Oenanthe">
+
-
        <span class="common">
+
-
            Water Dropwort
+
-
        </span>
+
-
        </abbr>
+
-
        </abbr>
+
-
        sp.
+
-
    </p></nowiki></pre>
+
-
 
+
-
Species (animal, with authority and year):
+
-
<pre><nowiki>
+
-
    <span class="species">
+
-
        <span class="binominal">Pica pica</span>
+
-
        <span class="authority">Linnaeus</span>,
+
-
        (<span class="year">1758</span>)
+
-
    </span>
+
-
</nowiki></pre>
+
-
 
+
-
Re-classified species (animal):
+
-
<pre><nowiki>
+
-
    The species was classified as
+
-
    <span class="species">
+
-
        <abbr class="binominal" title="Bartramia longicauda">Tringa longicauda</abbr>
+
-
        by Johann Bechstein in 1812.
+
-
    </span>
+
-
</nowiki></pre>
+
-
 
+
-
A more extreme example, where there is a need to describe the full taxonomic hierarchy:
+
-
 
+
-
<pre><nowiki>
+
-
  <span class="species">
+
-
    <span class="domain">Eukarya</span>
+
-
    <span class="kingdom">Animalia</span>
+
-
    <span class="subkingdom">Eumetazoa</span>
+
-
    <span class="superphylum">Deuterostomia</span>
+
-
    <span class="phylum">Chordata</span>
+
-
    <span class="subphylum">Vertebrata</span>
+
-
    <span class="taxoclass">Aves</span>
+
-
    <span class="subclass">Neognathae</span>
+
-
    <span class="order">Passeriformes</span>
+
-
    <span class="suborder">Passeri</span>
+
-
    <span class="parvordo">Passerida</span>
+
-
    <span class="superfamily">Passeroidea</span>
+
-
    <span class="family">Motacillidae</span>
+
-
    <span class="binominal">
+
-
<span class="genus">Motacilla</span>
+
-
<span class="specific">alba</span>
+
-
<span class="subspecies">yarrellii</span>
+
-
    </span>
+
-
    <span class="cname">Pied Wagtail</span>
+
-
    <span class="authority">Linnaeus</span>
+
-
    <span class="year">1758</span>
+
-
  </span>
+
-
</nowiki></pre>
+
-
 
+
-
=====Expressing a species with a GUID=====
+
-
Work is currently underway, through [http://www.nhm.ac.uk/hosted_sites/tdwg/ TDWG] to develop a [http://wiki.gbif.org/guidwiki/wikka.php?wakka=HomePage truly global GUID system] based on [http://wiki.gbif.org/guidwiki/wikka.php?wakka=LSID LSID]s. [http://xml.coverpages.org/lsid.html More on LSIDs].
+
-
 
+
-
In the following example case an NBN GUID is provided. This GUID would be usable on the [http://www.searchnbn.net/speciesInfo/taxonomy.jsp?searchTerm=lutra%20lutra&spKey=NBNSYS0000005133 NBN Gateway], [http://nbn.nhm.ac.uk/nhm/bin/nbntaxa.dll/taxon_details?taxon_key=NBNSYS0000005133 The NHM Species Dictionary], in Recorder 2002 and Recorder 6, and in the forthcoming OpenRecorder online recording toolkit. As there are different GUIDs for different databases, the type of GUID can be indicated with a code followed by a hyphen followed by the GUID (e.g. nbn-NBNSYS0000005133).
+
-
<pre><nowiki>
+
-
    <span class="sci nbn-NBNSYS0000005133">
+
-
        <span class="binominal">Lutra lutra</span>
+
-
    </span>
+
-
</nowiki>
+
-
</pre>
+
-
Alternatively, the GUID could be expressed as an element in its own right, with the GUID type being expressed as a secondary class name:
+
-
<pre><nowiki>
+
-
    <span class="species">
+
-
        <span class="binominal">Lutra lutra</span>
+
-
        <span class="uid nbn">NBNSYS0000005133</span>
+
-
    </span>
+
-
</nowiki>
+
-
</pre>
+
-
As a further alternative, the [[abbr-design-pattern]] could potentially be used, although this is semantically questionable:
+
-
<pre><nowiki>
+
-
    <span class="species">
+
-
        <abbr class="binominal" title="NBNSYS0000005133">Lutra lutra</abbr>
+
-
    </span>
+
-
</nowiki>
+
-
</pre>
+
-
Yet another alternative, using a [http://www.ubio.org/index.php?pagename=home uBio] LSID as the GUID:
+
-
<pre><nowiki>
+
-
    <span class="species urn:lsid:ubio.org:namebank:8341384">
+
-
      <span class="cname">Green Sandpiper</span>
+
-
      <span class="binominal">Tringa ochropus</span>
+
-
    </span>
+
-
</nowiki>
+
-
</pre>
+
-
uBio has a publicly available [http://www.ubio.org/index.php?pagename=soap_tools SOAP web services interface] which makes mining for taxonomic intelligence relatively easy.
+
-
 
+
-
====Questions====
+
-
 
+
-
* Is "sci" the best attribute name for the top-level?
+
-
** No - Scott Reynen
+
-
*** What do you think would be better? - Andy Mabbett
+
-
**** Assuming "sci" is short for "scientific name", I propose "scientific-name".
+
-
***** It is. That's 12 extra characters! - Andy Mabbett
+
-
** '''Taxon''' is a far better solution [http://en.wiktionary.org/wiki/taxon]. It's short, meaningful and in keeping with the other class types. - Andy Mabbett
+
-
*** I think "taxonname" or "taxon-name" would be a better value for the class attribute. It is more descriptive of the data your trying to specify the format of. Taxon refers more to the classification grouping I thought. The class attribute is used frequently for the application of CSS styling so the top level class at least needs to be fairly distinctive I would have thought to avoid clashes with other class attribute values in the page and CSS files. - Tony Prichard
+
-
**** The OED defines ''taxon'' as "A taxonomic group". See also the URL cited, [http://en.wiktionary.org/wiki/taxon]. - Andy Mabbett
+
-
***** I agree that '''taxon''' would be the most suitable name. It could be considered as a shortening of '''TaxonConcept''' (or '''TaxonName'''), which is the term used by the  [http://tdwg.napier.ac.uk/index.php?pagename=VotingDraftIntroduction TCS] - Charles Roper
+
-
** or '''Biota''' - Andy Mabbett
+
-
*'''Species''' is used in the above, for the sake of having one name to use, but "biota" or "taxon" are likely to be used in the final version. [[User:AndyMabbett|AndyMabbett]] 09:15, 22 Oct 2006 (PDT)
+
-
 
+
-
* Should "bin", var", "cult", etc., be written in full? (I think not, to save bloating file sizes)
+
-
** Yes - Scott Reynen
+
-
***'''Conceded''', and applied to the above. What about "subsp", etc? [[User:AndyMabbett|AndyMabbett]] 09:15, 22 Oct 2006 (PDT)
+
-
 
+
-
* Should other attribute names be abbreviated for brevity?
+
-
** No, brevity is not one of the [[naming-principles|naming principles]]. "bin", "var", and "cult" all leave ambiguous meaning, which is a problem. We should "Use class names based on names from the original schema," e.g. full words or phrases where they aren't especially long. - Scott Reynen
+
-
*** Fair enough, though I worry about some of my pages, with tens or hundreds of species listed! Also, note that "var" "sub" and suchlike are the ''proper'' abbreviations to use, in botanical nomenclature (see the posted examples). - Andy Mabbett
+
-
*** I think a balance will need to be achieved between brevity in the interests of avoiding bloated html in a page with many species names and giving a meaningful name - Tony Prichard
+
-
**** Would bloating really be an issue? Many, if not most, servers (including this one) now gzip,deflate content and thus transfer time aren't so much of an issue. The front page of the microformats site states "Designed for humans first and machines second[...]", so unabbreviated terms would be more consistent with this aim. - Charles Roper
+
-
*****[http://www.westmidlandbirdclub.com/records/lists.htm 341 species, 58Kb]. 'Nuff said? [[User:AndyMabbett|AndyMabbett]] 11:53, 26 Sep 2006 (PDT)
+
-
******Your bird list page can be [http://snipurl.com/zfmj compressed by 79%], i.e. it would go down from 58KB to 12KB by enabling output compression on your server. It would also make the page load faster and save you bandwidth. No doubt compression technologies will improve over time, as will connection speeds and server speeds, so the technical solution to reducing page size would seem to me to be preferable over the "manual compression" method, i.e. using abbreviated, less clear, less readable class names. While it is easy to improve the compression technology (or switch it on, even), it's much harder to change an established microformat standard. - [[User:CharlesRoper| Charles Roper]]
+
-
*****Clearly page size is an issue.  One way of reducing page size would be to have two versions of the page.  One would be microformat marked up and the other not.  It might be that the non-marked up page would be used on larger lists and as you got more refined, it would switch to marked-up pages, with the option of going with only marked-up pages or never being left to the web-user in some kind of preferences area.  Styling the marked-up version into simpler HTML should be quite easy. - [[User:MichaelLee|MichaelLee]]
+
-
 
+
-
* Is "class" a potentially confusing attribute name, and what should replace it ("taxoclass", perhaps? or "classis"?)
+
-
** Yes I would avoid class as it a frequent keyword in software languages - Tony Prichard
+
-
*** "bin" and "var" are also extremely common terms using in programming languages - Charles Roper
+
-
***'''Conceded''', and "taxoclass" applied to the above. "classis" would be an alternative. [[User:AndyMabbett|AndyMabbett]] 09:15, 22 Oct 2006 (PDT)
+
-
 
+
-
* What other attribute names are needed, if any (we could do with help from a taxonomist!)
+
-
 
+
-
* How to deal with: "Podiceps sp." (a grebe of unknown species)
+
-
** How about the following, where we can infer an unknown species by the absence of that attribute?:
+
-
:<pre><nowiki><span class="bin"><span class="genus">Podiceps</span></span></nowiki></pre>
+
-
*** This fails to account for the fact that this is thought to be only one species of a genus. Perhaps it's better to mark the species as unknown or "sp." or even "sp. #2": [[User:MichaelLee|MichaelLee]]
+
-
:<pre><nowiki><span class="bin"><span class="genus">Podiceps</span><span class="species">unknown</span></span></nowiki></pre> 
+
-
** There are also species aggregates and groups to be considered Grey/Dark Dagger sp., where it is one of two species but where the genus Acronicta cannot be used as there are more than the two species in the genus - Tony Prichard
+
-
*** Any suggestions? Or other examples? - [[User:AndyMabbett|AndyMabbett]]
+
-
**** This kind of aggregates are often used by birdwatchers (in Finland). How about separating the names with a slash (or some other  sign)?: - [[User:MikkoBiomi|MikkoBiomi]]
+
-
:<pre><nowiki><span class="bin">Phylloscopus trochilus/Phylloscopus collybita</span></nowiki></pre>
+
-
:<pre><nowiki><span class="bin"><span class="genus">Anser</span>/<span class="genus">Branta</span></span></nowiki></pre>
+
-
***** It would be clearer to come up with operators to allow a customized grouping of one's choice.  Someone may want to say "I saw this or that" and one might want to say "I saw this AND that." You can use operators like / and +, but these aren't easily computer parsed.  These could also be nested to come up with all sorts of groups. [[User:MichaelLee|MichaelLee]]
+
-
:<pre><nowiki><span class="group.or"><span class="bin">Phylloscopus trochilus</span><span>Phylloscopus collybita</span></span></nowiki></pre>
+
-
:<pre><nowiki><span class="group.and"><span class="bin">Phylloscopus trochilus</span><span>Phylloscopus collybita</span></span></nowiki></pre>
+
-
 
+
-
* Should we allow divisions of "binominal" with no parent "species", such as:
+
-
:<pre><nowiki><span class="binominal">Larus glaucoides <span class="sub">kumlieni</span></span></nowiki></pre>
+
-
 
+
-
* Is the "fungus" example OK, given that '''Amanita muscaria''' is not an abbreviation of "funghi"?
+
-
** I do not like the use of the abbr tag at all in the examples given. The abbr tag is for abbreviations with the suggestion that the title is used for the full name. The implication in the Mallard example is that Mallard is an abbreviation for the scientific name, it is not it is a different type of name - Tony Prichard
+
-
 
+
-
* Do the "authority" and "date" pair need a joint wrapper?
+
-
 
+
-
* Is "bin" (short for binominal) the most appropriate term for a taxon name? When subspecies, var, subvar, etc. are nested, then surely it becomes [http://en.wikipedia.org/wiki/Trinomial_nomenclature trinomial]? Would simply '''name''' or '''TaxonName''' not be more flexible? - Charles Roper
+
-
 
+
-
*Some instances of "authority" can be complex, such as:
+
-
 
+
-
:'''''Salmonella enterica''''' (ex Kauffmann & Edwards 1952) Le Minor & Popoff 1987
+
-
 
+
-
in which the parentheses are significant (see [http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Tree_of_Life/taxobox_usage#Authorities Wikipedia Taxobox usage] and [http://en.wikipedia.org/wiki/Author_citation_%28zoology%29 Wikipedia's article on zoological authority citations] for further examples). Should we therefore allow "authority" to be a text string, and omit a separate year field? [[User:AndyMabbett|Andy Mabbett]] 03:17, 27 Jun 2007 (PDT)
+
-
 
+
-
*[http://species.wikimedia.org/w/index.php?title=Wikispecies:Village_Pump&curid=7207&diff=338726&oldid=338563 Comment on WikiSpecies, by Ucucha (19:21, 1 July 2007 UTC)]:
+
-
 
+
-
:<blockquote>I definitely prefer "superregnum" over "superkingdom". Using the Latin forms of the taxon names makes the format less language-dependent and more international. It probably also reduces ambiguity somewhat (a biological "kingdom" is also called an "empire" in English).</blockquote>
+
-
 
+
-
====To add====
+
-
* Animal hybrids
+
-
* GUID (Globally Unique Identifier). When referencing to a taxon name, there is also often the need to provide a GUID which relates to a taxonomic concept database (such as the [http://nbn.nhm.ac.uk/nhm/ NHM Species Dictionary]). By providing a GUID, ambiguity is removed. - Charles Roper
+
-
** Thank you. [[User:AndyMabbett|AndyMabbett]] 11:55, 26 Sep 2006 (PDT)
+
-
 
+
-
====Future development====
+
-
Instead of including gender, age-bracket and count, we could allow for a furture microformat, called, perhpas, "sighting", which might have the components:
+
-
 
+
-
*sighting
+
-
**species (a "species" microformat)
+
-
***set (one or more)
+
-
****count
+
-
****age-bracket
+
-
****gender
+
-
**location (hCard or geo)
+
-
**date-time
+
-
 
+
-
See [http://www.westmidlandbirdclub.com/ladywalk/latest.htm West Midland Bird Club's Latest news from Ladywalk] and [http://www.birdforum.net/showthread.php?t=48505 In and around South Staffordshire 2006 (blog)] for simple examples.
+
==Bill Hull==
==Bill Hull==

Revision as of 09:13, 17 January 2008

Contents

Species Brainstorming

Note: the original name of the proposed microformat, "species", is likely to change, probably to "biota" or "taxon". The former has been retained here, to avoid having to make many repetitive and perhaps redundant edits
{{UpdateMarker} The Operator extension now detects Species. A test page is available. Work on both continues!

Andy Mabbett

Proposal

There should, I believe, be a "species" microformat for the markup of plant and animal names, to include their scientific names. Consider:

<abbr class="species" title="Anas platyrhynchos">Mallard</abbr>

or

<span class="species">Anas platyrhynchos</span>

The microformat would allow user agents to be configured to perform look-ups on on-line databases of species, according to user preferences. Specification of the taxonomic class would help user agents to know which such databases were applicable (i.e., use database A for plants, but database B for mammals and database C for insects.)

It would also allow for more specific searching (do I mean "crow" or do I mean "Corvus corone"?).

The specification should encourage, but not mandate, the correct capitalisation of scientific names, so "Anas platyrhynchos'" not "anas platyrhynchos" nor (except historically) "Anas Platyrhynchos". A reminder that such names should be styled with italics will also be included.

Straw man proposal

See : species-strawman-01

Bill Hull

My website has 17000+ photos of 4700+ bird species. There are also a handful of butterflies (organized very poorly as I am unaware of any published butterfly world taxonomies) and shortly will have a number of dragon/damselflies. The site is made up of static pages but is built from a database so it is easy for me to add it new HTML tags to the pages. If you are interested in some prototyping at some point I can probably build stuff into the pages. - Bill Hull

Roger Hyam

Taxonomic Databases Working Group

TDWG is the organisation for standardisation in exchange of biodiversity data. The organisation has now (November 2007) undergone some re-organization. It has a new collaborative development environment, standards process, standards architecture and it has formed alliances with major organizations in the domains of geospatial and ecological data.

Central to the TDWG standards architecture are the LSID vocabularies. The role of these vocabularies is to define URIs for the nuts-and-bolts concepts that occur in the biodiversity informatics domain. See a description of what the TDWG ontology is for details. Although the vocabularies are defined in OWL the intention is for their URIs to be used as namespaces across different XML and non-XML based technologies. They can act as a central mapping point for those hard pressed developers who want to combine data presented to them in many formats.

The species microformats that are proposed here are a good thing. The only danger is that they re-define any of the central terms defined in the TDWG vocabularies. If they do that then they are creating another language instead of extending HTML to embrace existing semantics - which I don't think is their intent. It would be nice to have the data in web pages in a form that can be combined with the hundreds of millions of records marked up with the TDWG URIs.

If there is enough belief in the need for a Species Microformat why not propose a TDWG Applicability Statement and take it through a peer review process. The TDWG process is quite simple and free (unless you count blood, sweat and tears). You would need to form a Task Group with a charter saying what you intended to do. As convener of the TAG Interest Group I would willingly host the Task Group. You could then propose a standard and have it reviewed by a range of biologists and IT people before it becomes ratified and recommended for adoption. RogerHyam 2007-11-5

Malcolm Storey

(extracted from e-mails to Andy Mabbett, by kind permission)

ICZN, ICBN et al

You don't cover the full set of levels of taxonomic hierarchy defined by the relevant body ICZN or ICBN (plus the others - one each for garden plant varieties, bacteria, viruses. Don't know about mycoplasmas, diseases, BSE factors etc.

ICBN Ranks listed [1], [2]

AIUI ICBN only goes down to species.

ICZN isn't so easy: [3]

1.2.2. The Code regulates the names of taxa in the family group, genus group, and species group. Articles 1-4, 7-10, 11.1-11.3, 14, 27, 28 and 32.5.2.5 also regulate names of taxa at ranks above the family group. (But none of the above articles list the taxonomic ranks.)

ICZN Only goes down to subspecies (art 1.3.4)

Note also:

1.4. Independence. Zoological nomenclature is independent of other systems of nomenclature in that the name of an animal taxon is not to be rejected merely because it is identical with the name of a taxon that is not animal (see Article 1.1.1)

(eg Trichia, Oenanthe, Melanotus)

Myxomycetes are the exception - they're in kingdom protozoa which falls under ICZN but they fall under the ICBN name space. (Hence "Trichia").

DNA

You may want to consider refs to DNA sequences. They're not part of taxonomy, but they can be considered the bottom rung of the taxonomic hierarchy and they will be of increasing significance.

Typography

what about Adalia 2-punctata, and Adalia bipunctata (not to mention those with hyphens [or apostrophes] which may get left out. And what about accented characters)?

Adalia 2-punctata is an abbreviation of Adalia bipunctata, so:
<abbr class="binominal" title="Adalia bipunctata">Adalia 2-punctata</abbr>

AndyMabbett 09:55, 21 Oct 2006 (PDT)

Gaps

The hierarchy is not always fully populated. Not every species belongs to a class. Maybe this was where fungi are different. In Paul Kirk's databases (which are the official ones used to drive the checklists and NBN) he has fixed fields for the higher level taxa which means that only certain ranks can be used. The blanks he fills in (mostly!!) with "insertae sedis" (think it's Latin for "unknown seat"). In my database I use a self-join which gives much more flexibility. Anyway there are lots of "insertae sedis" in Paul's database!

Homonyms

Apion carduorum sensu Morris 1990 is Apion gibbirostre (Gyllenhal, 1813). Apion carduorum Kirby, 1808 is a different species.

You'd mark the former up as something like
<abbr class="binominal" title="Apion gibbirostre">''Apion carduorum'' sensu Morris 1990</abbr>
AndyMabbett 12:21, 5 Oct 2006 (PDT)

Citations for authorites

If people are citing the authority in full they would include the literature reference, not just the date e.g.

Cuphophyllus niveus (Scop.) Bon, Doc. Mycol. 14(56): 11 (1985)[1984]
Perhaps we should allow for the inclusion of an hCitation? Andy Mabbett 15:08, 28 Feb 2007 (PST)

Hyppo

Nomenclatural challenge

You asked for comments. One challenge I see is the difference in Nomenclature for Animalia and Plantae (coming from the old 2 kingdom system). For Plantae the International Code of Botanical Nomenclature[4] is used and for Animalia the code from http://www.iczn.org/. Animalia code is not officially accepted but ICZN tries to be authoritive starting from 2008.

The two different nomenclatural systems differ in a few areas, and they affect markup.

--Hyppo 14:23, 9 Oct 2006 (PDT)
I would mark those up as:
<span class=genus">Dendroceros</span> subg. <span class="subgenus">Apoceros</span>
<span class=genus">Sula</span> <span class="subgenus">Morus</span>
<span class="binominal">Begonia grandis</span> ssp. <span class="subspecies">evansiana''</span>
<span class="binominal">Gorilla beringei</span> <span class="subspecies">graueri</span>
With wrapping class="biota" and possibly kingdom, attributes.
AndyMabbett 11:37, 10 Oct 2006 (PDT)

Cyndy Parr

The ideas expressed here are promising. Below are my comments on all the preceding -- as I have time I'll organize, elaborate, and try to move parts into the right discussion threads above.

In the Spire project we have been developing ontologies in OWL for taxonomic names and hierarchies. Ideally, we'd like to have a microformat where people can tag a scientific name and an application can then check an ontology of their choice for more information (richer semantics).

We would discourage full expression of the Linnaean hierarchy except for those who are maintaining such classifications (such as uBio). The rest of the hierarchy can be retrieved ontologically as necessary.

Better to tie the scientific name (taxon name) to the authority or ontology from which it came. I.e. for those who are able to provide information on taxonomic concepts, support for TCS (Taxonomic Concept Schema) fields would be important.

I prefer "taxon" or "taxon-name" or TaxonName over biota (which is plural, and too close to biotic which has a far larger scope than taxa). Would prefer "binomial" to "binominal"

"class" is difficult not only because of the confusion with the programming concept of classes, but because it is a taxonomic rank. However, most of us have figured out the difference by now so this is not critical.

"cname" should be "comname" or "common-name" or "vernacular" to make it more obvious what the information is. A sub-component would be the language for which that common name is used ( something like an HTML attribute lang="en")

There are known conflicts between names across kingdoms (as current codes of nomenclature allow these). Thus specification of kingdom may be encouraged. Disambiguation could be handled by applications outside the microformats (this could be difficult), or they could be dealt with in the core microformat: e.g. plant-taxon or fungal-taxon or animal-taxon.

A sightings microformat is a good idea and I would be interested in being involved in that. We've been toying with this in OWL and also using structured blogging over at http://fieldmarking.reger.com

Your terms such as gender (better: sex), age bracket (better: life stage), count, type (better: depending on the meaning, caste or morph) all belong in a specimen or sighting microformat and used in combination with the taxon microformat, not be part of it.

Response by Andy Mabbett

Thank you very much for your detailed contribution. I have a few responses:

  <span class="taxon lsidres:urn:lsid:ubio.org:namebank:21833">
    <i class="sci-name">Passeriformes</i>
  </span>

Or, to simplify further:

  <i class="taxon sci-name lsidres:urn:lsid:ubio.org:namebank:21833">Passeriformes</i>

Or, at the simplest level:

  <i class="taxon">Passeriformes</i>

Simply marking up the word as a taxon would lighten the load of any parser, making its job much simpler. --Charles Roper 10:50, 8 Jan 2007 (PST)

(I'm either in agreement with your other points, or ambivalent.)

Thank you again - do stick around. Are you on the mailing list?

Andy Mabbett 11:06, 5 Jan 2007 (PST)

Pengo

Unfortunately scientific names seem to change as often as common names. I have some examples and use cases this microformat needs to address, around the problems of ambiguity:

Ambiguity 1. Ambiguous scientific names.. Sousa chinensis may either refer to Chinese White Dolphin (also known as Sousa chinensis chinensis) or Humpback dolphin, also known as Sousa (genus) which includes up to five species or subspecies of dolphin including the Chinese White Dolphin. I don't care whether the Chinese White Dolphin is a species or subspecies, but the microformat needs to allow the user to be specific about which system is being addressed.

Ambiguity 2. Another example is the Orangutan... or Orangutans. Organutans were once believed to be a single species, but are now considered two separate species. The problem is that the new scientific name for just the Bornean species (Pongo pygmaeus) is the same as the old scientific name which encompassed both species (Pongo pygmaeus). Meanwhile the new scientific name for the Sumatran Orangutan (Pongo abelii) is always unambiguous.

Ambiguity 3. Doronomyrmex pocahontas is an ant species that probably doesn't belong in the genus Doronomyrmex, but rather Leptothorax. But, until a full taxonomic study of the known species of Doronomyrmex and Leptothorax is carried out, it will stay there. Meanwhile the the term "Leptothorax (sensu stricto)" is used to mean "in the sense of the original author".

Use cases: So how do we:

  1. tag species in new documents, where we are using the most current nomenclature in the tags, to indicate that we don't mean the old nomenclature
  2. tag species allowing for new nomenclature to arise which may obsolete what we're using
  3. tag species in old documents, where we have updated the nomenclature in the tag, but the taxt may be referring to the old nomenclature, and we want to indicate that the updated nomenclature is being used.
  4. tag species in [others'] documents that are tagged automatically and where the specific nomenclature being used is unknown or ambiguous
  5. address issues where competing nomenclatures exist side-by-side, or transition periods
  6. tag species that have some clues as to which nomenclature is being used, e.g. the date of publication, and the author.
  7. tag a taxon which is now considered paraphyletic
  8. decide what's out of the scope of this microformat

Brainstorm solutions:

<span species="Pongo pygmaeus" old-synonym="Pongo pygmaeus pygmaeus">Bornean Orangutan</span>

Basically I don't synonyms are necessary unless they are to show that the species was previously called something else, which may help to give a more exact meaning.

Comments? Are there already existing solutions to this problem in the real world? Pengo 19:49, 28 Jan 2007 (PST)

Response to Pengo by Andy Mabbett

Thank you for your expert contribution. Of your proposed solutions, the common (or vernacular) name, UID and author/ year are already in the current proposal. It may be sensible to have a "synonym" property (as used on http://en.wikipedia.org/wiki/Doronomyrmex_pocahontas), but I don't think "old-synonym" is particularly well named. Perhaps, if it's needed at all, "formerly" would be better? It is worth remembering, though, that the microformat is meant for labelling what people already publish and, for instance, http://en.wikipedia.org/wiki/Bornean_Orangutan refers to Pongo pygmaeus, not any previous name. Andy Mabbett 02:20, 30 Jan 2007 (PST)

Charles Roper

Synonyms

I found an interesting example of synonym usage in the Tiger Beetles of Connecticut checklist. In the particular example cited, the synonyms refer to, or are associated with, the species name - Cicindela duodecimguttata Dejean 1825. Synonyms are often mentioned alongside or near preferred scientific names; how should we tie them together, especially when, as in this case, the name and the synonym are not positioned close to one another, but are still clearly associated? As a segue to this question, how should multiple synonymous common names be represented? How about common names in different languages? For example, the Otter has many different common names.

I take it you refer to the text which may be paraphrased (by omitting some prose) as:
Cicindela duodecimguttata is known from 23 localities. Cicindela duodecimguttata, once classified as a subspecies of C. repanda, shares many traits with C. repanda. Where C. duodecimguttata occurs, the more common C. repanda is usually found.
Synonomies: Cicindela proteus Kirby 1837:9. Cicindela bucolica Casey 1913:28. Cicindela hudsonica Casey 1916:29. Cicindela edmontonensis Carr 1920:21
The problem would seem to be that C. repanda is referred to both as a species in its own right, and as a past synonym of C. duodecimguttata. If the whole thing is wrapped in one div class="biota", allowing the other listed synonyms to be included, then how is C. repanda to be marked up as a species in its own right?
I would mark up the first occurrence of each, then use the include-pattern to "attach" the other listed synonyms with the former (I've only included one synonym in the following, for clarity):

<span class="biota">

 <span class="binominal">Cicindela duodecimguttata</span>
 <object class="include" data="#C-proteus"></object>

</span>

is known from 23 localities. Cicindela duodecimguttata, once classified as a subspecies of

<span class="biota">

 <span class="binominal">C. repanda</span>

</span>

, shares many traits with C. repanda. Where C. duodecimguttata occurs, the more common C. repanda is usually found.

Synonomies: <span class="synonym" id="C-proteus">

 <span class="binominal">Cicindela proteus</span> [or maybe "synonym-binominal" ?]
 <span class="authority">Kirby</span>
 <span class="year">1837</span>:9.</span>

Cicindela bucolica Casey 1913:28. Cicindela hudsonica Casey 1916:29. Cicindela edmontonensis Carr 1920:21

I might then use the its entry on the "shares many traits" line to mark up C. repanda as an synonym, and include it in the same way.
Multiple and foreign-language common names would be catered for by allowing the common name attribute to be "0 or many" (the first such occurrence having precedence), and using a lang attribte where appropraite.
Andy Mabbett 14:42, 28 Feb 2007 (PST)

Other use cases

Please add your suggestions!

'Species' microformats could be used to:


See also

species-brainstorming was last modified: Wednesday, December 31st, 1969

Views