species: Difference between revisions
| AndyMabbett (talk | contribs) m (→Proposal:  typo) |  (Added inactive template per discussion in IRC) | ||
| (57 intermediate revisions by 22 users not shown) | |||
| Line 1: | Line 1: | ||
| {{inactive}} | |||
| =Species= | =Species= | ||
| Line 4: | Line 6: | ||
| :'''Note: the original name of the proposed microformat, "species", is likely to change, probably to "biota" or "taxon". The former has been retained here, to avoid having to make many repetitive and perhaps redundant edits''' | :'''Note: the original name of the proposed microformat, "species", is likely to change, probably to "biota" or "taxon". The former has been retained here, to avoid having to make many repetitive and perhaps redundant edits''' | ||
| :'''The [http://www.kaply.com/weblog/2007/02/16/operator-07a-is-available/ beta of Operator] (0.7a) detects ''Species''.  [http://www.westmidlandbirdclub.com/records/lists-2004.htm A test page is available]. Work on both continues!''' | |||
| ==Introduction== | ==Introduction== | ||
| Line 9: | Line 13: | ||
| People use the vernacular AND taxonomic names of species in everyday speech and writing - just read or watch any populist gardening magazine or television programme. | People use the vernacular AND taxonomic names of species in everyday speech and writing - just read or watch any populist gardening magazine or television programme. | ||
| Consider this list: "'''Blackbird'''", "'''poodle'''", "'''T Rex'''", "'''potato'''", "'''French Marigold'''", "'''Wisteria'''", "'''E. Coli'''", "'''HIV'''", "'''Rubella'''" and "'''human being'''". | *Consider this list: "'''Blackbird'''", "'''poodle'''", "'''T Rex'''", "'''potato'''", "'''French Marigold'''", "'''Wisteria'''", "'''E. Coli'''", "'''HIV'''", "'''Rubella'''" and "'''human being'''". | ||
| :"'''T Rex'''" is "''Tyrannosaurus rex''"; "'''E. Coli'''" is "''Escherichia coli''"; "'''HIV'''" is "''Human immunodeficiency virus''" and "'''Rubella'''" is "''Rubella virus''". All are the taxonomic (or scientific) names of unique species. | |||
| :"'''''Wisteria'''''" is a taxonomic genus. | |||
| :"'''Blackbird'''"; "'''poodle'''"; "'''potato'''"; "'''French Marigold'''" and "'''human being'''" (arguments about Neanderthals not withstanding) are vernacular (or common) names, but still refer to individual species. | |||
| *The scientific naming of organisms is a part of [http://www.calacademy.org/research/informatics/sblum/pub/biodiv_informatics.html biodiversity informatics] - "the application of information technology to the domain of biodiversity". | |||
| *The proposal aligns with [http://www.ted.com/tedprize/2007/wilson.cfm E.O Wilson's wish]: | |||
| <blockquote>...that we will work together to help create the key tool that we need to inspire preservation of Earth's biodiversity: the Encyclopedia of Life [...] an encyclopedia that lives on the Internet, with an ever-evolving page for every species [and which]  | |||
| does not duplicate existing efforts, but instead incorporates them through linking [with a] search technology that can aggregate existing biological information and make it easily accessible.</blockquote> | |||
| *As long ago as April 1998 on [http://www.w3.org/MarkUp/future/papers/rothenburg-19980412.html Robert Rothenburg's paper on Dictionaries] (''Note 4'') said: | |||
| <blockquote>What is missing [from HTML] is an element for marking up "proper names" (names of people, geographic locations, institutions, or even '''scientific names such as genus/species''').</blockquote> | |||
| :It's interesting that microformats have given us the first three missing items - and we're now debating the fourth! | |||
| ==Proposal== | ==Proposal== | ||
| Line 30: | Line 43: | ||
| Those are just two of the things a "species" microformat might do for you. | Those are just two of the things a "species" microformat might do for you. | ||
| Another benefit would be that user-agents could be instructed to treat text marked up in this way as not being in the base language of the document or element in which they occur - pronunciation should be as for Latin, they should not be translated (e.g. where a component word happens also to be a valid word in that language) and should not be spell-checked , or be spell-checked with a specialised dictionary | Your software, or  a search engine, would be able to differentiate between a pages discussing HMS Beagle, the ship, and a Beagle dog; or birds that fly as opposed to a slang term for women. | ||
| Another benefit would be that user-agents could be instructed to treat text marked up in this way as not being in the base language of the document or element in which they occur - pronunciation should be as for Latin, they should not be translated (e.g. where a component word happens also to be a valid word in that language, such as the genus '''''Colon''''', '''''Circus''' cyaneus'', ''Hesperia '''comma''''', or anything with ''major'' or ''minor'' on an English-language page) and should not be spell-checked, or be spell-checked with a specialised dictionary (a need identified in this [http://www.alvestrand.no/pipermail/ietf-languages/2003-February/000590.html 2003 ietf-languages discussion of language values for taxonomic names]). | |||
| A further benefit the species microformat would bring is in the enriching and enhancement of species checklists, which are commonly found on the web. Broadly speaking, a species checklist is a list of taxa, usually for a particular group of similar organisms such as birds or vascular plants, found within a particular geographical region (a country, [http://www.westmidlandbirdclub.com/records/lists.htm a region], a county, or a specific site, large or small). A typical example of a species checklist is the [http://www.coleopterist.org.uk/checklist.htm Checklist of Beetles of the British Isles] which, as the name suggests, lists beetles known to be found within the British Isles. A [http://www.google.com/search?q=species+checklist Google Search for "species checklist"] will reveal many other such examples. Species checklists are presented in a broadly consistent manner but are usually unable to be parsed and utilised by computers due to the lack of a common standard for marking them up in HTML. The species microformat would provide that common standard. A fully microformat enabled checklist would be parsable by computers and thus would provide developers with a means by which to aggregate and otherwise make use of this invaluable content beyond the current, rather limited, use of simple online viewing.  | |||
| A specific example of checklist use might be in enabling [http://www.aditsite.co.uk/html/choosing_recording_software.html biological recording software] to parse and aggregate checklists in order to include them in their own species dictionaries. Typically there are waits of many months or even years while humans collate checklist changes manually for inclusion in recording software; automated checklist parsing and aggregation would greatly expedite and increase the efficiencies of this process. | |||
| ==Existing taxonomies== | ==Existing taxonomies== | ||
| Line 44: | Line 63: | ||
| ==References== | ==References== | ||
| *[http://en.wikipedia.org/wiki/Scientific_classification Wikipedia: Scientific classification] | |||
| ====General taxonomy==== | |||
| * [http://en.wikipedia.org/wiki/Scientific_classification Wikipedia: Scientific classification] | |||
| **[http://en.wikipedia.org/wiki/Binomial_nomenclature Wikipedia: Binomial nomenclature] | **[http://en.wikipedia.org/wiki/Binomial_nomenclature Wikipedia: Binomial nomenclature] | ||
| **[http://en.wikipedia.org/wiki/Trinomial_nomenclature  | **[http://en.wikipedia.org/wiki/Binomen Wikipedia: Binomen] | ||
| **[http://en.wikipedia.org/wiki/Trinomial_nomenclature Wikipedia: Trinomial nomenclature] | |||
| **[http://en.wikipedia.org/wiki/Trinomen Wikipedia: Trinomen] | |||
| **[http://en.wikipedia.org/wiki/Taxon Wikipedia: Taxon] | **[http://en.wikipedia.org/wiki/Taxon Wikipedia: Taxon] | ||
| **[http://en.wikipedia.org/wiki/Rank_%28zoology%29 Wikipedia: Rank (zoology)] | |||
| **[http://en.wikipedia.org/wiki/Rank_%28botany%29 Wikipedia: Rank (botany)] | |||
| **[http://en.wikipedia.org/wiki/Cultivar Wikipedia: Cultivars] | **[http://en.wikipedia.org/wiki/Cultivar Wikipedia: Cultivars] | ||
| **[http://en.wikipedia.org/wiki/Variety_%28biology%29 Wikipedia: Varieties] | **[http://en.wikipedia.org/wiki/Variety_%28biology%29 Wikipedia: Varieties] | ||
| **[http://en.wikipedia.org/wiki/Subvariety Wikipedia: Sub-varieties] | **[http://en.wikipedia.org/wiki/Subvariety Wikipedia: Sub-varieties] | ||
| **[http://en.wikipedia.org/wiki/Form_%28botany%29 Wikipedia: forms] | **[http://en.wikipedia.org/wiki/Form_%28botany%29 Wikipedia: forms] | ||
| **[http://en.wikipedia.org/wiki/Synonym_%28taxonomy%29 Wikipedia: Synonyms] | |||
| **[http://en.wikipedia.org/wiki/Wikipedia:How_to_read_a_taxobox How to read a taxobox] | **[http://en.wikipedia.org/wiki/Wikipedia:How_to_read_a_taxobox How to read a taxobox] | ||
| *[http://www.iczn.org/ http://www.iczn.org/ International Commission on Zoological Nomenclature] | |||
| **[http://en.wikipedia.org/wiki/ | ====Taxonomic codes==== | ||
| *[http://www. | * [http://en.wikipedia.org/wiki/Nomenclature_Codes Wikipedia: Nomenclature Codes] | ||
| * [http://www.iczn.org/ http://www.iczn.org/ International Commission on Zoological Nomenclature] (ICZN) | |||
| ** [http://en.wikipedia.org/wiki/International_Code_of_Zoological_Nomenclature Wikipedia: International Code of Zoological Nomenclature] | |||
| **[http://en.wikipedia.org/wiki/International_Commission_on_Zoological_Nomenclature Wikipedia: International Commission on Zoological Nomenclature] | |||
| * [http://ibot.sav.sk/icbn/main.htm International Code of Botanical Nomenclature] (ICBN) | |||
| ** [http://en.wikipedia.org/wiki/International_Code_of_Botanical_Nomenclature Wikipedia: International Code of Botanical Nomenclature] | |||
| * [http://www.the-icsp.org/ International Committee on Systematics of Prokaryotes] (ICSP) | |||
| **[http://en.wikipedia.org/wiki/International_Code_of_Nomenclature_of_Bacteria Wikipedia: International Code of Nomenclature of Bacteria] (ICNB) | |||
| * [http://www.ictvonline.org/ International Committee on Taxonomy of Viruses]  | |||
| **[http://en.wikipedia.org/wiki/International_Committee_on_Taxonomy_of_Viruses Wikipedia: International Committee on Taxonomy of Viruses] | |||
| ====Other references==== | |||
| *[http://www.rhs.org.uk/rhsplantfinder/plantnaming.asp RHS Plant Finder: The naming of plants]   | *[http://www.rhs.org.uk/rhsplantfinder/plantnaming.asp RHS Plant Finder: The naming of plants]   | ||
| *[http://www.ishs.org/sci/icracpco.htm International Code of Nomenclature for Cultivated Plants] | *[http://www.ishs.org/sci/icracpco.htm International Code of Nomenclature for Cultivated Plants] | ||
| Line 64: | Line 102: | ||
| *[http://en.wiktionary.org/wiki/taxonomy Wiktionary: Taxonomy] | *[http://en.wiktionary.org/wiki/taxonomy Wiktionary: Taxonomy] | ||
| *[http://jbi.nhm.ku.edu/index.php/jbi/article/view/25 Biodiversity Informatics: Taxonomic names, metadata, and the Semantic Web] | *[http://jbi.nhm.ku.edu/index.php/jbi/article/view/25 Biodiversity Informatics: Taxonomic names, metadata, and the Semantic Web] | ||
| *[http://en.wikipedia.org/wiki/Virus_classification  Wikipedia: Virus classification] | |||
| ==Contributors & Supporters== | ==Contributors & Supporters== | ||
| Line 73: | Line 112: | ||
| *[[User:CyndyParr|Cyndy Parr]], [http://spire.umbc.edu Spire project] (interested party) | *[[User:CyndyParr|Cyndy Parr]], [http://spire.umbc.edu Spire project] (interested party) | ||
| *[[User:AnimalDiversity|Animal Diversity Web]], [http://animaldiversity.org] (interested party) | *[[User:AnimalDiversity|Animal Diversity Web]], [http://animaldiversity.org] (interested party) | ||
| *[[User:Christoph|Christoph Champ]], [http://www.christophchamp.com/] (supporter with a background in biochemistry, biophysics, and bioinformatics) | |||
| *[[User:David Stang|David Stang]], [http://ZipcodeZoo.com] (interested party) | |||
| *[[User:StuartTurner|Stuart Turner, DVM]], [http://leafpath.org Leafpath Informatics] (interested party) | |||
| ==See also== | ==See also== | ||
| Line 79: | Line 121: | ||
| *[https://addons.mozilla.org/firefox/169/ Biobar] - A customisable Firefox Extension providing a toolbar and content menu options for browsing biological data and databases, which shows what a user-agent could do, with data extracted from a species microformat | *[https://addons.mozilla.org/firefox/169/ Biobar] - A customisable Firefox Extension providing a toolbar and content menu options for browsing biological data and databases, which shows what a user-agent could do, with data extracted from a species microformat | ||
| ==Implementations (pending)== | ==Examples in the wild== | ||
| * An example use of the species microformat on the [http://www.amentsoc.org/insects/fact-files/orders/hymenoptera-parasitica.html Amateur Entomologists' Society Web site]. | |||
| *Social network for gardeners marking up plant names with microformats [http://www.growsonyou.com/plant/Cosmos_bipinnatus Grows on You]. | |||
| *A "proof-of-concept" example, with binomial and sub-family, but no other ranks, has been added to the Wikispecies entry for [http://species.wikimedia.org/wiki/Glaucidium_sanchezi Glaucidium sanchezi]; and to its subfamily, [http://species.wikimedia.org/wiki/Surniinae Surniinae]. See [http://species.wikimedia.org/wiki/Wikispecies:Microformat Wikispecies:Microformat] for related discussion. | |||
| *The current "[[species-brainstorming#Straw man proposal|straw man]]" for the Species microformat has been deployed, in part, on Wikipedia. '''''All''''' Wikipedia articles with "[http://en.wikipedia.org/wiki/Template:Taxobox taxoboxes]" (information panels on living things; and there are '''37,140''' as at the time of posting) now emit a species microformat. For example [http://en.wikipedia.org/wiki/Southern_Tamandua Southern_Tamandua] (a species) and [http://en.wikipedia.org/wiki/Anteater Anteater] (a family of species).  | |||
| * A [http://www.westmidlandbirdclub.com/records/lists-2004.htm West Midland Bird Club test page] is available. | |||
| * [http://www.bbc.co.uk/nature/family/Megapode BBC Wildlife Finder] | |||
| ==Implementations== | |||
| *[https://addons.mozilla.org/firefox/4106/ Operator] has a user-script for parsing ''Species''.  | |||
| *[http://buzzword.org.uk/cognition/ Cognition] has an [http://buzzword.org.uk/cognition/uf-plus.html#species experimental implementation]. | |||
| ===Implementations (pending)=== | |||
| *[http://www.spacefornature.co.uk/biorec/taxoncheck.htm Taxon Checker] - a software tool which, given a common name, searches for the relevant taxonomic data and outputs the selected species' details as (among other options) an HTML fragment. It is intended to provide templates for outputting such fragments with "species" microformat markup, once this proposal is implemented. | *[http://www.spacefornature.co.uk/biorec/taxoncheck.htm Taxon Checker] - a software tool which, given a common name, searches for the relevant taxonomic data and outputs the selected species' details as (among other options) an HTML fragment. It is intended to provide templates for outputting such fragments with "species" microformat markup, once this proposal is implemented. | ||
| *[http://en.wikipedia.org/wiki/User:Beastie_Bot Wikpedia's Beastie Bot] can be used to update the "taxoboxes" of articles about living things/ | |||
Latest revision as of 17:16, 28 April 2021
Species
- For the latest ideas, and to make comments, please see species-brainstorming.
- Note: the original name of the proposed microformat, "species", is likely to change, probably to "biota" or "taxon". The former has been retained here, to avoid having to make many repetitive and perhaps redundant edits
- The beta of Operator (0.7a) detects Species. A test page is available. Work on both continues!
Introduction
People use the vernacular AND taxonomic names of species in everyday speech and writing - just read or watch any populist gardening magazine or television programme.
- Consider this list: "Blackbird", "poodle", "T Rex", "potato", "French Marigold", "Wisteria", "E. Coli", "HIV", "Rubella" and "human being".
- "T Rex" is "Tyrannosaurus rex"; "E. Coli" is "Escherichia coli"; "HIV" is "Human immunodeficiency virus" and "Rubella" is "Rubella virus". All are the taxonomic (or scientific) names of unique species.
- "Wisteria" is a taxonomic genus.
- "Blackbird"; "poodle"; "potato"; "French Marigold" and "human being" (arguments about Neanderthals not withstanding) are vernacular (or common) names, but still refer to individual species.
- The scientific naming of organisms is a part of biodiversity informatics - "the application of information technology to the domain of biodiversity".
- The proposal aligns with E.O Wilson's wish:
...that we will work together to help create the key tool that we need to inspire preservation of Earth's biodiversity: the Encyclopedia of Life [...] an encyclopedia that lives on the Internet, with an ever-evolving page for every species [and which] does not duplicate existing efforts, but instead incorporates them through linking [with a] search technology that can aggregate existing biological information and make it easily accessible.
- As long ago as April 1998 on Robert Rothenburg's paper on Dictionaries (Note 4) said:
What is missing [from HTML] is an element for marking up "proper names" (names of people, geographic locations, institutions, or even scientific names such as genus/species).
- It's interesting that microformats have given us the first three missing items - and we're now debating the fourth!
Proposal
Imagine viewing a web page with a reference to a species - and being able to use an add-on to you browser to be taken directly to information about that species, on, say, Wikipedia, or Wikispecies, or Google Images, or another site, such as in an academic database, of your choosing.
Your software would automatically know to search site A if the scientific name referred to a moth, site B for a bird, and site C for a plant - and you could set your preferences as to which sites those were to be, and in which order two or more were to be searched (e.g. for moths, try UK Moths first, if not found try The Global Lepidoptera Names Index).
Or supposing someone writes a long, chronologically-ordered web page about all the birds, insects, mammals and plants they saw on a wildlife safari, with lots of prose description about the paces where they saw them and the people they were with, but you want to extract a list of species, sorted into alphabetical order within taxonomic class (birds first, then insects then...) or in taxonomic order.
Those are just two of the things a "species" microformat might do for you.
Your software, or a search engine, would be able to differentiate between a pages discussing HMS Beagle, the ship, and a Beagle dog; or birds that fly as opposed to a slang term for women.
Another benefit would be that user-agents could be instructed to treat text marked up in this way as not being in the base language of the document or element in which they occur - pronunciation should be as for Latin, they should not be translated (e.g. where a component word happens also to be a valid word in that language, such as the genus Colon, Circus cyaneus, Hesperia comma, or anything with major or minor on an English-language page) and should not be spell-checked, or be spell-checked with a specialised dictionary (a need identified in this 2003 ietf-languages discussion of language values for taxonomic names).
A further benefit the species microformat would bring is in the enriching and enhancement of species checklists, which are commonly found on the web. Broadly speaking, a species checklist is a list of taxa, usually for a particular group of similar organisms such as birds or vascular plants, found within a particular geographical region (a country, a region, a county, or a specific site, large or small). A typical example of a species checklist is the Checklist of Beetles of the British Isles which, as the name suggests, lists beetles known to be found within the British Isles. A Google Search for "species checklist" will reveal many other such examples. Species checklists are presented in a broadly consistent manner but are usually unable to be parsed and utilised by computers due to the lack of a common standard for marking them up in HTML. The species microformat would provide that common standard. A fully microformat enabled checklist would be parsable by computers and thus would provide developers with a means by which to aggregate and otherwise make use of this invaluable content beyond the current, rather limited, use of simple online viewing.
A specific example of checklist use might be in enabling biological recording software to parse and aggregate checklists in order to include them in their own species dictionaries. Typically there are waits of many months or even years while humans collate checklist changes manually for inclusion in recording software; automated checklist parsing and aggregation would greatly expedite and increase the efficiencies of this process.
Existing taxonomies
The proposal respects all existing biological taxonomies, and is not intended to change or supplant any of them - it is intended merely to provide webmasters (from personal hobby sites to major academic databases; from news outlets to retail organisations) with a method of either:
- marking-up a taxonomical name (or taxon-common name pair) in such a way that its components can be recognised by computers or
- marking up a common name, so as to associate with it a taxonomical name, in such a way that the latter's components can be recognised by computers.
Embedding within other microformats
The proposed plant microformat (with care regime, supplier, etc.), hlisting, recipe or hReview (and possibly others) could contain a scientific name microformat, in the same way that an hCalendar can contain an hCard.
See also: species-brainstorming#Future development
References
General taxonomy
Taxonomic codes
- Wikipedia: Nomenclature Codes
- http://www.iczn.org/ International Commission on Zoological Nomenclature (ICZN)
- International Code of Botanical Nomenclature (ICBN)
- International Committee on Systematics of Prokaryotes (ICSP)
- International Committee on Taxonomy of Viruses
Other references
- RHS Plant Finder: The naming of plants
- International Code of Nomenclature for Cultivated Plants
- Taxonomic Databases Working Group
- Hortax - The Names of Garden Plants
- www.bgbm.fu-berlin.de/iapt/nomenclature/code/SaintLouis/0000St.Luistitle.htm
- WikiSpecies
- Wiktionary: Taxonomy
- Biodiversity Informatics: Taxonomic names, metadata, and the Semantic Web
- Wikipedia: Virus classification
Contributors & Supporters
- Andy Mabbett (proponent)
- Roger Hyam (interested party?)
- Charles Roper, Sussex Biodiversity Record Centre (proponent)
- Steve McWilliam, rECOrd - The Biodiversity Information System for the Cheshire region (supporter)
- Kieren Pitts, ILRT - Institute for Learning and Research Technology and the Amateur Entomologists' Society (supporter)
- Cyndy Parr, Spire project (interested party)
- Animal Diversity Web, [1] (interested party)
- Christoph Champ, [2] (supporter with a background in biochemistry, biophysics, and bioinformatics)
- David Stang, [3] (interested party)
- Stuart Turner, DVM, Leafpath Informatics (interested party)
See also
Here's some work-in-progress:
- species
- examples
- quantitative evidence
- brainstorming (includes the straw man- or draft standard)
- Biobar - A customisable Firefox Extension providing a toolbar and content menu options for browsing biological data and databases, which shows what a user-agent could do, with data extracted from a species microformat
Examples in the wild
- An example use of the species microformat on the Amateur Entomologists' Society Web site.
- Social network for gardeners marking up plant names with microformats Grows on You.
- A "proof-of-concept" example, with binomial and sub-family, but no other ranks, has been added to the Wikispecies entry for Glaucidium sanchezi; and to its subfamily, Surniinae. See Wikispecies:Microformat for related discussion.
- The current "straw man" for the Species microformat has been deployed, in part, on Wikipedia. All Wikipedia articles with "taxoboxes" (information panels on living things; and there are 37,140 as at the time of posting) now emit a species microformat. For example Southern_Tamandua (a species) and Anteater (a family of species).
- A West Midland Bird Club test page is available.
- BBC Wildlife Finder
Implementations
- Operator has a user-script for parsing Species.
- Cognition has an experimental implementation.
Implementations (pending)
- Taxon Checker - a software tool which, given a common name, searches for the relevant taxonomic data and outputs the selected species' details as (among other options) an HTML fragment. It is intended to provide templates for outputting such fragments with "species" microformat markup, once this proposal is implemented.
- Wikpedia's Beastie Bot can be used to update the "taxoboxes" of articles about living things/