any23: Difference between revisions

From Microformats Wiki
Jump to navigation Jump to search
(microformats2 support - link to issues, add clients, project pages)
mNo edit summary
 
Line 1: Line 1:
'''Apache Any23''' is an open source Java parser that extracts [[RDFa]], classic microformats and a variety of other formats, and turns them into an RDF graph.
'''Apache Anything To Triples''' (Any23) is a library, a web service and a command line tool that extracts structured data in RDF format from a variety of Web documents.


Project pages:
Project pages:
* http://incubator.apache.org/projects/any23.html
* Homepage: http://any23.apache.org/
* https://any23.apache.org/supported-formats.html
* Supported I/O Formats: https://any23.apache.org/supported-formats.html
* https://any23.apache.org/dev-microformat-extractors.html
* Microformats Extractor Support: https://any23.apache.org/dev-microformat-extractors.html
* https://any23.apache.org/apidocs/org/apache/any23/extractor/html/package-summary.html
* Microformats Extractor Javadoc: https://any23.apache.org/apidocs/org/apache/any23/extractor/html/package-summary.html
* Issues: https://issues.apache.org/jira/browse/ANY23
* Project Issue Management: https://issues.apache.org/jira/browse/ANY23


== implemented microformats ==
== Implemented Microformats ==
* [[adr]]
* [[adr]]
* [[geo]]
* [[geo]]
Line 20: Line 20:
* [[species]]
* [[species]]


== microformats2 support ==
== Microformats2 support ==
Any23 does not yet support [[microformats2]], however there is a desire to do so per:
Any23 supports [[microformats2]], which was implemented in [https://issues.apache.org/jira/browse/ANY23-207]
* 2014-11-09 [http://krijnhoetmer.nl/irc-logs/microformats/20141109#l-24 in #microformats IRC] <blockquote>&lt;hectorMcSpector&gt; I am making the effort to implement Microformats2 support in Any23</blockquote>


See and comment on:
== Clients ==
* https://issues.apache.org/jira/browse/ANY23-207
The WebDataCommons [http://webdatacommons.org/] project uses Any23 and now extracts a large and varied volume of Microformts from the Common Crawl Corpus [http://commoncrawl.org/].
** They appear to be blocked on needing a Java microformats2 [[parser]].


Related issue:
== Web Service ==
* https://issues.apache.org/jira/browse/ANY23-249
TODO (lewismc 2017-03-28)
 
== clients ==
The WebDataCommons uses Any23, per [https://www.assembla.com/code/commondata/subversion/nodes/244/Extractor/trunk/extractor/src/main/java/org/webdatacommons/extractor/IgnorantHcardExtractorFactory.java]
 
== web service ==
The web service listed on the Any23 website does not operate as of 2014-12.
 
Instead, try [http://inspector.sindice.com Sindice Inspector].


== see also ==
== see also ==
* [[parsers]]
* [[parsers]]
* [[microformats2]]
* [[microformats2]]

Latest revision as of 18:29, 28 March 2017

Apache Anything To Triples (Any23) is a library, a web service and a command line tool that extracts structured data in RDF format from a variety of Web documents.

Project pages:

Implemented Microformats

Microformats2 support

Any23 supports microformats2, which was implemented in [1]

Clients

The WebDataCommons [2] project uses Any23 and now extracts a large and varied volume of Microformts from the Common Crawl Corpus [3].

Web Service

TODO (lewismc 2017-03-28)

see also