html-stripping-examples

From Microformats Wiki
Revision as of 11:14, 14 September 2013 by TomMorris (talk | contribs) (started documenting)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

This page documents existing library code that strips elements and attributes from HTML for "safe" display of HTML (e.g. for embedding).

jsoup (Java)

Details

  • none – "This whitelist allows only text nodes: all HTML will be stripped."
  • simpleText – "This whitelist allows only simple text formatting: b, em, i, strong, u"
  • basic – "This whitelist allows a fuller range of text nodes: a, b, blockquote, br, cite, code, dd, dl, dt, em, i, li, ol, p, pre, q, small, strike, strong, sub, sup, u, ul, and appropriate attributes."
  • relaxed – "This whitelist allows a full range of text and structural body HTML: a, b, blockquote, br, caption, cite, code, col, colgroup, dd, dl, dt, em, h1, h2, h3, h4, h5, h6, i, img, li, ol, p, pre, q, small, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, u, ul"

Problems: doesn't support new elements defined in HTML5.

See also