html-stripping-examples
Jump to navigation
Jump to search
This page documents existing library code that strips elements and attributes from HTML for "safe" display of HTML (e.g. for embedding).
jsoup (Java)
- none – "This whitelist allows only text nodes: all HTML will be stripped."
- simpleText – "This whitelist allows only simple text formatting: b, em, i, strong, u"
- basic – "This whitelist allows a fuller range of text nodes: a, b, blockquote, br, cite, code, dd, dl, dt, em, i, li, ol, p, pre, q, small, strike, strong, sub, sup, u, ul, and appropriate attributes."
- relaxed – "This whitelist allows a full range of text and structural body HTML: a, b, blockquote, br, caption, cite, code, col, colgroup, dd, dl, dt, em, h1, h2, h3, h4, h5, h6, i, img, li, ol, p, pre, q, small, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, u, ul"
Problems: doesn't support new elements defined in HTML5.