Purify & Filter HTML

Sharing some findings for HTML cleaning and removing MSWORD tags using PHP. It is also very useful in Text Mining applications.

htmLawed – PHP code to purify & filter HTML

  • make HTML markup in text more secure and standard-compliant
  • processed text can be used in HTML, XHTML or XML documents
  • restrict HTML elements, attributes, protocols, etc.
  • balance tags, check element nesting, transform deprecated attributes and tags, convert relative to absolute URLs, etc.
  • highly customizable
  • single file of ~45 kb
  • simple HTML Tidy alternative
  • use to filter, secure & sanitize HTML code submitted in blog comments, forum posts, etc., generate XML-compatible feed items from web-page excerpts, make old HTML code XHTML-compliant, pretty-print HTML, scrape web-pages, and so on

To download click here.

HTML Purifier – Standards-Compliant HTML Filtering

HTML Purifier is a standards-compliant HTML filter library written in PHP. HTML Purifier will not only remove all malicious code (better known as XSS) with a thoroughly audited, secure yet permissive whitelist,it will also make sure your documents are standards compliant, something only achievable with a comprehensive knowledge of W3C’s specifications. Tired of using BBCode due to the current landscape of deficient or insecure HTML filters? Have a WYSIWYG editor but never been able to use it? Looking for high-quality, standards-compliant, open-source components for that application you’re building? HTML Purifier is for you!

To download click here.

One Response to “Purify & Filter HTML”

  1. Rory Says:

    Nice utilities

    Interestin that HTMLawed is so much smaller than Purifier but seems to do everything


Share your thoughts & feedback

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: