Revision: 5314
Updated Code
at February 28, 2008 05:19 by localhorst
Updated Code
function cleanHTML($html) { /// <summary> /// Removes all FONT and SPAN tags, and all Class and Style attributes. /// Designed to get rid of non-standard Microsoft Word HTML tags. /// </summary> // start by completely removing all unwanted tags $html = ereg_replace("<(/)?(font|span|del|ins)[^>]*>","",$html); // then run another pass over the html (twice), removing unwanted attributes $html = ereg_replace("<([^>]*)(class|lang|style|size|face)=(\"[^\"]*\"|'[^']*'|[^>]+)([^>]*)>","<\\1>",$html); $html = ereg_replace("<([^>]*)(class|lang|style|size|face)=(\"[^\"]*\"|'[^']*'|[^>]+)([^>]*)>","<\\1>",$html); return $html }
Revision: 5313
Initial Code
Initial URL
Initial Description
Initial Title
Initial Tags
Initial Language
at February 27, 2008 13:49 by localhorst
Initial Code
function cleanHTML($html) { /// <summary> /// Removes all FONT and SPAN tags, and all Class and Style attributes. /// Designed to get rid of non-standard Microsoft Word HTML tags. /// </summary> // start by completely removing all unwanted tags $html = ereg_replace("<(/)?(font|span|del|ins)[^>]*>","",$html); // then run another pass over the html (twice), removing unwanted attributes $html = ereg_replace("<([^>]*)(class|lang|style|size|face)=(\"[^\"]*\"|'[^']*'|[^>]+)([^>]*)>","<\\1>",$html); $html = ereg_replace("<([^>]*)(class|lang|style|size|face)=(\"[^\"]*\"|'[^']*'|[^>]+)([^>]*)>","<\\1>",$html); // sample word html <p class="aaa" style="background:dot">abc</p> will return <p > </p> }
Initial URL
http://tim.mackey.ie/CommentView,guid,2ece42de-a334-4fd0-8f94-53c6602d5718.aspx
Initial Description
The PHP Code appears in Post Comments
Initial Title
Clean Word HTML using Regular Expressions
Initial Tags
php
Initial Language
PHP