Posted By

iroybot on 01/29/13


Tagged

xml cdata tidy


Versions (?)

Fix CDATA blocks in XML files after reformatting with Tidy


 / Published in: PHP
 

For some reason tidy inserts new lines before/after <![CDATA[ content in XML files. Since I like the benefits of a reformatted, readable XML... ... i run tidy first, then remove the spaces before/after the CDATA block:

  1. # command line/exec(),etc. or use the php functions to tidy up your document
  2. tidy -indent -utf8 -xml -wrap 1000 input.xml > output.xml
  3.  
  4. <?php
  5. /**
  6.   * Replaces invalid:
  7.   * <element>
  8.   * <![CDATA[whatever content]]>
  9.   * </element>
  10.   *
  11.   * With well-formed:
  12.   * <element><![CDATA[whatever content]]></element>
  13.   */
  14. $out = preg_replace('~>[\s\t\w\n\r]+<\!\[CDATA\[~', '><![CDATA[', file_get_contents("output.xml"));
  15. $out = preg_replace('~]]>\s+<~', ']]><', $out);
  16. file_put_contents("final.xml", $out);
  17. ?>

Report this snippet  

You need to login to post a comment.