RegEx to clean tiddlywiki files of google-navclient-hilite junk.


/ Published in: Regular Expression
Save to your folder(s)

wiki files accumulate google-navclient-hilite junk created by the search highlighter in the google toolbar. Search the internet for "SPAN id google-navclient-hilite" and expect to see wiki entries with markup junk generated by the google toolbar.

The regular expression find-and-replace can clean it up using the RegEx shown in Source, below.


Copy this code and paste it in your HTML
  1. /*
  2. Replace this:
  3. <SPAN id=google-navclient-hilite style="COLOR: black; BACKGROUND-COLOR: cyan">Word</SPAN>
  4. with this:
  5. Word
  6. Notes:
  7. The chars '<'>' obviously appear in the html code as &lt; and &gt;
  8. The minimal or lazy search operator used in Visual Studio is '#' which is roughly equivalent to '+?' in other RegEx syntax.
  9. {...} is the tagged expression to be replace - again, Visual Studio syntax.
  10.  
  11. Test vectors used (snipplr stripped the complete SPAN markup):
  12. <SPAN>file</SPAN> (expect file)
  13. <SPAN>file</SPAN> (expect file)
  14. <SPAN>file</SPAN> (expect file)
  15. <SPAN>aaa</SPAN> (expect aaa)
  16. <SPAN>file</SPAN><SPAN>file</SPAN> (expect filefile)
  17. <SPAN>file aaa</SPAN> (expect file aaa)
  18.  
  19.  
  20. (Visual Studio 2008 syntax)
  21. */
  22.  
  23. find:
  24. &lt;SPAN id=google-navclient-hilite style=&quot;COLOR\:.#; BACKGROUND-COLOR\:.#&quot;&gt;{.#}&lt;\/SPAN&gt;
  25.  
  26. replace with:
  27. \1

Report this snippet


Comments

RSS Icon Subscribe to comments

You need to login to post a comment.