Published in: Regular Expression
wiki files accumulate google-navclient-hilite junk created by the search highlighter in the google toolbar. Search the internet for "SPAN id google-navclient-hilite" and expect to see wiki entries with markup junk generated by the google toolbar.
The regular expression find-and-replace can clean it up using the RegEx shown in Source, below.
/* Replace this: <SPAN id=google-navclient-hilite style="COLOR: black; BACKGROUND-COLOR: cyan">Word</SPAN> with this: Word Notes: The chars '<'>' obviously appear in the html code as < and > The minimal or lazy search operator used in Visual Studio is '#' which is roughly equivalent to '+?' in other RegEx syntax. {...} is the tagged expression to be replace - again, Visual Studio syntax. Test vectors used (snipplr stripped the complete SPAN markup): <SPAN>file</SPAN> (expect file) <SPAN>file</SPAN> (expect file) <SPAN>file</SPAN> (expect file) <SPAN>aaa</SPAN> (expect aaa) <SPAN>file</SPAN><SPAN>file</SPAN> (expect filefile) <SPAN>file aaa</SPAN> (expect file aaa) (Visual Studio 2008 syntax) */ find: <SPAN id=google-navclient-hilite style="COLOR\:.#; BACKGROUND-COLOR\:.#">{.#}<\/SPAN> replace with: \1
You need to login to post a comment.
