<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/">
<channel>
<title>Snipplr - noah</title>
<link>http://snipplr.com/users/noah/tags/analysis</link>
<description>Recent snippets posted on Snipplr.com</description>
<language>en-us</language>
<pubDate>Wed, 19 Jun 2013 09:39:46 GMT</pubDate>
<item>
<title>(Perl) Detect PHP files with trailing whitespace, using Perl</title>
<link>http://snipplr.com/view/46218/detect-php-files-with-trailing-whitespace-using-perl/</link>
<description><![CDATA[ <p>The following incantation returns nonzero exit status when the terminating `?>`  of a PHP file, is followed by whitespace.</p> ]]></description>
<pubDate>Wed, 29 Dec 2010 08:25:03 GMT</pubDate>
<guid>http://snipplr.com/view/46218/detect-php-files-with-trailing-whitespace-using-perl/</guid>
</item>
<item>
<title>(Ruby) analysis of Hudson JUNit logfiles</title>
<link>http://snipplr.com/view/44499/analysis-of-hudson-junit-logfiles/</link>
<description><![CDATA[ <p>assuming you are standing in`$HUDSON_HOME/jobs/job_foo/builds/123`</p> ]]></description>
<pubDate>Sat, 20 Nov 2010 17:15:44 GMT</pubDate>
<guid>http://snipplr.com/view/44499/analysis-of-hudson-junit-logfiles/</guid>
</item>
<item>
<title>(SVN) Howto list all the file extension types in an SVN log dump</title>
<link>http://snipplr.com/view/28195/howto-list-all-the-file-extension-types-in-an-svn-log-dump/</link>
<description><![CDATA[ <p>Note that on Windows you will want to double-quote the string argument to `perl -ne` rather than single-quoting it.  Otherwise this works on Windows (with Cygwin) as well.</p> ]]></description>
<pubDate>Thu, 11 Feb 2010 14:59:31 GMT</pubDate>
<guid>http://snipplr.com/view/28195/howto-list-all-the-file-extension-types-in-an-svn-log-dump/</guid>
</item>
<item>
<title>(Ruby) Rendered WGet with Selenium</title>
<link>http://snipplr.com/view/7906/rendered-wget-with-selenium/</link>
<description><![CDATA[ <p>Created in response to a discussion about "ghosting," between Kord Campbell of Splunk and Christian Heilman of Yahoo! at Ajax World 2008.

IMPORTANT: The Selenium-RC server must be running on port 4444 (the default) and you must have Curl and Tidy installed on your system.   

NOTE: Diffing the rendered versus the "server" source.  This option works OK as a learning tool, but I need to do more in terms of normalizing the server source versus the rendered source.  I run both the "server" and innerHTML sources through Tidy, but unfortunately there still seems to be a lot of extraneous differences between them.

So while this works OK for downloading the rendered source via a Ruby script, I've got a ways to go before it can produce a reliable "rendered diff."

keywords: rwget, rwdiff, ruby, selenium rc, selenium remote control, examples</p> ]]></description>
<pubDate>Mon, 18 Aug 2008 09:23:43 GMT</pubDate>
<guid>http://snipplr.com/view/7906/rendered-wget-with-selenium/</guid>
</item>
<item>
<title>(Bash) Scrape Google from the command line</title>
<link>http://snipplr.com/view/4299/scrape-google-from-the-command-line/</link>
<description><![CDATA[ <p>This code is POC only -- actually using it would violate Google's TOS, which forbids scraping.  It is published here for educational value only.

Hypothetically, the following command should return a list of the top 500 or so hits in Google for onemorebug.com.

The results will be prepended with digits, followed by a dot and some whitespace (Lynx adds these).

_You must have Lynx and Wget installed on your system for this to work._

Keep in mind that *nix shells don't like it when you double-quote strings, see the comments.</p> ]]></description>
<pubDate>Sun, 09 Dec 2007 21:16:58 GMT</pubDate>
<guid>http://snipplr.com/view/4299/scrape-google-from-the-command-line/</guid>
</item>
</channel>
</rss>