URL: http://search.cpan.org/dist/String-Approx/Approx.pm
Fuzzy string matches with Jarkko Hietaniemi's String::Approx module.
Get approximate matches, close to what you want. This is great for when you have filenames that might contain misspellings, extra underscores or other typos and mistakes. Also great for searching for files when there are several different naming conventions used within a project.
Mainly I am concerned with being able to match strings that have underscores inserted (or deleted) in arbitrary places. But the result I came up with here, does a pretty good job of matching when there are all sorts of typos, without picking up too many false positives.
use String::Approx 'amatch'; use Test::More(no_plan); sub fuzm { "i", # match case-insensitively "10%", # tolerate up to 1 character in 10 being wrong "S0", # but no substituting one character for another "D1", # although, tolerate up to one deletion "I2" # and tolerate up to two insertions ]); } ok(fuzm("homer_simpson"), "exact match for 'homer_simpson'"); ok(fuzm("homersimpson"), "still matches without the underscore"); ok(fuzm("homers_impson"), "putting the underscore in a different place, still matches"); ok(fuzm("ho_mer_simpson"), "an extra underscore still matches"); ok(fuzm("ho_mer_simp_son"), "2 extra underscores still matches"); ok((not fuzm "ho_mersimp_son"), "2 underscores, both in the wrong places, doesn't match"); ok((not fuzm "ho_mer_sim_ps_on"), "3 extra underscores doesn't match"); ok((not fuzm "homer____simpson"), "3 extra underscores doesn't match");
You need to login to post a comment.
