Posted By

dreadwarrior on 10/01/09


Tagged

sql telephone normalization


Versions (?)

Who likes this?

1 person have marked this snippet as a favorite

mgerdt


Normalize telephone numbers


 / Published in: PHP
 

This snippet is usable for telephone number normalization. It supports re-mapping of misused alphabetical characters (e.g. i for 1) and some other clean up stuff... Take a look.

Update, 2009-10-02: added function to create a corresponding MySQL REPLACE construct to normalize data which is already stored in database. I decided to use this approach because MySQL's regular expression engine is very slow. TODO: add the possibility to remove non digit characters at the end of the laaaarge REPLACE call.

  1. <?php
  2. $search_replace_mapping = array(
  3. // country prefix normalization
  4. '+00' => '+', '++' => '+',
  5. // country prefix is always 00
  6. '+' => '00',
  7. // funny user input goulash
  8. 'i' => '1', 'I' => '1', 'l' => '1',
  9. 'o' => '0', 'O' => '0',
  10.  
  11. // ([^\diIloO\+]*)
  12. // ...brackets
  13. '(' => '', ')' => '',
  14. '[' => '', ']' => '',
  15. '[' => '', ']' => '',
  16. // slashes
  17. '/' => '', '\\\\' => '',
  18. // dashes
  19. '-' => '', '_' => '',
  20. // whitespaces
  21. ' ' => ''
  22. );
  23.  
  24. function normalizeTelephoneNumber($telephone_number, $search_replace_mapping) {
  25. // fetch search and replace arrays
  26. $search = array_keys($search_replace_mapping);
  27. $replace = array_values($search_replace_mapping);
  28.  
  29. // simple string replacement
  30. $telephone_number = str_replace($search, $replace, $telephone_number);
  31.  
  32. // lets kick out all dutty stuff which is left...
  33. $telephone_number = preg_replace('~[^\d]~', '', $telephone_number);
  34.  
  35. return $telephone_number;
  36. }
  37.  
  38. function generateSqlReplaceStatement($telephone_number, $search_replace_mapping)
  39. {
  40. $s = '';
  41. $template = 'REPLACE(%s, \'%s\', \'%s\')';
  42.  
  43. $i = 0;
  44. foreach ($search_replace_mapping as $search => $replace)
  45. {
  46. $s = sprintf($template, ($i == 0 ? '\''. $telephone_number .'\'' : $s), $search, $replace);
  47. $i++;
  48. }
  49.  
  50. $s = 'SELECT '. $s .' AS normalized FROM Accommodation';
  51.  
  52. return $s;
  53. }
  54.  
  55. $numbers = array(
  56. '0049 03831 667 156',
  57. '+39 0471 / 975 642',
  58. '+0039 6757 - 3939 9393',
  59. '+49 (0) 3834 50 77 73',
  60. '+43 (i) 4m 4n idiOt',
  61. '+44 (0) 1234 \\ 55 55'
  62. );
  63.  
  64. print_r($numbers);
  65.  
  66. $numbers_clean = array();
  67. foreach ($numbers as $number)
  68. {
  69. $numbers_clean[] = normalizeTelephoneNumber($number, $search_replace_mapping);
  70. }
  71.  
  72. print_r($numbers_clean);
  73.  
  74. $sql = array();
  75. foreach ($numbers as $number)
  76. {
  77. $sql[$number] = generateSqlReplaceStatement($number, $search_replace_mapping);
  78. }
  79.  
  80. print_r($sql);
  81. ?>

Report this snippet  

You need to login to post a comment.