Posted By

rambles on 05/27/11


Tagged

hash sas Summary


Versions (?)

Summary datasets using hashes


 / Published in: SAS
 

URL: http://www2.sas.com/proceedings/sugi30/236-30.pdf

This snippet comes directly from Paul M. Dorfman's paper on programming with Hash objects. The hash object is useful when summarising huge datasets that aren't sorted and indexed by the variable(s) to be summarised; they can often be quicker than proc summary and are certainly less machine intensive.

  1. ** Dummy code;
  2. data input ;
  3. do k1 = 1e6 to 1 by -1 ;
  4. k2 = put (k1, z7.) ;
  5. do num = 1 to ceil (ranuni(1) * 6) ;
  6. output ;
  7. end ;
  8. end ;
  9. run ;
  10.  
  11. ** Standard approach using Proc Summary;
  12. proc summary data = input nway ;
  13. class k1 k2 ;
  14. var num ;
  15. output out = summ_sum (drop = _:) sum = sum ;
  16. run ;
  17.  
  18. ** Alternative using the hash object;
  19. data _null_ ;
  20. if 0 then set input ;
  21.  
  22. dcl hash hh (hashexp:16) ;
  23. hh.definekey ('k1', 'k2' ) ;
  24. hh.definedata ('k1', 'k2', 'sum') ;
  25. hh.definedone () ;
  26. do until (eof) ;
  27. set input end = eof ;
  28. if hh.find () ne 0 then sum = 0 ;
  29. sum ++ num ;
  30. hh.replace () ;
  31. end ;
  32. rc = hh.output (dataset: 'hash_sum') ;
  33. run ;

Report this snippet  

You need to login to post a comment.