Word Frequency Count


/ Published in: SAS
Save to your folder(s)



Copy this code and paste it in your HTML
  1. /* test data set */
  2. data comments;
  3. length obs 8 comment $1000.;
  4. length p 8 c $1.;
  5. drop p c;
  6. input obs @@;
  7. /* skip the blanks */
  8. do while (c='');
  9. input c $char1. @@; /* guess what is wrong with $1.? */
  10. end;
  11. /* read one char at a time */
  12. p = 1;
  13. substr(comment,p,1) = c;
  14. do until (c='#');
  15. p + 1;
  16. input c $char1. @@;
  17. substr(comment, p, 1) = c;
  18. end;
  19. substr(comment, p, 1) = " "; /* get the # out */
  20. /* release the input line */
  21. input;
  22. cards;
  23. 1 SAS defines analytics as data-driven insight for
  24. better decisions. With SAS Analytics you get an integrated environment
  25. for predictive analytics and descriptive modeling, data mining, text
  26. mining, forecasting, optimization, simulation, experimental design and
  27. more.#
  28. 2 Our analytic solutions provide a range of
  29. techniques and processes for the collection, classification, analysis
  30. and interpretation of data to reveal patterns, anomalies, key
  31. variables and relationships, leading ultimately to new insights for
  32. guided decision making.#
  33. 3 We offer a comprehensive suite of analytics
  34. software#
  35. 4 SAS offers an integrated suite of analytics
  36. software unmatched in the industry, and delivered to you in a single
  37. environment.#
  38. ;
  39. run;
  40.  
  41. /* parse each word into an obs */
  42. data words;
  43. length obs no 8 word $16.; /* will be truncated if longer */
  44. keep obs no word;
  45. set comments;
  46. no = 0;
  47. do while(1);
  48. no + 1;
  49. word = upcase(scan(comment, no, " .,!?"));
  50. if word="" then leave;
  51. output;
  52. end;
  53. run;
  54.  
  55. proc freq data=words;
  56. tables word/ out=counts;
  57. run;
  58.  
  59. data test;
  60. set counts;
  61. file print;
  62. if word>='A';
  63. n + 1;
  64. drop n;
  65. if mod(N,3)=1 then put; /* changed to 3 to narrow */
  66. put word $10. count 5. +3 @;
  67. run;

URL: http://jaredprins.squarespace.com/blog/2008/3/31/sas-program-for-word-frequency-count.html

Report this snippet


Comments

RSS Icon Subscribe to comments

You need to login to post a comment.