Posted By

willcodeforfood on 12/16/07


Tagged

cli awk


Versions (?)

AWK more of it


 / Published in: Other
 

URL: http://snippets.dzone.com/posts/show/4893

http://snippets.dzone.com/posts/show/4893

  1. MORE AWK
  2.  
  3. HANDY ONE-LINERS FOR AWK 22 July 2003
  4. compiled by Eric Pement <[email protected]> version 0.22
  5. Latest version of this file is usually at:
  6. http://www.student.northpark.edu/pemente/awk/awk1line.txt
  7.  
  8.  
  9. USAGE:
  10.  
  11. Unix: awk '/pattern/ {print "$1"}' # standard Unix shells
  12. DOS/Win: awk '/pattern/ {print "$1"}' # okay for DJGPP compiled
  13. awk "/pattern/ {print \"$1\"}" # required for Mingw32
  14.  
  15. Most of my experience comes from version of GNU awk (gawk) compiled for
  16. Win32. Note in particular that DJGPP compilations permit the awk script
  17. to follow Unix quoting syntax '/like/ {"this"}'. However, the user must
  18. know that single quotes under DOS/Windows do not protect the redirection
  19. arrows (<, >) nor do they protect pipes (|). Both are special symbols
  20. for the DOS/CMD command shell and their special meaning is ignored only
  21. if they are placed within "double quotes." Likewise, DOS/Win users must
  22. remember that the percent sign (%) is used to mark DOS/Win environment
  23. variables, so it must be doubled (%%) to yield a single percent sign
  24. visible to awk.
  25.  
  26. If I am sure that a script will NOT need to be quoted in Unix, DOS, or
  27. CMD, then I normally omit the quote marks. If an example is peculiar to
  28. GNU awk, the command 'gawk' will be used. Please notify me if you find
  29. errors or new commands to add to this list (total length under 65
  30. characters). I usually try to put the shortest script first.
  31.  
  32. FILE SPACING:
  33.  
  34. # double space a file
  35. awk '1;{print ""}'
  36. awk 'BEGIN{ORS="\n\n"};1'
  37.  
  38. # double space a file which already has blank lines in it. Output file
  39. # should contain no more than one blank line between lines of text.
  40. # NOTE: On Unix systems, DOS lines which have only CRLF (
  41. ) are
  42. # often treated as non-blank, and thus 'NF' alone will return TRUE.
  43. awk 'NF{print $0 "\n"}'
  44.  
  45. # triple space a file
  46. awk '1;{print "\n"}'
  47.  
  48. NUMBERING AND CALCULATIONS:
  49.  
  50. # precede each line by its line number FOR THAT FILE (left alignment).
  51. # Using a tab (\t) instead of space will preserve margins.
  52. awk '{print FNR "\t" $0}' files*
  53.  
  54. # precede each line by its line number FOR ALL FILES TOGETHER, with tab.
  55. awk '{print NR "\t" $0}' files*
  56.  
  57. # number each line of a file (number on left, right-aligned)
  58. # Double the percent signs if typing from the DOS command prompt.
  59. awk '{printf("%5d : %s\n", NR,$0)}'
  60.  
  61. # number each line of file, but only print numbers if line is not blank
  62. # Remember caveats about Unix treatment of \r (mentioned above)
  63. awk 'NF{$0=++a " :" $0};{print}'
  64. awk '{print (NF? ++a " :" :"") $0}'
  65.  
  66. # count lines (emulates "wc -l")
  67. awk 'END{print NR}'
  68.  
  69. # print the sums of the fields of every line
  70. awk '{s=0; for (i=1; i<=NF; i++) s=s+$i; print s}'
  71.  
  72. # add all fields in all lines and print the sum
  73. awk '{for (i=1; i<=NF; i++) s=s+$i}; END{print s}'
  74.  
  75. # print every line after replacing each field with its absolute value
  76. awk '{for (i=1; i<=NF; i++) if ($i < 0) $i = -$i; print }'
  77. awk '{for (i=1; i<=NF; i++) $i = ($i < 0) ? -$i : $i; print }'
  78.  
  79. # print the total number of fields ("words") in all lines
  80. awk '{ total = total + NF }; END {print total}' file
  81.  
  82. # print the total number of lines that contain "Beth"
  83. awk '/Beth/{n++}; END {print n+0}' file
  84.  
  85. # print the largest first field and the line that contains it
  86. # Intended for finding the longest string in field #1
  87. awk '$1 > max {max=$1; maxline=$0}; END{ print max, maxline}'
  88.  
  89. # print the number of fields in each line, followed by the line
  90. awk '{ print NF ":" $0 } '
  91.  
  92. # print the last field of each line
  93. awk '{ print $NF }'
  94.  
  95. # print the last field of the last line
  96. awk '{ field = $NF }; END{ print field }'
  97.  
  98. # print every line with more than 4 fields
  99. awk 'NF > 4'
  100.  
  101. # print every line where the value of the last field is > 4
  102. awk '$NF > 4'
  103.  
  104.  
  105. TEXT CONVERSION AND SUBSTITUTION:
  106.  
  107. # IN UNIX ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format
  108. awk '{sub(/\r$/,"");print}' # assumes EACH line ends with Ctrl-M
  109.  
  110. # IN UNIX ENVIRONMENT: convert Unix newlines (LF) to DOS format
  111. awk '{sub(/$/,"\r");print}
  112.  
  113. # IN DOS ENVIRONMENT: convert Unix newlines (LF) to DOS format
  114. awk 1
  115.  
  116. # IN DOS ENVIRONMENT: convert DOS newlines (CR/LF) to Unix format
  117. # Cannot be done with DOS versions of awk, other than gawk:
  118. gawk -v BINMODE="w" '1' infile >outfile
  119.  
  120. # Use "tr" instead.
  121. tr -d \r <infile >outfile # GNU tr version 1.22 or higher
  122.  
  123. # delete leading whitespace (spaces, tabs) from front of each line
  124. # aligns all text flush left
  125. awk '{sub(/^[ \t]+/, ""); print}'
  126.  
  127. # delete trailing whitespace (spaces, tabs) from end of each line
  128. awk '{sub(/[ \t]+$/, "");print}'
  129.  
  130. # delete BOTH leading and trailing whitespace from each line
  131. awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'
  132. awk '{$1=$1;print}' # also removes extra space between fields
  133.  
  134. # insert 5 blank spaces at beginning of each line (make page offset)
  135. awk '{sub(/^/, " ");print}'
  136.  
  137. # align all text flush right on a 79-column width
  138. awk '{printf "%79s\n", $0}' file*
  139.  
  140. # center all text on a 79-character width
  141. awk '{l=length();s=int((79-l)/2); printf "%"(s+l)"s\n",$0}' file*
  142.  
  143. # substitute (find and replace) "foo" with "bar" on each line
  144. awk '{sub(/foo/,"bar");print}' # replaces only 1st instance
  145. gawk '{$0=gensub(/foo/,"bar",4);print}' # replaces only 4th instance
  146. awk '{gsub(/foo/,"bar");print}' # replaces ALL instances in a line
  147.  
  148. # substitute "foo" with "bar" ONLY for lines which contain "baz"
  149. awk '/baz/{gsub(/foo/, "bar")};{print}'
  150.  
  151. # substitute "foo" with "bar" EXCEPT for lines which contain "baz"
  152. awk '!/baz/{gsub(/foo/, "bar")};{print}'
  153.  
  154. # change "scarlet" or "ruby" or "puce" to "red"
  155. awk '{gsub(/scarlet|ruby|puce/, "red"); print}'
  156.  
  157. # reverse order of lines (emulates "tac")
  158. awk '{a[i++]=$0} END {for (j=i-1; j>=0;) print a[j--] }' file*
  159.  
  160. # if a line ends with a backslash, append the next line to it
  161. # (fails if there are multiple lines ending with backslash...)
  162. awk '/\\$/ {sub(/\\$/,""); getline t; print $0 t; next}; 1' file*
  163.  
  164. # print and sort the login names of all users
  165. awk -F ":" '{ print $1 | "sort" }' /etc/passwd
  166.  
  167. # print the first 2 fields, in opposite order, of every line
  168. awk '{print $2, $1}' file
  169.  
  170. # switch the first 2 fields of every line
  171. awk '{temp = $1; $1 = $2; $2 = temp}' file
  172.  
  173. # print every line, deleting the second field of that line
  174. awk '{ $2 = ""; print }'
  175.  
  176. # print in reverse order the fields of every line
  177. awk '{for (i=NF; i>0; i--) printf("%s ",i);printf ("\n")}' file
  178.  
  179. # remove duplicate, consecutive lines (emulates "uniq")
  180. awk 'a !~ $0; {a=$0}'
  181.  
  182. # remove duplicate, nonconsecutive lines
  183. awk '! a[$0]++' # most concise script
  184. awk '!($0 in a) {a[$0];print}' # most efficient script
  185.  
  186. # concatenate every 5 lines of input, using a comma separator
  187. # between fields
  188. awk 'ORS=NR%5?",":"\n"' file
  189.  
  190.  
  191.  
  192. SELECTIVE PRINTING OF CERTAIN LINES:
  193.  
  194. # print first 10 lines of file (emulates behavior of "head")
  195. awk 'NR < 11'
  196.  
  197. # print first line of file (emulates "head -1")
  198. awk 'NR>1{exit};1'
  199.  
  200. # print the last 2 lines of a file (emulates "tail -2")
  201. awk '{y=x "\n" $0; x=$0};END{print y}'
  202.  
  203. # print the last line of a file (emulates "tail -1")
  204. awk 'END{print}'
  205.  
  206. # print only lines which match regular expression (emulates "grep")
  207. awk '/regex/'
  208.  
  209. # print only lines which do NOT match regex (emulates "grep -v")
  210. awk '!/regex/'
  211.  
  212. # print the line immediately before a regex, but not the line
  213. # containing the regex
  214. awk '/regex/{print x};{x=$0}'
  215. awk '/regex/{print (x=="" ? "match on line 1" : x)};{x=$0}'
  216.  
  217. # print the line immediately after a regex, but not the line
  218. # containing the regex
  219. awk '/regex/{getline;print}'
  220.  
  221. # grep for AAA and BBB and CCC (in any order)
  222. awk '/AAA/; /BBB/; /CCC/'
  223.  
  224. # grep for AAA and BBB and CCC (in that order)
  225. awk '/AAA.*BBB.*CCC/'
  226.  
  227. # print only lines of 65 characters or longer
  228. awk 'length > 64'
  229.  
  230. # print only lines of less than 65 characters
  231. awk 'length < 64'
  232.  
  233. # print section of file from regular expression to end of file
  234. awk '/regex/,0'
  235. awk '/regex/,EOF'
  236.  
  237. # print section of file based on line numbers (lines 8-12, inclusive)
  238. awk 'NR==8,NR==12'
  239.  
  240. # print line number 52
  241. awk 'NR==52'
  242. awk 'NR==52 {print;exit}' # more efficient on large files
  243.  
  244. # print section of file between two regular expressions (inclusive)
  245. awk '/Iowa/,/Montana/' # case sensitive
  246.  
  247.  
  248. SELECTIVE DELETION OF CERTAIN LINES:
  249.  
  250. # delete ALL blank lines from a file (same as "grep '.' ")
  251. awk NF
  252. awk '/./'
  253.  
  254.  
  255. CREDITS AND THANKS:
  256.  
  257. Special thanks to Peter S. Tillier for helping me with the first release
  258. of this FAQ file.
  259.  
  260. For additional syntax instructions, including the way to apply editing
  261. commands from a disk file instead of the command line, consult:
  262.  
  263. "sed & awk, 2nd Edition," by Dale Dougherty and Arnold Robbins
  264. O'Reilly, 1997
  265. "UNIX Text Processing," by Dale Dougherty and Tim O'Reilly
  266. Hayden Books, 1987
  267. "Effective awk Programming, 3rd Edition." by Arnold Robbins
  268. O'Reilly, 2001
  269.  
  270. To fully exploit the power of awk, one must understand "regular
  271. expressions." For detailed discussion of regular expressions, see
  272. "Mastering Regular Expressions, 2d edition" by Jeffrey Friedl
  273. (O'Reilly, 2002).
  274.  
  275. The manual ("man") pages on Unix systems may be helpful (try "man awk",
  276. "man nawk", "man regexp", or the section on regular expressions in "man
  277. ed"), but man pages are notoriously difficult. They are not written to
  278. teach awk use or regexps to first-time users, but as a reference text
  279. for those already acquainted with these tools.
  280.  
  281. USE OF '\t' IN awk SCRIPTS: For clarity in documentation, we have used
  282. the expression '\t' to indicate a tab character (0x09) in the scripts.
  283. All versions of awk, even the UNIX System 7 version should recognize
  284. the '\t' abbreviation.
  285.  
  286. #---end of file---

Report this snippet  

You need to login to post a comment.