AWK
AWK is a small programming language designed for text processing. Its name is derived from the surnames of its authors: Alfred Aho, Peter Weinberger, and Brian Kernighan. The language is standarized and widely available on Unix-like systems.
Installation
On Arch, the awk(1p) command is provided by gawk, which is installed by default, with native Unicode support and a load of extrafeatures.
Alternative implementations
Like many other core utilities, there are several more-or-less compliant implementations available:
- BusyBox — The BusyBox implementation is not that well performant but has a smaller footprint suitable for memory strained environment.
- GoAWK — AWK implementation in Go language.
- nawk — The "new" AWK as described in The AWK Programming Language, a.k.a. BWK AWK or the One-True-AWK, is now co-maintained by Arnold Robbins and B. W. Kernighan, featuring UTF-8 and csv support.
- mawk — A rather performant AWK implementation.
Troubleshooting
Assignment to ARGC
variable via -v
option does not preserve in runtime
Although undocumented, it appears that many implementations will reset the ARGC
variable internally after processing the variable assignment of -v
options specified on command line. Therefore, to get desired value of ARGC
variable in runtime (e.g. BEGIN
code blocks), it's required to set the variable in code block directly:
BEGIN { ARGC=1; ... }
See also
- nawk(1): a refcard-style man page of nawk
- The GNU AWK manual: both a comprehensive tutorial and a canonical reference text on gawk(1)
- Alpine Linux's community wiki article on AWK: has some notes on implementation difference between BusyBox and other AWKs
- AWK tech notes: language gotchas and design difference compared to other modern programming languages
- Idiomatic AWK: how concise an AWK program could be