Goawk Versions Save

A POSIX-compliant AWK interpreter written in Go, with CSV support

v1.19.0

1 year ago

Notable changes in this release:

In other news, check out awk-demo, an amazing "old skool demo" written in AWK by @patsie75. It now works under GoAWK, at least on Linux. Clone that repo and run it with awk=goawk ./demo.sh!

Thanks to @ko1nksm for several bug reports.

See full list of commits since v1.18.0.

v1.18.0

2 years ago

Relatively minor release with the following changes:

See the list of commits.

v1.17.0

2 years ago

Now with proper CSV input and output support! For example, a simple example showing CSV input parsing and the new @"named-field" syntax:

$ goawk -i csv -H '{ print @"Abbreviation" }' testdata/csv/states.csv
AL
AK
AZ
...

Read the full documentation.

This feature was sponsored by the library of the University of Antwerp -- many thanks!

v1.16.0

2 years ago

v1.15.0

2 years ago

This release adds no new features. It's a significant performance improvement due to switching the internals of the interpreter from a tree-walking interpreter to a bytecode compiler with a virtual machine interpreter.

Results show that it's 18% faster overall on microbenchmarks, 13% on more real-world benchmarks. It should be fully backwards compatible -- please file an issue if you find a regression!

Read the details here.

v1.14.0

2 years ago

This reverts the feature from v1.11.0 which changed the builtin functions length, substr, index, and match to use character indexes instead of byte indexes (as per the POSIX spec). The reason is because it changed those functions from O(1) to O(N), which created "accidentally quadratic" behavior in scripts that expected these functions to be O(1).

For example, @xonixx's grok.awk script on a relatively large JSON input file took about 1s in bytes mode (goawk -b), but 8 minutes (!) in the new unicode char default mode. That's extremely problematic.

Like v1.11.0, this release is again a small breaking change, but once again shouldn't affect many scripts (it will again only affect scripts that use constant indexes for substr on non-ASCII strings). I hope not many people are using interp.Config.Bytes or the goawk -b option yet, as those are gone again. Seeing v1.11.0 was only introduced a few weeks ago, I think it's worth the breakage for a performance problem of this magnitude.

Fixes https://github.com/benhoyt/goawk/issues/93: "Major speed regression for gron.awk in goawk 1.11.0+".

v1.13.0

2 years ago

Support RS being multiple characters and regular expressions RS (#86), allowing significantly more powerful text processing. This is a Gawk extension to POSIX, which says, "If RS contains more than one character, the results are unspecified."

v1.12.0

2 years ago

This release adds support for "getline lvalue" forms. See #85.