Release 5.4.0 of gawk introduced a new
regular expression matching engine, named
MinRX.
MinRX is fully compliant with the POSIX standard for Extended
Regular Expressions (EREs), including the additional features needed
by awk and gawk. It is also a little stricter
that the original matchers are
in terms of accepting valid regular expression syntax when specifying
a regexp. (These restrictions apply to corner cases that should
not come up in day-to-day use.)
Previously, gawk used GNU regex and dfa
from GNULIB. These matchers are fast and generally robust, albeit not
fully POSIX compliant. MinRX replaces both of them.
Because regular expression matching is such a fundamental part of what
awk programs do, introducing a new regular expression engine
has some risk associated with it. To alleviate the risk, for the
term of one major release, gawk continues to provide access
to the original regexp matchers should that be needed.
If the environment variable GAWK_GNU_MATCHERS exists, then
gawk switches to using GNU regex and dfa,
as previously. Otherwise, the MinRX matcher is the default and
that is what it uses.
Should you find a need to switch from MinRX to the original matchers, please submit a bug report describing what did not work (see Reporting Problems and Bugs). Doing so is very important, as it will help the maintainers and the MinRX author fix any issues that are found.
After one major release, the old matchers, and the use of the
GAWK_GNU_MATCHERS environment variable, will be removed
from gawk.