GNXMAQY6IMH7NW7XUGLJ2T4S7JYAMDSAEB66FYPRJNZSRRZ3H5JAC
** ~save-lisp-and-die~
This function takes two parameters: (1) the filename of a glotawk
interpreter; (2) a filename to which to write a new interpreter. The
AWK code from glotawk is written into the new file, and then a dump of
the current heap. In this way, you can construct a single-file
interpreter containing not only glotawk's base library, but also
definitions of your own, which will be instantly available when you
run the new interpreter, without the overhead of parsing and without
the need to place copies of the files containing the definitions on
machines where the code needs to run.
So, previous to this change, you could put a built copy of ~glotawk~
on the machine where you need to run the code, plus copies of the
files where you define all your functions, ~a.glotawk~, ~b.glotawk~,
and ~c.glotawk~, and the file where you call them, ~end-use.glotawk~.
Then you'd run =glotawk a.glotawk b.glotawk c.glotawk
end-use.glotawk=. All the expressions in each file would be parsed and
evaluated. Most of the code by volume would be definitions. Not all
defined functions would be called. Startup time would be slower.
As of this change, instead, you could run ~/wherever/glotawk~, tell it
to =(load "a.glotawk")= and etc., and =(save-lisp-and-die
"/wherever/glotawk" "/wherever-else/my-new-interpreter")=. Then
~my-new-interpreter~ will contain the definitions inside it.
Maybe I was a little too excited when naming this function, because it
doesn't yet support changing which code runs at startup.
** Parsing across lines
The reader is retooled to support forms spanning multiple lines, which
lets the user write Lisp source in the commonly expected fashion.
them using ~string-join~, and then runs ~unsafe-system~ with that.
them using ~string-join~, and then runs ~unsafe-system~ with that. But
then you can't do redirections using ~system~, so as a workaround,
when there are any symbols among the arguments, their names are
inserted into the shell command without quoting. So you can
src_glotawk{ (system "echo" "hi" '> "/dev/null") } for example.
** Bugs fixed
*** Backslash escapes in strings
Inside strings, the usual backslash escapes act how you would expect
them to. That's ~\a \b \f \n \r \t \v \\ \"~ for the bell, backspace,
formfeed, newline, carriage return, horizontal tab, vertical tab,
backslash and double quotes, respectively. This means when the
characters ~\a~ are found in your string literal, they mean a single
character inside the resultant string: ASCII code 7; when that string
is printed using ~print~, which prints the readable representation of
a thing, it will come back out as the characters ~\a~; when that
string is printed using ~printf~, the ASCII code 7 will be sent to the
output.
*** ~strcat~, ~printf~, ~sprintf~ infinite loops
** ~quasiquote~ works
Evaluation environments were not being passed into these special
forms, causing infinite loops. This is fixed.
*** Strings now ~equal~ by contents
Due to a bug, ~equal~ was comparing strings using their identities
(e.g. is string a /stored at the same place as/ string b?), not their
contents (e.g. would awk consider them ~==~ ?). This is fixed.
*** ~quasiquote~ works
*** Some infinite loops squashed
As well as bugfixes mentioned above to prevent infinite loops, in
several places where infinite loops can possibly happen, they are
checked for, and we print an error and exit instead.
*** match function
The ~match~ special form had a wrong variable name that kept it from
ever working right; this is fixed. In cases where there was no match,
previously this special form would return ~(0 -1)~, hewing closely to
AWK. It has been modified to return ~nil~ if there is no match, so you
can more easily use its results in conditional checks.
** Other changes
*** Library moved to lib.glotawk
~lib-eval.awk~ contained the upper-level parts of glotawk, written in
glotawk. With the ability to parse across lines, and the addition of
the ~load~ function, the library is now moved to ~lib.glotawk~ and
is loaded in by ~lib-eval.awk~.
*** Busybox awk compatibility attempted; failed
Busybox's awk was barfing on the regular expressions used in the
tokenizer. That's fixed, but you can't take the length() of an array
in Busybox awk. POSIX does not specify this behavior, so Busybox is
not in the wrong here; but glotawk takes the length of many arrays.
Perhaps the next version of POSIX will require awk to support the use
of length() on arrays.
*** GC_DEBUG
If you run glotawk like ~awk -v GC_DEBUG=1 -f glotawk~, then upon
every garbage collection, the heap before and after are written to
Graphviz ~dot~ files, as well as the GC marks. You may wish to avoid
using the ~dot~ algorithm to visualize these: with thousands of nodes,
~dot~ thinks for a long time, and makes an inscrutable drawing. ~fdp~
and ~sfdp~ can render an inscrutable drawing much faster.
*** awk flexibility
Three awks are specified in the Makefile now: the one used to build
glotawk (~BUILD_AWK~), the one used to run the tests (~TEST_AWK~), and
the one expected to exist when glotawk is run (~TARGET_AWK_F~). These
all default to the value of ~AWK~, which in turn defaults to
~/usr/bin/awk~.