jaredj/glotawk - Change VIEK2XMYJX37ELGVE2P5PZQU47NJ3GBXPV3XY2RWG2TDHIZEV57QC

make repl code a special form; make main function a variable, defaulting to repl; allow changing using save-lisp-and-die

Created by jaredj on September 7, 2025

VIEK2XMYJX37ELGVE2P5PZQU47NJ3GBXPV3XY2RWG2TDHIZEV57QC

Dependencies

In channels

main

Change contents

Replacement in repl.awk at line 30 [5.3599]

B:BD[5.3870] → [5.3870:3878]

BEGIN {

[5.3870]

[6.358]

function _repl(     tokens, where, inside_string, to_eval) {
    delete tokens
    delete where
    delete inside_string
    delete to_eval

Replacement in repl.awk at line 36 [5.3599]

B:BD[6.392] → [7.204:261]

∅:D[7.261] → [6.418:620]

B:BD[6.418] → [6.418:620]

    _forget_parse_state(_TOKENS, _WHERE, _INSIDE_STRING)
    # it appears that during BEGIN, if we are operating on a file
    # specified on the command line, FILENAME is not set yet. So we
    # have to find out whether any filename was specified, instead.

[6.392]

[6.620]

    _forget_parse_state(tokens, where, inside_string)
    # maybe we are running from a BEGIN and so FILENAME is not set
    # yet. instead we have to find whether any filename was specified.

Replacement in repl.awk at line 40 [5.3599]

∅:D[6.655] → [5.3953:3955]

B:BD[5.3953] → [5.3953:3955]

[6.655]

[5.3955]

    while((getline) > 0) {
        # the reason to keep parser state outside tokenize and read is
        # so that we can have forms that span lines, rather than
        # requiring each line of input to be exactly one complete
        # form.
        tokenize($0, tokens, where, inside_string)
        read_forms_into(to_eval, tokens, where, inside_string)

Insertion in repl.awk at line 48 [5.3599]

[5.3956]

[6.656]

        if(!FILENAME || (FILENAME == "-")) _print_eval_all(to_eval)
        else                               _just_eval_all(to_eval)
        delete to_eval

Replacement in repl.awk at line 52 [5.3599]

B:BD[6.657] → [6.657:659]

B:BD[6.659] → [7.262:567]

{
    # the reason to keep global arrays of parser state (_TOKENS,
    # _TO_EVAL) is so we can have forms that span lines, rather than
    # requiring each line of input to be a complete form.
    tokenize($0, _TOKENS, _WHERE, _INSIDE_STRING)
    read_forms_into(_TO_EVAL, _TOKENS, _WHERE, _INSIDE_STRING)

[6.657]

[6.870]

        if(length(tokens) == 0 && length(to_eval) == 0) {
            # we have come to the end of all complete expressions in the
            # input so far. this is important so that we don't GC data
            # that we have read but not yet eval'd.
            _maybe_gc()

Replacement in repl.awk at line 58 [5.3599]

B:BD[6.871] → [7.568:717]

∅:D[7.717] → [6.984:1263]

B:BD[6.984] → [6.984:1263]

    if(!FILENAME || (FILENAME == "-")) _print_eval_all(_TO_EVAL)
    else                               _just_eval_all(_TO_EVAL)
    delete _TO_EVAL
    if(length(_TOKENS) == 0 && length(_TO_EVAL) == 0) {
        # we have come to the end of all complete expressions in the
        # input so far. this is important so that we don't GC data
        # that we have read but not yet eval'd.
        _maybe_gc()
        _prompt()

[6.871]

[6.1263]

            # and also, all previous input having completed, this is
            # when we should print the next prompt.
            _prompt()
        }

Insertion in repl.awk at line 63 [5.3599]

[6.1269]

    # after all input has ended -
    _incomplete_parse_at_end(tokens, where, inside_string)

Deletion in repl.awk at line 66 [5.3599]
B:BD[6.1271] → [6.1271:1272]
Deletion in repl.awk at line 67 [5.3599]
B:BD[6.1273] → [6.1273:1279]
B:BD[6.1279] → [7.718:780]
∅:D[7.780] → [6.1310:1312]
B:BD[6.1310] → [6.1310:1312]
```
END {
    _incomplete_parse_at_end(_TOKENS, _WHERE, _INSIDE_STRING)
}
```

Insertion in osf.awk at line 28 [9.954]

[2.192]

[8.700]

    else if(car == _symbol("dump-append-changing-main"))
        # the breaking point was here. here, unlike the other dumps,
        # we break out the awkification of the parameters into the awk
        # function implementing the form.
        return _dump_append_changing_main(_eval3(_cadr(form), env, env, d+1),
                                          _eval3(_caddr(form), env, env, d+1))

Insertion in osf.awk at line 40 [9.954]

[7.4673]

[9.3499]

    else if(car == _symbol("repl"))
        # this is likely only to be called once, ever
        return _repl()

Replacement in lib.glotawk at line 183 [11.36]

B:BD[10.987] → [3.0:197]

(defun save-lisp-and-die (me-file new-file)
  (let* ((snip-string "# IMAGE BEGINS QMGFAMRISGQJ48IWDOWPHOOGW3MLKKGSPXD4DTFIG0")
         (script (sprintf "/^%s/ { exit 0 } { print }" snip-string)))

[10.987]

[3.197]

(defun save-lisp-and-die (me-file new-file main-symbol)
  (let* ((image-begin "# IMAGE BEGINS QMGFAMRISGQJ48IWDOWPHOOGW3MLKKGSPXD4DTFIG0")
         (script (sprintf "/^%s/ { exit 0 } { print }" image-begin)))

Replacement in lib.glotawk at line 189 [11.36]

B:BD[3.343] → [3.343:370]

    (dump-append new-file)

[3.343]

[3.370]

    (dump-append-changing-main new-file main-symbol)

Insertion in lib-eval.awk at line 8 [13.21710]

[12.818]

[13.23695]

    _MAIN = "repl"
    main_expression = _cons(_symbol(_MAIN), _nil())
    _eval(main_expression)

Insertion in first-symbols.awk at line 114 [9.17737]
[2.251]
[8.3876]
```
    _symbol("dump-append-changing-main")
```
Insertion in first-symbols.awk at line 117 [9.17737]
[7.5739]
[9.19392]
```
    _symbol("repl")
```

Replacement in dump.awk at line 5 [13.37838]

B:BD[2.261] → [2.261:340]

    _SNIP_STRING = "# IMAGE BEGINS QMGFAMRISGQJ48IWDOWPHOOGW3MLKKGSPXD4DTFIG0"

[2.261]

[2.340]

    _IMAGE_BEGIN = "# IMAGE BEGINS QMGFAMRISGQJ48IWDOWPHOOGW3MLKKGSPXD4DTFIG0"

Replacement in dump.awk at line 30 [13.37838]

B:BD[13.38241] → [2.343:465]

function _dump(filename, append,   i, t, v, s, line) {
    logg_dbg("_dump", "dumping " (append ? ">> " : "> ") filename)

[13.38241]

[2.465]

function _dump(filename, append, main,   i, t, v, s, line) {
    # This is what code to execute when this image is started. It has
    # to be a glotawk form with no parameters. Usually it is the repl
    # special form, so that when we start we'll do the REPL. But maybe
    # you want some other function to happen at startup.
    if(!main) main = "repl"
    logg_dbg("_dump", "dumping " (append ? ">> " : "> ") filename ". main symbol is " main)

Replacement in dump.awk at line 39 [13.37838]

B:BD[2.480] → [2.480:518]

        print _SNIP_STRING >>filename

[2.480]

[2.518]

        print _IMAGE_BEGIN >>filename

Replacement in dump.awk at line 41 [13.37838]

B:BD[2.527] → [2.527:572]

        print _SNIP_STRING >filename

[2.527]

[2.572]

        print _IMAGE_BEGIN >filename

Insertion in dump.awk at line 75 [13.37838]

[8.5457]

[13.39160]

    # we are storing here the name of the _MAIN symbol as an awk
    # string, which we may have received in the main parameter. that's
    # why it's a little awkward
    print "    _MAIN = \"" awkescape(main) "\"" >>filename
    print "    main_expression = _cons(_symbol(_MAIN), _nil())" >>filename
    print "    _eval(main_expression)" >>filename

Insertion in dump.awk at line 95 [13.37838]

[2.792]

[8.5462]

function _dump_append_changing_main(g_filename, g_main,      filename, main) {
    if(_TYPE[g_filename] == "s") {
        if(_TYPE[g_main] == "'") {
            filename = _STRING[g_filename]
            main = _SYM_NUMBERS[g_main]
            logg_dbg("_dump_append_changing_main", "filename is " filename " and the form to evaluate at startup will be " main)
            return _dump(filename, 1, main)
        } else {
            logg_err("_dump_append_changing_main", "the value of the second parameter, main, must be a symbol")
            return _nil()
        }
    } else {
        logg_err("_dump_append_changing_main", "the value of the first parameter, filename, must be a string")
        return _nil()
    }
}

Insertion in doc/releases.org at line 7 [14.37]
[4.129]
[14.150]
```
repl
```

Deletion in doc/releases.org at line 22 [14.37]

B:BD[4.243] → [4.243:244]

B:BD[4.244] → [4.244:1354]


This function takes two parameters: (1) the filename of a glotawk
interpreter; (2) a filename to which to write a new interpreter. The
AWK code from glotawk is written into the new file, and then a dump of
the current heap. In this way, you can construct a single-file
interpreter containing not only glotawk's base library, but also
definitions of your own, which will be instantly available when you
run the new interpreter, without the overhead of parsing and without
the need to place copies of the files containing the definitions on
machines where the code needs to run.
So, previous to this change, you could put a built copy of ~glotawk~
on the machine where you need to run the code, plus copies of the
files where you define all your functions, ~a.glotawk~, ~b.glotawk~,
and ~c.glotawk~, and the file where you call them, ~end-use.glotawk~.
Then you'd run =glotawk a.glotawk b.glotawk c.glotawk
end-use.glotawk=. All the expressions in each file would be parsed and
evaluated. Most of the code by volume would be definitions. Not all
defined functions would be called. Startup time would be slower.

Replacement in doc/releases.org at line 23 [14.37]

B:BD[14.262] → [4.1355:1738]

As of this change, instead, you could run ~/wherever/glotawk~, tell it
to =(load "a.glotawk")= and etc., and =(save-lisp-and-die
"/wherever/glotawk" "/wherever-else/my-new-interpreter")=. Then
~my-new-interpreter~ will contain the definitions inside it.
Maybe I was a little too excited when naming this function, because it
doesn't yet support changing which code runs at startup.

[14.262]

[4.1738]

This function takes three parameters: (1) the filename of a glotawk
interpreter; (2) a filename to which to write a new interpreter; (3) a
symbol, whose value is a function to run at startup (usually, this is
~'repl~). The AWK code from glotawk is written into the new file, and
then a dump of the current heap. In this way, you can construct a
single-file interpreter containing not only glotawk's base library,
but also definitions of your own, which will be instantly available
without the overhead of parsing and without the need to distribute
more files. In addition, if you define your own main function and pass
its symbol in, the file written will run your function on startup,
instead of the REPL.