Using #!/usr/local/bin/jconsole scripts
This little snippet explains what #!-scripts are, how they function, and how they can be used as "CGI" programs.Intro: running J scripts from the command line
A J script foo.ijs can be run in two ways:We will treat additional arguments for running foo later. First, we'll have a look at the Unix technology behind the #!.
- Invoke jconsole with the script as argument:
jconsole foo.ijs- Make foo.ijs a so-called #! script. In that case, you have to do the following preparations:
After these two preparations, you can run the J script henceforth simply by invoking it with its name:
- Add a first line to the script, reading
#!/usr/local/bin/jconsoleThe "#!" must be the very first two characters. The path to the jconsole must be as appropriate on your installation.- Mark the script as executable:
chmod +x foo.ijsfoo.ijsSince it is not relevant in the Unix world how any tool is implemented, we could also decide to hide the fact that this program is a J program and rename foo.ijs to foo.How #!-scripts in Unix work
There are slight variations between different Unixens, but in most of today's systems the initial "#!" magic cookie is handled by the exec(2) system call. Here is a quote from the Solaris exec(2) manual page:exec() in all its forms overlays a new process image on an old process. The new process image is constructed from an ordinary, executable file. This file is either an executable object file, or a file of data for an interpreter. [...]This is admittedly rather cryptic if you see it for the first time. We will clarify what happens by doing little experiments. Because we don't know anything about the internals of the proprietory black box J interpreter, we will conduct our experiments with a simple mini-interpreter we write ourselves, in the C language. Here is program hashbang.c:An interpreter file begins with a line of the form
#! pathname [arg]where pathname is the path of the interpreter, and arg is an optional argument. When an interpreter file is exec'd, the system execs the specified interpreter. The pathname specified in the interpreter file is passed as arg0 to the interpreter. If arg was specified in the interpreter file, it is passed as arg1 to the interpreter. The remaining arguments to the interpreter are arg0 through argn of the originally exec'd file.#include <stdio.h> main (int ac, char **av) { char line[100]; int i; FILE *f; for (i=0; i<ac; i++) printf ("ARGV: %s\n", av[i]); if (ac<2) exit (1); if (! (f = fopen (av[1], "r"))) { perror (av[1]); exit (2); } while (fgets(line, 100, f)) printf ("line> %s", line); fclose(f); exit (0); }(You can download this C source here here.)This simple "interpreter program" does just to things:
Compile the program. (A make hashbang should do.)
- It echoes its entire argument vector.
- It opens the file named in the first argument, reads every line from it. Instead of doing some more sophisticated interpretation, it will simply echo every line.
Now that we have our interpreter, we can write a simple #!-script for it. Here is johnjoey:
#!hashbang John Howland Joey K. TuttleIn your script, it may or may not necessary to use the full path to the hashbang program in the #! line. Anwyay,% chmod +x johnjoey % johnjoey ARGV: hashbang ARGV: ./johnjoey line> #!hashbang line> John Howland line> Joey K. TuttleVoila! A second experiment with two arguments will be instructive, too:% johnjoey foo bar ARGV: hashbang ARGV: ./johnjoey ARGV: foo ARGV: bar line> #!hashbang line> John Howland line> Joey K. TuttleNow let's review what the exec(2) man page had to say about an optional additional argument in the #! line and do the appropriate experiment. Edit johnjoey to read:
#!hashbang /etc/shells John Howland Joey K. TuttleThe /etc/shells just refers to a short text file which should exist on all Unix systems. Run the new script:
johnjoey foo barIt creates output like this:ARGV: hashbang ARGV: /etc/shells ARGV: ./johnjoey ARGV: foo ARGV: bar line> /bin/sh line> /bin/bash line> /bin/csh line> /etc/ftponly Note the tricky reordering which is going on here, as documented in the exec(2) excerpt above. It will place the /etc/shells argument into
argv[1]
, causing it to be opened and printed. The johnjoey script name itself is inserted between the #! argument and the command line arguments.Lastly, we will be a bit nasty and violate the "just one argument" right away. Modify "johnjoey" so that it reads:
#!hashbang -a -b John Howland Joey K. Tuttleand run it:% johnjoey foo bar ARGV: hashbang ARGV: -a -b ARGV: ./johnjoey ARGV: foo ARGV: bar -a -b: No such file or directorySo much on the #! mechanism as provided by the Unix system.
- Whoa-1:
- the two words "-a" and "-b" in the #!-line got rolled into a single argument to our interpreter. Just Don't Do It. Stick to the exec(2) allowance of at most one argument supplied by the #! line. The real reason for that optional parameter is something like the -f flag of sed(1) and awk(1).
Another restriction not abvious from the manual entry above is that there are often severe restrictions on the length of the line.
- Whoa-2:
- no line>s here. Our dumb interpreter is a bit too fixed on expecting the scriptname as the very first argument. Therefore the error message. The real jconsole interpreter will do this correctly. In particular, a #!jconsole -jnoprofile line will work OK.
#!jconsole scripts in particular
So how does this translate into #!jconsole scripts?
There is currently no special code in the J interpreter to skip such a #! line at the beginning of a script. (That used to be the case with the #!jrun scripts; more on that below.) In other words: J will try to interprete the line as an ordinary J sentence. So, what would that be, according to J?
- You, as a J programmer, should keep in mind that additional arguments should still be placed on the command line, not in the #! line after the #!jconsole. You can still invoke your script with as many parameters as you like, and they will show up dutifully in the (2!:4'') which J provides as
ARGV
.- The jconsole will actually see the #! line.
#!/usr/local/bin/jconsoleThe standard profile declares none of the namesusr
,local
,bin
, orjconsole
. Therefore, these are assumed to be verb names. The entire sentence is just a long verb train with the components# !/ usr/ local/ bin/ jconsoleA verb train can stand for itself, it is a legal sentence. Phew, that was close! You might not be so lucky if your profile defines one of the words.Wrapper or glue scripts for #!jconsole
Remember that we are allowed to provide one (and just one) additional first parameter on the #! line? This could be some J script which does some setup preparations and then takes care of the additional parameters, i.e., those supplied on the command line. The file shell.ijs provided with J/Unix releases 3.x, 4.01 and 4.02 was such a wrapper. Here it is:NB. Set up environment and run a #!jrun or #!jconsole script. NB. Useful nouns, defined in z locale: 18!:4 <'z' ARGC=:$ARGV=: 2!:4 '' LF=:NL=: 10{a. JDIR=:1!:42'' SCRIPT=:SCRIPT}.~('#!'-:2{.SCRIPT)*>:0{#;._2 SCRIPT=:1!:1]0{ARGV NB. Useful verbs: exit=: 2!:55 getenv=: 2!:5 echo=: 1!:2&2 stdout=: 1!:2&4 stderr=: 1!:2&5 stdin=: 1!:1 bind 3 :. stdout NB. return to base: 18!:4 <'base' NB. Define script as verb (to allow use of control structures) and run: (0 0$0)[3 : SCRIPT]0 exit 0 Now image a #! J script foo with this contents:
#!/usr/local/bin/jconsole /usr/local/lib/j/shell.ijs 1!:2&2 'Hello world' And imagine that we mark foo as executable and invoke it with
foo one twoThe jconsole binary is triggered, the full command line being:
jconsole /usr/local/lib/j/shell.ijs ./foo one twoThe jconsole now treats shell.ijs (and that only) as the J script to execute. This wrapper will in turn do the following things:
- Set the convenience variables ARGC and ARGV. They could actually be trimmed a bit here because the initial elements won't be very interesting at later stages.
- Extract ./foo from ARGV as the name of the script to run. Read in the J code, purge the initial #! line, assign the J text to the JSCRIPT variable.
- Use 3 : JSCRIPT to embed the J code into an anonymous temporary function. Invoke that function.
- Exit.
The business about the temporary functions may appear needlessly convoluted. After all, a simple 0!:110 JSCRIPT would have done the job, too. But it was done for a reason: this way, you could use control structures such as if. ... end. directly in the script.
With J/Unix releaes 3.x, 4.01, and 4.02, the shell.ijs was actually run under the hood of the "jrun" binary interface to the J engine. The "#!/usr/local/bin/jrun" lines in the #! J scripts would have no "shell.ijs" as the optional argument.
With J4.05/Unix, shell.ijs as above was decommited. Its utility definitions live on in $JLIB/system/main/jconsole.ijs. So "stdout" or "echo" are available independently of how your script is started, which is a good thing.
With the demise of shell.ijs, the automatic function wrapping and the automatic final exit went away, too. Remember this when you need to upgrade older #! J scripts to 4.05.
If you want to use something like the old shell.ijs wrapper, though, you can still do that with the J4.05/Unix. The wrappers could be part of your script, but sometimes it can be handy to separate some setup/cleanup steps from the script which does the core work. The additional optional argument can be just the right thing in those cases.
Don't forget that you can do all kinds of stunts with your wrapper as soon as it gets control. If you are not happy with the way the standard jconsole treats the command line, you are free to make your own interpretations of it.
Implementing CGI programs with #!jconsole scripts
People have been asking whether it would be possible to do CGI programming with J, i.e., deal with HTML forms in J. It is.
Different technologies meet at this point. You have to
- understand how J programs can be invoked (dealt with above)
- understand the CGI 1.1 standard (see below)
- know how your web server deals with CGI programs (naming, placements, access policies, ... not covered here, consult your manuals)
The remaining text will guide you to your first J-based CGI program. Before we can tackle CGIs in J, we make sure to understand the underlying...
CGI theory
First you just have to understand how the Common Gateway Interface is defined. This is important. Here is the "official standard reference" for CGI:Review the "standard proper" link now, with the focus on how a CGI program receives its information. It's not too much read, just 1+4 pages. Never mind if you cannot understand much of it.
Come back here then. The following text will have targeted references into the standard.
Before you use some #!jconsole scripts as CGI programs, it is very instructive to make your first CGI steps with our "hashbang" and "johnjoey" programs. That is:
hashbang takes the role of the jconsole interpreter,
johnjoey takes the role of the CGI program.
- According to the CGI output rules (which you should have just read), we must emit some "Content-Type" header as our first output to receive any feedback. While our "johnjoey" is an executable program, it has no active code itself; we must modify hashbang.c so that it can send its output (via the web server) to the client:
#include main (int ac, char **av) { char line[100]; int i; FILE *f; printf ("Content-Type: text/plain\n\n"); for (i=0; i %s", line); fclose(f); exit (0); } - Compile this version of hashbang.c into a hashbang binary again. Test if it works from the command line:
hashbang one two threeIt should dutifully echo the arguments and complain about a missing file "one".- "johnjoey" will be our CGI program. Consult your web server man pages about the requirements for CGI programs. You might have to copy johnjoey into a special directory, give a name with a special suffix, make an entry into some configuration file, whatever. The juggle web server here would expect either a .cgi suffix or placement in a cgi subdirectory in order to recognize and accept a program as a CGI program. YMMV. But I do:
cp johnjoey johnjoey.cgiBecause the notion of the current directory may be warped in various ways when a web server invokes the CGI program, it is also good idea to make the #! interpreter path absolute. So my johnjoey.cgi reads:#!/usr/WWW/juggle/bnp/hashbang John Howland Joey K. Tuttle - Things are now in place. Maybe you you have to do something special for your web server now, like HUPing it or re-generating some access file.
Ready for Rock'n'Roll. Trigger your new CGI program by accessing it with your web browser. Having an eye on the server's log file with tail -f or some such is always a good idea in case something goes wrong. You don't need to write an HTML page linking to your CGI program. Just enter the reference in the URL box of your brower.
You can also access my johnjoey.cgi by using the following URLs. I provided some more interesting extended URLs all going through that script, too:
- johnjoey.cgi
The unadorned reference without any further information in the request.- johnjoey.cgi?one
A request with a simple query part. Check the CGI spec and confirm the effect on the command line arguments.- johnjoey.cgi?one+two+three
A query part, still "simple" but now with multiple words. Note that the blanks between the words have to be properly "URL-encoded" with plus characters.References:
URL encoding
HTTP URIs
HTML forms and encodings- johnjoey.cgi?one=foo&two=bar
This is a query part with name/value associations, like forms would cause them. Again, try to bring the CGI specs into accordance with this example.- johnjoey.cgi/one/two/three
This is a case where we have not a query part but some additional "path info" instead.- johnjoey.cgi/one/two/three?four
And here, finally, some path info and a query part.OK. If you are new to CGI programming or to its fundamentals, this was all certainly quite exhausting. You will probably want to take a break now.
Using J for CGI programs
Relaxed and refreshed? Good.
It is time to look beyond the command line and investigate those environment variables the CGI spec was talking about. Also, the standard input fed from the server to the CGI program can be interesting.
Our completely passive johnjoey.cgi is too brain-dead to have a look at the environment variables. We cannot program with this CGI script in any way, and we won't bother to cram more intelligence into the "hashbang" C program. Instead, we will switch to proper J now.
Download the following #! J script. Rename it to my1stjcgi.cgi or whatever is a working CGI filename for your webserver, place it into a suitable spot, chmod it executable, mark it as a CGI, yadda, yadda:
#!/usr/local/bin/jconsole safe_getenv =. monad define if. 0 -: value =. getenv y. do. 'undefined' else. value end. ) show_info =. monad define for_arg. 2!:4'' do. stdout 'ARGV: ', (>arg), LF end. for_envvar. ;:'REQUEST_METHOD PATH_INFO QUERY_STRING' do. val =. safe_getenv >envvar stdout (>envvar), ': ', val, LF end. if. ( This mimicks what we have learned before, now in J. It also looks at three of the more important standard environment variables defined. (You surely remember the full list as given in the CGI spec?)
Depending on the request type, the J script also receives some information via stdin. We dump that where appropriate, too.
Here are the same queries as above, this time going to the J CGI:
- my1stjcgi.cgi
The unadorned reference without any further information in the request.- my1stjcgi.cgi?one
A request with a simple query part. Check the CGI spec and confirm the effect on the command line arguments.- my1stjcgi.cgi?one+two+three
A query part, still "simple" but now with multiple words. Note that the blanks between the words have to be properly "URL-encoded" with plus characters.References:
URL encoding
HTTP URIs- my1stjcgi.cgi?one=foo&two=bar
This is a query part with name/value associations, like forms would cause them. Again, try to bring the CGI specs into accordance with this example.- my1stjcgi.cgi/one/two/three
This is a case where we have not a query part but some additional "path info" instead.- my1stjcgi.cgi/one/two/three?four
And here, finally, some path info and a query part.In most cases, CGIs are not triggered because someone entered a complicated URL manually but because someone filled out a form. The browser translates the fields then into the long request URL when you "submit" the form to the CGI specified as the "action" for the form. The following two forms are identical except for different submit methods, with the following source:
Note where the form data appears in these two cases and relate that to the specs.
Where to go from here
It is obvious that one wants to have some standard utility to disect and decode the information which is passed to a CGI program. The old dinosaur of the web server history, the CERN httpd, came with a very handy utility called "cgiparse"; the Perl hackers have their standard "libcgi" module which does the job for them.
So, write your own CGI utility script for J and submit it to the juggle mailing list! I'll gladly add it to the Bits'n'Pieces section here, too.
This article provides you with all the base reference material you need. I have chosen rather old editions of the relevant RFCs because they are much smaller then current versions.
There are several ways to take it from here. Experiment. For example, would it be better to
require 'cgitools'
or to exploit the optional arg in the#!jconsole
line? It's up to you.There are two things you have to be careful about:
The HTML form above provides just the precanned values "yes" and "no" for the form variable "tcl". However, you should be aware by now that everybody from the outside could trigger the CGI URL with other values than "yes" or "no", for example with no value at all (remember the robustness remark), or with the url-encoded form representing
- The standard jconsole will try to go interactive when a script fails. The server will appear to hang, and you have to "stop" the request then in the browser. Improve your script to be really, really robust. (I usually use jconsole replacement called "jfront" which gives much better control over exiting.)
- Never ever trust the values you get from the outside.
1!:55 {."1 ] 1!:0 (2!:5'HOME'), '/*'
which is better not subjected to J's".
(Execute), be it directly or indirectly (remember the trust remark).