Using Shared Libraries under Unix
The family of 15!: foreigns enables you to call arbitrary functions from so-called shared libraries. Shared libraries can be dynamically (i.e., at run-time) linked to the J interpreter. You can use libraries coming with your Unix system (eg., the standard C library or the "resolver" library etc. etc.), libraries by third party vendors like a lapack library, or libraries your made on your own.Consult your system documentation, notably the man page for the linker ld(1), when you want to build your own shared libraries. This documentation will use the standard unix libc in the examples. You are of course free to use any other shared library, too.
The 15!: family has three parts:
- calling shared library functions
- self-allocated memory
- associating J names with allocated memory
Calling shared library functions
- cd =: 15!:0
- The left argument specifies
- the shared library to use
- the function to call
- the types of the result and the parameters
The right argument are the parameters to the function to be called. For example:
'/usr/lib/libc.so _strcmp i *c *c' 15!:0 'foo';'bar' +-+---+---+ |4|foo|bar| +-+---+---+The result is the returned function value (the "4" in this case), followed by the parameters (possibly modified by the function).
The underlying facility is the "dlopen" library routine. You are well advised to take a look at its man page and to find out the pecularities fo your Unix system. Most Unix variants use a ".so.MAJORVERSION.MINORVERSION" suffix for shared libraries, AIX being a major exception (".a" for both static and dynamic libs). On some variants you have to give the full path to the lib, including exact version numbers. Other systems will search for the "most suitable" library if you just request "libc.so". Some dlopen() variants do a search along the path in the LD_LIBRARY_PATH environment variable, others don't. Some systems will trigger the functions "__init" and "__fini" upon first loading and last release of a shared libary, respectively. For all these reasons, it's a good idea to have a glance at the dlopen man page.
Note that C compilers prepend a leading underscore to all programmer-defined external symbols. Hence "_strcmp", not "strcmp", in the library.
Type specifiers:
- n
- none (like "void"); just for a void return type
- s
- short
- i
- int and long int
- c
- char
- d
- double
- f
- float; note that the old K&R C standard didn't have any float parameters (they were implicitly promoted to double parameters). Float parameters are made possible by the ISO C standard but I don't know any libraries using float parameters yet. Summary: it's far more likely that a "double" is needed.
- *X
- Pointer to X (something of the above); goes along with array values
- *
- Pointer (without any further checks)
The different parameters must be boxed. An open scalar argument will be boxed automatically but that's as much convinience as you'll get. Box a single array parameter.
J values as parameters:
- A list of integers is good for the parameter "*i" etc. Note that there is a big difference between the scalar
5
and the singleton vector,5
and be careful.- A scalar boolean 0 or 1 will be converted to an integer. You have to do this yourself for boolean lists. Watch out and add a "0+" or so where necessary to force the type widening from boolean to integer.
<0
is a fine NULL pointer.- Be sure to provide enough memory space whereever a function expects it. For example:
'/usr/lib/libc.so _getwd *c *c' 15!:0 <(60#'*') +-------+------------------------------------------------------------+ |1282920|/net/ohura/home/ohura/neitzel/src/j/builds/ohura/j/engine **| +-------+------------------------------------------------------------+is toying with your life and will cause the well-deservered core dump as would with the corresponding faulty C program. So, be careful and know what you do (RTFM).- Adresses (pointers), in particular to allocated memory, are dealt with as boxed integers. The boxing is a necessary device when some DLL function wants to set a pointer variable (and expects a
(char**)
argument in ordinary C call-by-value lingo).In earlier Windows releases, the "*m" prototype used to go along with such boxed pointer values (as opposed to a "*c" which goes along with an array). The "*m" has been officially decomissioned now. Use an unadorned "*" with boxed addresses.
(<0)
is indeed a fine NULL pointer.- cder =: 15!:10
- When cd returns an error (usually a "domain error"), it can have one of two reasons. 15!:10 '' returns the result from the parameter checks. The number of arguments in the type specification and on the right side are compared and the types are checked. An error code consists of two integers: the first gives the general class of the error, the second often indicates which parameter was found to be in error.
- 0 0
- no error
- 1 0
- file not found
- 2 0
- procedure not found
- 3 0
- too many DLLs loaded (max 20)
- 4 0
- parameter count doesn't match declarations
- 5 x
- declaration x invalid
- 6 x
- parameter x type doesn't match declaration
- cderx =: 15!:11
- This is the other kind of possible failures. Whenever the parameters pass the formal prototype checks but the actual call to the shared library fails, you can trigger 15!:11 with a empty vector to get a more detailed error code and text:
'/usr/lib/liiiiibc.so _getwd *c *c' 15!:0 <(60#'*') |domain error | '/usr/lib/liiiiibc.so _getwd *c *c' 15!:0<(60#'*') 15!:11'' +-+--------------------------------------------------------------------+ |0|ld.so.1: jint: fatal: /usr/lib/liiiiibc.so: can't open file: errno=2| +-+--------------------------------------------------------------------+- cdf=:15!:5
- Call this with an empty vector once you are done with all the shared library work. On some systems, this will trigger the "__fini" routines in those libs which are now unmapped from the process' address space (i.e. all except those used by the J system itself).
Self-allocated memory
For many external routines you have to provide memory space which is under your control (unlike the temporary memory used for parameters or the memory behind variables which is managed by J and comes and goes along with (re-)assignments to those variable names).The following routines allow you maintain such memory. The memr and memw routines are also necessary to deal with pointers to memory returned from library routines. For example, mmap'ping a file results in memory neither managed by J nor by mema/memf. But you'd use memr/memw to access the memory.
Note that it is only necessary to use mema where you need a long-time reference to the data in one and the same location. All arrays you pass are effectively passed as pointers to writable memory during a 15!:0 call. The array will be returned to you modified if the called function makes any changes. Very often, that is good enough and you don't need any "mema" allocations.
- mema =: 15!:3
- Memory allocate. The sole parameter is the number of bytes to allocate. Return value is the address as a J integer.
- memr =: 15!:1
- Memory read. Argument is a three or four-element vector of integers specifying:
Return value is always a vector of the requested type.
- base address (typically but not necessarily previously obtained by mema)
- byte offset from base address
- number of items to read or _1 for null-terminated strings
- (optional) char/integer/double/complex type code (see 3!:0), defaulting to characters/bytes.
- memw =: 15!:2
- Memory write. Left argument: the data to write. Right argument: integer vector with:
- base address of memory to write to
- byte offset
- number of items to store
- type (optional, defaulting to character)
- memf =: 15!:4
- Memory free. Parameter is an address previously obtained by mema. Use this when you do not need the memory anymore.
For example, the following two approaches are both viable:
cd=:15!:0 mema=:15!:3 memr=:15!:1 statbuf_size=.500 NB. play it safe fn=.'/etc/passwd' NB. approach 1, using the array: ]r=. '/usr/lib/libc.so _stat i *c *c' cd '/etc/passwd';statbuf_size#'*' +-+-----------+---------------------------------------------...---+ |0|/etc/passwd|#$%-many-cryptic-bytes-here!@#$**************...***| +-+-----------+---------------------------------------------...---+ statbuf=. >2{r NB. continue to dissect and analyse the statbuf data NB. approach 2, using allocated memory: a=.mema statbuf_size '/usr/lib/libc.so _stat i *c *' cd '/etc/passwd';<<a +-+-----------+---------+ |0|/etc/passwd|+-------+| | | ||1505952|| | | |+-------+| +-+-----------+---------+ memr a,0,10,4 8388608 0 0 0 96209 33060 1 0 3 0 NB. have more looks at the data... memf aThe big difference with the mema approach is that you'll be able to keep the original data in its original location. Compare that to the temporary array approach, where statbuf is already a copy in a different location. Sometimes, this matters.Name associations
[This part still lacks all the details. It may even be incorrect. I need to do more experiments myself, stay tuned. For the very adventurous juggler, the system/main/jmf.ijs script on the Windows version shows an application of these facilities. Expect some tough reading, though, since it requires familiarity with the organisation of J data. If you happen to have the publication "An Implementation of J" by Roger Hui: it is outdated in that the "A" header maintains now (i.e., starting with version 4.02) references to both a shape vector and to the raveled data instead of including these in place. Keeping this change in mind, the "Noun" Chapter provides a helpful orientation.]
- symget =: 15!:6 [. symset =: 15!:7
- Tie self-allocated memory to J referents and retrieve the values via those names.
- gh15 =: 15!:8 [. fh15 =: 15!:9
- gh15 gets a descriptive header and ties it to memory you manage. The result is that the memory appears as an ordinary J object with rank, shape, and type as other J objects.
fh15 frees the header again. Usually, you would free the memory associated with the header at the same time.