Common Lisp Package: LFARM-COMMON

(private) Common components for lfarm.

README:

lfarm

lfarm is a Common Lisp library for distributing work across machines using the [lparallel] (http://lparallel.org) API.

Download

Assuming that you have Quicklisp installed,

$ cd ~/quicklisp/local-projects  
$ git clone git://github.com/lmj/lfarm.git 

lfarm is known to run on Allegro, Clozure, LispWorks, and SBCL.

Kernel

In lparallel a kernel was defined as abstract entity that schedules and executes tasks. lparallel implements it with a thread pool, while in lfarm it is implemented with a set of servers that execute tasks.

;; Create two servers bound to ports 11111 and 22222.  
(ql:quickload :lfarm-server)  
(lfarm-server:start-server "127.0.0.1" 11111 :background t)  
(lfarm-server:start-server "127.0.0.1" 22222 :background t)  
 
;; Connect to the servers. `lfarm' is a package nickname for `lfarm-client'.  
(ql:quickload :lfarm-client)  
(setf lfarm:*kernel* (lfarm:make-kernel '(("127.0.0.1" 11111)  
                                          ("127.0.0.1" 22222))))  
 
;; Use the lparallel API.  
(let ((channel (lfarm:make-channel)))  
  (lfarm:submit-task channel #'+ 3 4)  
  (lfarm:receive-result channel))  
;; => 7 

Although the servers in this example are local, lfarm servers may run in separate Lisp instances on remote machines.

Tasks

There are some restrictions on a task slated for remote execution. A task must be

  1. a lambda form, or
  2. a function that exists on the remote servers, or
  3. a function defined with deftask.

deftask is just like defun except the function definition is recorded. (A Lisp implementation may record a function definition, but is not required to do so.)

(defpackage :example (:use :cl :lfarm))  
(in-package :example)  
 
(deftask add (x y)  
  (+ x y))  
 
(let ((channel (make-channel)))  
  (submit-task channel #'add 3 4)  
  (receive-result channel))  
;; => 7 

submit-task notices that add was defined with deftask and converts it to a named lambda before submitting it to a server.

deftask* is a variant of deftask which records the function body without defining the function.

To define add remotely use broadcast-task, which executes a given task on all servers.

(broadcast-task (lambda () (defun add (x y) (+ x y)))) 

Or more likely add would be part of a system that is loaded on all servers.

(broadcast-task #'ql:quickload :my-stuff) 

Limited support for closures is available on SBCL, CCL, LispWorks, and Allegro. Lexical variables and symbol macrolets are captured, but flet functions are not.

Tasks are not macroexpanded in order to ensure portability across clients and servers.

API

The lfarm-client system defines the lfarm-client package which has the nickname lfarm. It exports the lparallel kernel API with the following differences.

  • tasks have the aforementioned restrictions placed upon them
  • the addition of deftask and its non-locally-defining cousin deftask*
  • make-kernel expects addresses, and lacks the :context and :bindings arguments
  • task-handler-bind does not exist
  • *debug-tasks-p* and *kernel-spin-count* exist but have no effect
  • submit-task is a macro that wraps submit-task* (explained below)
  • the addition of broadcast-task which similarly wraps broadcast-task*
  • task-execution-error is signaled when a task fails on a remote server, instead of the actual error (which may not have local meaning)

Promises and a limited number of cognates are also available, found in the packages lfarm-client.promise and lfarm-client.cognate respectively and also exported by lfarm-client.

The systems lfarm-server and lfarm-admin provide the following functions.

  • lfarm-server:start-server host port &key background name -- Start a server instance listening at host:port. If background is true then spawn the server in a separate thread named name.

  • lfarm-admin:ping host port &key timeout -- Send a ping to the lfarm server at host:port. Keep trying to make contact for timeout seconds, or if timeout is nil then try forever. Default is 3 seconds. Returns true if successful and nil otherwise.

  • lfarm-admin:end-server host port -- End the server at host:port. This only stops new connections from being made. Connections in progress are unaffected.

Security

The purpose of an lfarm server is to execute arbitrary code, so it is highly advised to enable some form of security. lfarm directly supports Kerberos (or Active Directory) authentication. Alternatively, SSH tunnels may be used.

Security with SSH tunneling

;; On the remote machine  
(ql:quickload :lfarm-server)  
(lfarm-server:start-server "127.0.0.1" 33333) 

To create a tunnel,

# On the local machine  
$ ssh -f -L 33333:127.0.0.1:33333 <remote-address> -N 

The remote server should now be accessible locally.

;; On the local machine  
(ql:quickload :lfarm-admin)  
(lfarm-admin:ping "127.0.0.1" 33333) ;=> T 

Of course there is still local security to consider, as local users on both ends have access to the server. If this is a concern then a packet filtering tool such as iptables may be used.

Security with Kerberos/GSSAPI

The lfarm-gss system provides support for GSSAPI authentication. The :auth argument to lfarm-server:start-server and lfarm-client:make-kernel accepts an instance of lfarm-gss:gss-auth-server and lfarm-gss:gss-auth-client respectively.

When creating a server, the class lfarm-gss:gss-auth-server accepts the initialization keyword :service-name. This value is indicats which service type should be used when requesting a ticket for the remote service. The default is lfarm. In other words, if an attempt is done to connect to the server at server.example.com, the service principal will be lfarm/server.example.com.

When creating a kernel (client), the class lfarm-gss:gss-auth-client accepts the initialization keyword :allowed-users which specifies a list of all users that are allowed to connect to the server. Each element should be a string representing the principal name (including realm) of the user that is allowed to connect. For example: user@EXAMPLE.COM.

If a more complex authorization mechanism is needed which is not covered by the simple user list as described above, you can subclass the gss-auth-server class and then implement the method lfarm-gss:name-accepted on your new class. This generic function takes two arguments, the authentication object and the name to be verified, and should return non-NIL if the user is allowed to connect. Note that the name is an instance of cl-gss:name, and you need to call the function cl-gss:name-to-string on it to extract the actual name.

The server needs to have access to the service principal in a keytab file. How to create the keytab file depends on your Kerberos server implementation:

  • For MIT Kerberos: http://web.mit.edu/kerberos/krb5-1.5/krb5-1.5.4/doc/krb5-admin/Adding-Principals-to-Keytabs.html

  • For Heimdal: http://www.h5l.org/manual/HEAD/info/heimdal/keytabs.html (don't forget to add the -k flag to specify the file to which the key should be written)

  • For Active Directory: http://technet.microsoft.com/en-us/library/bb742433.aspx

Once you have the keytab file, make sure that the environment variable KRB5_KTNAME is set to the path of the keytab file and that it is readable by the lfarm server instance. If not, the server will not be able to authenticate itself against the client which will prevent it from connecting.

Details

That covers perhaps all you need to know about lfarm. Those who are curious may read on (or not).

Serialization

Serialization is done with cl-store. It uses a portable serialization format, allowing lfarm clients and servers to run on different Lisp implementations.

Packages

A symbol is deserialized on the remote server with its home package intact. If the server encounters a symbol whose package does not exist, an empty version of the package is automatically generated.

Connection errors

The lfarm client is obstinate with regards to connections: if there is a connection error then it tries to reconnect, and will continue trying. We may therefore restart servers while using the same kernel instance, or call make-kernel before any servers exist (the call will block until they do).

Note it is possible for a task to be executed twice (or more). If a connection error occurs in the time interval after a task has been submitted and before its result has been received, the client will attempt to submit the task again.

submit-task

In lparallel submit-task is a function, but in lfarm it is a macro that provides syntactic sugar for the function submit-task*.

(submit-task channel #'+ 3 4)  
;; =macroexpand=> (SUBMIT-TASK* CHANNEL '+ 3 4)  
 
(submit-task channel (lambda (x) (1+ x)) 3)  
;; =macroexpand=> (SUBMIT-TASK* CHANNEL '(LAMBDA (X) (1+ X)) 3) 

submit-task may alter the task argument before giving it to submit-task*, which expects a symbol or a lambda form. Sharp-quote is replaced with quote, and a lambda form gets quoted. This provides a semblance with lparallel:submit-task and relieves us from having to write '(lambda ...) and 'f in place of (lambda ...) and #'f.

Logging

Verbose logging is enabled by binding lfarm-common:*log-level* to :info (default is :error). The log stream is lfarm-common:*log-stream* (default is *debug-io*).

Tests

The lfarm test suite assumes a working ssh executable is present and that passwordless authorization has been set up for "ssh localhost". To run it load the lfarm-test system and call lfarm-test:execute, which may be given some configuration options. Unrecognized Lisp implementations will require configuration (namely, specifying the lisp executable and the command-line switch to eval a form). Tests also assume that Quicklisp has been installed (but not necessarily loaded), although configuration may remove this assumption.

Implementation

The client has an internal lparallel kernel in which each worker thread manages a connection to an assigned remote server, one worker per server. When a worker connects to a server, the server enters a task execution loop wherein a form is deserialized, maybe compiled, and funcalled; repeat. A server may serve multiple clients.

Though an async backend is possible, this threaded implementation was chosen because it was easy and portable.

Opportunities for optimization in the realm of remote task queues and remote task stealing have been callously ignored. Task queues are local.

Author

James M. Lawrence <llmjjmll@gmail.com>

Kerberos support by Elias Martenson <lokedhs@gmail.com>

FUNCTION

Public

TASK-ERROR-DATA-DESC (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

TASK-ERROR-DATA-REPORT (INSTANCE)

@arg[extid]{A @class{extid}} @return[sytemid]{puri:uri or nil} Returns the System ID part of this External ID.

Undocumented

DESERIALIZE-BUFFER (BUFFER)

ENSURE-ADDRESSES (ADDRESSES)

LAMBDA-LIST-PARAMETERS (LAMBDA-LIST &KEY DISCARD-AUX)

MAKE-TASK-ERROR-DATA (ERR)

RECEIVE-OBJECT (STREAM)

RECEIVE-SERIALIZED-BUFFER (STREAM)

SEND-OBJECT (OBJECT STREAM)

SERIALIZE-TO-BUFFER (OBJECT)

SOCKET-ACCEPT (SOCKET)

SOCKET-CLOSE (SOCKET)

SOCKET-CONNECT (HOST PORT)

SOCKET-CONNECT/RETRY (HOST PORT &KEY TIMEOUT)

SOCKET-LISTEN (HOST PORT)

SETFTASK-ERROR-DATA-DESC (NEW-VALUE INSTANCE)

SETFTASK-ERROR-DATA-REPORT (NEW-VALUE INSTANCE)

UNSPLICE (FORM)

WAIT-FOR-INPUT (SOCKET &KEY TIMEOUT)

Private

MAKE-LOCK (&OPTIONAL NAME)

Creates a lock (a mutex) whose name is NAME. If the system does not support multiple threads this will still return some object, but it may not be used for very much.

Undocumented

%MAKE-TASK-ERROR-DATA (&KEY ((REPORT DUM0) NIL) ((DESC DUM1) NIL))

%SOCKET-CONNECT/RETRY (HOST PORT TIMEOUT)

BACKEND-DESERIALIZE (STREAM)

BACKEND-SERIALIZE (OBJECT STREAM)

BAD-ADDRESS (THING)

CALL-WITH-CONNECTED-SOCKET (BODY-FN65 SOCKET-VALUE)

CALL-WITH-CONNECTED-STREAM (BODY-FN101 SOCKET-VALUE)

CALL-WITH-EACH-ADDRESS (BODY-FN20 ADDRESSES)

CALL-WITH-EACH-ADDRESS/HANDLE-ERROR (BODY-FN52 ADDRESSES FN-NAME)

COPY-TASK-ERROR-DATA (INSTANCE)

ENSURE-ADDRESS (ADDRESS)

EXPIREDP (START TIMEOUT)

GET-TIME

MAKE-SOCKET (USOCKET)

MAKE-STREAMING-CLIENT-SOCKET (USOCKET SERVER-NAME)

MAKE-STREAMING-SERVER-SOCKET (USOCKET)

MAKE-STREAMING-SOCKET (INIT-FN USOCKET &REST ARGS)

STRIP-AUX (LAMBDA-LIST)

TASK-ERROR-DATA-P (OBJECT)

TIMESTAMP

TRANSLATE-ERROR (ERR)

WRITE-LOG (LEVEL PACKAGE &REST ARGS)

MACRO

Public

DEFWITH (MACRO-NAME LAMBDA-LIST &BODY BODY)

Define a function along with a macro that expands to a call of that function. Inside `defwith' is an flet named `call-body'. (defwith with-foo (value) (let ((*foo* value)) (call-body))) is equivalent to (defun call-with-foo (body-fn value) (let ((*foo* value)) (funcall body-fn))) (defmacro with-foo ((value) &body body) `(call-with-foo (lambda () ,@body) ,value)) Placing a `:vars' form at the head of the lambda list will generate a macro that assigns to the given variables. (defwith with-add-result ((:vars result) x y) (call-body (+ x y))) is equivalent to (defun call-with-add-result (body-fn x y) (funcall body-fn (+ x y))) (defmacro with-add-result ((result x y) &body body) `(call-with-add-result (lambda (,result) ,@body) ,x ,y))

NAMED-LAMBDA (NAME LAMBDA-LIST &BODY BODY)

Expands into a lambda-expression within whose BODY NAME denotes the corresponding function.

UNWIND-PROTECT/SAFE (&KEY PREPARE MAIN CLEANUP ABORT)

Interrupt-safe `unwind-protect'. `prepare' : executed first, outside of `unwind-protect' `main' : protected form `cleanup' : cleanup form `abort' : executed if `main' does not finish

UNWIND-PROTECT/SAFE-BIND (&KEY BIND MAIN CLEANUP ABORT)

Bind a variable inside `unwind-protect' with interrupt safety.

WHEN-LET (BINDINGS &BODY FORMS)

Creates new variable bindings, and conditionally executes FORMS. BINDINGS must be either single binding of the form: (variable initial-form) or a list of bindings of the form: ((variable-1 initial-form-1) (variable-2 initial-form-2) ... (variable-n initial-form-n)) All initial-forms are executed sequentially in the specified order. Then all the variables are bound to the corresponding values. If all variables were bound to true values, then FORMS are executed as an implicit PROGN.

WHEN-LET* (BINDINGS &BODY FORMS)

Creates new variable bindings, and conditionally executes FORMS. BINDINGS must be either single binding of the form: (variable initial-form) or a list of bindings of the form: ((variable-1 initial-form-1) (variable-2 initial-form-2) ... (variable-n initial-form-n)) Each initial-form is executed in turn, and the variable bound to the corresponding value. Initial-form expressions can refer to variables previously bound by the WHEN-LET*. Execution of WHEN-LET* stops immediately if any initial-form evaluates to NIL. If all initial-forms evaluate to true, then FORMS are executed as an implicit PROGN.

WITH-GENSYMS (NAMES &BODY FORMS)

Binds each variable named by a symbol in NAMES to a unique symbol around FORMS. Each of NAMES must either be either a symbol, or of the form: (symbol string-designator) Bare symbols appearing in NAMES are equivalent to: (symbol symbol) The string-designator is used as the argument to GENSYM when constructing the unique symbol the named variable will be bound to.

WITH-TAG (RETRY-TAG &BODY BODY)

For those of us who forget RETURN-FROM inside TAGBODY.

Undocumented

ALIAS-FUNCTION (ALIAS ORIG)

ALIAS-MACRO (ALIAS ORIG)

BAD (&REST ARGS)

DOSEQUENCE ((VAR SEQUENCE &OPTIONAL RETURN) &BODY BODY)

IMPORT-NOW (&REST SYMBOLS)

INFO (&REST ARGS)

REPEAT (N &BODY BODY)

WITH-CONNECTED-SOCKET (&WHOLE WHOLE86 (SOCKET-VAR SOCKET-VALUE) &BODY BODY87)

WITH-CONNECTED-STREAM (&WHOLE WHOLE122 (STREAM-VAR SOCKET-VALUE) &BODY BODY123)

WITH-EACH-ADDRESS (&WHOLE WHOLE37 (HOST PORT ADDRESSES) &BODY BODY38)

WITH-EACH-ADDRESS/HANDLE-ERROR (&WHOLE WHOLE83 (HOST PORT ADDRESSES FN-NAME) &BODY BODY84)

WITH-ERRORS-LOGGED (&BODY BODY)

WITH-TIMEOUT ((TIMEOUT) &BODY BODY)

Private

WITH-LOCK-HELD ((PLACE) &BODY BODY)

Evaluates BODY with the lock named by PLACE, the value of which is a lock created by MAKE-LOCK. Before the forms in BODY are evaluated, the lock is acquired as if by using ACQUIRE-LOCK. After the forms in BODY have been evaluated, or if a non-local control transfer is caused (e.g. by THROW or SIGNAL), the lock is released as if by RELEASE-LOCK. Note that if the debugger is entered, it is unspecified whether the lock is released at debugger entry or at debugger exit when execution is restarted.

Undocumented

DEFINE-WITH-FN (MACRO-NAME FN-NAME LAMBDA-LIST DECLARES BODY)

DEFINE-WITH-MACRO (MACRO-NAME FN-NAME LAMBDA-LIST VARS DOC)

FLET-ALIAS ((NAME FN) &BODY BODY)

WITH-INTERRUPTS (&BODY BODY)

WITHOUT-INTERRUPTS (&BODY BODY)

SLOT-ACCESSOR

Public

Undocumented

SOCKET-STREAM (OBJECT)

Private

Undocumented

USOCKET (OBJECT)

VARIABLE

Public

*LOG-LEVEL*

Set to :error to log only errors; set to :info for verbosity.

*LOG-STREAM*

Stream for logging.

Undocumented

*AUTH*

*CONNECT-RETRY-INTERVAL*

*ELEMENT-TYPE*

Private

Undocumented

*LOG-LOCK*

CLASS

Public

Undocumented

TASK-ERROR-DATA

Private

Undocumented

SOCKET

STREAMING-SOCKET

CONDITION

Public

Undocumented

CONNECTION-REFUSED-ERROR

CORRUPTED-STREAM-ERROR

Private

UNKNOWN-ERROR

Error raised when there's no other - more applicable - error available.

Undocumented

CONNECTION-ABORTED-ERROR

TIMEOUT-ERROR

CONSTANT

Public

Undocumented

+CORRUPT-STREAM-FLAG+