Common Lisp Package: MGL-BP

Backpropagation.

README:

FUNCTION

Public

ADD-LUMP (LUMP BPN)

Add LUMP to BPN. MAX-N-STRIPES of LUMP gets set to equal that of the previous last, non-weight lump of BPN.

BACKWARD-BPN (BPN &KEY (LAST-LUMP NIL LAST-LUMP-P))

Accumulate derivatives of weights.

FORWARD-BPN (BPN &KEY FROM-LUMP TO-LUMP END-LUMP)

Propagate the values from the already clamped inputs.

Undocumented

->* (&REST ARGS)

->+ (&REST ARGS)

->ABS (&REST ARGS)

->ACTIVATION (&REST ARGS)

->CONSTANT (&REST ARGS)

->CROSS-ENTROPY (&REST ARGS)

->CROSS-ENTROPY-SOFTMAX (&REST ARGS)

->DROPOUT (&REST ARGS)

->ERROR (&REST ARGS)

->EXP (&REST ARGS)

->INPUT (&REST ARGS)

->LINEAR (&REST ARGS)

->MAX (&REST ARGS)

->NORMALIZED (&REST ARGS)

->PERIODIC (&REST ARGS)

->RECTIFIED (&REST ARGS)

->REF (&REST ARGS)

->REP (&REST ARGS)

->ROUGH-EXPONENTIAL (&REST ARGS)

->SCALED-TANH (&REST ARGS)

->SIGMOID (&REST ARGS)

->SOFTMAX (&REST ARGS)

->SOFTPLUS (&REST ARGS)

->SQUARED-ERROR (&REST ARGS)

->STOCHASTIC-SIGMOID (&REST ARGS)

->STRETCH (&REST ARGS)

->SUM (&REST ARGS)

->SUM-SQUARED-ERROR (&REST ARGS)

->WEIGHT (&REST ARGS)

COLLECT-BPN-ERRORS (SAMPLER BPN &KEY COUNTERS-AND-MEASURERS)

FIND-LUMP (NAME BPN &KEY ERRORP)

REMOVE-LUMP (LUMP BPN)

RENORMALIZE-ACTIVATIONS (->ACTIVATIONS L2-UPPER-BOUND)

Private

FIRST-TRAINED-WEIGHT-LUMP (TRAINER BPN)

Much time can be wasted computing derivatives of non-trained weight lumps. Return the first one that TRAINER trains.

Undocumented

->LUMP (BPN LUMP-SPEC)

->WEIGHT* (&REST ARGS)

ADD-AND-FORGET-DERIVATIVES (TRAINER BPN)

COPY-BPN-WEIGHTS (FROM-BPN TO-BPN &KEY ERROR-IF-NO-MATCH-P)

DERIVATIVES* (LUMP)

DERIVE-ROUGH-EXPONENTIAL (X &KEY SIGNAL-VARIANCE LENGTH-SCALE (ROUGHNESS 2))

ENSURE-SOFTMAX (LUMP)

LIMIT-STRIPES (LUMP N)

MAX-N-STRIPES* (LUMP)

N-STRIPES* (LUMP)

NEXT-LUMP-NAME

NODES* (LUMP)

NORM (V)

ROUGH-EXPONENTIAL (X &KEY SIGNAL-VARIANCE LENGTH-SCALE (ROUGHNESS 2))

SEGMENT-SET-DERIVATIVES->WEIGHTS (SEGMENT-SET WEIGHTS)

ZERO-NON-WEIGHT-DERIVATIVES (BPN &KEY (LAST-LUMP NIL LAST-LUMP-P))

MACRO

Public

BUILD-BPN ((&KEY (CLASS ''BPN) INITARGS (MAX-N-STRIPES 1)) &BODY LUMPS)

Syntactic sugar to assemble BPNs from lumps. Like LET* it is a sequence of bindings (of symbols to lumps). The names of the lumps created default to the symbol of the binding. In case a lump is not bound to a symbol (because it was created in a nested expression), the local function LUMP finds the lump with the given name in the bpn being built. Example: (mgl-bp:build-bpn () (features (mgl-bp:->input :size n-features)) (biases (mgl-bp:->weight :size n-features)) (weights (mgl-bp:->weight :size (* n-hiddens n-features))) (activations0 (mgl-bp:->activation :weights weights :x features)) (activations (mgl-bp:->+ :args (list biases activations0))) (output (mgl-bp:->sigmoid :x activations)))

WITH-WEIGHTS-COPIED ((FROM-BPN) &BODY BODY)

In BODY ->WEIGHT will first look up if a weight lump of the same name exists in FROM-BPN and return that, or else create a weight lump normally. If FROM-BPN is NIL, then weights are copied.

Undocumented

DEFLUMP (NAME DIRECT-SUPERCLASSES DIRECT-SLOTS &REST OPTIONS)

GENERIC-FUNCTION

Public

Undocumented

COMPUTE-DERIVATIVES (SAMPLES TRAINER BPN)

DEFAULT-SIZE (LUMP)

DERIVE-LUMP (LUMP)

TRANSFER-LUMP (LUMP)

Private

Undocumented

SET-INPUT-DONE (LUMP)

SLOT-ACCESSOR

Public

CLASS-WEIGHTS (OBJECT)

If non-NIL, an FLT-VECTOR of GROUP-SIZE. Useful TARGET's distribution is different on the training and test sets. Just set the w_i to test_frequency_i/training_frequency_i.

SETFCLASS-WEIGHTS (NEW-VALUE OBJECT)

If non-NIL, an FLT-VECTOR of GROUP-SIZE. Useful TARGET's distribution is different on the training and test sets. Just set the w_i to test_frequency_i/training_frequency_i.

COST (OBJECT)

Return the sum of costs for all active stripes. The cost of a stripe is the sum of the error nodes. The second value is the number of stripes.

DEFAULT-VALUE (OBJECT)

Upon creation or resize the lump's nodes get filled with this value.

DERIVATIVES (OBJECT)

Derivatives of nodes, input node derivatives are not calculated. A 1d array representing a matrix of the same dimension as NODES.

DROPOUT (OBJECT)

If non-NIL, then in the forward pass zero out each node in this chunk with DROPOUT probability. See Geoffrey Hinton's 'Improving neural networks by preventing co-adaptation of feature detectors'.

IMPORTANCE (OBJECT)

If non-NIL, an FLT-VECTOR of n-stripes.

SETFIMPORTANCE (NEW-VALUE OBJECT)

If non-NIL, an FLT-VECTOR of n-stripes.

INDICES-TO-CALCULATE (OBJECT)

NIL or a simple vector of array indices into this lump's range (i.e. in the 0 (1- SIZE) interval). Need not be ordered. If not NIL the node's value is not calculated and its derivatives are not propagated unless it is in INDICES-TO-CALCULATE. It has no effect subsequent lumps: they may use values that have not been recalculated. The primary use-case is to temporarily mask out an uninteresting part of the network.

SETFINDICES-TO-CALCULATE (NEW-VALUE OBJECT)

NIL or a simple vector of array indices into this lump's range (i.e. in the 0 (1- SIZE) interval). Need not be ordered. If not NIL the node's value is not calculated and its derivatives are not propagated unless it is in INDICES-TO-CALCULATE. It has no effect subsequent lumps: they may use values that have not been recalculated. The primary use-case is to temporarily mask out an uninteresting part of the network.

LUMPS (OBJECT)

Lumps in reverse order

NODES (OBJECT)

The values of the nodes. All nodes have values. It is conceptually a N-STRIPES x SIZE matrix that can be enlarged to MAX-N-STRIPES x SIZE by setting N-STRIPES.

SOFTMAX (OBJECT)

A matrix of the same size as X, EXP'ed and normalized in groups of GROUP-SIZE.

TARGET (OBJECT)

A lump of the same size as INPUT-LUMP that is the T in -sum_{k}target_k*ln(x_k) which the the cross entropy error.

Undocumented

GROUP-SIZE (OBJECT)

NAME (OBJECT)

NOISYP (OBJECT)

SETFNOISYP (NEW-VALUE OBJECT)

NORMALIZE-WITH-STATS-P (OBJECT)

SETFNORMALIZE-WITH-STATS-P (NEW-VALUE OBJECT)

NORMALIZED-CAP (OBJECT)

SETFNORMALIZED-CAP (NEW-VALUE OBJECT)

SIZE (OBJECT)

TRANSPOSE-WEIGHTS-P (OBJECT)

UPDATE-STATS-P (OBJECT)

SETFUPDATE-STATS-P (NEW-VALUE OBJECT)

Private

SAME-STRIPES-P (OBJECT)

Non-NIL iff all stripes are the same. If true, it effectively overrides both N-STRIPES and MAX-N-STRIPES and there is only one column in NODES and DERIVATIVES. Set up by the lump itself taking its inputs into account. Notably, ->WEIGHTS always have SAME-STRIPES-P T.

SCALE (OBJECT)

The sum of nodes after normalization. Can be changed during training, for instance when clamping. If it is a vector then its length must be MAX-N-STRIPES which automatically maintained.

SETFSCALE (NEW-VALUE OBJECT)

The sum of nodes after normalization. Can be changed during training, for instance when clamping. If it is a vector then its length must be MAX-N-STRIPES which automatically maintained.

X (OBJECT)

This is the input lump.

Undocumented

ARGS (OBJECT)

DROP-NEGATIVE-INDEX-P (OBJECT)

FIRST-TRAINED-LUMP (OBJECT)

INTO (OBJECT)

LENGTH-SCALE (OBJECT)

N (OBJECT)

NORMALIZED-LUMP (OBJECT)

PERIOD (OBJECT)

ROUGHNESS (OBJECT)

RUNNING-STATS (OBJECT)

SIGNAL-VARIANCE (OBJECT)

WEIGHTS (OBJECT)

Y (OBJECT)

VARIABLE

Private

Undocumented

*BPN-BEING-BUILT*

*IN-TRAINING-P*

*LUMPS-TO-COPY*

*NEXT-LUMP-NAME*

CLASS

Public

->ACTIVATION (&REST ARGS)

Perform X*WEIGHTS where X is of size M and WEIGHTS is a ->WEIGHT whose single stripe is taken to be of dimensions M x N stored in column major order. N is the size of this lump. If TRANSPOSE-WEIGHTS-P then WEIGHTS is N x M and X*WEIGHTS' is computed.

->CROSS-ENTROPY-SOFTMAX (&REST ARGS)

A specialized lump that is equivalent to hooking ->EXP with NORMALIZED-LUMP and ->CROSS-ENTROPY but is numerically stable. See <http://groups.google.com/group/comp.ai.neural-nets/msg/a7594ebea01fef04?dmode=source> It has two parameters X and TARGET. In the transfer phase it computes the EXP of each input node and normalizes them as if by NORMALIZED-LUMP. These intermediate values are placed into SOFTMAX. The value node K is nodes_k = - target_k * ln(softmax_k). Since the sum of this is cross entropy: - sum_k target_k * ln(softmax_k), simply plug this lump into an ->ERROR. In the derive phase it computes the cross entropy error of the normalized input: d(-sum_k{target_k * ln(softmax_k)})/dx_k = sum_j{ target_j * (softmax_k - KDELjk)} which is equal to softmax_k - target_k if target sums to 1.

->ERROR (&REST ARGS)

An error node is usually a leaf in the graph of lumps. Contrary to non-error leaf lumps it gets a non-zero derivative: 1. Error lumps have exactly one node (in each stripe) whose value is computed as the sum of nodes in the X parameter lump.

->RECTIFIED (&REST ARGS)

max(0,x) activation function. If NOISYP then add normal(0,sigmoid(x)) noise to x.

->SOFTPLUS (&REST ARGS)

log(1+exp(x))) activation function.

->SUM (&REST ARGS)

Sum of all nodes (per stripe).

Undocumented

->* (&REST ARGS)

->+ (&REST ARGS)

->ABS (&REST ARGS)

->CONSTANT (&REST ARGS)

->CROSS-ENTROPY (&REST ARGS)

->DROPOUT (&REST ARGS)

->EXP (&REST ARGS)

->INPUT (&REST ARGS)

->LINEAR (&REST ARGS)

->MAX (&REST ARGS)

->NORMALIZED (&REST ARGS)

->PERIODIC (&REST ARGS)

->REF (&REST ARGS)

->REP (&REST ARGS)

->ROUGH-EXPONENTIAL (&REST ARGS)

->SCALED-TANH (&REST ARGS)

->SIGMOID (&REST ARGS)

->SOFTMAX (&REST ARGS)

->SQUARED-ERROR (&REST ARGS)

->STOCHASTIC-SIGMOID (&REST ARGS)

->STRETCH (&REST ARGS)

->SUM-SQUARED-ERROR (&REST ARGS)

->WEIGHT (&REST ARGS)

BP-TRAINER

BPN

CG-BP-TRAINER

LUMP

Private

Undocumented

BASE-BP-TRAINER

DATA-LUMP