# FUNCTION

# Public

# ADD-LUMP (LUMP BPN)

Add LUMP to BPN. MAX-N-STRIPES of LUMP gets set to equal that of
the previous last, non-weight lump of BPN.

# BACKWARD-BPN (BPN &KEY (LAST-LUMP NIL LAST-LUMP-P))

Accumulate derivatives of weights.

# FORWARD-BPN (BPN &KEY FROM-LUMP TO-LUMP END-LUMP)

Propagate the values from the already clamped inputs.

# Undocumented

# ->* (&REST ARGS)

# ->+ (&REST ARGS)

# ->ABS (&REST ARGS)

# ->ACTIVATION (&REST ARGS)

# ->CONSTANT (&REST ARGS)

# ->CROSS-ENTROPY (&REST ARGS)

# ->CROSS-ENTROPY-SOFTMAX (&REST ARGS)

# ->DROPOUT (&REST ARGS)

# ->ERROR (&REST ARGS)

# ->EXP (&REST ARGS)

# ->INPUT (&REST ARGS)

# ->LINEAR (&REST ARGS)

# ->MAX (&REST ARGS)

# ->NORMALIZED (&REST ARGS)

# ->PERIODIC (&REST ARGS)

# ->RECTIFIED (&REST ARGS)

# ->REF (&REST ARGS)

# ->REP (&REST ARGS)

# ->ROUGH-EXPONENTIAL (&REST ARGS)

# ->SCALED-TANH (&REST ARGS)

# ->SIGMOID (&REST ARGS)

# ->SOFTMAX (&REST ARGS)

# ->SOFTPLUS (&REST ARGS)

# ->SQUARED-ERROR (&REST ARGS)

# ->STOCHASTIC-SIGMOID (&REST ARGS)

# ->STRETCH (&REST ARGS)

# ->SUM (&REST ARGS)

# ->SUM-SQUARED-ERROR (&REST ARGS)

# ->WEIGHT (&REST ARGS)

# COLLECT-BPN-ERRORS (SAMPLER BPN &KEY COUNTERS-AND-MEASURERS)

# FIND-LUMP (NAME BPN &KEY ERRORP)

# REMOVE-LUMP (LUMP BPN)

# RENORMALIZE-ACTIVATIONS (->ACTIVATIONS L2-UPPER-BOUND)

# Private

# FIRST-TRAINED-WEIGHT-LUMP (TRAINER BPN)

Much time can be wasted computing derivatives of non-trained weight
lumps. Return the first one that TRAINER trains.

# Undocumented

# ->LUMP (BPN LUMP-SPEC)

# ->WEIGHT* (&REST ARGS)

# ADD-AND-FORGET-DERIVATIVES (TRAINER BPN)

# COPY-BPN-WEIGHTS (FROM-BPN TO-BPN &KEY ERROR-IF-NO-MATCH-P)

# DERIVATIVES* (LUMP)

# DERIVE-ROUGH-EXPONENTIAL (X &KEY SIGNAL-VARIANCE LENGTH-SCALE (ROUGHNESS 2))

# ENSURE-SOFTMAX (LUMP)

# LIMIT-STRIPES (LUMP N)

# MAX-N-STRIPES* (LUMP)

# N-STRIPES* (LUMP)

# NEXT-LUMP-NAME

# NODES* (LUMP)

# NORM (V)

# ROUGH-EXPONENTIAL (X &KEY SIGNAL-VARIANCE LENGTH-SCALE (ROUGHNESS 2))

# SEGMENT-SET-DERIVATIVES->WEIGHTS (SEGMENT-SET WEIGHTS)

# ZERO-NON-WEIGHT-DERIVATIVES (BPN &KEY (LAST-LUMP NIL LAST-LUMP-P))

# MACRO

# Public

# BUILD-BPN ((&KEY (CLASS ''BPN) INITARGS (MAX-N-STRIPES 1)) &BODY LUMPS)

Syntactic sugar to assemble BPNs from lumps. Like LET* it is a
sequence of bindings (of symbols to lumps). The names of the lumps
created default to the symbol of the binding. In case a lump is not
bound to a symbol (because it was created in a nested expression), the
local function LUMP finds the lump with the given name in the bpn
being built. Example:
(mgl-bp:build-bpn ()
(features (mgl-bp:->input :size n-features))
(biases (mgl-bp:->weight :size n-features))
(weights (mgl-bp:->weight :size (* n-hiddens n-features)))
(activations0 (mgl-bp:->activation :weights weights :x features))
(activations (mgl-bp:->+ :args (list biases activations0)))
(output (mgl-bp:->sigmoid :x activations)))

# WITH-WEIGHTS-COPIED ((FROM-BPN) &BODY BODY)

In BODY ->WEIGHT will first look up if a weight lump of the same
name exists in FROM-BPN and return that, or else create a weight lump
normally. If FROM-BPN is NIL, then weights are copied.

# Undocumented

# DEFLUMP (NAME DIRECT-SUPERCLASSES DIRECT-SLOTS &REST OPTIONS)

# GENERIC-FUNCTION

# Public

# Undocumented

# COMPUTE-DERIVATIVES (SAMPLES TRAINER BPN)

# DEFAULT-SIZE (LUMP)

# DERIVE-LUMP (LUMP)

# TRANSFER-LUMP (LUMP)

# Private

# Undocumented

# SET-INPUT-DONE (LUMP)

# SLOT-ACCESSOR

# Public

# CLASS-WEIGHTS (OBJECT)

If non-NIL, an FLT-VECTOR of GROUP-SIZE. Useful
TARGET's distribution is different on the training and test sets. Just
set the w_i to test_frequency_i/training_frequency_i.

# SETFCLASS-WEIGHTS (NEW-VALUE OBJECT)

If non-NIL, an FLT-VECTOR of GROUP-SIZE. Useful
TARGET's distribution is different on the training and test sets. Just
set the w_i to test_frequency_i/training_frequency_i.

# COST (OBJECT)

Return the sum of costs for all active stripes. The cost of a
stripe is the sum of the error nodes. The second value is the number
of stripes.

# DEFAULT-VALUE (OBJECT)

Upon creation or resize the lump's nodes get
filled with this value.

# DERIVATIVES (OBJECT)

Derivatives of nodes, input node derivatives are
not calculated. A 1d array representing a matrix of the same dimension
as NODES.

# DROPOUT (OBJECT)

If non-NIL, then in the forward pass zero out each
node in this chunk with DROPOUT probability. See Geoffrey Hinton's
'Improving neural networks by preventing co-adaptation of feature
detectors'.

# IMPORTANCE (OBJECT)

If non-NIL, an FLT-VECTOR of n-stripes.

# SETFIMPORTANCE (NEW-VALUE OBJECT)

If non-NIL, an FLT-VECTOR of n-stripes.

# INDICES-TO-CALCULATE (OBJECT)

NIL or a simple vector of array indices into this
lump's range (i.e. in the 0 (1- SIZE) interval). Need not be ordered.
If not NIL the node's value is not calculated and its derivatives are
not propagated unless it is in INDICES-TO-CALCULATE. It has no effect
subsequent lumps: they may use values that have not been recalculated.
The primary use-case is to temporarily mask out an uninteresting part
of the network.

# SETFINDICES-TO-CALCULATE (NEW-VALUE OBJECT)

NIL or a simple vector of array indices into this
lump's range (i.e. in the 0 (1- SIZE) interval). Need not be ordered.
If not NIL the node's value is not calculated and its derivatives are
not propagated unless it is in INDICES-TO-CALCULATE. It has no effect
subsequent lumps: they may use values that have not been recalculated.
The primary use-case is to temporarily mask out an uninteresting part
of the network.

# LUMPS (OBJECT)

Lumps in reverse order

# NODES (OBJECT)

The values of the nodes. All nodes have values. It
is conceptually a N-STRIPES x SIZE matrix that can be enlarged to
MAX-N-STRIPES x SIZE by setting N-STRIPES.

# SOFTMAX (OBJECT)

A matrix of the same size as X, EXP'ed and
normalized in groups of GROUP-SIZE.

# TARGET (OBJECT)

A lump of the same size as INPUT-LUMP that is the
T in -sum_{k}target_k*ln(x_k) which the the cross entropy error.

# Undocumented

# GROUP-SIZE (OBJECT)

# NAME (OBJECT)

# NOISYP (OBJECT)

# SETFNOISYP (NEW-VALUE OBJECT)

# NORMALIZE-WITH-STATS-P (OBJECT)

# SETFNORMALIZE-WITH-STATS-P (NEW-VALUE OBJECT)

# NORMALIZED-CAP (OBJECT)

# SETFNORMALIZED-CAP (NEW-VALUE OBJECT)

# SIZE (OBJECT)

# TRANSPOSE-WEIGHTS-P (OBJECT)

# UPDATE-STATS-P (OBJECT)

# SETFUPDATE-STATS-P (NEW-VALUE OBJECT)

# Private

# SAME-STRIPES-P (OBJECT)

Non-NIL iff all stripes are the same. If true, it
effectively overrides both N-STRIPES and MAX-N-STRIPES and there is
only one column in NODES and DERIVATIVES. Set up by the lump itself
taking its inputs into account. Notably, ->WEIGHTS always have
SAME-STRIPES-P T.

# SCALE (OBJECT)

The sum of nodes after normalization. Can be
changed during training, for instance when clamping. If it is a vector
then its length must be MAX-N-STRIPES which automatically
maintained.

# SETFSCALE (NEW-VALUE OBJECT)

The sum of nodes after normalization. Can be
changed during training, for instance when clamping. If it is a vector
then its length must be MAX-N-STRIPES which automatically
maintained.

# X (OBJECT)

This is the input lump.

# Undocumented

# ARGS (OBJECT)

# DROP-NEGATIVE-INDEX-P (OBJECT)

# FIRST-TRAINED-LUMP (OBJECT)

# INTO (OBJECT)

# LENGTH-SCALE (OBJECT)

# N (OBJECT)

# NORMALIZED-LUMP (OBJECT)

# PERIOD (OBJECT)

# ROUGHNESS (OBJECT)

# RUNNING-STATS (OBJECT)

# SIGNAL-VARIANCE (OBJECT)

# WEIGHTS (OBJECT)

# Y (OBJECT)

# VARIABLE

# Private

# Undocumented

# *BPN-BEING-BUILT*

# *IN-TRAINING-P*

# *LUMPS-TO-COPY*

# *NEXT-LUMP-NAME*

# CLASS

# Public

# ->ACTIVATION (&REST ARGS)

Perform X*WEIGHTS where X is of size M and WEIGHTS
is a ->WEIGHT whose single stripe is taken to be of dimensions M x N
stored in column major order. N is the size of this lump. If
TRANSPOSE-WEIGHTS-P then WEIGHTS is N x M and X*WEIGHTS' is
computed.

# ->CROSS-ENTROPY-SOFTMAX (&REST ARGS)

A specialized lump that is equivalent to hooking
->EXP with NORMALIZED-LUMP and ->CROSS-ENTROPY but is numerically
stable. See <http://groups.google.com/group/comp.ai.neural-nets/msg/a7594ebea01fef04?dmode=source>
It has two parameters X and TARGET. In the transfer phase it computes
the EXP of each input node and normalizes them as if by
NORMALIZED-LUMP. These intermediate values are placed into SOFTMAX.
The value node K is nodes_k = - target_k * ln(softmax_k). Since the
sum of this is cross entropy: - sum_k target_k * ln(softmax_k), simply
plug this lump into an ->ERROR.
In the derive phase it computes the cross entropy error of the
normalized input: d(-sum_k{target_k * ln(softmax_k)})/dx_k = sum_j{
target_j * (softmax_k - KDELjk)} which is equal to softmax_k -
target_k if target sums to 1.

# ->ERROR (&REST ARGS)

An error node is usually a leaf in the graph of
lumps. Contrary to non-error leaf lumps it gets a non-zero derivative:
1. Error lumps have exactly one node (in each stripe) whose value is
computed as the sum of nodes in the X parameter lump.

# ->RECTIFIED (&REST ARGS)

max(0,x) activation function. If NOISYP then add
normal(0,sigmoid(x)) noise to x.

# ->SOFTPLUS (&REST ARGS)

log(1+exp(x))) activation function.

# ->SUM (&REST ARGS)

Sum of all nodes (per stripe).