Common Lisp Package: MGL-BM

Fully General Boltzmann Machines, Restricted Boltzmann Machines and their stacks called Deep Belief Networks (DBN).

README:

FUNCTION

Public

COLLECT-BM-MEAN-FIELD-ERRORS (SAMPLER BM &KEY (COUNTERS-AND-MEASURERS (MAKE-BM-RECONSTRUCTION-RMSE-COUNTERS-AND-MEASURERS BM)))

Set the hidden and then the visible mean field and collect the errors with COLLECT-BATCH-ERRORS. By default, return the reconstruction rmse.

COLLECT-BM-MEAN-FIELD-ERRORS/LABELED (SAMPLER BM &KEY (COUNTERS-AND-MEASURERS (APPEND (MAKE-BM-RECONSTRUCTION-MISCLASSIFICATION-COUNTERS-AND-MEASURERS BM) (MAKE-BM-RECONSTRUCTION-CROSS-ENTROPY-COUNTERS-AND-MEASURERS BM))))

Like COLLECT-BM-MEAN-FIELD-ERRORS but reconstruct the labels even if they were missing.

COLLECT-DBN-MEAN-FIELD-ERRORS (SAMPLER DBN &KEY (RBM (LAST1 (RBMS DBN))) (COUNTERS-AND-MEASURERS (MAKE-DBN-RECONSTRUCTION-RMSE-COUNTERS-AND-MEASURERS DBN RBM RBM)))

Run the mean field up to RBM then down to the bottom and collect the errors with COLLECT-BATCH-ERRORS. By default, return the rmse at each level in the DBN.

COLLECT-DBN-MEAN-FIELD-ERRORS/LABELED (SAMPLER DBN &KEY (RBM (LAST1 (RBMS DBN))) (COUNTERS-AND-MEASURERS (APPEND (MAKE-DBN-RECONSTRUCTION-MISCLASSIFICATION-COUNTERS-AND-MEASURERS DBN RBM RBM) (MAKE-DBN-RECONSTRUCTION-CROSS-ENTROPY-COUNTERS-AND-MEASURERS DBN RBM RBM))))

Like COLLECT-DBN-MEAN-FIELD-ERRORS but reconstruct labeled chunks even if it's missing in the input.

DBM->DBN (DBM &KEY (RBM-CLASS 'RBM) (DBN-CLASS 'DBN) DBN-INITARGS)

Convert DBM to a DBN by discarding intralayer connections and doubling activations of clouds where necessary. If a chunk does not have input from below then scale its input from above by 2; similarly, if a chunk does not have input from above then scale its input from below by 2. By default, weights are shared between clouds and their copies. For now, unrolling the resulting DBN to a BPN is not supported.

DOWN-DBM (DBM)

Do a single downward pass in DBM, propagating the mean-field much like performing approximate inference, but in the other direction. Disregard intralayer and upward connections, double activations to chunks having downward connections.

DOWN-MEAN-FIELD (DBN &KEY (RBM (LAST1 (RBMS DBN))))

Propagate the means down from the means of RBM.

INPUTS->NODES (BM)

Copy the previously clamped INPUTS to NODES as if SET-INPUT were called with the same parameters.

MAKE-BM-RECONSTRUCTION-CROSS-ENTROPY-COUNTERS-AND-MEASURERS (BM &KEY CHUNK-FILTER)

Return a list of counter, measurer conses to keep track of cross entropy error suitable for BM-MEAN-FIELD-ERRORS.

MAKE-BM-RECONSTRUCTION-MISCLASSIFICATION-COUNTERS-AND-MEASURERS (BM &KEY CHUNK-FILTER)

Return a list of counter, measurer conses to keep track of cross entropy error suitable for BM-MEAN-FIELD-ERRORS.

MAKE-DBM-RECONSTRUCTION-RMSE-COUNTERS-AND-MEASURERS (DBM &KEY CHUNK-FILTER)

Return a list of counter, measurer conses to keep track of reconstruction rmse suitable for COLLECT-BM-MEAN-FIELD-ERRORS.

MAKE-DBN-RECONSTRUCTION-MISCLASSIFICATION-COUNTERS-AND-MEASURERS (DBN &KEY (RBM (LAST1 (RBMS DBN))) CHUNK-FILTER)

Return a list of counter, measurer conses to keep track of misclassifications suitable for BM-MEAN-FIELD-ERRORS.

MAKE-DBN-RECONSTRUCTION-RMSE-COUNTERS-AND-MEASURERS (DBN &KEY (RBM (LAST1 (RBMS DBN))) CHUNK-FILTER)

Return a list of counter, measurer conses to keep track of reconstruction rmse suitable for COLLECT-BM-MEAN-FIELD-ERRORS.

MERGE-CLOUD-SPECS (SPECS DEFAULT-SPECS)

Combine cloud SPECS and DEFAULT-SPECS. If the first element of SPECS is :MERGE then merge them else return SPECS. Merging concatenates them but removes those specs from DEFAULT-SPECS that are between chunks that have a spec in SPECS. If a spec has CLASS NIL then it is removed as well. A cloud spec at minimum specifies the name of the chunks it connects: (:chunk1 inputs :chunk2 features) in which case it defaults to be a FULL-CLOUD. If that is not desired then the class can be specified: (:chunk1 inputs :chunk2 features :class factored-cloud) To remove a cloud from DEFAULT-SPECS use :CLASS NIL: (:chunk1 inputs :chunk2 features :class nil) Other initargs are passed as is to MAKE-INSTANCE: (:chunk1 inputs :chunk2 features :class factored-cloud :rank 10) You may also pass a CLOUD object as a spec.

NODES->INPUTS (BM)

Copy NODES to INPUTS.

RECONSTRUCTION-ERROR (BM)

Return the squared norm of INPUTS - NODES not considering constant or conditioning chunks that aren't reconstructed in any case. The second value returned is the number of nodes that contributed to the error.

RECONSTRUCTION-RMSE (CHUNKS)

Return the squared norm of INPUTS - NODES not considering constant or conditioning chunks that aren't reconstructed in any case. The second value returned is the number of nodes that contributed to the error.

SAMPLE-HIDDEN (BM)

Generate samples from the probability distribution defined by the chunk type and the mean that resides in NODES.

SAMPLE-VISIBLE (BM)

Generate samples from the probability distribution defined by the chunk type and the mean that resides in NODES.

SET-HIDDEN-MEAN/1 (BM)

Set NODES of the chunks in the hidden layer to the means of their respective probability distributions.

SET-VISIBLE-MEAN/1 (BM)

Set NODES of the chunks in the visible layer to the means of their respective probability distributions.

SETTLE-HIDDEN-MEAN-FIELD (BM &KEY (SUPERVISOR (DEFAULT-MEAN-FIELD-SUPERVISOR BM)))

Convenience function on top of SETTLE-MEAN-FIELD.

SETTLE-MEAN-FIELD (CHUNKS BM &KEY (OTHER-CHUNKS (SET-DIFFERENCE (CHUNKS BM) CHUNKS)) (SUPERVISOR (DEFAULT-MEAN-FIELD-SUPERVISOR BM)))

Do possibly damped mean field updates on CHUNKS until convergence. Compute V'_{t+1}, what would normally be the means, but average it with the previous value: V_{t+1} = k * V_t + (1 - k) * V'{t+1} where K is the damping factor (an FLT between 0 and 1). Call SUPERVISOR with CHUNKS BM and the iteration. Settling is finished when SUPERVISOR returns NIL. If SUPERVISOR returns a non-nil value then it's taken to be a damping factor. For no damping return 0.

SETTLE-VISIBLE-MEAN-FIELD (BM &KEY (SUPERVISOR (DEFAULT-MEAN-FIELD-SUPERVISOR BM)))

Convenience function on top of SETTLE-MEAN-FIELD.

SUPERVISE-MEAN-FIELD/DEFAULT (CHUNKS BM ITERATION &KEY (NODE-CHANGE-LIMIT 1.0000000116860974d-7) (N-UNDAMPED-ITERATIONS 100) (N-DAMPED-ITERATIONS 100) (DAMPING-FACTOR 0.8999999761581421d0))

A supervisor for SETTLE-MEAN-FIELD. Return NIL if the average of the absolute value of change in nodes is below NODE-CHANGE-LIMIT, else return 0 damping for N-UNDAMPED-ITERATIONS then DAMPING-FACTOR for another N-DAMPED-ITERATIONS, then NIL.

UP-DBM (DBM)

Do a single upward pass in DBM, performing approximate inference. Disregard intralayer and downward connections, double activations to chunks having upward connections.

Undocumented

CLOUD-CHUNK-AMONG-CHUNKS (CLOUD CHUNKS)

CONDITIONING-CLOUD-P (CLOUD)

MAKE-BM-RECONSTRUCTION-RMSE-COUNTERS-AND-MEASURERS (BM &KEY CHUNK-FILTER)

RANK (CLOUD)

Private

ACTIVATE-CLOUD (CLOUD REVERSEP &KEY (FROM-FN #'OLD-NODES) (TO-FN #'NODES))

From CHUNK1 calculate the activations of CHUNK2 and _add_ them to CHUNK2. If REVERSEP then swap the roles of the chunks. FROM-FN and TO-FN are the accessors to use to get the nodes value arrays (one of #'NODES, #'OLD-NODES, #'MEANS. In the simplest case it adds weights (of CLOUD) * OLD-NODES (of CHUNK1) to the nodes of the hidden chunk.

FULL-CLOUDS-EVERYWHERE (VISIBLE-CHUNKS HIDDEN-CHUNKS)

Return a list of cloud specifications suitable for instantiating a BM. Put a cloud between each pair of visible and hidden chunks unless they are both conditioning chunks. The names of the clouds are two element lists of the names of the visible and hidden chunks.

HIJACK-MEANS-TO-ACTIVATION (CHUNKS CLOUDS)

Set NODES of CHUNKS to the activations calculated from CLOUDS. Skip chunks that don't need activations. If ADDP don't zero NODES first, but add to it.

MAKE-DBN-RECONSTRUCTION-CROSS-ENTROPY-COUNTERS-AND-MEASURERS (DBN &KEY (RBM (LAST1 (RBMS DBN))) CHUNK-FILTER)

Return a list of counter, measurer conses to keep track of misclassifications suitable for BM-MEAN-FIELD-ERRORS.

NODE-CHANGE (CHUNKS)

Return the average of the absolute values of NODES - OLD-NODES over CHUNKS. The second value returned is the number of nodes that contributed to the average.

Undocumented

->CHUNK (CHUNK-DESIGNATOR CHUNKS)

->CLOUD (CLOUD-DESIGNATOR BM)

->CLOUDS (CHUNKS CLOUD-SPECS)

ADD-CHUNK-NODES (CHUNK FROM TO)

ADD-RBM (RBM DBN)

BOTH-CLOUD-ENDS-IN-P (CLOUD CHUNKS)

CHECK-DBM-CLOUDS (DBM)

CHECK-NO-NAME-CLASHES (RBMS)

CHECK-NO-SELF-CONNECTION (BM)

CHECK-STRIPES (CHUNK)

CLOUD-BETWEEN-CHUNKS-P (CLOUD CHUNKS1 CHUNKS2)

CONDITIONING-CHUNK-P (CHUNK)

CONDITIONING-CLOUDS-TO (CHUNKS CLOUDS)

CONNECTS-TO-P (CHUNK CHUNKS CLOUDS)

COPY-CHUNK-NODES (CHUNK FROM TO)

COPY-DBM-CHUNK-TO-DBN (CHUNK)

COPY-DBM-CLOUD-TO-DBN (CLOUD CLOUDS LAYER-BELOW LAYER1 LAYER2 LAYER-ABOVE)

ENSURE-ARRAY-LARGE-ENOUGH (ARRAY PROTOTYPE)

FACTORED-CLOUD-SHARED-CHUNK (CLOUD)

FILL-CHUNK (CHUNK VALUE &KEY ALLP)

FORMAT-FULL-CLOUD-NORM (CLOUD)

FULL-CLOUD-NORM (CLOUD)

FULL-CLOUDS-EVERYWHERE-BETWEEN-LAYERS (LAYERS)

MAKE-CHUNK-RECONSTRUCTION-CROSS-ENTROPY-COUNTERS-AND-MEASURERS (CHUNKS &KEY CHUNK-FILTER)

MAKE-CHUNK-RECONSTRUCTION-MISCLASSIFICATION-COUNTERS-AND-MEASURERS (CHUNKS &KEY CHUNK-FILTER)

MAKE-DO-CLOUD/CHUNK2 (CHUNK2-INDEX INDEX CHUNK2-SIZE OFFSET BODY)

MAP-SPARSER (TRAINER BM)

MARK-LABELS-PRESENT (OBJECT)

MAYBE-REMEMBER (CHUNK)

MAYBE-USE-REMEMBERED (CHUNK)

MEANS-OR-SAMPLES (TRAINER BM CHUNK)

NAME-CLASHES (LIST)

NODES* (CHUNK)

NODES->MEANS (CHUNK)

NORM (MATRIX)

NOT-BEFORE (LIST OBJ)

OTHER-CHUNK (CLOUD CHUNK)

PREVIOUS-RBM (DBN RBM)

REMOVE-IF* (FILTER SEQ)

SET-MEAN (CHUNKS BM &KEY (OTHER-CHUNKS (SET-DIFFERENCE (CHUNKS BM) CHUNKS)))

SET-MEAN* (CHUNKS BM &KEY (OTHER-CHUNKS (SET-DIFFERENCE (CHUNKS BM) CHUNKS)))

STABLE-SET-DIFFERENCE (LIST1 LIST2)

SUM-CHUNK-NODES-AND-OLD-NODES (CHUNK NODE-WEIGHT OLD-NODE-WEIGHT)

SUM-NODES-AND-OLD-NODES (CHUNKS NODE-WEIGHT OLD-NODE-WEIGHT)

SWAP-NODES (CHUNKS)

SWAP-NODES* (CHUNKS)

USE-BLAS-ON-CHUNK-P (CHUNK)

VERSION (OBJ)

VISIBLE-NODES->MEANS (BM)

ZERO-CHUNK (CHUNK)

MACRO

Public

Undocumented

DO-CLOUDS ((CLOUD BM) &BODY BODY)

Private

DO-CHUNK ((INDEX CHUNK) &BODY BODY)

Iterate over the indices of nodes of CHUNK skipping missing ones.

DO-CLOUD-RUNS (((START END) CLOUD) &BODY BODY)

Iterate over consecutive runs of weights present in CLOUD.

Undocumented

DO-CLOUD/CHUNK1 ((CHUNK1-INDEX CLOUD) &BODY BODY)

DO-STRIPES ((CHUNK &OPTIONAL (STRIPE (GENSYM))) &BODY BODY)

WITH-VERSIONS ((VERSION OBJECTS) &BODY BODY)

GENERIC-FUNCTION

Public

DEFAULT-MEAN-FIELD-SUPERVISOR (BM)

Return a function suitable as the SUPERVISOR argument for SETTLE-MEAN-FIELD. The default implementation

FIND-CHUNK (NAME OBJECT &KEY ERRORP)

Find the chunk in OBJECT whose name is EQUAL to NAME. Signal an error if not found and ERRORP.

FIND-CLOUD (NAME OBJECT &KEY ERRORP)

Find the cloud in OBJECT whose name is EQUAL to NAME. Signal an error if not found and ERRORP.

SAMPLE-CHUNK (CHUNK)

Sample from the probability distribution of CHUNK whose means are in NODES.

SET-CHUNK-MEAN (CHUNK)

Set NODES of CHUNK to the means of the probability distribution. When called NODES contains the activations.

SET-HIDDEN-MEAN (BM)

Like SET-HIDDEN-MEAN/1, but settle the mean field if there are hidden-to-hidden connections. For an RBM it trivially calls SET-HIDDEN-MEAN/1, for a DBM it calls UP-DBM before settling.

SET-VISIBLE-MEAN (BM)

Like SET-VISIBLE-MEAN/1, but settle the mean field if there are visible-to-visible connections. For an RBM it trivially calls SET-VISIBLE-MEAN.

Undocumented

NEGATIVE-PHASE (BATCH TRAINER BM)

POSITIVE-PHASE (BATCH TRAINER BM)

Private

ACCUMULATE-CLOUD-STATISTICS (TRAINER BM CLOUD MULTIPLIER)

Take the accumulator of TRAINER that corresponds to CLOUD and add MULTIPLIER times the cloud statistics of [persistent] contrastive divergence.

ACTIVATE-CLOUD* (CLOUD REVERSEP FROM-CHUNK TO-CHUNK FROM-MATRIX TO-MATRIX)

Like ACTIVATE-CLOUD but without keyword parameters.

ZERO-WEIGHT-TO-SELF (CLOUD)

In a BM W_{i,i} is always zero.

Undocumented

ACCUMULATE-CLOUD-STATISTICS* (CLOUD V1 V2 MULTIPLIER START ACCUMULATOR)

ACCUMULATE-NEGATIVE-PHASE-STATISTICS (TRAINER BM &KEY MULTIPLIER (MULTIPLIER (FLT 1)))

ACCUMULATE-POSITIVE-PHASE-STATISTICS (TRAINER BM &KEY MULTIPLIER (MULTIPLIER (FLT 1)))

ACCUMULATE-SPARSITY-STATISTICS (SPARSITY MULTIPLIER)

COPY-NODES (CHUNK)

FLUSH-ACCUMULATOR (SPARSITY ACCUMULATOR START N-INPUTS-IN-BATCH)

RESIZE-CHUNK (CHUNK SIZE MAX-N-STRIPES)

SLOT-ACCESSOR

Public

CHUNKS (OBJECT)

A list of all the chunks in this BM. It's VISIBLE-CHUNKS and HIDDEN-CHUNKS appended.

CLOUD-A (OBJECT)

A full cloud whose visible chunk is the same as the visible chunk of this cloud and whose hidden chunk is the same as the visible chunk of CLOUD-B.

CLOUD-B (OBJECT)

A full cloud whose hidden chunk is the same as the hidden chunk of this cloud and whose visible chunk is the same as the hidden chunk of CLOUD-A.

CLOUDS (OBJECT)

Normally, a list of CLOUDS representing the connections between chunks. During initialization cloud specs are allowed in the list.

CLOUDS-UP-TO-LAYERS (OBJECT)

Each element of this list is a list of clouds connected from below to the layer of the same index.

COST (OBJECT)

Return the sum of costs for all active stripes. The cost of a stripe is the sum of the error nodes. The second value is the number of stripes.

DEFAULT-VALUE (OBJECT)

Upon creation or resize the lump's nodes get filled with this value.

HIDDEN-CHUNKS (OBJECT)

A list of CHUNKs that are not directly observed. Disjunct from VISIBLE-CHUNKS.

HIDDEN-SAMPLING (OBJECT)

Controls whether and how hidden nodes are sampled during the learning or mean field is used instead. :HALF-HEARTED, the default value, samples the hiddens but uses the hidden means to calculate the effect of the positive and negative phases on the gradient. The default should almost always be preferable to T, as it is a less noisy estimate.

SETFHIDDEN-SAMPLING (NEW-VALUE OBJECT)

Controls whether and how hidden nodes are sampled during the learning or mean field is used instead. :HALF-HEARTED, the default value, samples the hiddens but uses the hidden means to calculate the effect of the positive and negative phases on the gradient. The default should almost always be preferable to T, as it is a less noisy estimate.

INDICES-PRESENT (OBJECT)

NIL or a simple vector of array indices into the layer's NODES. Need not be ordered. SET-INPUT sets it. Note, that if it is non-NIL then N-STRIPES must be 1.

SETFINDICES-PRESENT (NEW-VALUE OBJECT)

NIL or a simple vector of array indices into the layer's NODES. Need not be ordered. SET-INPUT sets it. Note, that if it is non-NIL then N-STRIPES must be 1.

INPUTS (OBJECT)

This is where the after method of SET-INPUT saves the input for later use by RECONSTRUCTION-ERROR, INPUTS->NODES. It is NIL in CONDITIONING-CHUNKS.

LAYERS (OBJECT)

A list of layers from bottom up. A layer is a list of chunks. The layers partition the set of all chunks in the BM. Chunks with no connections to layers below are visible (including constant and conditioning) chunks. The layered structure is used in the single, bottom-up, approximate inference pass. When instantiating a DBM, VISIBLE-CHUNKS and HIDDEN-CHUNKS are inferred from LAYERS and CLOUDS.

MEANS (OBJECT)

Saved values of the means (see SET-MEAN) last computed.

N-GIBBS (OBJECT)

The number of steps of Gibbs sampling to perform. This is how many full (HIDDEN -> VISIBLE -> HIDDEN) steps are taken for CD learning, and how many times each chunk is sampled for PCD.

SETFN-GIBBS (NEW-VALUE OBJECT)

The number of steps of Gibbs sampling to perform. This is how many full (HIDDEN -> VISIBLE -> HIDDEN) steps are taken for CD learning, and how many times each chunk is sampled for PCD.

N-PARTICLES (OBJECT)

The number of persistent chains to run. Also known as the number of fantasy particles.

NODES (OBJECT)

The values of the nodes. All nodes have values. It is conceptually a N-STRIPES x SIZE matrix that can be enlarged to MAX-N-STRIPES x SIZE by setting N-STRIPES.

PERSISTENT-CHAINS (OBJECT)

A BM that keeps the states of the persistent chains (each stripe is a chain), initialized from the BM being trained by COPY with 'PCD as the context. Suitable for training BM and RBM.

SCALE (OBJECT)

The sum of the means after normalization. Can be changed during training, for instance when clamping. If it is a vector then its length must be MAX-N-STRIPES which is automatically maintained when changing the number of stripes.

SETFSCALE (NEW-VALUE OBJECT)

The sum of the means after normalization. Can be changed during training, for instance when clamping. If it is a vector then its length must be MAX-N-STRIPES which is automatically maintained when changing the number of stripes.

TARGET (OBJECT)

A lump of the same size as INPUT-LUMP that is the T in -sum_{k}target_k*ln(x_k) which the the cross entropy error.

VISIBLE-CHUNKS (OBJECT)

A list of CHUNKs whose values come from the outside world: SET-INPUT sets them.

VISIBLE-SAMPLING (OBJECT)

Controls whether visible nodes are sampled during the learning or the mean field is used instead.

SETFVISIBLE-SAMPLING (NEW-VALUE OBJECT)

Controls whether visible nodes are sampled during the learning or the mean field is used instead.

WEIGHTS (OBJECT)

A chunk is represented as a row vector (disregarding the multi-striped case). If the visible chunk is 1xN and the hidden is 1xM then the weight matrix is NxM. Hidden = hidden + weights * visible. Visible = visible + weights^T * hidden.

Undocumented

CHUNK (OBJECT)

CHUNK1 (OBJECT)

CHUNK2 (OBJECT)

CLOUD (OBJECT)

CONDITIONING-CHUNKS (OBJECT)

DAMPING (OBJECT)

DBN (OBJECT)

GROUP-SIZE (OBJECT)

NAME (OBJECT)

RBMS (OBJECT)

SIZE (OBJECT)

SPARSER (OBJECT)

Private

NORMAL-CHAINS (OBJECT)

The BM being trained.

OLD-NODES (OBJECT)

The previous value of each node. Used to provide parallel computation semantics when there are intralayer connections. Swapped with NODES or MEANS at times.

SCALE1 (OBJECT)

When CHUNK1 is being activated count activations coming from this cloud multiplied by SCALE1.

SCALE2 (OBJECT)

When CHUNK2 is being activated count activations coming from this cloud multiplied by SCALE2.

Undocumented

CACHED-ACTIVATIONS1 (OBJECT)

CACHED-ACTIVATIONS2 (OBJECT)

CACHED-VERSION1 (OBJECT)

SETFCACHED-VERSION1 (NEW-VALUE OBJECT)

CACHED-VERSION2 (OBJECT)

SETFCACHED-VERSION2 (NEW-VALUE OBJECT)

HAS-HIDDEN-TO-HIDDEN-P (OBJECT)

HAS-INPUTS-P (OBJECT)

HAS-VISIBLE-TO-VISIBLE-P (OBJECT)

HIDDEN-AND-CONDITIONING-CHUNKS (OBJECT)

HIDDEN-SOURCE-CHUNK (OBJECT)

NEXT-NODE-INPUTS (OBJECT)

OLD-PRODUCTS (OBJECT)

OLD-SUM1 (OBJECT)

PRODUCTS (OBJECT)

SPARSITY-GRADIENT-SOURCES (OBJECT)

SPARSITY-TARGET (OBJECT)

SUM1 (OBJECT)

SUM2 (OBJECT)

VISIBLE-AND-CONDITIONING-CHUNKS (OBJECT)

VARIABLE

Private

Undocumented

*VERSIONS*

CLASS

Public

BM

The network is assembled from CHUNKS (nodes of the same behaviour) and CLOUDs (connections between two chunks). To instantiate, arrange for VISIBLE-CHUNKS, HIDDEN-CHUNKS, CLOUDS (either as initargs or initforms) to be set. Usage of CLOUDS is slightly tricky: you may pass a list of CLOUD objects connected to chunks in this network. Alternatively, a cloud spec may stand for a cloud. Also, the initial value of CLOUDS is merged with the default cloud spec list before the final cloud spec list is instantiated. The default cloud spec list is what FULL-CLOUDS-EVERYWHERE returns for VISIBLE-CHUNKS and HIDDEN-CHUNKS. See MERGE-CLOUD-SPECS for the gory details. The initform, '(:MERGE), simply leaves the default cloud specs alone.

BM-PCD-TRAINER

Persistent Contrastive Divergence trainer.

CHEATING-SPARSITY-GRADIENT-SOURCE

Like NORMAL-SPARSITY-GRADIENT-SOURCE, but it needs less memory because it only tracks average activation levels of nodes independently (as opposed to simultaneous activations) and thus it may produce the wrong gradient an example for which is when two connected nodes are on a lot, but never at the same time. Clearly, it makes little sense to change the weight but this is exactly what happens.

CHUNK (OBJECT)

A chunk is a set of nodes of the same type in a Boltzmann Machine. This is an abstract base class.

CLOUD (OBJECT)

A set of connections between two chunks. The chunks may be the same, be both visible or both hidden subject to constraints imposed by the type of boltzmann machine the cloud is part of.

CONDITIONING-CHUNK

Nodes in CONDITIONING-CHUNK never change their values on their own so they are to be clamped. Including this chunk in the visible layer allows `conditional' RBMs.

CONSTANT-CHUNK

A special kind of CONDITIONING-CHUNK whose NODES are always DEFAULT-VALUE. This conveniently allows biases in the opposing layer.

CONSTRAINED-POISSON-CHUNK

Poisson units with normalized (EXP ACTIVATION) means.

DBM

A Deep Boltzmann Machine. See "Deep Boltzmann Machines" by Ruslan Salakhutdinov and Geoffrey Hinton at <http://www.cs.toronto.edu/~hinton/absps/dbm.pdf>. To instantiate, set up LAYERS and CLOUDS but not VISIBLE-CHUNKS and HIDDEN-CHUNKS, because contrary to how initialization works in the superclass (BM), the values of these slots are inferred from LAYERS and CLOUDS: chunks without a connection from below are visible while the rest are hidden. The default cloud spec list is computed by calling FULL-CLOUDS-EVERYWHERE-BETWEEN-LAYERS on LAYERS.

DBN (OBJECT)

Deep Belief Network: a stack of RBMs. DBNs with multiple hidden layers are not Boltzmann Machines. The chunks in the hidden layer of a constituent RBM and the chunk in the visible layer of the RBM one on top of it must be EQ for the DBN to consider them the same. Naming them the same is not enough, in fact, all chunks must have unique names under EQUAL as usual. Similarly to DBMs, DBNs can be constructed using the :LAYERS initarg. When using this feature, a number of RBMs are instantiated. Often one wants to create a DBN that consists of some RBM subclass, this is what the :RBM-CLASS initarg is for.

EXP-NORMALIZED-GROUP-CHUNK

Means are normalized (EXP ACTIVATION).

FACTORED-CLOUD

Like FULL-CLOUD but the weight matrix is factored into a product of two matrices: A*B. At activation time, HIDDEN += VISIBLE*A*B.

GAUSSIAN-CHUNK

Nodes are real valued. The sample of a node is its activation plus guassian noise of unit variance.

NORMAL-SPARSITY-GRADIENT-SOURCE

Keep track of how much pairs of nodes connected by CLOUD are simultaneously active. If a node in CHUNK deviates from the target sparsity, that is, its average activation is different from the target, then decrease or increase the weight to nodes to which it's connected by CLOUD in such a way that it will be closer to the target. Smooth the empirical estimates in simultaneous activations in PRODUCTS by DAMPING.

RBM

An RBM is a BM with no intralayer connections. An RBM when trained with PCD behaves the same as a BM with the same chunks, clouds but it can also be trained by contrastive divergence (see RBM-CD-TRAINER) and stacked in a DBN.

RBM-CD-TRAINER

A contrastive divergence based trainer for RBMs.

SIGMOID-CHUNK

Nodes in a sigmoid chunk have two possible samples: 0 and 1. The probability of a node being on is given by the sigmoid of its activation.

SOFTMAX-CHUNK

Binary units with normalized (EXP ACTIVATION) firing probabilities representing a multinomial distribution. That is, samples have exactly one 1 in each group of GROUP-SIZE.

TEMPORAL-CHUNK

After a SET-HIDDEN-MEAN, the means of HIDDEN-SOURCE-CHUNK are stored in NEXT-NODE-INPUTS and on the next SET-INPUT copied onto NODES. If there are multiple SET-HIDDEN-MEAN calls between two SET-INPUT calls then only the first set of values are remembered.

Undocumented

FULL-CLOUD

SOFTMAX-LABEL-CHUNK

SPARSITY-GRADIENT-SOURCE

Private

BM-MCMC-PARAMETERS

Paramaters for Markov Chain Monte Carlo based trainers for BMs.

NORMALIZED-GROUP-CHUNK

Means are normalized to SCALE within node groups of GROUP-SIZE.

SEGMENTED-GD-SPARSE-BM-TRAINER

For the chunks with . Collect the average means over samples in a batch and adjust weights in each cloud connected to it so that the average is closer to SPARSITY-TARGET. This is implemented by keeping track of the average means of the chunks connected to it. The derivative is (M* (MATLISP:TRANSPOSE (M.- C1-MEANS TARGET)) C2-MEANS) and this is added to derivative at the end of the batch. Batch size comes from the superclass.

Undocumented

FACTORED-CLOUD-SHARED-CHUNK (CLOUD)

SEGMENTED-GD-BM-TRAINER