# FUNCTION

# Public

# ADD-CONFUSION-MATRIX (MATRIX RESULT-MATRIX)

Add MATRIX into RESULT-MATRIX.

# BACKING-ARRAY (ARRAY)

Return the array in which the contents of ARRAY are stored. For
simple arrays, this is always the array itself. The second value is
the displacement.

# BINARIZE-RANDOMLY (X)

Return 1 with X probability and 0 otherwise.

# BINOMIAL-LOG-LIKELIHOOD-RATIO (K1 N1 K2 N2)

See "Accurate Methods for the Statistics of Surprise and
Coincidence" by Ted Dunning (http://citeseer.ist.psu.edu/29096.html).
All classes must have non-zero counts, that is, K1, N1-K1, K2, N2-K2
are positive integers. To ensure this - and also as kind of prior -
add a small number such as 1 to K1, K2 and 2 to N1, N2 before
calling.

# BREAK-SEQ (FRACTIONS SEQ)

Split SEQ into a number of subsequences. FRACTIONS is either a
positive integer or a list of non-negative real numbers. If FRACTIONS
is a positive integer then return a list of that many subsequences of
equal size (bar rounding errors), else split SEQ into subsequences,
where the length of subsequence I is proportional to element I of
FRACTIONS:
(BREAK-SEQ '(2 3) '(0 1 2 3 4 5 6 7 8 9))
=> ((0 1 2 3) (4 5 6 7 8 9))

# COMPUTE-FEATURE-DISAMBIGUITIES (DOCUMENTS MAPPER CLASS-FN &KEY (CLASSES (ALL-DOCUMENT-CLASSES DOCUMENTS CLASS-FN)))

Return scored features as an EQUAL hash table whose keys are
features of DOCUMENTS and values are their log likelihood ratios.
MAPPER takes a function and a document and calls function with
features of the document.

# COMPUTE-FEATURE-LLRS (DOCUMENTS MAPPER CLASS-FN &KEY (CLASSES (ALL-DOCUMENT-CLASSES DOCUMENTS CLASS-FN)))

Return scored features as an EQUAL hash table whose keys are
features of DOCUMENTS and values are their log likelihood ratios.
MAPPER takes a function and a document and calls function with
features of the document.

# CONFUSION-MATRIX-ACCURACY (MATRIX &KEY FILTER)

Return the overall accuracy of the results in MATRIX. It's computed
as the number of correctly classified cases (hits) divided by the name
of cases. Return the number of hits and the number of cases as the
second and third value. If FILTER function is given, then call it with
the target and the prediction of the cell. Disregard cell for which
FILTER returns NIL.
Precision and recall can be easily computed by giving the right
filter, although those are provided in separate convenience
functions.

# CONFUSION-MATRIX-PRECISION (MATRIX PREDICTION)

Return the accuracy over the cases when the classifier said
PREDICTION.

# CONFUSION-MATRIX-RECALL (MATRIX TARGET)

Return the accuracy over the cases when the correct class is
TARGET.

# COUNT-FEATURES (DOCUMENTS MAPPER)

Return scored features as an EQUAL hash table whose keys are
features of DOCUMENTS and values are counts of occurrences of
features. MAPPER takes a function and a document and calls function
with features of the document.

# ENCODE/BAG-OF-WORDS (DOCUMENT MAPPER FEATURE->INDEX &KEY (KIND BINARY))

Return a sparse vector that represents the encoded DOCUMENT. Get
the features of DOCUMENT from MAPPER, convert each feature to an index
by FEATURE->INDEX. FEATURE->INDEX may return NIL if the feature is not
used. The result is a vector of index/value conses. Indexes are unique
within the vector and are in increasing order. Depending on KIND value
is calculated differently: for :FREQUENCY it is the number of times
the corresponding feature was found in DOCUMENT, for :BINARY it is
always 1. :NORMALIZED-FREQUENCY and :NORMALIZED-BINARY are like the
unnormalized counterparts except that as the final step values in the
assembled sparse vector are normalized to sum to 1.

# GAUSSIAN-RANDOM-1

Return a single float of zero mean and unit variance.

# INDEX-SCORED-FEATURES (FEATURE-SCORES N &KEY (START 0))

Take scored features as a feature -> score hash table (returned by
COUNT-FEATURES or COMPUTE-FEATURE-LLR, for instance) and return a
feature -> index hash table that maps the first N (or less) features
with the highest scores to distinct dense indices starting from
START.

# MAKE-N-GRAM-MAPPEE (FUNCTION N)

Make a function of a single argument that's suitable for the
function arguments to a mapper function. It calls FUNCTION with every
N element.

# MAKE-RANDOM-GENERATOR (SEQ)

Return a function that returns elements of VECTOR in random order
without end. When there are no more elements, start over with a
different random order.

# MAKE-SEQ-GENERATOR (VECTOR)

Return a function that returns elements of VECTOR in order without
end. When there are no more elements, start over.

# MULTINOMIAL-LOG-LIKELIHOOD-RATIO (K1 K2)

See "Accurate Methods for the Statistics of Surprise and
Coincidence" by Ted Dunning (http://citeseer.ist.psu.edu/29096.html).
K1 is the number of outcomes in each class. K2 is the same in a
possibly different process.
All elements in K1 and K2 are positive integers. To ensure this - and
also as kind of prior - add a small number such as 1 each element in
K1 and K2 before calling.

# MV-GAUSSIAN-RANDOM (&KEY MEANS COVARIANCES (COVARIANCES-LEFT-SQUARE-ROOT (CHOLESKY (HERMITIAN-MATRIX COVARIANCES))))

Return a column vector of samples from the multivariate normal
distribution defined by MEANS (Nx1) and COVARIANCES (NxN). For
multiple calls with the same parameter one can pass in
COVARIANCES-LEFT-SQUARE-ROOT instead of COVARIANCES.

# NSHUFFLE-VECTOR (VECTOR)

Shuffle a vector in place using Fisher-Yates algorithm.

# REVERSE-HASH-TABLE (HASH-TABLE &KEY (TEST #'EQL))

Return a hash table that maps from the values of HASH-TABLE back to
its keys. HASH-TABLE had better be a bijection.

# SPLIT-BODY (BODY)

Return a list of declarations and the rest of BODY.

# STRATIFIED-SPLIT (FRACTIONS SEQ &KEY (KEY #'IDENTITY) (TEST #'EQL) RANDOMIZEP)

Similar to BREAK-SEQ, but also makes sure that keys are equally
distributed among the paritions. It can be useful for classification
tasks to partition the data set while keeping the distribution of
classes the same.

# Undocumented

# ADD-TO-RUNNING-STAT (X STAT)

# ALIST->HASH-TABLE (ALIST &REST ARGS)

# APPEND1 (LIST OBJ)

# AS-COLUMN-VECTOR (A)

# ASDF-SYSTEM-RELATIVE-PATHNAME (PATHNAME)

# CALL-PERIODIC-FN (N FN &REST ARGS)

# CALL-PERIODIC-FN! (N FN &REST ARGS)

# CLEAR-RUNNING-STAT (STAT)

# FILL! (ALPHA X)

# FLT (X)

# FLT-VECTOR (&REST ARGS)

# GROUP (SEQ N)

# HASH-TABLE->ALIST (HASH-TABLE)

# HASH-TABLE->VECTOR (HASH-TABLE)

# LAST1 (SEQ)

# MAKE-FLT-ARRAY (DIMENSIONS &KEY (INITIAL-ELEMENT 0.0d0))

# MAX-POSITION (SEQ START END)

# POISSON-RANDOM (MEAN)

# PRINT-TABLE (LIST &KEY (STREAM T))

# READ-DOUBLE-FLOAT-ARRAY (ARRAY STREAM)

# READ-INDEXED-FEATURES (STREAM)

# RUNNING-STAT-MEAN (STAT)

# RUNNING-STAT-VARIANCE (STAT)

# SCALED-TANH (X)

# SECH (X)

# SELECT-RANDOM-ELEMENT (SEQ)

# SHUFFLE-VECTOR (VECTOR)

# SIGMOID (X)

# SIGN (X)

# SPLIT-PLIST (LIST KEYS)

# SUBSEQ* (SEQUENCE START &OPTIONAL END)

# SUFFIX-SYMBOL (SYMBOL &REST SUFFIXES)

# TO-SCALAR (MATRIX)

# TRY-CHANCE (CHANCE)

# WRITE-DOUBLE-FLOAT-ARRAY (ARRAY STREAM)

# WRITE-INDEXED-FEATURES (FEATURES->INDICES STREAM)

# Private

# Undocumented

# ->DESCRIPTION (OBJECT DESCRIPTION)

# ALL-DOCUMENT-CLASSES (DOCUMENTS CLASS-FN)

# COLLECT-DISTINCT (SEQ &KEY (KEY #'IDENTITY) (TEST #'EQL))

# COMPACT-BINARY-FEATURE-VECTOR (FEATURE-VECTOR)

# DOCUMENT-FEATURES (DOCUMENT MAPPER)

# FORMAT-DESCRIPTION (DESCRIPTION STREAM)

# PPRINT-DESCRIPTIONS (CLASS DESCRIPTIONS STREAM)

# READ-AS-BYTES (N STREAM)

# WRITE-AS-BYTES (INTEGER N STREAM)

# MACRO

# Public

# REPEATEDLY (&BODY BODY)

Like CONSTANTLY but evaluates BODY it for each time.

# SPECIAL-CASE (TEST &BODY BODY)

Let the compiler compile BODY for the case when TEST is true and
also when it's false. The purpose is to allow different constraints to
propagate to the two branches allowing them to be more optimized.

# Undocumented

# DEFINE-DESCRIPTIONS ((OBJECT CLASS &KEY INHERITP) &BODY DESCRIPTIONS)

# DEFINE-SLOTS-NOT-TO-BE-COPIED (CONTEXT CLASS &BODY SLOT-NAMES)

# DEFINE-SLOTS-TO-BE-SHALLOW-COPIED (CONTEXT CLASS &BODY SLOT-NAMES)

# PUSH-ALL (LIST PLACE)

# THE! (&REST ARGS)

# WHILE (TEST &BODY BODY)

# WITH-COPYING (&BODY BODY)

# WITH-GENSYMS (VARS &BODY BODY)

# Private

# Undocumented

# DEFINE-SLOT-NOT-TO-BE-COPIED (CONTEXT CLASS SLOT-NAME)

# DEFINE-SLOT-TO-BE-SHALLOW-COPIED (CONTEXT CLASS SLOT-NAME)

# WITH-SAFE-PRINTING (&BODY BODY)

# WITH-ZERO-ON-UNDERFLOW (&BODY BODY)

# GENERIC-FUNCTION

# Public

# CONFUSION-CLASS-NAME (MATRIX CLASS)

Name of CLASS for presentation purposes.

# CONFUSION-MATRIX-CLASSES (MATRIX)

A list of all classes. The default is to collect
classes from the counts. This can be overridden if, for instance, some
classes are not present in the results.

# COPY (CONTEXT OBJECT)

Make a deepish copy of OBJECT in CONTEXT.

# COPY-OBJECT-EXTRA-INITARGS (CONTEXT ORIGINAL-OBJECT)

Return a list of

# COPY-OBJECT-SLOT (CONTEXT ORIGINAL-OBJECT SLOT-NAME VALUE)

Return the value of the slot in the copied object
and T, or NIL as the second value if the slot need not be initialized.
The default implementation of COPY-FOR-PCD for classes calls
COPY-SLOT-FOR-PCD.

# MAP-CONFUSION-MATRIX (FN MATRIX)

Call FN with TARGET, PREDICTION, COUNT paramaters
for each cell in the confusion matrix. Cells with a zero count may be
ommitted.

# READ-WEIGHTS (OBJECT STREAM)

Read the weights of OBJECT from STREAM.

# SORT-CONFUSION-CLASSES (MATRIX CLASSES)

Return a list of CLASSES sorted for presentation
purposes.

# WRITE-WEIGHTS (OBJECT STREAM)

Write the weights of OBJECT to STREAM.

# Undocumented

# CONFUSION-COUNT (MATRIX TARGET PREDICTION)

# SETFCONFUSION-COUNT (COUNT MATRIX TARGET PREDICTION)

# SLOT-ACCESSOR

# Public

# Undocumented

# INDEX (OBJECT)

# LAST-EVAL (OBJECT)

# SETFLAST-EVAL (NEW-VALUE OBJECT)

# Private

# Undocumented

# COUNTS (OBJECT)

# FN (OBJECT)

# PERIOD (OBJECT)

# VARIABLE

# Public

# Undocumented

# *NO-ARRAY-BOUNDS-CHECK*

# Private

# Undocumented

# *MGL-DIR*

# CLASS

# Public

# CONFUSION-MATRIX

A confusion matrix keeps count of classification
results. The correct class is called `target' and the output of the
classifier is called `prediction'. Classes are compared with EQUAL.