# FUNCTION

# Public

# BIN-AND-COUNT (SEQUENCE N)

Make N equal width bins and count the number of elements of sequence
that belong in each.

# BINOMIAL-CUMULATIVE-PROBABILITY (N K P)

P(X<k) for X a binomial random variable with parameters n &
p. Bionomial expecations for fewer than k events in N trials, each
having probability p.

# BINOMIAL-GE-PROBABILITY (N K P)

The probability of k or more occurances in N events, each with
probability p.

# BINOMIAL-PROBABILITY (N K P)

P(X=k) for X a binomial random variable with parameters n &
p. Binomial expectations for seeing k events in N trials, each having
probability p. Use the Poisson approximation if N>100 and P<0.01.

# BINOMIAL-PROBABILITY-CI (N P ALPHA &KEY EXACT?)

Confidence intervals on a binomial probability. If a binomial
probability of p has been observed in N trials, what is the 1-alpha
confidence interval around p? Approximate (using normal theory
approximation) when npq >= 10 unless told otherwise

# BINOMIAL-TEST-ONE-SAMPLE (P-HAT N P &KEY (TAILS BOTH) (EXACT? NIL))

The significance of a one sample test for the equality of an observed
probability p-hat to an expected probability p under a binomial
distribution with N observations. Use the normal theory approximation
if n*p*(1-p) > 10 (unless the exact flag is true).

# BINOMIAL-TEST-ONE-SAMPLE-SSE (P-ESTIMATED P-NULL &KEY (ALPHA 0.05) (1-BETA 0.95) (TAILS BOTH))

Returns the number of subjects needed to test whether an observed
probability is significantly different from a particular binomial null
hypothesis with a significance alpha and a power 1-beta.

# BINOMIAL-TEST-PAIRED-SSE (PD PA &KEY (ALPHA 0.05) (1-BETA 0.95) (TAILS BOTH))

Sample size estimate for the McNemar (discordant pairs) test. Pd is
the projected proportion of discordant pairs among all pairs, and Pa
is the projected proportion of type A pairs among discordant
pairs. alpha, 1-beta and tails are as binomal-test-two-sample-sse.
Returns the number of individuals necessary; that is twice the number
of matched pairs necessary.

# BINOMIAL-TEST-TWO-SAMPLE (P-HAT1 N1 P-HAT2 N2 &KEY (TAILS BOTH) (EXACT? NIL))

Are the observed probabilities of an event (p-hat1 and p-hat2) in
N1/N2 trials different? The normal theory method implemented here.
The exact test is Fisher's contingency table method, below.

# BINOMIAL-TEST-TWO-SAMPLE-SSE (P1 P2 &KEY (ALPHA 0.05) (SAMPLE-RATIO 1) (1-BETA 0.95) (TAILS BOTH))

The number of subjects needed to test if two binomial probabilities
are different at a given significance alpha and power 1-beta. The
sample sizes can be unequal; the p2 sample is sample-sse-ratio * the
size of the p1 sample. It can be a one tailed or two tailed test.

# CHI-SQUARE (DOF PERCENTILE)

Returns the point which is the indicated percentile in the Chi
Square distribution with dof degrees of freedom.

# CHI-SQUARE-CDF (X DOF)

Computes the left hand tail area under the chi square distribution
under dof degrees of freedom up to X. Adopted from CLASP 1.4.3,
http://eksl-www.cs.umass.edu/clasp.html

# CHI-SQUARE-TEST-FOR-TREND (ROW1-COUNTS ROW2-COUNTS &OPTIONAL SCORES)

This test works on a 2xk table and assesses if there is an increasing
or decreasing trend. Arguments are equal sized lists counts.
Optionally, provide a list of scores, which represent some numeric
attribute of the group. If not provided, scores are assumed to be 1
to k.

# CHI-SQUARE-TEST-ONE-SAMPLE (VARIANCE N SIGMA-SQUARED &KEY (TAILS BOTH))

The significance of a one sample Chi square test for the variance of
a normal distribution. Variance is the observed variance, N is the
number of observations, and sigma-squared is the test variance.

# CHI-SQUARE-TEST-RXC (CONTINGENCY-TABLE)

Takes contingency-table, an RxC array, and returns the significance
of the relationship between the row variable and the column variable.
Any difference in proportion will cause this test to be significant --
consider using the test for trend instead if you are looking for a
consistent change.

# CHOOSE (N K)

How may ways to take n things taken k at a time, when order doesn't
matter

# CONVERT-TO-STANDARD-NORMAL (X MU SIGMA)

Convert X from a Normal distribution with mean mu and variance
sigma to standard normal

# CORRELATION-COEFFICIENT (POINTS)

just r from linear-regression. Also called Pearson Correlation

# CORRELATION-SSE (RHO &KEY (ALPHA 0.05) (1-BETA 0.95))

Returns the size of a sample necessary to find a correlation of
expected value rho with significance alpha and power 1-beta.

# CORRELATION-TEST-TWO-SAMPLE (R1 N1 R2 N2 &KEY (TAILS BOTH))

Test if two correlation coefficients are different. Users Fisher's Z
test.

# F-SIGNIFICANCE (F-STATISTIC NUMERATOR-DOF DENOMINATOR-DOF &OPTIONAL ONE-TAILED-P)

Adopted from CLASP, but changed to handle F < 1 correctly in the
one-tailed case. The `f-statistic' must be a positive number. The
degrees of freedom arguments must be positive integers. The
`one-tailed-p' argument is treated as a boolean.
This implementation follows Numerical Recipes in C, section 6.3 and
the `ftest' function in section 13.4.

# F-TEST (VARIANCE1 N1 VARIANCE2 N2 &KEY (TAILS BOTH))

F test for the equality of two variances

# FALSE-DISCOVERY-CORRECTION (P-VALUES &KEY (RATE 0.05))

A multiple testing correction that is less conservative than Bonferroni.
Takes a list of p-values and a false discovery rate, and returns the
number of p-values that are likely to be good enough to reject the
null at that rate. Returns a second value which is the p-value
cutoff. See
Benjamini Y and Hochberg Y. "Controlling the false discovery rate:
a practical and powerful approach to multiple testing." J R Stat
Soc Ser B 57: 289 300, 1995.

# FISHER-EXACT-TEST (CONTINGENCY-TABLE &KEY (TAILS BOTH))

Fisher's exact test. Gives a p value for a particular 2x2
contingency table

# FISHER-Z-TRANSFORM (R)

Transforms the correlation coefficient to an approximately normal
distribution.

# LINEAR-REGRESSION (POINTS)

Computes the regression equation for a least squares fit of a line to
a sequence of points (each a list of two numbers, e.g. '((1.0
0.1) (2.0 0.2))) and report the intercept, slope, correlation
coefficient r, R^2, and the significance of the difference of the
slope from 0.

# MCNEMARS-TEST (A-DISCORDANT-COUNT B-DISCORDANT-COUNT &KEY (EXACT? NIL))

McNemar's test for correlated proportions, used for longitudinal
studies. Look only at the number of discordant pairs (one treatment is
effective and the other is not). If the two treatments are A and B,
a-discordant-count is the number where A worked and B did not, and
b-discordant-count is the number where B worked and A did not.

# MEAN-SD-N (SEQUENCE)

A combined calculation that is often useful. Takes a sequence and
returns three values: mean, standard deviation and N.

# MODE (SEQUENCE)

Returns two values: a list of the modes and the number of times
they occur.

# NORMAL-MEAN-CI (MEAN SD N ALPHA)

Confidence interval for the mean of a normal distribution
The 1-alpha percent confidence interval on the mean of a normal
distribution with parameters mean, sd & n.

# NORMAL-MEAN-CI-ON-SEQUENCE (SEQUENCE ALPHA)

The 1-alpha confidence interval on the mean of a sequence of
numbers drawn from a Normal distribution.

# NORMAL-PDF (X MU SIGMA)

The probability density function (PDF) for a normal distribution
with mean mu and variance sigma at point x.

# NORMAL-SD-CI (SD N ALPHA)

As normal-variance-ci-on-sequence, but a confidence inverval for
the standard deviation.

# NORMAL-VARIANCE-CI (VARIANCE N ALPHA)

The 1-alpha confidence interval on the variance of a sequence of
numbers drawn from a Normal distribution.

# PERMUTATIONS (N K)

How many ways to take n things taken k at a time, when order
matters

# PHI (X)

the CDF of standard normal distribution. Adopted from CLASP 1.4.3,
see copyright notice at http://eksl-www.cs.umass.edu/clasp.html

# POISSON-CUMULATIVE-PROBABILITY (MU K)

Probability of seeing fewer than K events over a time period when
the expected number events over that time is mu.

# POISSON-GE-PROBABILITY (MU X)

Probability of X or more events when expected is mu.

# POISSON-MU-CI (X ALPHA)

Confidence interval for the Poisson parameter mu
Given x observations in a unit of time, what is the 1-alpha confidence
interval on the Poisson parameter mu (= lambda*T)?
Since find-critical-value assumes that the function is monotonic
increasing, adjust the value we are looking for taking advantage of
reflectiveness.

# POISSON-PROBABILITY (MU K)

Probability of seeing k events over a time period when the expected
number of events over that time is mu.

# POISSON-TEST-ONE-SAMPLE (OBSERVED MU &KEY (TAILS BOTH) (APPROXIMATE? NIL))

The significance of a one sample test for the equality of an observed
number of events (observed) and an expected number mu under the
poisson distribution. Normal theory approximation is not that great,
so don't use it unless told.

# RANDOM-NORMAL (&KEY (MEAN 0) (SD 1))

returns a random number with mean and standard-distribution as
specified.

# RANDOM-PICK (SEQUENCE)

Random selection from sequence

# RANDOM-SAMPLE (N SEQUENCE)

Return a random sample of size N from sequence, without replacement.
If N is equal to or greater than the length of the sequence, return
the entire sequence.

# ROUND-FLOAT (X &KEY (PRECISION 5))

Rounds a floating point number to a specified number of digits
precision.

# SIGN-TEST (PLUS-COUNT MINUS-COUNT &KEY (EXACT? NIL) (TAILS BOTH))

Really just a special case of the binomial one sample test with p =
1/2. The normal theory version has a correction factor to make it a
better approximation.

# SIGN-TEST-ON-SEQUENCES (SEQUENCE1 SEQUENCE2 &KEY (EXACT? NIL) (TAILS BOTH))

Same as sign-test, but takes two sequences and tests whether the
entries in one are different (greater or less) than the other.

# SPEARMAN-RANK-CORRELATION (POINTS)

Spearman rank correlation computes the relationship between a pair of
variables when one or both are either ordinal or have a distribution
that is far from normal. It takes a list of points (same format as
linear-regression) and returns the spearman rank correlation
coefficient and its significance.

# T-DISTRIBUTION (DOF PERCENTILE)

Returns the point which is the indicated percentile in the T
distribution with dof degrees of freedom. Adopted from CLASP 1.4.3,
http://eksl-www.cs.umass.edu/clasp.html

# T-SIGNIFICANCE (T-STATISTIC DOF &KEY (TAILS BOTH))

Lookup table in Rosner; this is adopted from CLASP/Numeric
Recipes (CLASP 1.4.3), http://eksl-www.cs.umass.edu/clasp.html

# T-TEST-ONE-SAMPLE (X-BAR SD N MU &KEY (TAILS BOTH))

The significance of a one sample T test for the mean of a normal
distribution with unknown variance. X-bar is the observed mean, sd is
the observed standard deviation, N is the number of observations and
mu is the test mean.
See also t-test-one-sample-on-sequence

# T-TEST-ONE-SAMPLE-ON-SEQUENCE (SEQUENCE MU &KEY (TAILS BOTH))

As t-test-one-sample, but calculates the observed values from a
sequence of numbers.

# T-TEST-ONE-SAMPLE-SSE (MU MU-NULL VARIANCE &KEY (ALPHA 0.05) (1-BETA 0.95) (TAILS BOTH))

Returns the number of subjects needed to test whether the mean of a
normally distributed sample mu is different from a null hypothesis
mean mu-null and variance variance, with alpha, 1-beta and tails as
specified.

# T-TEST-PAIRED (D-BAR SD N &KEY (TAILS BOTH))

The significance of a paired t test for the means of two normal
distributions in a longitudinal study. D-bar is the mean difference,
sd is the standard deviation of the differences, N is the number of
pairs.

# T-TEST-PAIRED-ON-SEQUENCES (BEFORE AFTER &KEY (TAILS BOTH))

The significance of a paired t test for means of two normal
distributions in a longitudinal study. Before is a sequence of before
values, after is the sequence of paired after values (which must be
the same length as the before sequence).

# T-TEST-PAIRED-SSE (DIFFERENCE-MU DIFFERENCE-VARIANCE &KEY (ALPHA 0.05) (1-BETA 0.95) (TAILS BOTH))

Returns the number of subjects needed to test whether the differences
with mean difference-mu and variance difference-variance, with alpha,
1-beta and tails as specified.

# T-TEST-TWO-SAMPLE (X-BAR1 SD1 N1 X-BAR2 SD2 N2 &KEY (VARIANCES-EQUAL? TEST) (VARIANCE-SIGNIFICANCE-CUTOFF 0.05) (TAILS BOTH))

The significance of the difference of two means (x-bar1 and x-bar2)
with standard deviations sd1 and sd2, and sample sizes n1 and n2
respectively. The form of the two sample t test depends on whether
the sample variances are equal or not. If the variable
variances-equal? is :test, then we use an F test and the
variance-significance-cutoff to determine if they are equal. If the
variances are equal, then we use the two sample t test for equal
variances. If they are not equal, we use the Satterthwaite method,
which has good type I error properties (at the loss of some power).

# T-TEST-TWO-SAMPLE-ON-SEQUENCES (SEQUENCE1 SEQUENCE2 &KEY (VARIANCE-SIGNIFICANCE-CUTOFF 0.05) (TAILS BOTH))

Same as t-test-two-sample, but providing the sequences rather than
the summaries.

# T-TEST-TWO-SAMPLE-SSE (MU1 VARIANCE1 MU2 VARIANCE2 &KEY (SAMPLE-RATIO 1) (ALPHA 0.05) (1-BETA 0.95) (TAILS BOTH))

Returns the number of subjects needed to test whether the mean mu1 of
a normally distributed sample (with variance variance1) is different
from a second sample with mean mu2 and variance variance2, with alpha,
1-beta and tails as specified. It is also possible to set a sample
size ratio of sample 1 to sample 2.

# WILCOXON-SIGNED-RANK-TEST (DIFFERENCES &OPTIONAL (TAILS BOTH))

A test on the ranking of positive and negative differences (are the
positive differences significantly larger/smaller than the negative
ones). Assumes a continuous and symmetric distribution of differences,
although not a normal one. This is the normal theory approximation,
which is only valid when N > 15.
This test is completely equivalent to the Mann-Whitney test.

# Z (PERCENTILE &KEY (EPSILON 1.d-15))

The inverse normal function, P(X<Zu) = u where X is distributed as
the standard normal. Uses binary search.

# Z-TEST (X-BAR N &KEY (MU 0) (SIGMA 1) (TAILS BOTH))

The significance of a one sample Z test for the mean of a normal
distribution with known variance.
mu is the null hypothesis mean, x-bar is the observed mean, sigma is
the standard deviation and N is the number of observations. If tails
is :both, the significance of a difference between x-bar and mu. If
tails is :positive, the significance of x-bar is greater than mu, and
if tails is :negative, the significance of x-bar being less than mu.
Returns a p value.

# Undocumented

# COEFFICIENT-OF-VARIATION (SEQUENCE)

# CORRELATION-TEST-TWO-SAMPLE-ON-SEQUENCES (POINTS1 POINTS2 &KEY (TAILS BOTH))

# GEOMETRIC-MEAN (SEQUENCE &OPTIONAL (BASE 10))

# MEAN (SEQUENCE)

# MEDIAN (SEQUENCE)

# NORMAL-SD-CI-ON-SEQUENCE (SEQUENCE ALPHA)

# NORMAL-VARIANCE-CI-ON-SEQUENCE (SEQUENCE ALPHA)

# PERCENTILE (SEQUENCE PERCENT)

# RANGE (SEQUENCE)

# SD (SEQUENCE)

# STANDARD-DEVIATION (SEQUENCE)

# STANDARD-ERROR-OF-THE-MEAN (SEQUENCE)

# VARIANCE (SEQUENCE)

# WILCOXON-SIGNED-RANK-TEST-ON-SEQUENCES (SEQUENCE1 SEQUENCE2 &OPTIONAL (TAILS BOTH))

# Z-TEST-ON-SEQUENCE (SEQUENCE &KEY (MU 0) (SIGMA 1) (TAILS BOTH))

# Private

# 2-TAILED-CORRELATION-SIGNIFICANCE (N R)

We use the first line for anything less than 5, and the last line
for anything over 500. Otherwise, find the nearest value (maybe we
should interpolate ... too much bother!)

# ANOVA1 (D)

One way simple ANOVA, from Neter, et al. p677+. Data is give as a
list of lists, each one representing a treatment, and each containing
the observations.

# ANOVA2 (A1B1 A1B2 A2B1 A2B2)

Two-Way Anova. (From Misanin & Hinderliter, 1991, p. 367-) This is
specialized for four groups of equal n, called by their plot location
names: left1 left2 right1 right2.

# ANOVA2R (G1 G2)

Two way ANOVA with repeated measures on one dimension. From
Ferguson & Takane, 1989, p. 359. Data is organized differently for
this test. Each group (g1 g2) contains list of all subjects' repeated
measures, and same for B. So, A: ((t1s1g1 t2s1g1 ...) (t1s2g2 t2s2g2
...) ...) Have to have the same number of test repeats for each
subject, and this assumes the same number of subject in each group.

# AVERAGE-RANK (VALUE SORTED-VALUES)

Average rank calculation for non-parametric tests. Ranks are 1
based, but lisp is 0 based, so add 1!

# BETA-INCOMPLETE (A B X)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

# CORRELATE (X Y)

Correlation of two sequences, as in Ferguson & Takane, 1989,
p. 125. Assumes NO MISSING VALUES!

# CROSS-MEAN (L &AUX K R)

Cross mean takes a list of lists, as ((1 2 3) (4 3 2 1) ...) and
produces a list with mean and standard error for each VERTICLE entry,
so, as: ((2.5 . 1) ...) where the first pair is computed from the nth
1 of all the sublists in the input set, etc. This is useful in some
cases of data cruching.
Note that missing data is assumed to be always at the END of lists.
If it isn't, you've got to do something previously to interpolate.

# DUMPLOT (V &OPTIONAL SHOW-VALUES)

A dumb terminal way of plotting data.

# ERROR-FUNCTION (X)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

# ERROR-FUNCTION-COMPLEMENT (X)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

# FIND-CRITICAL-VALUE (P-FUNCTION P-VALUE &OPTIONAL (X-TOLERANCE 1.e-5) (Y-TOLERANCE 1.e-5))

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

# GAMMA-INCOMPLETE (A X)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

# GAMMA-LN (X)

Adopted from CLASP 1.4.3, http://eksl-www.cs.umass.edu/clasp.html

# HARMONIC-MEAN (SEQ)

See: http://mathworld.wolfram.com/HarmonicMean.html

# HISTOVALUES (V* &KEY (NBINS 10))

Take a set of values and produce a histogram binned into n groups,
so that you can get a report of the distribution of values. There's a
large chance for off-by-one errores here!

# LMEAN (LL)

Lmean takes the mean of entries in a list of lists vertically.
So: (lmean '((1 2) (5 6))) -> (3 4) The args have to be the same
length.

# N-RANDOM (N L &AUX R)

Select n random sublists from a list, without replacement. This
copies the list and then destroys the copy. N better be less than or
equal to (length l).

# NORMALIZE (V)

Normalize a vector by dividing it through by subtracting its min
and then dividing through by its range (max-min). If the numbers are
all the same, this would screw up, so we check that first and just
return a long list of 0.5 if so!

# PROTECTED-MEAN (L)

Computes a mean protected where there will be a divide by zero, and
gives us n/a in that case.

# PROUND (N V)

Returns a string that is rounded to the appropriate number of
digits, but the only thing you can do with it is print it. It's just
a convenience hack for rounding recursive lists.

# REGRESS (X Y)

Simple linear regression.

# SAFE-EXP (X)

Eliminates floating point underflow for the exponential function.
Instead, it just returns 0.0d0

# T1-TEST (VALUES TARGET &OPTIONAL (WARN? T))

One way t-test to see if a group differs from a numerical mean
target value. From Misanin & Hinderliter p. 248.

# T2-TEST (L1 L2)

T2-test calculates an UNPAIRED t-test.
From Misanin & Hinderliter p. 268. The t-cdf part is inherent in
xlispstat, and I'm not entirely sure that it's really the right
computation since it doens't agree entirely with Table 5 of M&H, but
it's close, so I assume that M&H have round-off error.

# TUKEY-Q (K DFWG)

Finds the Q table for the appopriate K, and then walks BACKWARDS
through it (in a kind of ugly way!) to find the appropriate place in
the table for the DFwg, and then uses the level (which must be 0.01 or
0.05, indicating the first, or second col of the table) to determine
if the Q value reaches significance, and gives us a + or - final
result.

# WILCOXON-1 (INITIAL-VALUES TARGET)

Nonparametric one-sample (signed) rank test (Wilcoxon).
From http://www.graphpad.com/instatman/HowtheWilcoxonranksumtestworks.htm

# X2TEST

Simple Chi-Squares From Clarke & Cooke p. 431; should = ~7.0

# Undocumented

# ALL-SQUARES (AS BS &AUX SQUARES)

# BINOMIAL-LE-PROBABILITY (N K P)

# CHI-SQUARE-1 (EXPECTED OBSERVED)

# CHI-SQUARE-2 (TABLE)

# EVEN-POWER-OF-TWO? (N)

# F-SCORE>P-LIMIT? (DF1 DF2 F-SCORE LIMITS-TABLE)

# FACTORIAL (NUMBER)

# MAX* (L &REST LL &AUX M)

# MIN* (L &REST LL &AUX M)

# P2 (V)

# ROUND-UP (X)

# S2 (L N)

# SIGN (X)

# SQR (A)

# STANDARD-ERROR (SEQUENCE)

# SUM (L &AUX (SUM 0))

# T-P-VALUE (X DF &OPTIONAL (WARN? T))

# T1-VALUE (VALUES TARGET)

# T2-VALUE (L1 L2)

# TESTANOVA2

# MACRO

# Public

# Undocumented

# SQUARE (X)

# TEST-VARIABLES (&REST ARGS)

# Private

# UNDERFLOW-GOES-TO-ZERO (&BODY BODY)

Protects against floating point underflow errors and sets the value to 0.0 instead.

# Z/PROTECT (EXPR TESTVAR)

Macro to protect from division by zero.