# FUNCTION

# Public

# JARO-DISTANCE (STR1 STR2 &KEY (START1 0) (END1 NIL) (START2 0) (END2 NIL) (TEST #'EQL HAVE-TEST) (TEST-NOT NIL HAVE-TEST-NOT) (KEY #'IDENTITY))

jaro-distance SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => DISTANCE

# LEVENSHTEIN-DISTANCE

levenshtein-distance S1 S2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => NUMBER
Computes the Levenshtein distance between sequences S1 and S2. The result
value DISTANCE is the minimum number of edit operations required to
transform S1 into S2 (or vice versa), where allowed operations are:
- insert a single character at some position
- delete a single character at some position
- substitute a single character at some position by another one
The Levenshtein distance is a measure of similarity between strings. If
two strings have a distance of 0, they are equal. The TEST function is
used to compare sequence elements. The default test function is eql.
This function is a generalization of string-levenshtein-distance
for arbitrary sequence types. Use string-levenshtein-distance, if you
need a version optimized for use with strings.

# LONGEST-COMMON-SUBSEQUENCE*-LENGTH (SEQ1 SEQ2 &KEY (START1 0) (END1 NIL) (START2 0) (END2 NIL) (TEST #'EQL HAVE-TEST) (TEST-NOT NIL HAVE-TEST-NOT) (KEY #'IDENTITY HAVE-KEY))

longest-common-subsequence*-length SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => LENGTH
Returns the length of the (or: a) longest common contiguous subsequence
of sequences SEQ1 and SEQ2. This problem is usually called the longest
common `substring´ problem, but in order to avoid confusion, the we use the
term subsequence* here to refer to contiguous subsequences.

# LONGEST-COMMON-SUBSEQUENCES* (SEQ1 SEQ2 &KEY (START1 0) (END1 NIL) (START2 0) (END2 NIL) (TEST #'EQL HAVE-TEST) (TEST-NOT NIL HAVE-TEST-NOT) (KEY #'IDENTITY HAVE-KEY))

longest-common-subsequences* SEQ1 SEQ2 &key START1 END1 START2 END2 TEST TEST-NOT KEY => LENGTH
Returns a list containing all contigous longest common subsequences
of SEQ1 and SEQ2. This problem is usually called the longest common
`substring´ problem, but in order to avoid confusion, the we use the
term subsequence* here to refer to contiguous subsequences.

# MAP-NGRAMS (FUNCTION SIZE LIST &KEY (START-PADDING NIL) (END-PADDING NIL))

map-ngrams FUNCTION N LIST &key START-PADDING END-PADDING => UNDEFINED
Calls FUNCTION for each n-gram of size N constructed from the elements
in the given LIST. Initially, the value of START-PADDING is used to fill
the first N - 1 elements in the call. When the list is exhausted, the
value of END-PADDING is used to pad to a size of N. The function must
accept N arguments.

# STRING-JARO-DISTANCE

string-jaro-distance STR1 STR2 &key START1 END1 START2 END2 CASE-SENSITIVE => DISTANCE

# STRING-LEVENSHTEIN-DISTANCE

string-levenshtein-distance S1 S2 &key START1 END1 START2 END2
CASE-SENSITIVE => DISTANCE
Computes the Levenshtein distance between strings S1 and S2. The result
value DISTANCE is the minimum number of edit operations required to
transform S1 into S2 (or vice versa), where allowed operations are:
- insert a single character at some position
- delete a single character at some position
- substitute a single character at some position by another one
The Levenshtein distance is a measure of similarity between strings. If
two strings have a distance of 0, they are equal.
If CASE-SENSITIVE is true (which is the default), then comparing of
characters is done in a case sensitive way, distinguishing between lower
and upper case letters. If CASE-SENSITIVE is false, then this function
does not distinguish between lower case and upper case letters.
See function levenshtein-distance for a generalization of this function
to arbitrary sequences.