Common Lisp Package: AUTOMATON

README:

FUNCTION

Public

REGEXP-AUTOMATON (R &OPTIONAL AS)

Returns a new automaton object corresponding to regexp R. AS is a hash table mapping from identifiers to auxiliary named automata. (An error is signaled if R uses an identifier not in AS.) The constructed automaton is deterministic and minimal, and has no transitions to dead states.

RUN (A STR)

Returns true if STR is accepted by A. As a side-effect, A is determinized if not already deterministic. Complexity: linear in the length of STR (when A is deterministic).

RUN-TO-FIRST-MATCH (A STR &OPTIONAL (START 0) (END (LENGTH STR)))

Returns the end position of match if a substring of STR, optionally between positions START and END, is found that is accepted by A; otherwise, returns nil. Complexity: linear in the length of STR (when A is deterministic).

RUN-TO-FIRST-UNMATCH (A STR &OPTIONAL (START 0) (END (LENGTH STR)))

Returns the end position of match if a substring of STR, optionally between positions START and END, is found that is accepted by A; otherwise, returns nil. A greedy approach is taken until the first match failure or the end of STR (whatever happens first), trying to extend the match length one character at a time. Complexity: linear in the length of STR (when A is deterministic).

STRING-REGEXP (S &OPTIONAL FS)

Returns a new regexp object corresponding to regular expression string S. FS is a logior or optional syntax flags.

Undocumented

AUTOMATON-EQUAL (A1 A2)

REGEXP-EQUAL (R1 R2)

STATE-EQUAL (S1 S2)

Private

ACCEPTING-STATES (A)

Returns a hash table containing the set of accepting states reachable from the initial state of A.

ACOMPLEMENT (A)

Returns a new deterministic

ACONCATENATE (A1 A2)

Returns a new automaton that accepts the concatenation of the languages of A1 and A2. Complexity: linear in the number of states.

ACONCATENATE-MANY (L)

Returns a new automaton that accepts the concatenation of the languages of automata in list L, respecting the order. Complexity: linear in the total number of states.

ADD-EPSILON (S TO)

Adds transitions of state TO to state S. Also, if TO accepts, so does S.

ADD-EPSILONS (A PAIRS)

Adds epsilon transitions to A and returns it. This is done by adding extra character interval transitions that are equivalent to the given set of epsilon transitions. PAIRS is a list of state-pair objects representing pairs of source-destination states where the epsilon transitions should be added.

AINTERSECTION (A1 A2)

Returns a new deterministic automaton that accepts the intersection of the languages of A and A2. As a side-effect, both A1 and A2 are determinized if not already deterministic. Complexity: quadratic in the number of states (when deterministic).

ANY-CHAR-AUTOMATON

Returns a new deterministic automaton that accepts any single character.

ANY-OF-RIGHT-LENGTH-SUBAUTOMATON (STR N)

Returns a new sub-automaton (root of a state graph) accepting non-negative (decimal) integers of length of (subseq STR N).

ANY-STRING-AUTOMATON

Returns a new deterministic automaton that accepts any string.

AREDUCE (A)

Reduces automaton A by combining overlapping and adjacent edge intervals with the same destination. Finally, returns A.

AREVERSE (A)

Reverses the language of non-singleton A. Returns a hash table of new initial states.

AT-LEAST-SUBAUTOMATON (STR N INITIALS ZEROS)

Returns a new sub-automaton (root of a state graph) accepting non-negative (decimal) integers of value at least the one represented by (subseq STR N), and of length of (subseq STR N).

AT-MOST-SUBAUTOMATON (STR N)

Returns a new sub-automaton (root of a state graph) accepting non-negative (decimal) integers of value at most the one represented by (subseq STR N), and of length of (subseq STR N).

AUNION (A1 A2)

Returns a new automaton that accepts the union of the languages of A1 and A2. Complexity: linear in the number of states.

AUNION-MANY (L)

Returns a new automaton that accepts the union of the languages of automata given in list L.

BETWEEN-SUBAUTOMATON (STR1 STR2 N INITIALS ZEROS)

Returns a new sub-automaton (root of a state graph) accepting non-negative (decimal) integers of value between the one represented by (subseq STR1 N) and (subseq STR2 N), inclusive, and of length of (subseq STR1 N) = (subseq STR2 N).

CHAR-AUTOMATON (C)

Returns a new deterministic automaton that accepts a single character whose code is C.

CHAR-RANGE-AUTOMATON (CMIN CMAX)

Returns a new deterministic automaton that accepts a single character whose code is in closed interval [CMIN, CMAX].

CHAR-SET-AUTOMATON (STR)

Returns a new deterministic automaton that accepts a single character in set STR.

CLONE-EXPANDED (A)

Returns a clone of A, expanded if singleton.

DETERMINIZE (A)

Determinizes A and returns it.

DETERMINIZE2 (A INITIALSET)

Determinizes A using the set of initial states in INITIALSET state-set.

EMPTY-AUTOMATON

Returns a new determinsitic automaton with the empty language.

EMPTY-P (A)

Returns true if A accepts no strings.

EMPTY-STRING-AUTOMATON

Returns a new deterministic automaton that accepts only the empty string.

EXPAND-SINGLETON (A)

Expands the singleton representation of A into the regular representation, and returns A.

INTERVAL-AUTOMATON (MIN MAX DIGITS)

Returns a new automaton that accepts strings representing non-negative (decimal) integers in interval [MIN, MAX]. If DIGITS > 0, uses the fixed number of digits (strings must be prefixed by 0s to obtain the right length). Otherwise, the number of digits is not fixed. If MIN > MAX or if the numbers in the interval cannot be expressed with the given fixed number of digits, an error is signaled.

LIVE-STATES2 (A STATES)

Returns the set of live states of A that are in STATES hash table. A state is live if an accepting state is reachable from it.

MINIMIZE (A)

Minimizes, and determinizes if not already deterministic, A and returns it.

MINIMIZE-BRZOZOWSKI (A)

Minimizes A using Brzozowski's algorithm. Complexity: O(2 ^ N), where N is the number of states, but works very well in practice (even better than Hopcroft's).

MINIMIZE-HOPCROFT (A)

Minimizes A using Hopcroft's algorithm. Complexity: O(N log N), regarded as one of the most generally efficient existing algorithms.

MINIMIZE-HUFFMAN (A)

Minimizes A using the standard textbook, Huffman's algorithm. Complexity: O(N ^ 2), where N is the number of states.

NUM-OF-STATES (A)

Returns the number of states of A.

NUM-OF-TRANSITIONS (A)

Returns the number of transitions of A.

OPTIONAL (A)

Returns a new automaton that accepts the union of the empty string and the language of A. Complexity: linear in the number of states.

REMOVE-DEAD-TRANSITIONS (A)

Returns reduced A with transitions to dead states removed. A state is dead if no accepting state is reachable from it.

REPEAT (A)

Returns a new automaton that accepts the Kleene star (zero or more concatenated repetitions) of the language of A. Complexity: linear in the number of states.

REPEAT-MIN (A MIN)

Returns a new automaton that accepts MIN or more concatenated repetitions of the language of A.

REPEAT-MINMAX (A MIN MAX)

Returns a new automaton that accepts a number, from [MIN, MAX], of concatenated repetitions of the language of A. If MIN > MAX, the empty automaton is returned.

SET-STATE-NUMS (STATES)

Renumerates, by assigning consecutive numbers to the NUM slot of states being the keys of STATES hash table, and finally returns STATES.

SLNADD (SL Q)

Adds state Q to state-list SL and returns a new state-list-node object.

SLNREMOVE (SLN)

Removes state-list-node SLN from its state-list object.

SORTED-TRANSITION-LIST (S *TO-FIRST*)

Returns a list of all transitions of S, sorted using TRANSITION< and *TO-FIRST*.

SORTED-TRANSITION-VECTOR (S *TO-FIRST*)

Returns a vector of all transitions of S, sorted using TRANSITION< and *TO-FIRST*.

SORTED-TRANSITIONS (STATES)

Renumerates each state in STATES hash table, and returns a vector of sorted vectors of transitions for each state, ordered by the NUM slot.

SSTEP (S C)

Returns a state reachable from S, given the input character code C.

START-POINTS (A)

Returns a sorted vector of all interval start points (character codes).

STATES (A)

Returns a hash table containing the set of states reachable from the initial state of A.

STRING-AUTOMATON (STR)

Returns a new deterministic automaton that accepts the single given string STR.

SUBSET-OF (A A2)

Returns true if the language of A is a subset of the language of A2.

TOTALIZE (A)

Adds transitions to an explicit crash state, added to A, to ensure that the transition function is total. Finally, returns A.

TRANSITION< (TR1 TR2)

Returns true if TR1 is strictly less than TR2. If *TO-FIRST* special variable is bound to true, the values of the destination states' NUM slots are compared first, followed by the intervals comparison. The intervals comparison is done as follows: the lower interval bounds are compared first, followed by reversed upper interval bounds comparisons. If *TO-FIRST* is bound to nil, the interval comparison is done first, followed by the NUM comparisons.

Undocumented

ADD-TRIGGERS (TRS TRIGGERS N1 N2)

CHECK (R FLAG)

CHECK-MINIMIZE-ALWAYS (A)

ESCAPED-CHAR (C)

HT-SET-TO-VECTOR (HT)

MAKE-REGEXP (KIND &OPTIONAL E1 E2)

MARK-PAIR (MARK TRIGGERS N1 N2)

MATCH (R C)

MORE (R)

NEXT (R)

PARSE-CHAR-CLASS (R)

PARSE-CHAR-CLASS-EXP (R)

PARSE-CHAR-CLASSES (R)

PARSE-CHAR-EXP (R)

PARSE-COMPLEMENT-EXP (R)

PARSE-CONCATENATION-EXP (R)

PARSE-INTERSECTION-EXP (R)

PARSE-REPEAT-EXP (R)

PARSE-SIMPLE-EXP (R)

PARSE-UNION-EXP (R)

PEEK (R S)

RESET-TRANSITIONS (S)

RESTORE-INVARIANT (A)

STATES-AGREE (TRS MARK N1 N2)

TRANSITIONS-EQUAL (TS1 TS2)

GENERIC-FUNCTION

Private

Undocumented

CLONE (TR)

SLOT-ACCESSOR

Private

Undocumented

ACCEPT (OBJECT)

SETFACCEPT (NEW-VALUE OBJECT)

C (OBJECT)

DETERMINISTIC (OBJECT)

SETFDETERMINISTIC (NEW-VALUE OBJECT)

DIGITS (OBJECT)

EXP1 (OBJECT)

EXP2 (OBJECT)

FLAGS (OBJECT)

FROM (OBJECT)

FST (OBJECT)

SETFFST (NEW-VALUE OBJECT)

HASH-CODE (OBJECT)

SETFHASH-CODE (NEW-VALUE OBJECT)

ID (OBJECT)

SETFID (NEW-VALUE OBJECT)

INFO (OBJECT)

SETFINFO (NEW-VALUE OBJECT)

INITIAL (OBJECT)

SETFINITIAL (NEW-VALUE OBJECT)

KIND (OBJECT)

LST (OBJECT)

SETFLST (NEW-VALUE OBJECT)

MAXC (OBJECT)

SETFMAXC (NEW-VALUE OBJECT)

MAXR (OBJECT)

MINC (OBJECT)

SETFMINC (NEW-VALUE OBJECT)

MINIMIZATION (OBJECT)

SETFMINIMIZATION (NEW-VALUE OBJECT)

MINIMIZE-ALWAYS (OBJECT)

SETFMINIMIZE-ALWAYS (NEW-VALUE OBJECT)

MINR (OBJECT)

N1 (OBJECT)

N2 (OBJECT)

NEXT-ID (OBJECT)

SETFNEXT-ID (NEW-VALUE OBJECT)

NUM (OBJECT)

SETFNUM (NEW-VALUE OBJECT)

POS (OBJECT)

SETFPOS (NEW-VALUE OBJECT)

PRED (OBJECT)

SETFPRED (NEW-VALUE OBJECT)

Q (OBJECT)

SETFQ (NEW-VALUE OBJECT)

S (OBJECT)

SETFS (NEW-VALUE OBJECT)

S1 (OBJECT)

SETFS1 (NEW-VALUE OBJECT)

S2 (OBJECT)

SETFS2 (NEW-VALUE OBJECT)

SINGLETON (OBJECT)

SETFSINGLETON (NEW-VALUE OBJECT)

SIZE (OBJECT)

SETFSIZE (NEW-VALUE OBJECT)

SL (OBJECT)

SETFSL (NEW-VALUE OBJECT)

SUCC (OBJECT)

SETFSUCC (NEW-VALUE OBJECT)

TEXT (OBJECT)

TO (OBJECT)

SETFTO (NEW-VALUE OBJECT)

TRANSITIONS (OBJECT)

SETFTRANSITIONS (NEW-VALUE OBJECT)

VARIABLE

Private

Undocumented

*ESCAPE-UNICODE-CHARS*

*MINIMIZATION*

*MINIMIZE-ALWAYS*

*PRINT-RENUMERATE-STATES*

CLASS

Private

Undocumented

AUTOMATON

INT-PAIR

REGEXP

STATE

STATE-LIST

STATE-LIST-NODE

STATE-PAIR

STATE-SET

TRANSITION

CONSTANT

Private

Undocumented

+ALL+

+ANYSTRING+

+AUTOMATON+

+COMPLEMENT+

+EMPTY+

+INTERSECTION+

+INTERVAL+

+MAX-CHAR-CODE+

+MIN-CHAR-CODE+

+NONE+