# Trait fst::Automaton
[−]
[src]

pub trait Automaton { type State; fn start(&self) -> Self::State; fn is_match(&self, state: &Self::State) -> bool; fn accept(&self, state: &Self::State, byte: u8) -> Self::State; fn can_match(&self, _state: &Self::State) -> bool { ... } fn will_always_match(&self, _state: &Self::State) -> bool { ... } fn starts_with(self) -> StartsWith<Self> where Self: Sized { ... } fn union<Rhs: Automaton>(self, rhs: Rhs) -> Union<Self, Rhs> where Self: Sized { ... } fn intersection<Rhs: Automaton>(self, rhs: Rhs) -> Intersection<Self, Rhs> where Self: Sized { ... } fn complement(self) -> Complement<Self> where Self: Sized { ... } }

Automaton describes types that behave as a finite automaton.

All implementators of this trait are represented by *byte based* automata.
Stated differently, all transitions in the automata correspond to a single
byte in the input.

This implementation choice is important for a couple reasons:

- The set of possible transitions in each node is small, which may make efficient memory usage easier.
- The finite state transducers in this crate are all byte based, so any automata used on them must also be byte based.

In practice, this does present somewhat of a problem, for example, if
you're storing UTF-8 encoded strings in a finite state transducer. Consider
using a `Levenshtein`

automaton, which accepts a query string and an edit
distance. The edit distance should apply to some notion of *character*,
which could be represented by at least 1-4 bytes in a UTF-8 encoding (for
some definition of "character"). Therefore, the automaton must have UTF-8
decoding built into it. This can be tricky to implement, so you may find
the `utf8-ranges`

crate useful.

## Associated Types

`type State`

The type of the state used in the automaton.

## Required Methods

`fn start(&self) -> Self::State`

Returns a single start state for this automaton.

This method should always return the same value for each implementation.

`fn is_match(&self, state: &Self::State) -> bool`

Returns true if and only if `state`

is a match state.

`fn accept(&self, state: &Self::State, byte: u8) -> Self::State`

Return the next state given `state`

and an input.

## Provided Methods

`fn can_match(&self, _state: &Self::State) -> bool`

Returns true if and only if `state`

can lead to a match in zero or more
steps.

If this returns `false`

, then no sequence of inputs from this state
should ever produce a match. If this does not follow, then those match
states may never be reached. In other words, behavior may be incorrect.

If this returns `true`

even when no match is possible, then behavior
will be correct, but callers may be forced to do additional work.

`fn will_always_match(&self, _state: &Self::State) -> bool`

Returns true if and only if `state`

matches and must match no matter what
steps are taken.

If this returns `true`

, then every sequence of inputs from this state
produces a match. If this does not follow, then those match states may
never be reached. In other words, behavior may be incorrect.

If this returns `false`

even when every sequence of inputs will lead to
a match, then behavior will be correct, but callers may be forced to do
additional work.

`fn starts_with(self) -> StartsWith<Self> where Self: Sized`

Returns an automaton that matches the strings that start with something this automaton matches.

`fn union<Rhs: Automaton>(self, rhs: Rhs) -> Union<Self, Rhs> where Self: Sized`

Returns an automaton that matches the strings matched by either this or the other automaton.

`fn intersection<Rhs: Automaton>(self, rhs: Rhs) -> Intersection<Self, Rhs> where Self: Sized`

Returns an automaton that matches the strings matched by both this and the other automaton.

`fn complement(self) -> Complement<Self> where Self: Sized`

Returns an automaton that matches the strings not matched by this automaton.

## Implementors

`impl<'a, T: Automaton> Automaton for &'a T`

`impl Automaton for Levenshtein`

`impl Automaton for Regex`