# Struct fst::Levenshtein
[−]
[src]

pub struct Levenshtein { /* fields omitted */ }

A Unicode aware Levenshtein automaton for running efficient fuzzy queries.

A Levenshtein automata is one way to search any finite state transducer
for keys that *approximately* match a given query. A Levenshtein automaton
approximates this by returning all keys within a certain edit distance of
the query. The edit distance is defined by the number of insertions,
deletions and substitutions required to turn the query into the key.
Insertions, deletions and substitutions are based on
**Unicode characters** (where each character is a single Unicode scalar
value).

# Example

This example shows how to find all keys within an edit distance of `1`

from `foo`

.

use fst::{IntoStreamer, Streamer, Levenshtein, Set}; let keys = vec!["fa", "fo", "fob", "focus", "foo", "food", "foul"]; let set = Set::from_iter(keys).unwrap(); let lev = Levenshtein::new("foo", 1).unwrap(); let mut stream = set.search(&lev).into_stream(); let mut keys = vec![]; while let Some(key) = stream.next() { keys.push(key.to_vec()); } assert_eq!(keys, vec![ "fo".as_bytes(), // 1 deletion "fob".as_bytes(), // 1 substitution "foo".as_bytes(), // 0 insertions/deletions/substitutions "food".as_bytes(), // 1 insertion ]);

This example only uses ASCII characters, but it will work equally well on Unicode characters.

# Warning: experimental

While executing this Levenshtein automaton against a finite state
transducer will be very fast, *constructing* an automaton may not be.
Namely, this implementation is a proof of concept. While I believe the
algorithmic complexity is not exponential, the implementation is not speedy
and it can use enormous amounts of memory (tens of MB before a hard-coded
limit will cause an error to be returned).

This is important functionality, so one should count on this implementation being vastly improved in the future.

## Methods

`impl Levenshtein`

[src]

`fn new(query: &str, distance: u32) -> Result<Self>`

Create a new Levenshtein query.

The query finds all matching terms that are at most `distance`

edit operations from `query`

. (An edit operation may be an insertion,
a deletion or a substitution.)

If the underlying automaton becomes too big, then an error is returned.

A `Levenshtein`

value satisfies the `Automaton`

trait, which means it
can be used with the `search`

method of any finite state transducer.

## Trait Implementations

`impl Debug for Levenshtein`

[src]

`impl Automaton for Levenshtein`

[src]

`type State = Option<usize>`

The type of the state used in the automaton.

`fn start(&self) -> Option<usize>`

Returns a single start state for this automaton. Read more

`fn is_match(&self, state: &Option<usize>) -> bool`

Returns true if and only if `state`

is a match state.

`fn can_match(&self, state: &Option<usize>) -> bool`

Returns true if and only if `state`

can lead to a match in zero or more steps. Read more

`fn accept(&self, state: &Option<usize>, byte: u8) -> Option<usize>`

Return the next state given `state`

and an input.

`fn will_always_match(&self, _state: &Self::State) -> bool`

Returns true if and only if `state`

matches and must match no matter what steps are taken. Read more

`fn starts_with(self) -> StartsWith<Self> where Self: Sized`

Returns an automaton that matches the strings that start with something this automaton matches. Read more

`fn union<Rhs: Automaton>(self, rhs: Rhs) -> Union<Self, Rhs> where Self: Sized`

Returns an automaton that matches the strings matched by either this or the other automaton. Read more

`fn intersection<Rhs: Automaton>(self, rhs: Rhs) -> Intersection<Self, Rhs> where Self: Sized`

Returns an automaton that matches the strings matched by both this and the other automaton. Read more

`fn complement(self) -> Complement<Self> where Self: Sized`

Returns an automaton that matches the strings not matched by this automaton.