Struct fst::SetBuilder [] [src]

pub struct SetBuilder<W>(_);

A builder for creating a set.

This is not your average everyday builder. It has two important qualities that make it a bit unique from what you might expect:

  1. All keys must be added in lexicographic order. Adding a key out of order will result in an error.
  2. The representation of a set is streamed to any io::Write as it is built. For an in memory representation, this can be a Vec<u8>.

Point (2) is especially important because it means that a set can be constructed without storing the entire set in memory. Namely, since it works with any io::Write, it can be streamed directly to a file.

With that said, the builder does use memory, but memory usage is bounded to a constant size. The amount of memory used trades off with the compression ratio. Currently, the implementation hard codes this trade off which can result in about 5-20MB of heap usage during construction. (N.B. Guaranteeing a maximal compression ratio requires memory proportional to the size of the set, which defeats the benefit of streaming it to disk. In practice, a small bounded amount of memory achieves close-to-minimal compression ratios.)

The algorithmic complexity of set construction is O(n) where n is the number of elements added to the set.

Example: build in memory

This shows how to use the builder to construct a set in memory. Note that Set::from_iter provides a convenience function that achieves this same goal without needing to explicitly use SetBuilder.

use fst::{IntoStreamer, Streamer, Set, SetBuilder};

let mut build = SetBuilder::memory();
build.insert("bruce").unwrap();
build.insert("clarence").unwrap();
build.insert("stevie").unwrap();

// You could also call `finish()` here, but since we're building the set in
// memory, there would be no way to get the `Vec<u8>` back.
let bytes = build.into_inner().unwrap();

// At this point, the set has been constructed, but here's how to read it.
let set = Set::from_bytes(bytes).unwrap();
let mut stream = set.into_stream();
let mut keys = vec![];
while let Some(key) = stream.next() {
    keys.push(key.to_vec());
}
assert_eq!(keys, vec![
    "bruce".as_bytes(), "clarence".as_bytes(), "stevie".as_bytes(),
]);

Example: stream to file

This shows how to stream construction of a set to a file.

use std::fs::File;
use std::io;

use fst::{IntoStreamer, Streamer, Set, SetBuilder};

let mut wtr = io::BufWriter::new(File::create("set.fst").unwrap());
let mut build = SetBuilder::new(wtr).unwrap();
build.insert("bruce").unwrap();
build.insert("clarence").unwrap();
build.insert("stevie").unwrap();

// If you want the writer back, then call `into_inner`. Otherwise, this
// will finish construction and call `flush`.
build.finish().unwrap();

// At this point, the set has been constructed, but here's how to read it.
let set = Set::from_path("set.fst").unwrap();
let mut stream = set.into_stream();
let mut keys = vec![];
while let Some(key) = stream.next() {
    keys.push(key.to_vec());
}
assert_eq!(keys, vec![
    "bruce".as_bytes(), "clarence".as_bytes(), "stevie".as_bytes(),
]);

Methods

impl SetBuilder<Vec<u8>>
[src]

Create a builder that builds a set in memory.

impl<W: Write> SetBuilder<W>
[src]

Create a builder that builds a set by writing it to wtr in a streaming fashion.

Insert a new key into the set.

If a key is inserted that is less than any previous key added, then an error is returned. Similarly, if there was a problem writing to the underlying writer, an error is returned.

Calls insert on each item in the iterator.

If an error occurred while adding an element, processing is stopped and the error is returned.

Calls insert on each item in the stream.

Note that unlike extend_iter, this is not generic on the items in the stream.

Finishes the construction of the set and flushes the underlying writer. After completion, the data written to W may be read using one of Set's constructor methods.

Just like finish, except it returns the underlying writer after flushing it.