haccepted-0.1.0.0: Data structures and algorithms
Safe HaskellNone
LanguageHaskell2010

SuffixArray

Description

Suffix array and LCP array

A suffix array is the sorted array of all suffixes of a string, each suffix represented by its start index. An LCP (longest common prefix) array is an array of the lengths of the longest common prefixes of consecutive suffixes in the suffix array. A suffix array, sometimes with its LCP array, can be used to perform tasks like finding patterns in the string or calculating certain properties of the string. Other suffix structures, such as suffix trees and suffix automata may serve as alternates to suffix arrays.

This implementation constructs the suffix array and LCP array from a given string indexing function, with elements of type Int. It takes O(n log n), which is usually fast enough. O(n) algorithms to construct suffix arrays exist.

Sources:

Synopsis

Documentation

type Chr = Int Source #

buildSufA :: Chr -> Int -> (Int -> Chr) -> (UArray Int SuffixId, UArray Int Int) Source #

Builds a suffix array and LCP array. n is the length of the string. at is a 0-based indexing function into the string. Characters must be in [0..b-1]. Faster than buildSufAL unless b is too large. O(b + n log n).

buildSufAL :: Int -> (Int -> Chr) -> (UArray Int SuffixId, UArray Int Int) Source #

Builds a suffix array and LCP array. n is the length of the string. at is a 0-based indexing function into the string. Intended for large alphabets. O(n log n).