Abstract
We present a novel implementation of compressed suffix arrays exhibiting new tradeoffs between search time and space occupancy for a given text (or sequence) of n symbols over an alphabet , where each symbol is encoded by lg|Σ| bits. We show that compressed suffix arrays use just nH_h + O(n lg lg n/lg_|Σ| n) bits, while retaining full text indexing functionalities, such as searching any pattern sequence of length m in O(m lg|Σ| + polylog(n)) time. The term H_h ≤ lg|Σ| denotes the hth-order empirical entropy of the text, which means that our index is nearly optimal in space apart from lower-order terms, achieving asymptotically the empirical…
Citation impact
672
total citations
- FWCI
- 18.26
- Percentile
- 100%
- References
- 20
Citations per year
Authors
3Topics & keywords
Topics
Keywords
- Compressed suffix array
- Suffix
- Search engine indexing
- Alphabet
- Entropy (arrow of time)
- Combinatorics
- Multiplicative function
- Suffix array
No related works found for this paper.