Suffix Tries

A suffix trie is a type of trie mainly used for representing all the suffixes of a string. It is a tree where each node represents the suffix of a string. The root node is the entire string, and each child node is a suffix of the string that’s one character shorter than the parent node’s suffix.

For example, consider the string “banana”. The suffix trie for this string would look like this:

             root
           /  | \
          b   a  n
         /    |   \
         a    n    a
        /     |     \
        n     a      n
       /      |       \
      a       n        a
     /        |         \
    n         a          $
   /          |
  a           $
 /
$

Each path from the root to a leaf node represents a suffix of the string. The leaf nodes represent the end of a suffix, and the dollar sign ($) is used to indicate the end of the string.

The children store entire word instead of characters.

Features of Suffix Tries

Following are the Features of Suffix Tries −

Operations on Suffix Tries

Some of the common operations on suffix tries include:

  • Insertion: Insert a new suffix into the trie.
  • Deletion: Remove a suffix from the trie.
  • Search: Find a specific suffix in the trie.

Insertion Operation on Suffix Tries

We can insert a new suffix into the trie by following these steps:

1. Start at the root node.
2. For each character in the suffix:
    a. If the character is not in the current node's children, create a new node for the same.
    b. Then move to the child node corresponding to the character and repeat the process.
3. Repeat the process for all character in the suffix.
4. Mark the last node as a leaf node to indicate the end of the suffix.

Code for Insertion Operation

For example, to insert the suffix “ana” into the trie, we would follow these steps:

CC++JavaPython

// C Program to insert a suffix into a trie#include <stdio.h>#include <stdlib.h>#define ALPHABET_SIZE 26// Structure for trie nodestructTrieNode{structTrieNode* children[ALPHABET_SIZE];int isEndOfWord;};// Function to create a new trie nodestructTrieNode*createNode(){structTrieNode* node =(structTrieNode*)malloc(sizeof(structTrieNode));
   node->isEndOfWord =0;for(int i =0; i < ALPHABET_SIZE; i++){
      node->children[i]=NULL;}return node;}// Function to insert a suffix into the trievoidinsert(structTrieNode* root,constchar* suffix){structTrieNode* current = root;for(int i =0; suffix[i]!='\0'; i++){int index = suffix[i]-'a';if(current->children[index]==NULL){
         current->children[index]=createNode();}
      current = current->children[index];}
   current->isEndOfWord =1;}// Display the trievoiddisplay(structTrieNode* root,char str[],int level){if(root->isEndOfWord){
      str[level]='\0';printf("%s\n", str);}for(int i =0; i < ALPHABET_SIZE; i++){if(root->children[i]!=NULL){
         str[level]= i +'a';display(root->children[i], str, level +1);}}}intmain(){structTrieNode* root =createNode();insert(root,"banana");insert(root,"ana");insert(root,"na");insert(root,"a");insert(root,"$");char str[20];// Allocating memory for string bufferdisplay(root, str,0);return0;}

Output

Following is the output of the above code:

a
ana
banana
na

Deletion and Search Operations on Suffix Tries

Deletion and search operations on suffix tries are similar to insertion. We can delete a suffix by removing the corresponding leaf node from the trie. To search for a specific suffix, we can traverse the trie from the root node to the leaf node corresponding to the suffix.

Following are the steps to delete a suffix from the trie:

1. Start at the root node.
2. For each character in the suffix:
    a. Move to the child node corresponding to the character.
3. Mark the last node as a non-leaf node to delete the suffix.

Following are the steps to search for a specific suffix in the trie:

1. Start at the root node.
2. For each character in the suffix:
    a. Move to the child node corresponding to the character.
3. Check if the last node is a leaf node to find the suffix.

Code for Deletion and Search Operations

Here is the code for deletion and search operations on suffix tries:

CC++JavaPython

// C Program to delete and search a suffix in a trie#include <stdio.h>#include <stdlib.h>#define ALPHABET_SIZE 26// Structure for trie nodestructTrieNode{structTrieNode* children[ALPHABET_SIZE];int isEndOfWord;};// Function to create a new trie nodestructTrieNode*createNode(){structTrieNode* node =(structTrieNode*)malloc(sizeof(structTrieNode));
   node->isEndOfWord =0;for(int i =0; i < ALPHABET_SIZE; i++){
      node->children[i]=NULL;}return node;}// Function to insert a suffix into the trievoidinsert(structTrieNode* root,constchar* suffix){structTrieNode* current = root;for(int i =0; suffix[i]!='\0'; i++){int index = suffix[i]-'a';if(current->children[index]==NULL){
         current->children[index]=createNode();}
      current = current->children[index];}
   current->isEndOfWord =1;}// Function to delete a suffix from the trievoiddelete(structTrieNode* root,constchar* suffix){structTrieNode* current = root;for(int i =0; suffix[i]!='\0'; i++){int index = suffix[i]-'a';if(current->children[index]==NULL){return;// Suffix not found}
      current = current->children[index];}
   current->isEndOfWord =0;}// Function to search for a suffix in the trieintsearch(structTrieNode* root,constchar* suffix){structTrieNode* current = root;for(int i =0; suffix[i]!='\0'; i++){int index = suffix[i]-'a';if(current->children[index]==NULL){return0;// Suffix not found}
      current = current->children[index];}return current->isEndOfWord;}intmain(){structTrieNode* root =createNode();insert(root,"banana");insert(root,"ana");insert(root,"na");insert(root,"a");insert(root,"$");printf("Search for 'ana': %s\n",search(root,"ana")?"Found":"Not found");delete(root,"ana");printf("After deletion:\n");printf("Search for 'ana': %s\n",search(root,"ana")?"Found":"Not found");printf("Search for 'na': %s\n",search(root,"na")?"Found":"Not found");return0;}

Output

Following is the output of the above code:

Search for 'ana': Found
After deletion:
Search for 'ana': Not found
Search for 'na': Found

Time Complexity of Suffix Tries

The time complexity of suffix tries depends on the operations performed:

  • Insertion: O(n) for inserting a suffix of length n into the trie.
  • Deletion: O(n)
  • Search: O(n)

Applications of Suffix Tries

Suffix tries are used in various applications, including:

  • Pattern matching and substring search algorithms
  • Text indexing and searching
  • Genomic sequence analysis
  • Spell checking and correction
  • Biological sequence analysis
  • Information retrieval
  • Compression algorithms

Conclusion

In this chapter, we discussed suffix tries, a type of trie used for representing all the suffixes of a string. We explored the structure of a suffix trie, its features, operations, and applications. Suffix tries are a powerful data structure for substring searches, pattern matching, and other string-related operations.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *