C++ Program to Implement Boyer-Moore Algorithm for Pattern Searching

This is a C++ Program to implement Boyer-Moore algorithm. The idea of bad character heuristic is simple. The character of the text which doesn’t match with the current character of pattern is called the Bad Character. Whenever a character doesn’t match, we slide the pattern in such a way that aligns the bad character with the last occurrence of it in pattern. We preprocess the pattern and store the last occurrence of every possible character in an array of size equal to alphabet size. If the character is not present at all, then it may result in a shift by m (length of pattern). Therefore, the bad character heuristic takes O(n/m) time in the best case.

Here is source code of the C++ Program to Implement Boyer-Moore Algorithm for String Matching. The C++ program is successfully compiled and run on a Linux system. The program output is also shown below.

  1. /* Program for Bad Character Heuristic of Boyer Moore String Matching Algorithm */
  2.  
  3. # include <limits.h>
  4. # include <string.h>
  5. # include <stdio.h>
  6.  
  7. # define NO_OF_CHARS 256
  8.  
  9. // A utility function to get maximum of two integers
  10. int max(int a, int b)
  11. {
  12.     return (a > b) ? a : b;
  13. }
  14.  
  15. // The preprocessing function for Boyer Moore's bad character heuristic
  16. void badCharHeuristic(char *str, int size, int badchar[NO_OF_CHARS])
  17. {
  18.     int i;
  19.  
  20.     // Initialize all occurrences as -1
  21.     for (i = 0; i < NO_OF_CHARS; i++)
  22.         badchar[i] = -1;
  23.  
  24.     // Fill the actual value of last occurrence of a character
  25.     for (i = 0; i < size; i++)
  26.         badchar[(int) str[i]] = i;
  27. }
  28.  
  29. void search(char *txt, char *pat)
  30. {
  31.     int m = strlen(pat);
  32.     int n = strlen(txt);
  33.  
  34.     int badchar[NO_OF_CHARS];
  35.  
  36.     badCharHeuristic(pat, m, badchar);
  37.  
  38.     int s = 0; // s is shift of the pattern with respect to text
  39.     while (s <= (n - m))
  40.     {
  41.         int j = m - 1;
  42.  
  43.         while (j >= 0 && pat[j] == txt[s + j])
  44.             j--;
  45.  
  46.         if (j < 0)
  47.         {
  48.             printf("\n pattern occurs at shift = %d", s);
  49.  
  50.             s += (s + m < n) ? m - badchar[txt[s + m]] : 1;
  51.  
  52.         }
  53.  
  54.         else
  55.             s += max(1, j - badchar[txt[s + j]]);
  56.     }
  57. }
  58.  
  59. /* Driver program to test above funtion */
  60. int main()
  61. {
  62.     char txt[] = "ABAAABCD";
  63.     char pat[] = "ABC";
  64.     search(txt, pat);
  65.     return 0;
  66. }

Output:

$ g++ Boyer-Moore.cpp
$ a.out
 
pattern occurs at shift = 4
------------------
(program exited with code: 0)
Press return to continue

Sanfoundry Global Education & Learning Series – 1000 C++ Programs.

advertisement
advertisement

Here’s the list of Best Books in C++ Programming, Data Structures and Algorithms.

If you find any mistake above, kindly email to [email protected]

advertisement
advertisement
Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.