C Program to Implement Knuth-Morris-Pratt Algorithm for Pattern Searching

This is a C Program to implement KMP algorithm for string matching. Unlike the Naive algorithm where we slide the pattern by one, we use a value from lps[] to decide the next sliding position. Let us see how we do that. When we compare pat[j] with txt[i] and see a mismatch, we know that characters pat[0..j-1] match with txt[i-j+1…i-1], and we also know that lps[j-1] characters of pat[0…j-1] are both proper prefix and suffix which means we do not need to match these lps[j-1] characters with txt[i-j…i-1] because we know that these characters will anyway match. See KMPSearch() in the below code for details.

Here is source code of the C Program to Implement Knuth-Morris-Pratt Algorithm for String Matching. The C program is successfully compiled and run on a Linux system. The program output is also shown below.

  1. #include<stdio.h>
  2. #include<string.h>
  3. #include<stdlib.h>
  4.  
  5. void computeLPSArray(char *pat, int M, int *lps);
  6.  
  7. void KMPSearch(char *pat, char *txt) {
  8.     int M = strlen(pat);
  9.     int N = strlen(txt);
  10.  
  11.     // create lps[] that will hold the longest prefix suffix values for pattern
  12.     int *lps = (int *) malloc(sizeof(int) * M);
  13.     int j = 0; // index for pat[]
  14.  
  15.     // Preprocess the pattern (calculate lps[] array)
  16.     computeLPSArray(pat, M, lps);
  17.  
  18.     int i = 0; // index for txt[]
  19.     while (i < N) {
  20.         if (pat[j] == txt[i]) {
  21.             j++;
  22.             i++;
  23.         }
  24.  
  25.         if (j == M) {
  26.             printf("Found pattern at index %d \n", i - j);
  27.             j = lps[j - 1];
  28.         }
  29.  
  30.         // mismatch after j matches
  31.         else if (i < N && pat[j] != txt[i]) {
  32.             // Do not match lps[0..lps[j-1]] characters,
  33.             // they will match anyway
  34.             if (j != 0)
  35.                 j = lps[j - 1];
  36.             else
  37.                 i = i + 1;
  38.         }
  39.     }
  40.     free(lps); // to avoid memory leak
  41. }
  42.  
  43. void computeLPSArray(char *pat, int M, int *lps) {
  44.     int len = 0; // lenght of the previous longest prefix suffix
  45.     int i;
  46.  
  47.     lps[0] = 0; // lps[0] is always 0
  48.     i = 1;
  49.  
  50.     // the loop calculates lps[i] for i = 1 to M-1
  51.     while (i < M) {
  52.         if (pat[i] == pat[len]) {
  53.             len++;
  54.             lps[i] = len;
  55.             i++;
  56.         } else // (pat[i] != pat[len])
  57.         {
  58.             if (len != 0) {
  59.                 // This is tricky. Consider the example AAACAAAA and i = 7.
  60.                 len = lps[len - 1];
  61.  
  62.                 // Also, note that we do not increment i here
  63.             } else // if (len == 0)
  64.             {
  65.                 lps[i] = 0;
  66.                 i++;
  67.             }
  68.         }
  69.     }
  70. }
  71.  
  72. // Driver program to test above function
  73. int main() {
  74.     char *txt = "ABABDABACDABABCABAB";
  75.     char *pat = "ABABCABAB";
  76.     KMPSearch(pat, txt);
  77.     return 0;
  78. }

Output:

$ gcc KMP.c
$ ./a.out
 
Found pattern at index 10

Sanfoundry Global Education & Learning Series – 1000 C Programs.

advertisement
advertisement

Here’s the list of Best Books in C Programming, Data Structures and Algorithms.

If you find any mistake above, kindly email to [email protected]

advertisement
advertisement
Subscribe to our Newsletters (Subject-wise). Participate in the Sanfoundry Certification contest to get free Certificate of Merit. Join our social networks below and stay updated with latest contests, videos, internships and jobs!

Youtube | Telegram | LinkedIn | Instagram | Facebook | Twitter | Pinterest
Manish Bhojasia - Founder & CTO at Sanfoundry
Manish Bhojasia, a technology veteran with 20+ years @ Cisco & Wipro, is Founder and CTO at Sanfoundry. He lives in Bangalore, and focuses on development of Linux Kernel, SAN Technologies, Advanced C, Data Structures & Alogrithms. Stay connected with him at LinkedIn.

Subscribe to his free Masterclasses at Youtube & discussions at Telegram SanfoundryClasses.