This is a java program to implement Aho-Corasick Algorithm. It is a kind of dictionary-matching algorithm that locates elements of a finite set of strings (the “dictionary”) within an input text. It matches all patterns simultaneously. The complexity of the algorithm is linear in the length of the patterns plus the length of the searched text plus the number of output matches. Note that because all matches are found, there can be a quadratic number of matches if every substring matches (e.g. dictionary = a, aa, aaa, aaaa and input string is aaaa).
Here is the source code of the Java Program to Implement Aho-Corasick Algorithm for String Matching. The Java program is successfully compiled and run on a Windows system. The program output is also shown below.
package com.sanfoundry.setandstring;
import java.util.Arrays;
import java.util.Scanner;
public class StringSearchUsingAhoCorasickAlgo
{
static final int ALPHABET_SIZE = 26;
Node[] nodes;
int nodeCount;
public static class Node
{
int parent;
char charFromParent;
int suffLink = -1;
int[] children = new int[ALPHABET_SIZE];
int[] transitions = new int[ALPHABET_SIZE];
boolean leaf;
{
Arrays.fill(children, -1);
Arrays.fill(transitions, -1);
}
}
public StringSearchUsingAhoCorasickAlgo(int maxNodes)
{
nodes = new Node[maxNodes];
// create root
nodes[0] = new Node();
nodes[0].suffLink = 0;
nodes[0].parent = -1;
nodeCount = 1;
}
public void addString(String s)
{
int cur = 0;
for (char ch : s.toCharArray())
{
int c = ch - 'a';
if (nodes[cur].children[c] == -1)
{
nodes[nodeCount] = new Node();
nodes[nodeCount].parent = cur;
nodes[nodeCount].charFromParent = ch;
nodes[cur].children[c] = nodeCount++;
}
cur = nodes[cur].children[c];
}
nodes[cur].leaf = true;
}
public int suffLink(int nodeIndex)
{
Node node = nodes[nodeIndex];
if (node.suffLink == -1)
node.suffLink = node.parent == 0 ? 0 : transition(
suffLink(node.parent), node.charFromParent);
return node.suffLink;
}
public int transition(int nodeIndex, char ch)
{
int c = ch - 'a';
Node node = nodes[nodeIndex];
if (node.transitions[c] == -1)
node.transitions[c] = node.children[c] != -1 ? node.children[c]
: (nodeIndex == 0 ? 0 : transition(suffLink(nodeIndex), ch));
return node.transitions[c];
}
// Usage example
public static void main(String[] args)
{
StringSearchUsingAhoCorasickAlgo ahoCorasick = new StringSearchUsingAhoCorasickAlgo(
1000);
Scanner sc = new Scanner(System.in);
System.out.println("Enter the main string: ");
String main = sc.nextLine().toLowerCase().trim();
System.out.println("Enter the pattern string: ");
String pattern = sc.nextLine().toLowerCase().trim();
ahoCorasick.addString(pattern);
int node = 0;
for (char ch : main.toCharArray())
{
node = ahoCorasick.transition(node, ch);
}
if (ahoCorasick.nodes[node].leaf)
System.out.println("A '" + pattern + "' string is substring of "
+ main + " string.");
else
System.out.println("A '" + pattern
+ "' string is not substring of " + main + " string.");
sc.close();
}
}
Output:
$ javac StringSearchUsingAhoCorasickAlgo.java $ java StringSearchUsingAhoCorasickAlgo Enter the main string: Sanfoundry Enter the pattern string: Asanfoundry A 'asanfoundry' string is not substring of sanfoundry string.
Sanfoundry Global Education & Learning Series – 1000 Java Programs.
advertisement
advertisement
Here’s the list of Best Books in Java Programming, Data Structures and Algorithms.
If you find any mistake above, kindly email to [email protected]Related Posts:
- Practice BCA MCQs
- Practice Information Technology MCQs
- Check Java Books
- Check Programming Books
- Apply for Computer Science Internship