Two-dimensional pattern matching
In computer science, two-dimensional pattern matching is the problem of locating occurrences of a two-dimensional matrix of characters ("the pattern") in a bigger two-dimensional matrix ("the picture", or, in analogy with string searching, "the text").[1]
The naïve solution
Assume given a pattern and a text , where . The simplest approach is to compare P against every sub-array of size in T. This algorithm has a worst-case time of . It is usually assume that , whence this can be written as .
An automaton-based solution
We shall illustrate a more efficient solution (due essentially to Bird 1977) by means of an example. Suppose we have the following pattern and text
abaab
aba baaab
P = bab T = abaab
aba babab
ababa
We first construct a Aho-Corasick automaton to search a linear text for occurrences of the columns of P. As a biproduct we obtain an identifier for every column, so that two identical columns get the same identifier: in our example, say the first (and third) columns are idntified by 0, and the middle column by 1.
Next, we run the automaton on the columns of T, obtaining (in linear time) an indication whenever one of the columns of P appears in T: in our example this will be a 3×5 matrix C as follow (a "-" marks an absence of a match):
abaab
aba baaab
P = bab T = abaaa C = 01---
aba babab 10--1
ababa 010-0
We now use the Knuth-Morris-Pratt Algorithm to search the rows of C for a pattern identical to P, i.e., 010. When we find one, we have identified an occurrence of P in T. Note that it is not necessary to keep all of C, or all of T, in memory; T can be read row by row, and from C one only needs to keep one row in memory---along with a row of states of the Aho-Corasick and the KMP automata.
The running time of this algorithm is if one ignores the dependence on the size of the alphabet, which exists in the Aho-Corasick algorithm. Taking this into account, the complexity is where k is the size of the alphabet.
More efficient solutions
The dependence on the size of the alphabet was removed by an algorithm of Galil and Park.[2]
See also
Footnotes
References
- Apostolico, Alberto (1999). "Chapter 13: General Pattern Matching". In Atallah (ed.). Algorithms and Theory of Computation Handbook. CRC Press. pp. 13–11. ISBN 0849326494.
- Bird, Richard S. (1977). "Two dimensional pattern matching". Information Processing Letters. 6 (5): 168–170.
- Galil, Zvi; Park, Kunsoo (1996). "Alphabet-independent two-dimensional witness computation". SIAM Journal on Computing. 25 (5): 907–935.
Content Disclaimer
Informasi ini disarikan dari Wikipedia dan disajikan kembali untuk tujuan edukasi. Konten tersedia di bawah lisensi CC BY-SA 3.0. Kami tidak bertanggung jawab atas ketidakakuratan data yang bersumber dari kontribusi publik tersebut.
- The information displayed on this website is sourced in part or in whole from Wikipedia and has been adapted for the purpose of restating it. We strive to provide accurate and relevant information, however:
- There is no guarantee of absolute accuracy. Wikipedia is an open, collaborative project that can be edited by anyone, so information is subject to change.
- It is not intended to constitute professional advice. The content displayed is for informational and educational purposes only. For important decisions (e.g., medical, legal, or financial), please consult a professional.
- Content copyright. Wikipedia is licensed under the Creative Commons Attribution-ShareAlike License (CC BY-SA). This means that content may be reused with appropriate attribution and shared under a similar license.
- Responsible use. Any risk arising from the use of information from this website is entirely the responsibility of the user.