Help needed in understanding the prefix function.

→ Pay attention

Before contest
Codeforces Round 1006 (Div. 3)
2 days
Register now »

→ Top rated

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

Countries | Cities | Organizations

View all →

→ Top contributors

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	157
6	Qingyu	156
7	adamant	151
7	djm03178	151
7	luogu_official	151
10	awoo	146

View all →

→ Find user

→ Recent actions

Detailed →

AdHocMan's blog

Help needed in understanding the prefix function.

By AdHocMan, 4 years ago, In English

Today I was trying to solve the problem CF R1600 — Camp Schedule. I realized that I will need the longest prefix of a string which is also a suffix of that string. I searched it on google and found that it is called 'Prefix Function'. I tried to understand this concept from the CP Algorithms article, but I am too dumb to understand it.

Though I got the code from there what I needed, I can't understand how it works. Can anyone please help me to understand this concept? (with some examples?)

Prefix Function Code

// everything 0-indexed
// pi[i] = 0 -> empty string matched
// pi[i] = j -> prefix s[0..(j)] matched
int prefixFunction(const string& s){
    
    int n = sz(s);
    
    vector<int> pi(n);
    
    for(int i = 1; i < n; i++){
        
        int j = pi[i - 1];
        
        while(j > 0 && s[j] != s[i]) j = pi[j - 1]; // This part is hard to understand for me.
        if(s[j] == s[i]) j++;
        pi[i] = j;
    }
    return pi[n - 1];
}

Note: I am very new to this type of algorithms.

#string, difficult to understand, prefix function

AdHocMan
4 years ago
6

Comments (5)

Show archived | Write comment?

TiredOfLife

4 years ago, # |

Go to youtube and type "tushar roy kmp algorithm"

→ Reply

Everule

4 years ago, # |

Lets get the basic things clear. $$$\pi[i] - \pi[i-1] \le 1$$$, because each new character, can add only one more match.

In my amazing drawing, The substrings covered by the same color are the same, and we have processed upto $$$i$$$.

First we check, if we can extend red. That requires $$$C_2 = C_3$$$. If that is not true, we need to check with the next lower match. The next lower match, is cyan. to check if cyan works, we have to check $$$C_1 = C_3$$$.

Now if cyan doesnt work either, we want such a matching inside cyan. Which is $$$\pi[\pi[\pi[i]]]$$$. And we continuously keep on going down, to the next largest, until the next character matches. This is $$$O(n)$$$, because checking characters takes $$$O(1)$$$. Each character can add most one to $$$\pi$$$. and each, time we go back, the matching decreases by at least $$$1$$$. So we can deduce that the number of times the while loop runs cannot be more than $$$O(n)$$$.

Also keep in mind, is that the substrings may intersect, but it doesn't really matter. I didnt draw intersecting substrings, to not make the diagram too messy.

→ Reply

AdHocMan

4 years ago, # ^ |

while(j > 0 && s[j] != s[i]) j = pi[j - 1];

Isn't this that part which is searching the next lower match?

→ Reply

Everule

4 years ago, # ^ |

Well, read the loop like this.

while there is some matching left before it
         if Characters at the positions are equal : break
         else : go to next smaller matching.

→ Reply

AdHocMan

4 years ago, # ^ |

Thank you so much. The explanation was very clear to understand.

→ Reply