Warning:This article is only for beginners.Don't waste your time reading if you have known about suffix-array.
So this is today's problem.
443B - Kolya and Tandem Repeat
Although the question is so easy,there is lots of fun in it,if you go deep.
1.
First,why is it a easy problem?
The thing who makes this question so easy is the limit of n: n<=200.
First every high school students shall be able to get it AC by a O(n^3) sulution.
In which we can use one demention for the start, one demention for the end,one demention for the check.
Wow.A really bruteforce solution.
But in this question ,the solution runs so fast that everybody gets a 15msAC,which means the fastest CF can configure.
But,when we face bigger test data, the n^3 solution goes never so fast again.
What will happen when we add n to 5000?
We use one of the 15ms O(n^3) solution for test.
Here is the solution :
(Sorry Natureal,don't beat me,at least not to hard~)
And Here is the data :
(I random it.)
Well on my computer it runs for 38.846s.
2.
So if there a quicklier solution?
Ofcause there is.
We can use Rabin-Karp to speed up the third demention,in witch we check if a given begin and end can do,from O(n)to O(1).
That means the whole solution is O(n^2).
You can easily learn basic of Rabin-Karp from any website,so I don't explain for it,but I do give comment in code.
Use it to deal with the n=5000 data,and it runs for 0.723s.
Don't stop thinking.
38.846/0.723 = 53.729.
So why a n^2 solution faster than a n^3 solution only 54 times,while n == 5000?
Because Rabin-Karp has lots of [Modulo Operation],which is far slower than [simply check if equal] in that n^3.
3.
Let's get it better.
In that solution I check k and n.
when n < k,that's obvious that he can take the whole string(if it's odd,-1).
Or,we shall do the n^2.
maxa means the largest answer we had met,and it starts from 0.
As you can see,I use maxa to jump the answers whose len<=maxa.I only check substrings whose length>maxa.
But wait a minute.Why let maxa be 0?
At the begining we do have a solution of len=2*k,which means use the k spaces he add to mirror the last k words.
let's add it.
Also ,you can know that in my code,when the check-work get "yes",the second demention will still run,because the next
solution is always lager than the last in one loop.
So why don't we get it reversed?
We can do it from the largest to shortest.when check-work get "yes",simply break the second loop.
so we add the two optimization to our origion n^2 code.
And I get a 0.026s on my n=5000 data.
Why so fast?
Because I made that data n=k=5000.so the advanced n^2 solution will let maxa = k in the begining,just the max.
So the next n^2 = 0^2,so 0.026.
Well that's not fair because not every time we are so lucky to get k similar to n.
So I change k in the n=k=5000 data.now he is n=5000;k=1000 data.
And get a 0.143s.
Not fair again.so we use the original n^2 code to run n=5000;k=1000 data.
And get a 0.282s.
This time fair enough.
But notice that when k get more larger,the advanced code runs far more faster.
Interesting,right?
4.
So how to get it faster?
I know some of you knows about suffix-array.
All right,we do can use suffix-arry to deal with the strings totally in n,and use exgcd to deal with the strings who is in both n and k.
And that is O(n).Rignt,I know.
But I want a solution to get my code faster now.
Just imagine I am a green-hand,knowing nothing about suffix-array.Not every time we have high-tech weapons to deal with problems in contest.
OK HERE I DO REALLY HAVE A CRAZY SOLUTION!!!
Bad news is that it can only deal with randomed data.Targeted-and-well-designed data can hack it.But when data is randomed,it really works.
run n=5000;k=1000 data.(data is randomed)
get a 0.018s.
When n gets larger,the difference gets biggger.
run n=500000;k=100000 data.(data is randomed)
22683992 gets 0.387s
22681668 gets ...
10 mins later.
still running.
20 mins later.
still running.
And it has no sign like "Oh I will be out soon!".Nope.Maybe next time I shall write something like that to show me how far it has gone.
All right,what ever.
I'd go sleep now.Hope I can report when I wake tomorrow.
Actually it will be simpler to understand if you've learned bucket-sort.
The main idea is : if it do,then there is two same strings one close to another.And if it do,the two strings must be same.And if the two strings same,then their first three words is equal.AH,here is it.WE ONLY CHECK TWO STRINGS WITH SAME PREFIX OF LENGTH 3.
The strategy is really really really crazy,and since there is lots of great HACKERS on CF,I know it does not work.
But in other places,things could be different,especially in some places totally without HACK.
Or we can still see it as a joke.
By the way,if the data is godly-perfect randomed,the solution woule be O(26^3 + n^2 / 26^3 )
Ofcause it would't be.
Thanks for the the force of buckets!
See you next time~