tenshi_kanade's blog

By tenshi_kanade, history, 8 years ago, In English

Prologue:

OK, so I finally decided to learn Suffix Tree. It's a monster I'd never dared to look in the eyes, I had always gotten away with using Suffix Array with the tail between my legs until now. The thing is that some problems require a more efficient data structure, so I decided to leave my fears behind and learn it once and for all.

First of all, I read about it in a few books and some papers online. When I understood its final structure and how to work with it, I searched an implementation online (no, I didn't plan on coding everything from scratch). I found a good and efficient implementation on Geeks for Geeks, but unfortunately it was written in C and it worked with native types.

The tweaking and testing:

I worked with the code and tweaked it here and there. I removed unnecessary comments and upgraded the code to C++ to make it work with strings instead of char pointers. I also added some extra functionality that didn't originally come with the implementation (Longest Common Substring, Longest Repeated Substring, Occurence Count, etc.). I finally wrapped everything into a class and went on to testing.

Everything worked fine and very fast too. I solved a few problems, but then I came across some other ones that couldn't be solved 'cause I realized that the class I had in my hands worked only with one single string. A lot of problems that are solved with Suffix Trees require a different kind of data structure, one that supports working with multiple strings. That data structure, I learned, is called Generalized Suffix Tree. This is where the problems began.

Generalized Suffix Tree:

I searched online about it, but all the papers and articles I found built it by adding a different delimiter character for each word. The problem with this implementation is that you can work with a very small number of words. A lot of problems give you 500, 100 or even 104 strings. There aren't that many characters.

My idea was to use a single delimiter character for every word, but then do a DFS and count how many delimiters I pass along the way. When I reach a leaf, I know what word this suffix belongs to by the amount of delimiters I passed.

So the problem boils down to this: We have a tree with numbers written on the leaves. How do we efficiently compute for every inner node how many different numbers are written on the leaves in its subtree?

My workaround (the best idea that crossed my mind) was to do a DFS and maintain a set for each node with the numbers I found on its subtree. By merging smaller sets into bigger ones and upgrading pointers instead of moving everything from one place to another, I was able to achieve something in the lines of O(Nlog2N), but the problem is that Suffix Array + LCP + Range Query ends up being faster than this Generalized Suffix Tree. That's not very nice after all the effort I put into it. But I still have hopes, I think there has to be a better way out there.

Final question:

How do you guys implement a Generalized Suffix Tree and how do you efficiently compute for every inner node, how many words this suffix belongs to?

Thanks very much in advance.

Full text and comments »

  • Vote: I like it
  • +25
  • Vote: I do not like it

By tenshi_kanade, history, 8 years ago, In English

I want to tell a curious story that happened to me while solving a problem on SPOJ. It happened a few days ago.

  • Solved a problem. Submitted the code. Result: WA.
  • I, being pretty sure that my solution was correct and knowing that SPOJ's checker is an abomination, decided to resubmit the exact same code, only seconds apart from my first submission. Result: AC.
  • Not happy enough, I decided to test the unstable system of SPOJ. So I submitted the same code, for the third time. As I expected, the junk that SPOJ use went haywire again, and result was WA.

This was obviously not the first time I had problems with SPOJ and its rusty short-circuited servers. Every single time I want to solve a problem there, I know what's coming. I expect lots of unstable testing and false WA's.

So, I've made up my mind. From now on, or at least until SPOJ staff decide to do something about that junkyard they use, I'll only use SPOJ if I want to solve a problem that's not on any other online judge. Nothing will give more headaches, not even UVA with its ridiculous parsers.

OK, I just wanted to share my thoughts. Let's hope I don't get so many downvotes for my shitpost. Have a nice day!

Full text and comments »

  • Vote: I like it
  • +33
  • Vote: I do not like it

By tenshi_kanade, history, 9 years ago, In English

I've recently come across a very strange problem using C++ 11.

I submitted a solution for problem 620C - Жемчужинки using unordered_map and got TLE, then I submitted the exact same solution using map (it's supposed to be slower: O(1) amortized VS O(logN) per operation) and it got Accepted with 280 ms.

Submission with unordered_map: 15756653

Submission with map: 15759645

Is this some kind of problem with the C++11 compiler used here on Codeforces? Anyone got a clue?

Full text and comments »

  • Vote: I like it
  • +11
  • Vote: I do not like it

By tenshi_kanade, history, 9 years ago, In English

I experienced a strange WA on 576C - Точки на плоскости.

Compare these two submissions: 13000332 and 13000388. They are exactly the same, except that one submission had comparators <= and >=, while the other had comparators < and >. There shouldn't be any undefined behaviour as far as I can see, so I'm wondering why the WA in the first submission.

This would have been very frustrating had it been during a contest... Is there anything I'm missing here?

Full text and comments »

  • Vote: I like it
  • 0
  • Vote: I do not like it

By tenshi_kanade, history, 9 years ago, In English

Hi. I need a little help with an algorithm...

I've recently read the chapter on Suffix Array from the book Competitive Programming 2, and I've implemented it using the O(NlogN) method described there. It works, but only as long as there isn't a suffix which is itself a suffix of another suffix. For example, it doesn't work for the simple string "AAAA". I tried copying/pasting the code written in the book to make sure it wasn't my code that was wrong, but it didn't work either.

So, my question is: How do we overcome that problem and make the algorithm work even for those strings?

Here's the implementation in C++: Suffix Array

Full text and comments »

  • Vote: I like it
  • +8
  • Vote: I do not like it

By tenshi_kanade, 9 years ago, In English

I've seen YT videos and blog posts here of contests screencasts (mainly from dj3500), and I was planning to do a screencast. But the thing is I don't know what software to use. I was wondering what lightweight video capture program I can use, because the only one I know of is Camtasia Studio, and that's anything but lightweight. On the contrary, it's so CPU-heavy that I believe if I were to capture 2 hours of HD video nonstop, my hard drive would blow into a thousand pieces and my CPU would heat up to the point it would melt the motherboard, the case and the table that supports it.

Anyway, I'd appreciate if someone could englihten me on the matter and share a lightweight video capture program I can use to record a screencast.

Thanks in advance!

Full text and comments »

  • Vote: I like it
  • -5
  • Vote: I do not like it

By tenshi_kanade, 10 years ago, In English

258B - Little Elephant and Elections is driving me crazy!

I am getting WA in Test 3.

The problem statement states that the lucky digits in the Elephant's party must be strictly larger than the total number of lucky digits in all the other 6 parties. In that case, m = 47. Let Ci be the number of numbers with i lucky digits in the range [1, m]. For m = 47, C0 = 31, C1 = 14 and C2 = 2. Then the possible assignments are...

  • The elephant's party gets one of the 2 numbers from C2, then either one of the other 6 parties gets a number from C1 and the other 5 parties get numbers from C0, or all the 6 parties get numbers from C0. Then the assignments are 2 * 14 * 31 * 30 * 29 * 28 * 27 + 2 * 31 * 30 * 29 * 28 * 27 * 26 = 1631145600.
  • The elephant's party gets one of the numbers from C1, then all of the other 6 parties must get numbers from C0. In this case, the assignments are 14 * 31 * 30 * 29 * 28 * 27 * 26 = 7421712480.

This means the total number of correct assignments is 1631145600 + 7421712480 = 9052858080. This number modulo 109 + 7 is 52858017.

Then my question is: Where the hell is my reasoning wrong? Cause the answer seems to be 907362803.

I'll be grateful to any kind soul willing to enlighten me.

UPDATE: I figured it out. I was failing to consider the order of political parties. Thanks for opening my post anyway, and please don't downvote.

Full text and comments »

  • Vote: I like it
  • +13
  • Vote: I do not like it

By tenshi_kanade, 10 years ago, In English

I just want to make public my fascination towards this problem and recommend a tutorial to all the users. I loved solving this problem. It's a truly beautiful Eulerian Path problem in disguise.

It also helped me refresh my knowledge on the algorithm. I learned Eulerian Paths/Tours like 10 years ago when I first started at the USACO Gateway and then I rarely ever used it again. Now I googled it to refresh my memory, but I couldn't come upon any good implementations out there. There are a few, but all of them seem either overly complicated or inefficient. No other algorithm I found out there beats the beautiful recursive algorithm presented at the USACO Gateway.

For those not registered at USACO, you can view the tutorial here: Eulerian Paths/Tours Tutorial

Full text and comments »

  • Vote: I like it
  • +18
  • Vote: I do not like it

By tenshi_kanade, 11 years ago, In English

Hello!

Long ago that I do not write here on the blog. Tomorrow there is another Div 2 contest, so I'm going to use to try to amend a bit bad that I was in the previous contest. I'm going to try to not be so gil at last and a little review code before sending. I trusted too much because the problems were very easy... my mistake. Me di account which had a small error only when the lockee to hack, and there was no turning back... But well, I'm calm because I know that my real place is in Div 1.

Tomorrow I hope to do + 240 rating and be nothing be Div 1, and in the next that has reach the 1750 and remain purple. My ultimate goal is to become red as the thickness, but that is impossible. I am satisfied with being yellow.

Also now I am excited because the leprosy goes well, Dad! So I am going to go well!

Greetings comrades!

Full text and comments »

  • Vote: I like it
  • -7
  • Vote: I do not like it

By tenshi_kanade, 11 years ago, In English

I'd like to introduce myself.

My name is Pablo and I'm from Rosario, Argentina, one of the city most beautiful in the world. Cup me much programming, is what most excites me together with go to encourage the leprosy. Then I got here to see what wave... and the truth that is re good! This competition Abbyy in which I participated was re grosa, chabon! Re-zarpados problems were! And I d like to tell you to touch if the problem accepted it or not, because it allowed you to correct if there was some error idiot somewhere in the code. I hope to do more competitions like this, crazy!

All that don't drink me all these skills online is that we they discriminate against many much other cultures and countries, because the problems are nothing but English, and there are many who do not know English. I problems put on the translator to be able to make them, and sometimes I am confused a band that because it means evil. But outside of that, all jewel.

Well, I say goodbye then. Best regards!

Well, here down command just as I wrote above in Spanish for those who speak Spanish...

Me gustaría presentarme.

Me llamo Pablo y soy de Rosario, Argentina, una de las ciudad más lindas del mundo. Me copa mucho la programación, es lo que más me apasiona junto con ir a alentar a la lepra. Entonces me metí acá a ver qué onda... y la verdad que está re bueno! Esta competencia Abbyy en la que participé fue re grosa, chabón! Los problemas re-zarpados eran! Y me re gustó que te dijeran al toque si el problema lo aceptaban o no, porque te permitía corregir si había algún error boludo en alguna parte del código. Espero que hagan más competencias como esta, loco!

Lo único que no me copa de todas estas competencias on-line es que discriminan a muchas culturas y países, porque los problemas están en inglés nada más, y hay muchos que no sabemos inglés. Yo los problemas los meto en el traductor para poder hacerlos, y a veces me confunde una banda eso porque traduce mal. Pero fuera de eso, todo joya.

Bueno, me despido entonces. Saludos!

Full text and comments »

  • Vote: I like it
  • +28
  • Vote: I do not like it