Hey all,
I find one of my biggest weaknesses is debugging efficiently under contest settings. If you look at my contest history, there are plenty of instances where I get horrible penalties and wrong multiple submissions on a given problem before potentially solving it. The most recent example is today's Educational Round, where I took 40 minutes to figure out what was wrong with my D solution.
I am well aware that there are plenty of blogs and posts on debugging upon Googling, but a lot of the advice is kind of generic and don't always work for me, so I was hoping to get some more specific answers via this blog.
I'll try to provide some context so you know what I'm already doing. Here is my current debug strategy during contests:
TLE/MLE/RE: Usually not very common for me. I just look for common sources of these errors. For TLE or MLE, I check things like allocating lots of vectors inside loops, while loop conditions and recursive functions, etc. For RE, I check for index out of bounds, etc.
WA: The verdict I get 90% of the time. In general, I first do a quick scan through my code for immediately obvious errors that I can catch on the spot.
A. WA on samples/I have a test case that I know breaks my program: Usually not too bad because I can just print intermediate variables and locate which part of my program is bust. I have a basic debug template so I can type DEBUG(x)
to print the contents of variables. Depending on how complicated the implementation is, this is probably fixable in 5-10 min if it's not a fault in my logic.
B. WA and I don't have a test case, but I do know a brute force solution: I'll write up the bash solution and generator, and run my bash solution against my current solution with a bash script. Once I find a test case, I proceed to case A. If I type fast, I might pull this off in 5-10 min as well.
C. WA, nothing to work with except the dreaded "WA on pretest X" message: This is where shit hits the fan. I try things like rereading the problem statement, rereading my code, looking over my logic. I try edge cases like n = 1, but it's rarely that obvious. I try to think of counter cases, but I can never think of the case needed to break my program. It probably doesn't help that when I can't debug quickly, I get frustrated, and that only makes it worse. This can take anywhere from 30+ min to never fixing it at all throughout the entire contest.
So I guess my question is, how do you all deal with case C? Thanks in advance.
My debugging process:
Step 0: Reread the problem statement and check for incorrect formatting + debug statements + output
More than once I've gotten WA by not reading right or doing something stupid. This seems trivial but this simple step can save you an hour on debugging.
Step 1: Figure out if my algorithm/solution is wrong.
Usually this happens when I try proof by AC and it fails. I then try to formally prove my results or formally disprove the results. If I can't prove the result, then I might try rethinking it. If the problem is hard enough that I can't prove/disprove the result, then there's nothing I can really do besides hope my guesses are correct. Sometimes if I can't prove my solution but have enough intuition to be confident in my results, I skip to step 2.
Step 2: Figure out where my implementation went wrong.
Once I'm reasonably confident my algorithm is probably right, I debug my code using print statements. Even though my code gets the right answer, it might be doing some funky stuff, so samples can still be helpful here. Printing is especially helpful here. The template I use (Benq's) has easy print methods, which makes it convenient for me to debug. I highly recommend make printing functions to make printing easier unless you have super fast typing skills.
I usually break my code into blocks where I try to see what happens in each block and whether it does things correctly. I've learned that I'm generally stupider than I think I am, and I can make mistakes anywhere, even in simple precomputation. So checking my code in blocks forces me to look at everywhere. Of course, if I'm highly suspicious about an area of code, I'll check that block first, but generally I look completely through my code. I'll also use some random test cases to check my logic in each block.
Step 3: I completed Step 1 and Step 2 and both have failed to resolve the issues.
This is the hard scenario where you either fake solved the problem or you have a very subtle bug. I usually go back to step 1, and try to figure out where I went wrong. If my algorithm is incorrect, it means either I've assumed something invalid in my proof or did something that doesn't make any sense. Being able to find out where you've fake solved something is a skill that takes time and practice to pick up, so this step is definitely not easy. After this, if I can't see why my solution could be wrong, I go back to step 2, and after a while of failing I go to step 4.
Step 4: Rewrite the code.
Sometimes your code gets so messy that it might be helpful just to completely rewrite it. You might see something you wouldn't see while skimming over your code. It's also good for just having something to do instead of just poring over code. You should be actively thinking while rewriting code instead of just copying though.
Step 5: Give up and move on.
I'm too stubborn to ever do this step, so I don't have really much experience with this. :P
General advice: - Keep your code clean. Especially when rewriting code or debugging, meaningful variable names will go a long way. I've confused myself with one letter variables names many times. - Keep track of your assumptions whether mentally or physically. This will help you figure out if your algorithm is wrong at some point if you previously glossed over something you didn't have time to prove.
Interesting. There definitely are moments where I could benefit from rewriting code or keeping it clean in the first place, so I will keep that in mind.
Your process is very similar to mine. But I almost always try step 2 before or with step 1, as I have seen my solution being accepted even when I had no idea why it works, quite a few times. Step 1 is often more time consuming in proof by AC problems, as you have to construct a non-trivial corner case (non-trivial in the sense that I have already tried to prove or disprove by corner cases before coding).
One thing I found is that often after getting WA I look into the code and need only like 1 or 2 minutes to spot the bug. If you look at solutions of other people this is often the case, too. They resubmit often fairly fast, and successful.
So now before I submit, I ask myself what will I now do if I get WA? And then just do that for like a minute or two.
Interesting exercise. Thanks for the reply!
Even if you do not know the brute force solution, it might be a good idea to run your code on a large number of random test cases and in the meanwhile check for the mistakes you tend to make. For example, I check for overflows by monitoring if something goes negative, or if I rely on something — check if it always holds (recent example, in the last Hacker Cup round, I was not updating dp states which were not reachable, with infinity, because I initialized them with infinity in the beginning — and I did not realize I was updating that place in the dp array multiple times, until I checked if the value was indeed always infinity).
Strange but highly effective tip: solve lots of 1500/1600 problems. They tend to contain enough substance to not be completely trivial but simple enough to have concise and elegant solutions, providing for perfect debugging practice.
Others have covered some really good tips, additionally, when it is a greedy problem but you are not sure about the proof and the bug is not in the implementation, and when there are relatively more number of AC submissions, I delete my code, start from scratch, trust me it works at times, because it could mean I probably took off to a wrong path, it is easier to find the right solution than to disprove/debug the wrong solution.
Don't forget some multi-testcases stuffs like reset your array or a vector