Блог пользователя sarkarasm

Автор sarkarasm, 4 года назад, По-английски

Usually, for long testcases the full testcases are not visible. Is there any reason for this? Cause, it will become much easier to debug our solution if we had access to them. If its not possible to show them, then at least let us download the file.

  • Проголосовать: нравится
  • +127
  • Проголосовать: не нравится

»
4 года назад, # |
  Проголосовать: нравится -36 Проголосовать: не нравится

commenting for better reach.

»
4 года назад, # |
Rev. 2   Проголосовать: нравится +15 Проголосовать: не нравится

I remember that someone has asked the same question long ago. But yeah it is hard to find that post again so here are some of the opinions on the post as well as mines.

  • Sure that fixing a program is a part of the debugging process. But debugging is not entirely fixing. Debugging is the process where you have to come up with the case where the program produces the wrong output, understand why, and then fix it.
  • $$$10^5$$$ numbers are actually a lot to fit in your screen. Also, there are lots of testcases, displaying all of them is not really ok.
  • Even if you got the full testcase, what do you gonna do with $$$10^5$$$ numbers then? Good luck go step by step with that amount of number!
  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    I also thought the same thing.

    • Most of the time I can think of the problem in my solution. But sometimes, no matter how hard I try, I cannot come up with testcases where my solution fails. In such situations, I have to turn up to others to figure out the problem. Although the codeforces community is really helpful, this method is not always reliable.

    • 10^5 numbers are not possible to display. Hence, I am asking for a file that we can download. CSES has that feature.

    • The CF judge usually tells us which testcase is wrong. Hence, it is not that hard to filter out. Even if it is not, we can build our own checker to filter out the wrong testcases.

    • »
      »
      »
      4 года назад, # ^ |
      Rev. 2   Проголосовать: нравится +17 Проголосовать: не нравится
      • In a real contest, knowing testcases are not possible. After the contest, if you really cannot come up with a testcase, write yourself a test generator, and often with or without a checker. And then test them yourself. I believe this is ok since you also mention that you can build your own checker.
      • But CSES only have at most like 30 test cases, on Codeforces there are likely to have 100 tests for each problem. Implement this feature might take more time to optimize. I am not saying that it is not possible but considering that you really do can debug without knowing big testcases, not implementing such features is logical (at least imo) and understandable.
      • Also imo knowing which testcase wrong does not contribute to your debugging process. It is the same as knowing your solution is wrong.
      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +8 Проголосовать: не нравится

        not implementing such features is logical (at least imo) and understandable

        No feature needs to be implemented at all. Just the problem packages from polygon need to be uploaded to google drive / dropbox. Some help is better than no help. I have seen problems with independent small test cases where my code got WA and despite trying for a long time I couldn't find any bug, eventually I had to give up. I am willing to give one day behind a huge test case rather than giving up without knowing why my solution fails.

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится +9 Проголосовать: не нравится

          Well first of all small testcases are visble on codeforces, the point of this discussion is about fully visible testcases, which means its about big testcases.

          And about publishing on dropbox/google drive or similarities. There are 2 problems arise.One is the data is actually big. Lots of space need to be rented in order to store those data. And secondly, both problem and testcases can be stolen for some local contest, and the real setters are not even credited.

          Of course helps is neccessary when stuck, but sinlge small help in one single problem don't help you avoid those stucking situation in a long run. Sadly only yourself can help you with that.

          • »
            »
            »
            »
            »
            »
            4 года назад, # ^ |
              Проголосовать: нравится +6 Проголосовать: не нравится

            I think you misunderstood my point about small testcases. There exists problems where there are T independent test cases (say T = 1e5) where each test case is just an array with 5-10 numbers. If your solution fails in 1000th test case you have no idea where it fails despite the test case being small actually.

            • »
              »
              »
              »
              »
              »
              »
              4 года назад, # ^ |
                Проголосовать: нравится +9 Проголосовать: не нравится

              You got the point. But the number 1000 is unlikely. The solution passed 999 tests, statistically, should be correct. Instead, it is about 10 to 30, which is, yeah, also visible.

              And you are desperate to see that small test case, here is one way to do it: first, you hash the first few number input where you got the answer wrong, let's call it $$$X$$$. Then in your code, you also hash the same amount of number in the same order. If you got the magic number $$$X$$$, then print out the testcase you got wrong at the beginning.

              • »
                »
                »
                »
                »
                »
                »
                »
                4 года назад, # ^ |
                  Проголосовать: нравится +6 Проголосовать: не нравится

                Hashing is also overkilled though. If you are sure the tests are having different first numbers, then just check one by one by hand.

          • »
            »
            »
            »
            »
            »
            4 года назад, # ^ |
            Rev. 3   Проголосовать: нравится +4 Проголосовать: не нравится

            Lots of space need to be rented in order to store those data.

            Dumb me: [I don't agree with it. 10000 problems * 100 test cases * 1 MB average for each test case = 1 GB. Google drive gives 15 GB for free.]

            You are right about it.

            I agree with your second point of problem being stolen. In that regards, I wish setters would be given freedom to release the full packages of problems publicly, only if they want.

            • »
              »
              »
              »
              »
              »
              »
              4 года назад, # ^ |
                Проголосовать: нравится 0 Проголосовать: не нравится

              Well, this one is the decision of our administrator and problem setter. I don't complain though if it is public :)

            • »
              »
              »
              »
              »
              »
              »
              4 года назад, # ^ |
                Проголосовать: нравится +48 Проголосовать: не нравится

              You should check your math.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +30 Проголосовать: не нравится

      Have you ever tried to debug a complicated algorithm which fails on a huge input? In many tasks it's not practical at all. In most cases the ability to see the beginning of the test data might give you some clues, if it doesn't then the full test data won't help either.

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        I remember I was solving a problem on trees and it failed on a big test. After looking at some of the initial edges, I figured it was a Bamboo and found the problem.

        So yeah, looking at the beginning does help sometimes!

  • »
    »
    4 года назад, # ^ |
    Rev. 2   Проголосовать: нравится +54 Проголосовать: не нравится

    Your comments in this thread mention that in a real contest they don't show testcases, so you shouldn't have them in upsolving either. But by that logic, why do they show small/truncated testcases in upsolving mode at all?

    I also personally don't agree with the general sentiment that it's ideal to replicate your contest environment while training -- that's what contests are for. While training, I'm happy to be able to ask friends for help, read editorials, and read the exact testcases where my code fails without the logistical overhead of writing a generator/checker locally (often this isn't even that hard, it's just time-consuming). I don't always do those things, but I sometimes do -- I think you should trust the trainee to only do them when appropriate.

    • »
      »
      »
      4 года назад, # ^ |
      Rev. 4   Проголосовать: нравится -10 Проголосовать: не нравится

      I don't mention that testcases should be eliminated in upsolving. But making you think that I implied it also is my fault. "Don't rely on wrong testcases" is what I wanted to say. Having a training environment without knowing testcases is one of the ways to eliminate that mindset. Of course, it is not ideal and optimal. I know that lots of people that have very little time to train for competitions, and what they should focus on is the algorithm and problem-solving skill. And solving a problem fast is the priority. And this part is not human-specific though, not everybody the same, therefore different training approaches should be applied.

      I don't agree that contests on various OJ are for training. The more correct term is virtual contest, which is also the environment without visible testcases during training session. It is true that for us Div 2/3 is for training, but that is not the ideal environment for people with that corresponding level. The real contest is for competing. If it is for training though then what is the point of the rating system?

      The rest I agree with you. That is also what I said below in this thread.

»
4 года назад, # |
  Проголосовать: нравится +3 Проголосовать: не нравится

I think some help is better than no help at all. There has been problems where despite thinking and trying extensively for a long time, I just couldn't find any corner case and eventually had to give up.

The only logic behind not publishing testcases I can think of is: setters give huge efforts behind setting problems and test cases, releasing testcases mean anyone can copy their problem in another OJ without their consent.

I wish CF would at least release inputs of testcases in dropbox like Atcoder. It will help a lot, but it doesn't completely give away everything (checker is must for problems with multiple solutions).

Also I think setters should be given freedom to release the full packages of problems publicly, only if they want to.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится -10 Проголосовать: не нравится

    I think you are misleading here. Codeforces does provide small testcases and you can use that, and I think that count as helps. And for big testcases, really, it does not help you debugging at all. Even if you have those big test case, what do you gonna do then? The only information you got for each big testcase is whether you are correct on that test or not. And that is also done by this great platform, codeforces. Big testcases are hard to analyze. Evem you can analyze it, it would take a lot of time, and I dont think it's worth.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится 0 Проголосовать: не нравится

      I have faced problems where there are T independent test cases where each test case is just ~30 numbers. If my solution fails in 200th test case, I have no idea where it fails despite the actual test case is actually small.

      There has been problems where I tried to stress test and write random test case generator to test but couldn't succeed in finding any corner case, because the failing test was actually hand-made and the property of the test was not recognizable by only seeing the visible portion.

      Evem you can analyze it, it would take a lot of time, and I dont think it's worth.

      In some cases it's worth it. It's better to have option rather than being forced to give up without finding out why my solution fails.

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится -10 Проголосовать: не нравится

        Before we go deeper, let's review about debugging in the real contest. In the real contest, you don't have a testcase right in front of your eye. It is all by yourself. And thus besides algorithmic and problem-solving skill, debugging skill is also a requirement. And it also needs to be trained during solving.

        I know that you want to save time for the debugging part. But what about viewing the AC code, as well as seeing the editorial, problem setter's code? Analyzing those codes is much easier with a detailed explanation. Then compare it to your code and see what wrong. That I think will save more time than just seeing a bunch of numbers.

        And this is the conner case I think you might have — you have another solution that you are sure it is correct and is different than the editorial. If you are that sure, then I think you are willing to try out writing test generators and your checker, as well as communicate with the others for help. Because this time it is not upsolving anymore, but proving the correctness.

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится 0 Проголосовать: не нравится

          I know that you want to save time for the debugging part

          It's not only about saving time, it's about having more options. Having option to spend one day analyzing a huge case is better for me than being forced to give up on my solution that differs from editorial without understanding why it doesn't work. There can be mistakes in proofs and/or code that are only recognizable with hand-made test case. I think it's better to have more options.

          • »
            »
            »
            »
            »
            »
            4 года назад, # ^ |
            Rev. 2   Проголосовать: нравится -10 Проголосовать: не нравится

            But that is not my point. My point is during the real contest you have literally no options rather than come up with by yourself, and therefore upsolving should also have the same environment.

            And I don't say that you should give up. There are options. I do mention that you can communicate with the other. You can even ask the setter for that hand-made testcase. And if the test is small enough, use the option I mentioned above so that you can pull out that test you want.

            Personal opinion tho, seeing the testcase, especially that special hand-made testcase, to me, seems like giving up. I guess what is better for you is not suitable for me.

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится +20 Проголосовать: не нравится

        In that case, you can use: if t==30 cout the input. I did the same thing here 87208919

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится -10 Проголосовать: не нравится

          Exactly! And I think there are a lot of tricks to check those, like using assert to test the our hypothesis.

    • »
      »
      »
      4 года назад, # ^ |
        Проголосовать: нравится +10 Проголосовать: не нравится

      Big testcases are hard to analyze. Evem you can analyze it, it would take a lot of time, and I dont think it's worth.

      This is not always true — if your submission has a bug that causes it to enter an infinite loop, or throw an exception, or erroneously allocate an unreasonably large amount of memory — and if your local testing can't trigger the bug — then getting the large test case to reproduce the bug locally is more than half the battle

      • »
        »
        »
        »
        4 года назад, # ^ |
          Проголосовать: нравится 0 Проголосовать: не нравится

        This I don't agree. You can do stress test yourself. If you accidentally have WA on a big testcase, then well your algorithm might be wrong. And to debug, in this case, require deeper analysis in order to fully understand your code.

        • »
          »
          »
          »
          »
          4 года назад, # ^ |
            Проголосовать: нравится +10 Проголосовать: не нравится

          Of course you can do stress test yourself, I said "if your local testing can't trigger the bug"

          • »
            »
            »
            »
            »
            »
            4 года назад, # ^ |
              Проголосовать: нравится 0 Проголосовать: не нравится

            If local testing does not trigger the bug, then seeing the huge testcase, then download it to your machine, then run it locally, might do the same. And also in that situation, you are likely to have some "undefined" behaviors, which should be well-known by C/C++ coder.

            • »
              »
              »
              »
              »
              »
              »
              4 года назад, # ^ |
                Проголосовать: нравится 0 Проголосовать: не нравится

              Yes, it's possible that your code fails (on the testing server) on a big test case, but when you download it and run it locally, you cannot reproduce the failure. But it's also possible that you can reproduce it locally, and I'd argue that it's actually quite likely (especially with tools like valgrind and ftrapv). In any case, in the cases where it turns out to be possible to reproduce the failure locally, you don't have to spend a lot of time after reproducing the error to understand your bug.

              • »
                »
                »
                »
                »
                »
                »
                »
                4 года назад, # ^ |
                Rev. 2   Проголосовать: нравится +6 Проголосовать: не нравится

                Please see some of my suggestions above for time saving when upsolving. That would save more time than just analyzing the testcase.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 года назад, # ^ |
                    Проголосовать: нравится +10 Проголосовать: не нравится

                  Which suggestions? The cases I'm thinking of don't require analyzing the testcase, just finding out the line of code my submission failed on

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 года назад, # ^ |
                    Проголосовать: нравится 0 Проголосовать: не нравится

                  here, same page. Sorry that I missed the word above, I just have editted it in.

                  Like I said, it is not as practical as stress testcase. And hell yeah, codeforces has diagnose feature like 2 or 3 months now, we don't need to do that by ourselves anymore.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 года назад, # ^ |
                    Проголосовать: нравится 0 Проголосовать: не нравится

                  Ah, I see. Yeah, I had read that before, it contains good advice.

                  Regarding diagnostics: they only apply to C++, right? AFAICT python stack traces are never printed on runtime errors.

                  I feel like if we are allowed to "cheat" by looking at small testcases in practice mode ("cheat" in the sense that it is an action we cannot do in contest mode), it shows that there is no philosophical objection to allowing us to "cheat" by downloading large testcases or by printing stack traces, just that it takes too much effort / resources to support those operations.

                • »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  »
                  4 года назад, # ^ |
                    Проголосовать: нравится 0 Проголосовать: не нравится

                  Philosophically, yeah it is possible to pull out the testcase, as I also said above. There might be the case where some real hacker could just write a compressor and print out compressed testcase.

                  But economically, not worth. For 100% pull out testcases of a problem with 100 testcases, 100 WA submissions need to be made, and that might also contain chances of ban because of spamming.

                  I thing print out the stacktrace is ok tho. Java has that feature builtin.

»
4 года назад, # |
Rev. 2   Проголосовать: нравится +1 Проголосовать: не нравится

Yes they can provide files,i would be great for begginer like me....Maybe Hackerearth provides file of test cases,but isn't it very complicated to find a bug from 10^5 numbers??and there are 100 even 200 test cases....just giving my opinion....