AtCoder C++ compiler doubt

4 years ago, # ^ |

+16

I don't know. I just like to solve puzzles, and if we can set some options which will be used by most platforms by default that should be beneficial for me as I don't have to think which compiler options are used in this particular platform. Looks like some people (now including me due to influence of my teammate) believe that asserts should be executed. It feels like this is true for most platforms, but I'm not sure as I don't usually look at the compiler flags until something unexpected happens. If this actually a standard I think it might be better to remove the option, but I should be the last one to ask if this is actually a standard or not.

→ Reply

traviscott

4 years ago, # |

+238

"Compilator" — Um_nik, 2020

→ Reply

4 years ago, # ^ |

+133

Yeah, I also don't know what's the right English word for many things, I just used Russian word because I thought that the Russian word came from English directly.

→ Reply

Monogon

4 years ago, # ^ |

+503

He doesn't understand English, he just memorized some stuff that's enough for solving puzzles.

→ Reply

4 years ago, # ^ |

+147

Does anybody understand English?

→ Reply

mani100011

4 years ago, # ^ |

-104

I'm not sure you understand the meaning of the word "understand".

→ Reply

traviscott

4 years ago, # ^ |

-55

Um_nik by puzzles do you mean tasks?

→ Reply

Agile_Eagle

4 years ago, # |

+52

I just memorized some stuff that's enough for solving puzzles

Weird flex but ok.

→ Reply

antontrygub

4 years ago, # ^ |

+26

this is the exact opposite of flex

→ Reply

4 years ago, # |

+152

I'd say NDEBUG shouldn't be defined by an online judge.

The flag turns assertions into no-ops. The standard caveat of an assertion is that, when you disable assertions, assert(fun(n)) discards the fun(n) part as well.

In contests, people tend to put all sorts of checks inside assertions. More importantly, often the contestants make debug submissions relying on the assert working, and as far as I know, it's considered a good practice in contests. Defining NDEBUG system-wide is therefore counter-productive. The user can always define it explicitly if they want, maybe even as part of their #ifdef LOCAL or #ifdef ONLINE_JUDGE template.

→ Reply

albjeno

4 years ago, # ^ |

-48

Why would it be considered good practice to rely on something that is supposed to be disabled in the the final build – at least in a professional context? Assertions should be used to check pre- and postconditions, not to change the state by calling a mutating function. This seems like a misuse of a language feature.

You can also reenable assert by putting #undef NDEBUG before you include the header cassert, so your last argument is not very convincing.

PS: Needless to say that offering compilers with different flags is a poor decision if it had been deliberate at all.

→ Reply

4 years ago, # ^ |

+31

Some common uses of assert by contestants may indeed be considered misuses. Problem is, what would you write instead, in vanilla C++, to check validity of statements about your program? This way is short and concise, and it's better than nothing, and better than rolling up your own. This is what matters a lot in a contest program.

I agree that assert shouldn't call functions with side effects, and when people learn it, it goes the painful way. But it's beside the point of defining NDEBUG in a judge environment.

I don't agree that contest judge is a production environment which should have assertions disabled. One of a judge's valid and agreed uses in the contest is helping debug your solution.

And as we can see, requiring users to write #undef NDEBUG is against the principle of least astonishment.

→ Reply

albjeno

4 years ago, # ^ |

-53

A statement is by definition non-mutating. Whenever we want to execute a function with side-effects and check the return value, we should write it exactly like that, i. e. auto result = change(); assert(result);. Typing speed is hardly a hindrance to a good performance (maybe unless you are in the Top 20 of users) so these extra keystrokes shouldn't hurt.

I did not mean to say judges should only run release builds. I just wanted to make the misconception about the intended behavior of assert clear. On the other hand, I wasn't aware you could check the output of judge's run during a contest.

→ Reply

4 years ago, # ^ |

+16

Usually debug submitting goes like this: put a couple assertions, and a Wrong Answer turns into a Runtime Error. So you narrow the bug down to what you checked in these assertions.

→ Reply

4 years ago, # ^ |

+22

What are you talking about "a statement is by definition non-mutating"? Maybe you are used to writing it like that, but that doesn't mean everyone is.

How do you know the intended behavior of assert? Are you the developer of C++ who added asserts?

→ Reply

dalex

4 years ago, # ^ |

-30

Outside of competitive programming, release builds are built with -DNDEBUG (asserts are skipped).

→ Reply

4 years ago, # ^ |

You say it like it's predetermined, it was not someone in your company who wrote that compilation line

→ Reply

dalex

4 years ago, # ^ |

← Rev. 2 →

+13

It's not only in my company, so I believe it's one of the best practices in software industry

→ Reply

Xellos

4 years ago, # ^ |

That's not the point of his post. Treat it like "a company" or "one's company".

→ Reply

PavelKunyavskiy

4 years ago, # ^ |

← Rev. 2 →

+18

I don't have good source for statistics, but I would bet on 50/50 on enabling and not. And a lot of projects have there own assert implementation to avoid this.

Probably it's more frequent on Windows, because it's default behavior for MSVC (-DNDEBUG in Release and no -DNDEBUG in Debug).

→ Reply

4 years ago, # ^ |

+21

Seconding this, the C++ spec (https://en.cppreference.com/w/cpp/error/assert) says nothing about "by definition non-mutating", it only says that it evaluates to 0 if NDEBUG is defined and checks the boolean if NDEBUG is not defined.

Also, it's completely untrue that release builds are always built with NDEBUG. I work on a "real production codebase" at a "real software company" and our builds to not use NDEBUG. In C++, there's no such thing as "a standard release build", there are only options that the compiler/language gives you, e.g. the option to set NDEBUG to disable all asserts.

The point here is to be simple, transparent, and accessible by using essentially the default set of flags. If anything, I think -DNDEBUG is a super niche and weird feature, and it's only a step away from -Dassert=printf("HAHAHAHA\n").

→ Reply

WolfBlue

4 years ago, # ^ |

← Rev. 2 →

-31

Uh, assert is definitely intended to be non-mutating... there is a reason that there exists a flag to disable it. There's no flag to disable sort(), and nobody would ever dream of making one, because it is intended to be mutating. One does not have to be the developer of C++ to know this is good practice, it's been trained into anyone with any idea of how to do decent software development

→ Reply

4 years ago, # ^ |

+12

This is a good practice, I understand pros of that, and I don't doubt that you have been trained that. But please don't say anyone, and don't say what people were intending with asserts. You have to be the developer of C++ (and the exact one who made asserts) to know the intent.

→ Reply

4 years ago, # ^ |

+21

Just to go to a bit of an extreme: there totally is a flag to disable sort, it's -Dsort="". More reasonably, there are flags to set your heap size to 0 (force all allocations to fail), there are flags to disable the standard library entirely (-nostdinc++), and many more configurations that people do use in practice, but we obviously shouldn't care about.

The point is that no code has any obligation to work with all possible (intended) flags, especially in a production environment. For many many C++ projects, the build system and compile flags are as much part of the source code as the .cpp files, and they won't work if you change random flags. Don't take "the spec writers intended for this to be something you can choose to use" as "you should/must use this feature".

→ Reply

PavelKunyavskiy

4 years ago, # ^ |

+31

Too be formally correct, -Dsort="" will just not compile, because it will try to define name function with empty name in algorithm.

Something like -D_GLIBCXX_ALGORITHM would work better ;)

→ Reply

WolfBlue

4 years ago, # ^ |

-19

Wait, there is a flag to disable sort??? What the... Congratulations, you have just won the argument.

→ Reply

Xellos

4 years ago, # ^ |

Problem is, what would you write instead, in vanilla C++, to check validity of statements about your program?

Return codes. In vanilla C, too, in fact. Both show up as RE instead of WA. It has an added benefit — if an online judge shows you just the return code, it's already telling you what "assertion" you failed.

I agree that -DNDEBUG shouldn't be in an online judge, it's useful for things like production builds which isn't the case here.

→ Reply

4 years ago, # ^ |

Return codes, yes, actually true to some extent. However, when you are in a function other than main and, for example, want to literally check preconditions and postconditions, return codes get cumbersome.

→ Reply

Xellos

4 years ago, # ^ |

Depending on the specific case, return codes for the program instead of function work as well: if(condition) exit(1) is RTE just like assert(condition).

→ Reply

-is-this-fft-

4 years ago, # ^ |

+57

Although I would never dare to mutate the state with an assert, I think principles from software engineering do not need to be blindly implemented in competitive programming. Best practices in software engineering exist because they make something better: if those things don't make anything better in competitive programming then they are not necessary here.

→ Reply

4 years ago, # ^ |

I think "assert is supposed to be disabled in the final build" is not true. It can be disabled as an optimization if you choose, but I don't think it's true that all "professional contexts" use -DNDEBUG.

I think it's reasonable to expect the judge to be an unurprising and unmodified base environment with base flags (additive changes like -Iboost only), and having the -DNDEBUG flag is certainly surprising and not additive.

→ Reply

VLamarca

4 years ago, # |

+46

This topic is related to a discussion I had the other day. Are there cases in which the use of assert is ilegal and should result in disqualification of the team (in lets say ICPC)?

This may sound like an absurd question at first because one may think that anything you can code should be acceptable, but that is not true: if someone uses the judge result to discover the hidden test cases (by doing a binary search for example — if n>=x WA, else TLE) this is ilegal (or isnt it?). I believe that one rule is that every submission should be an actual attempt of solving the problem. So, suppose I submit a code and I got WA. Can I add only an assert to the code to check if the error is in a certain supposition of mine? Maybe not because that is not an attempt of solving the problem. But what if the first submission contained the assert already and I got RTE? I get free information that the error is in the suposition. But this is just weird, maybe this is a loophole. Anyone has any thoughts on that?

→ Reply

4 years ago, # ^ |

+86

No, this is explicitly an allowed use of assert; that's the whole point of having a separate RTE verdict from a WA verdict. The judge telling you RTE from failing an assert is similar to the judge telling you RTE from an out-of-bounds. You should be using asserts for this reason, it's one the the tricks that really helped me.

However, there is a question of whether we're giving too much feedback, which is obviously a balance. Overall, I think allowing this feedback is good for the contests, because it makes them more fun for the contestants (less stress from unknowns/debugging), and is still reasonably skill-based.

There are several arguments for witholding feedback:

You don't want contestants to extract/hack the test data. This is mostly a concern in IOI-style contests because there is no penalty system for many submits and there is ample time to make many submissions, though IOI also mitigates this with submission limits. (I'm totally guilty of doing this; I've fully extracted test data through WA/TLE/RTE, through partial scoring, and through runtime, as well as just using it to debug.)
Contestants "should" be able to code correct solutions on their own, that's the whole point of CP.

The main argument for giving more feedback is that it makes the contest a better test of skill, rather than having too much luck from random bugs. (This is similar to how competitive Super Smash Bros is played without items; just because a better player "should" be better with items, doesn't mean it's good for the competition.) Secondarily, it also makes the contest more fun for contestants because you're able to solve/debug more problems.

How much feedback you give is also a spectrum. Here are a few points on the spectrum:

No feedback. Contests were like this for a while, I think USACO didn't get feedback until ~2012. This obviously doesn't have any problems with extracting test data, but also is very high variance because even top contestants sometimes have bugs. In this, you have no choice but to spend any extra time diffing your solution against a slow, which kinda sucks as a competitor.
Per-problem verdict (AC/WA/TLE/MLE/RTE/CE/PE), taken from first failed case. Modern IOI's do this (per subtask), as well as ICPC. In my opinion this is a good, conservative balance. It's very difficulty to extract test data because you only get the verdict for the first failed test case, and you can extract at most a couple bits at a time, though it's not foolproof.
Per-problem verdict, with first failed case number. Codeforces does this (though they also tell you runtime + memory usage), as well as a few ICPC training platforms (e.g. ejudge). This definitely can be very useful in debugging (whether you're passing the same case or not), but still very hard to extract the test data because you only get aggregate statistics. This is also a good balance.
Per-test verdict, sometimes with per-test memory and time usage. I think IOI did this in 2014, and many online judges do this. This definitely gives you avenues to extract test data, because you can leak data through verdict (~2 perfect bits), time (sleep(n/1000)), and memory (a = new int[k*1000]). There are many variants of this; AtCoder randomizes the order of test cases, many judges stop after the first failed case. If leaking data is not a concern (e.g. because penalty or because the contest is too short for it to be practical), this is also decent.
All test data. No one does this outside of practice mode, but it's obviously good to help people who are just upsolving.

Overall, I think we've hit a happy medium with per-problem verdict, maybe with index-of-failed-case and max runtime/memory.

None of these directly address your question of whether giving different verdicts for RTE/WA/TLE/MLE is good. I think these are small and helpful hints to give. I think not distinguishing TLE/WA would be very annoying to debug, and at that point, giving a little more feedback about RTE seems totally fine (if you insist, I can and will write void tle_assert(bool b) { while(!b); }).

I believe that one rule is that every submission should be an actual attempt of solving the problem.

I think this is impractical and unenforceable. If I resubmit some code which failed before because I think the issue might be nondeterministic, is that illegal? What if I accidentally introduce some bug which guarantees an MLE? What if I intentionally introduce some bug which guarantees an MLE? Most scoring schemes also do more than enough to penalize extraneous submissions, so I also don't think this is a problem.

→ Reply

VLamarca

4 years ago, # ^ |

I like your number 2 in spectrum for ICPC indeed. But going back to my question, you said that making the use of assert in some cases(and other ways of exploiting the judge veredict) is unenforceable, but I dont agree with that. What this rule may be is subjective, that is, it depends on the judge looking at your code and thinking that you are intentionally trying to get information from the hidden test cases. I believe that this is the current rule in ICPC, but in my opinion a better rule would be allowing you to do anything and limiting the number of submissions to 50 per problem per team so you cannot “binary search” stuff. That way there is no subjective factor and it is easier to judge.

→ Reply

ksun48

4 years ago, # ^ |

Most serious contests don't have such rules and instead have strong/numerous test cases to make sure that this isn't an issue.

→ Reply

VLamarca

4 years ago, # ^ |

+38

I should have started by saying the following: an icpc organizer from Latin America said that an example of an ilegal use of assert is checking whether the input graph is always bipartite (because this may be the case even though the statement does not say that directly, this should be a deduction the contestant has). This use is ilegal because you are trying to check something from the hidden test cases. This rule is BS imo.

→ Reply

4 years ago, # ^ |

+10

Yeah I agree that's pretty BS. I guess the ICPC rules are pretty loose (they pretty much just give judges final say), so it's kinda unfortunate that they're not becoming more objective; at least they haven't codified subjective grading.

→ Reply

4 years ago, # ^ |

+28

Sorry, I got a little confused, I didn't realize you were just stating that such a rule exists and not that you were advocating for it. I think the subjectivity is precisely why rules like "all submissions should be real attempts" are starting to die off in favor of more test data with carefully designed tests. Right now, I don't believe ICPC includes any such rule. ICPC rules prohibit interfering with the network or the judge machines, but there are no rules against using valid submissions to gain information.

I want to point out that there's a spectrum of possibilities of how you can "hack" a problem: the most egregious is to get all test data locally and hard-code the answers in a big if statement. Then, there's identifying/hardcoding particular special cases to fix 1-2 test cases that you failed. Beyond that, one could identify some patterns in 1-2 test cases that get TLE and try to write special heuristics optimize those cases. Finally, there's a whole range of heuristic performance optimizations that could also be considered hacking. I think the consensus is that hard-coding all cases is definitely bad and we should prevent it, but case-specific heuristic performance optimizations are fine but sometimes frowned upon (some ICPC search problems even are designed to be solved with heuristic performance optimizations).

In practice, we usually don't need a submission limit to prevent hacking because (a) there are too many distinct test cases, which totally prevents extracting all data and (b) because 20 minutes of penalty is a big deal. Penalty gives a clear ordering of solve normally > hacking with many submits > not solving, which I think is definitely the right ordering. (FWIW, serious contests without penalty often do have a generous submission limit. I believe IOI's is around 100 submissions per problem.)

Finally, I think there's this assumption that hacking problems is automatically bad; I think it definitely shows some relevant skill and it can be a fun part of contests. There's also a very a fine line solving heuristic search problems and squeezing solutions via hacking. For many heuristic problems, there's a balance between WA-AC-TLE, and you often need to "binary search" on the judge data to tune your parameters. These are totally acceptable and a good skill to encourage.

→ Reply

ffao

4 years ago, # ^ |

+15

Right now, I don't believe ICPC includes any such rule. ICPC rules prohibit interfering with the network or the judge machines, but there are no rules against using valid submissions to gain information.

It's a rule for the Latin American ICPC regional.

I agree with pretty much all that you said (well, except for heuristic problems being acceptable, but that's a different discussion...): being able to trade penalty time for only a few bits of information is not necessarily a bad thing and adds an interesting dimension to strategy. This rule being hard to enforce means it's very rarely actually invoked (for the times I can remember, a team was attempting to interfere with the judge machines), so while I would prefer to see the rule gone, it doesn't affect much in practice.

→ Reply

tribute_to_Ukraine_2022

4 years ago, # ^ |

Has anyone ever been disqualified upon this rule?

→ Reply

ffao

4 years ago, # ^ |

Personally all I know is more than one team has been disqualified for submitting fork bombs. You'd need someone older than me to know what happened prior to 2011, though.

→ Reply

Golovanov399

4 years ago, # ^ |

+20

Actually, I don't fully understand if a believing in any hypothesis violates the rule. Like imagine a problem where the graph is not said to be bipartite, but it follows from the statement, the hardest part of the problem is to discover this fact, and the bipartite case is trivial to implement. If I submit the following (pseudo)code:

Graph g;
g.read();
assert(g.is_bipartite());
solve(g);

then it seems that it's illegal. Is it legal to comment the assert here and then submit? I mean, on one hand, it's still "trying to check something from the hidden test cases", the only difference is that the verdict I get in response does not have to be RE. On the other hand, as far as I know, a contestant doesn't have to prove that their solution is correct, this is what the testing system is for. So please, if I miss any crucial difference between these, anyone tell me what it is.

→ Reply

tribute_to_Ukraine_2022

4 years ago, # ^ |

← Rev. 2 →

If your thought process was sth like "maybe the graph is not necessarily bipartite, but is nearly bipartite (whatever that means) so solve should still work, or maybe I just made a bug in testing if the graph is bipartite" then it's definitely "an actual attempt of solving the problem", but what if Jury says "we don't believe you that's what you thought"

EDIT: I slightly misunderstood what you said, my response refers to case: "you submitted the code with assert, got RE and resubmitted without it (or submitted without got some verdict other than AC, and resubmitted with assert)". If you made only one submission it's obviously "an actual attempt of solving the problem"

→ Reply

ffao

4 years ago, # ^ |

As written, your code is an attempt to solve the problem and therefore should be OK according to my interpretation of the rule.

However, if you submit without the assert and later submit the same code with the assert, then you know the second submission is not going to pass so it would be an illegal submission.

So this is a very interesting case, because to me it sounds like the rule would make the exact same code either legal or illegal, depending on the circumstances.

→ Reply

tribute_to_Ukraine_2022

4 years ago, # ^ |

If the only change in the code is adding an assertion, then yes, but what if I also make some change like: in dfs before iterating through all vertices I shuffle them/sort them by degree/indices, starf dfs from vertex with minimum/maximum degree/random vertex or even start from vertex no 2 instead of 1 or anything like this? It is entirely possible that with a change like this I can slip some random bullshit/solution with a very subtle bug past weak testcases and it is less likely but still possible that sorting all vertices by sth seemingly reasonable like degree/subtree size etc. will actually turn my solution into a provable greedy. Is it still illegal to submit it?

→ Reply

Merkurev

4 years ago, # ^ |

And what's the point of this whole thread? To learn in perfection trashy Latin American RC rules?

One of the cool things about competitive programming is automated judgement. And this rule makes it manual. In my opinion, it makes no sense at all.

→ Reply

The_Dark_Knight_Rises

4 years ago, # |

I don't understand computers, I just memorized some stuff that's enough for solving puzzles.

Funny, how the guy at #1 in top rated describes competitive programming. Can I get even remotely close to where you are if I memorize stuff/problems/ideas/tricks ? Please tell how you "memorize" ?

→ Reply

swapnilr

4 years ago, # ^ |

+20

I think he means to say he memorized basic things like how to compile programs, terminal I/O redirection, and useful compiler flags etc. These, together with basics of C++/Java/other languages, is mostly what he needs to convert his solutions of puzzles (= contest tasks) to code and test it.

→ Reply

The_Dark_Knight_Rises

4 years ago, # ^ |

+49

Fair enough. You just thwarted my plans to take over Codeforces.

→ Reply