Geothermal's blog

By Geothermal, history, 6 days ago, In English

On November 16th starting at 1:30 PM US Eastern Time, the 2024 ICPC North America South Regional will take place. This year, galen_colin and I will host the official contest broadcast, starting shortly before the beginning of the contest, on the ICPC Live YouTube channel. During the stream, we'll

  • Follow the scoreboard and discuss the contest
  • Discuss the problems and possibly solve a few of them (though unlike some of my past personal ICPC streams, solving problems and competing for the highest score will not be the main goal of the stream)
  • Talk about the teams competing to advance to the 2025 ICPC North America Championship
    • In keeping with recent tradition, I'm hoping to share some fun facts about the competing teams. If you're the contestant or coach of a competing team and want to share a fun fact for me to read on stream while covering your team, please comment below or send me a DM (including your university and team name).
  • Answer questions asked by viewers
    • If you have a question you'd like us to answer on stream, feel free to comment below!
  • And more!

The South Division includes the Mid-Atlantic, South Central, and Southeast North America regions. Teams from all three regions will compete on the same problemset and scoreboard, and the top few teams from each region, as well as potentially some extra teams from the combined scoreboard, will advance to the North America Championship. Historically, these three regions have been highly competitive, so we should be in for an exciting regional!

The contest website is at this link, and open mirrors for the two divisions will be held simultaneously with the contest: Division 1 and Division 2. Division 1 is meant for teams with competitive programming experience and will decide the teams advancing to NAC, while Division 2 is meant for teams with less competition/CS experience.

We hope to see you on the stream! Good luck to all of the participating teams!

Full text and comments »

  • Vote: I like it
  • +67
  • Vote: I do not like it

By Geothermal, history, 9 months ago, In English

On February 24th starting at 2 PM US Eastern Time, the 2023 ICPC North America South Regional* will take place. This year, I will host the official contest broadcast, starting shortly before the beginning of the contest, on the ICPC Live YouTube channel. During the stream, I'll

  • Follow the scoreboard and discuss the contest
  • Discuss the problems and possibly solve a few of them (though unlike some of my past personal ICPC streams, solving problems and competing for the highest score will not be the main goal of the stream)
    • Around two hours into the contest, SecondThread and I will join each other's streams via Zoom to discuss the problems overlapping between the South Division and the Pacific Northwest regional.
  • Talk about the teams competing to advance to the 2024 ICPC North America Championship
    • In keeping with recent tradition, I'm hoping to share some fun facts about the competing teams. If you're the contestant or coach of a competing team and want to share a fun fact for me to read on stream while covering your team, please comment below or send me a DM (including your university and team name).
  • Answer questions asked by viewers
    • If you have a question you'd like me to answer on stream, feel free to comment below!
  • And more!

The South Division comprises the Mid-Atlantic, South Central, and Southeast North America regions. Teams from all three regions will compete on the same problemset and scoreboard, and the top few teams from each region, as well as potentially some extra teams from the combined scoreboard, will advance to the North America Championship. Historically, these three regions have been highly competitive, so we should be in for an exciting regional!

The contest website is at this link, and open mirrors for the two divisions will be held simultaneously with the contest: Division 1 and Division 2.

I hope to see you on the stream! Good luck to all of the participating teams!

*: Not a typo; due to ongoing schedule difficulties resulting from COVID, the 2023 regional is being held in 2024.

Full text and comments »

  • Vote: I like it
  • +66
  • Vote: I do not like it

By Geothermal, history, 11 months ago, In English

While using the virtual participation tool on AtCoder, the "Standings" page shows the result of the contest as of the end of the competition. This means there isn't a way to view the standings as of the current time in your virtual contest (i.e., when I'm one hour into my virtual participation I would want the standings to display the results from submissions made in the first hour of the contest). This prevents me from practicing using the scoreboard to inform my strategy. The "Virtual Standings" tab does adjust to the time of the virtual contest, but there are usually too few participants for this page to provide much useful information. Right now, the best solution I know of is to use CLIST, which does support viewing scoreboards partway through a contest, by just scrolling to the current time in my VC whenever I want to check the scoreboard, but this isn't an ideal solution because it involves using an external site and requires me to manually select the current time in my virtual participation. Is there any chance the AtCoder developers could change the virtual participation feature so that the standings page updates correctly during virtual contests?

(My apologies if this has been brought up already or if this functionality is already available somewhere that I've missed.)

Full text and comments »

  • Vote: I like it
  • +72
  • Vote: I do not like it

By Geothermal, history, 12 months ago, In English

Recently, registrant counts for Codeforces rounds have been reported incorrectly on the Contests page. For example, at the time of writing, Codeforces Round 911 is listed as having 2162 registrants:

Clicking the 2162 number to access the list of registrants, however, shows that only 302 people have actually registered:

Indeed, the registrant list indeed only contains 303 names, including one participant who registered while I was taking these screenshots :)

The same bug applies to CodeTON Round 7, which currently reports over 10,000 participants on the contest list but lists fewer than 8,000 contestants in its registrant list. Obviously, contest registration statistics are not a critical feature of the site, but even so, I wanted to make sure Codeforces staff are aware of this issue since I haven't seen anyone else report this bug yet.

Full text and comments »

  • Vote: I like it
  • +77
  • Vote: I do not like it

By Geothermal, history, 14 months ago, In English

While I very much enjoy Codeforces rounds overall, FSTs are by far my least favorite part of the Codeforces contest format. There are two reasons I dislike contests with weak pretests. The first is stylistic: I don't care very much about penalizing people who don't rigorously prove their solutions before submitting (which is the main justification for having weak pretests) and I think it's essentially impossible to FST these solutions without more frequently FSTing contestants who make minor mistakes in their code. (The vast majority of my FSTs are due to this sort of minor glitch rather than an error in the underlying idea).

The second is that FSTs introduce an element of hidden information to Codeforces rounds. During a contest, you have no way of knowing whether the authors have written strong pretests. If you act as though the pretests are weak, you will be at a disadvantage if pretests are actually strong (because if you knew pretests are strong, you could have e.g. submitted ideas that are difficult to prove but easy to code, submitted code without spending time checking every detail to make sure you haven't missed any silly details like int vs long long, etc). If you act as though the pretests are strong, you will be at a disadvantage if pretests turn out to be weak (because you didn't thoroughly check your solutions before submitting them). When pretests are weak, there is also ambiguity over which errors pretests will catch, and it's not obvious to me why some mistakes deserve to be penalized more than others by FSTing. Thus, the current system (FSTs are usually strong, but sometimes pretests are unintentionally weak and occasionally an author will write weak pretests on purpose) introduces a degree of variance that I believe both unnecessarily decreases the competitive value of Codeforces rounds.

In this post, I offer four proposals to improve the state of pretests on Codeforces. The first proposal serves to eliminate the hidden information problem described above. The remaining three focus on minimizing FSTs for authors who seek to write strong pretests. (These three proposals are generally commonly implemented already, and my purpose in posting this blog is to try to make them universal norms.) All of these proposals are supported by concrete examples of problems that I believe they would have improved.


Proposal One

If an author intentionally writes weak pretests, they should announce this decision in the contest announcement or in the statements of the affected problems.

Case study: Round 869, Div. 1 E. In this problem, the author chose to include only six pretests in order to discourage people from incorrectly guessing the solution idea and attempting proofs by AC. The issue is that (a) many minor implementation bugs seem to have been caught in the crossfire and (b) contestants had no way of knowing that pretests would be weak (only one large test was included, so unlike most problems around this difficulty level, typical implementation errors common in large test cases did not fail pretests).

I disagree with the stylistic choice to use weak pretests here, but I think it's a matter of judgment and I respect the author's right to decide to make pretests weak. However, to eliminate the hidden information problem, contestants should be told that pretests are weak in advance, especially because most contestants have come to assume that pretests for late problems in Div. 1 are strong.

I can think of a couple of ways to implement this, but my favorite is to add a note at the end of the problem statement reading "Note that the pretests for this problem are intentionally made weak. There are six pretests, including the sample test case." This lets contestants know that they are responsible for carefully confirming the validity of their solutions before submitting their code. This isn't a full solution to the hidden information problem, as there's still ambiguity over which specific errors the pretests are designed to catch, so I would oppose weak pretests even if this change was implemented, but I think this would significantly reduce the damage weak pretests do to the contest experience.

Sidenote: This proposal, including the language of the note above, was inspired by maroonrk's comment at this link.

The remaining proposals are specific to authors who intend to write strong pretests. (I think this applies to the strong majority of CF setters.)


Proposal Two

If a problem is expected to receive under 100 solutions, the pretests should be the same as the system tests. No exceptions.

Case study: Round 896, Div. 1 D. In this problem, there were 41 pretests, which I assume were meant to be strong (for example, I accidentally declared my variable storing the sum of the input degrees as an int and not a long long and this error was caught by pretests, even though the only way it should matter is due to undefined behavior or if the sum of the input degrees is not equal to $$$2N$$$ but is evaluated as $$$2N$$$ due to overflow). However, several people FSTed when it turned out that there were 37 additional system tests not included in pretests. Given that this problem received only about 25 solutions in-contest, running the full test suite on every submission would not have created a significant load for the contest servers, so these FSTs could easily be avoided by setting pretests equal to systests.

More generally, running a large number of test cases is not especially expensive when a problem receives very few solutions throughout the contest, so unless pretests are intentionally made weak, there is no reason for them to not coincide with the system tests.


Proposal Three

If pretests do not include all system tests, then multitesting should be used. If possible, pretests should include all possible small cases. Exceptions can be made for e.g. Div. 2 As if with only 2-3 pretests allowed, it is impossible to include all small cases, any larger edge cases, and larger maximal cases (though this is fairly rare because most modern D2As have small inputs and large numbers of test cases).

I'm ambivalent on whether exceptions should be allowed for e.g. problems where multitesting makes the complexity analysis more complex. However, any exceptions should require justifications approved by the contest coordinator (and if possible, these problems should just have pretests = systests).

Case study: CodeTON Round 5 D. In this problem, I FSTed due to a minor bug in my logic that would have been caught on a fair number of small cases (if I recall correctly, my solution fails most cases where there exists a zero-length edge incident to $$$N$$$). There wasn't any meaningful reason the problem didn't use multitesting, and incorporating multitests would have prevented this and similar FSTs.

Multitesting is a very powerful tool for preventing FSTs, and authors should use it (and take full advantage by including all small cases) unless they have a good reason not to.


Proposal Four

Assuming multiple meaningfully distinct test case generators are used, at least one test from each generator should be included in pretests. Authors should prepare solutions meant to fail on each category of test case and should ensure that each solution actually does fail pretests.

Case study: Global Round 19 E. In this problem, the answer is immediately NO whenever $$$n$$$ is not a Fibonacci number. No such cases are included in pretests, however, and my solution FSTed by not immediately returning after outputting NO. There was a generator in place to output non-Fibonacci numbers, but no such test was included in pretests (I think this was the result of last-minute communication issues during the preparation of the round).

I'm pretty sure that following this proposal is standard practice already. Aside from rounds where pretests were intentionally weak, I can't recall any authors intentionally not following this practice. The point of this proposal is not to substantially change the process of test case generation but to make sure that this condition is checked prior to the start of each round.


Thanks for reading! If you have feedback, please feel free to write a comment; I'd be happy to hear input from people with more problem preparation experience than me.

Full text and comments »

  • Vote: I like it
  • +502
  • Vote: I do not like it

By Geothermal, history, 16 months ago, In English

Very loosely inspired by some comments on Thoughts on Reaching Cyan? and my recent AMA.

Until now, I've avoided telling beginners to practice math before working on competitive programing problems because I haven't figured out a helpful way of doing so. This post is my attempt at telling grays to train math first in a way that might lead to improvement without an absurd time commitment.

Introduction

This section is largely motivation for why I'm proposing this training plan. If you just want to see the instructions for the training plan I'm proposing, you can skip this part. The first subsection in particular is not especially relevant to the rest of the post, but I wanted to have some documentation explaining why I still think the traditional "just solve problems" advice is good.

The Conventional Advice on Improvement

I'm frequently asked for advice on how to improve at competitive programming. Typically, I tell people to focus on solving problems and learning from the ones they can't do, and I think this is the most common advice most strong competitive programmers give to beginners asking for tips. Below, I'll explain why I like to give this advice.

Looking back at my own experience, there were two main points in my career when I was totally stuck and felt like I wasn't improving at all. The first was when I had no background in algorithms at all, and I needed to learn e.g. what big-O notation is, what DP is, what it means to binary search for the answer, etc. Most of the people who ask me for advice know all of these things, so this isn't especially relevant to their situations (but if you don't know these things, read the first eight chapters of the Competitive Programmer's Handbook, and chapter four of Competitive Programming 4 if you have it).

The second was when as a high school junior, I couldn't solve USACO Platinum problems and convinced myself that segment trees, centroid decomposition, and heavy-light decomposition were really important algorithms that I needed to learn to do well on USACO, and thus I decided to spend a bunch of time repeatedly rereading articles about these topics in order to try to understand them better. I generally understood the concept of a segtree, but I wasn't able to implement them from scratch and also didn't have a great sense of how to apply them to problems, meaning that this knowledge was absolutely useless in USACO contests (indeed, that year I failed to solve several easy segtree problems). Meanwhile, centroid decomposition and HLD hardly come up, and even if they had, I didn't really come to understand them well enough to apply them to actual problems. I occasionally spent time thinking about Platinum problems, but they were mostly way too hard for me, so I found it hard to focus on them for extended periods of time or even to understand solutions when I saw them.

That experience is why I think the "just solve problems slightly above your level" advice is actually pretty good--I think I would have stood a fair chance at making USACO Camp as a high schooler if instead of trying to learn advanced topics and solve problems that were way too hard for me, I solved problems from the CF archive that were a bit above my current rating. This also would have fixed my segtree issues because I would eventually have encountered actual segtree problems in the wild that were closer to my skill level, and by solving them or understanding their solutions I would have eventually become strong enough to work on USACO Platinum problems.

Indeed, I improved a lot faster once I started following this advice. After tanking USACO as a junior and failing to make camp, I no longer had a good reason to focus on USACO (it's very hard to make camp for the first time as a senior, and it would have been too late for college admissions anyway, which in retrospect I cared about much more than I should have). I ended up shifting my focus to Codeforces contests; I switched from Java to C++ and started solving problems from the CF archive. Despite not doing any actual USACO practice that year, I did much better on the USACO (plus, I actually started to understand how to use segtrees), solving five problems during the season and getting very close to six, compared to just two the year before.

Motivating a Different Approach

I feel unqualified to give good advice to people who are hardstuck gray/green because I was never stuck in low Elo on Codeforces. This is partly because I had some programming contest experience before I started CF, but I was still a relative beginner when I signed up for CF, so I think the larger factor is that I had five years of math contest experience at the time. Many of the specific topics from math contests transferred well to CF, but more than that, the general thinking process was very similar, and many steps within Codeforces problems (e.g. when proving that your solution is correct) are essentially just math contest problems.

Thus, if I wanted to tell people to follow the same training plan I did, I would have to advise them to spend five years doing math contests first. This is obviously infeasible, so until now I've mostly given up on telling people to do math before trying competitive programming.

However, I've seen a large number of people, especially beginners, who at least appear to be following the conventional advice I outlined above but who don't see the same improvement I did. My theory is that the conventional advice may not work for some people because they haven't developed the background in mathematical thinking necessary to succeed in programming contests. Empirically, many of the top competitive programmers had extensive experience in math contests first, but this may be correlation and not causation. As a result, the goal of this post is to see if focused practice on math problems can help beginners improve at competitive programming more efficiently without the sort of time commitment I put into math contests.

There are (at least) two reasons I think training on math problems might be more efficient than directly training on programming contest tasks. First, programming contest editorials are often not great at motivating the mathematical thinking necessary to come up with solutions, meaning that it may be difficult for beginners who don't have practice thinking about hard math problems to reflect on how they could have solved problems themselves. Second, while most easy programming contest problems involve mathematical thinking, they are usually tagged "math", "combinatorics", "number theory", etc, making it difficult to isolate specific problem-solving strategies or to find patterns across problems. In contrast, the plan I'm proposing uses a tool designed to systematically build up various mathematical problem-solving strategies.


The Plan

In short, this plan calls for systematic practice on math contest problems involving no programming. I'll outline a specific roadmap below.

Instructions

For this training plan, we'll be using Alcumus. Alcumus is a completely free platform provided by Art of Problem Solving that allows you to pick a topic and receive problems from that topic targeted to your skill level. I suspect that practicing problems from e.g. Mathcounts, AMC 10/12, and AIME would give similar results, but the advantage of Alcumus is that we can focus on problems in topics relevant to competitive programming and e.g. ignore all geometry.

Start by registering for an Alcumus account. Go to your settings and set Problem Difficulty to Insanely Hard (you can tune this down later if the problems you receive are much too hard for you, but generally you want to be doing the harder problems on Alcumus) and Advance Only on Mastery to Yes. When you click play, you will be presented with a problem. You'll have two tries to answer it correctly. In the top-right, you can change your focus topic, the topic which most of your problems will be drawn from. Your focus topic will automatically change when you master it (once the progress bar in the top right becomes blue).

Once you're signed up for Alcumus, work through the following steps.

  1. Algebra: Set your focus topic to "All Topics in Algebra" and solve problems until you're mostly receiving Level 20+ algebra problems. If these problems feel much too easy for you, skip to the next step. Otherwise, master all topics in algebra up to "Advanced Quadratics". The easy topics will not take long to master (you should finish them in ~5 problems each). Algebra is less directly relevant to competitive programming problems, but it's fundamental to the problem-solving process and to the other fields of math we'll work on. If high-level problems in the early sections of algebra feel too difficult, master all topics in prealgebra first. Finish all topics up to "Factoring General Quadratics" before moving to the next steps. The remaining topics ("Sum and Product of Roots" through "Advanced Quadratics") can be done concurrently while working on the next steps.
  2. Counting and Probability: Master all topics in Counting and Probability. Combinatorics is the field of math that comes up most in competitive programming and is the most relevant to the style of thinking in programming contests, so this is where you should see the most improvement. Master topics through "Distinguishability" before starting the next step; the remaining topics can be done concurrently with the next step.
  3. Number Theory: Master all topics in Number Theory. These topics come up frequently in programming contests and are also valuable for building general numerical reasoning skills.

You can ignore all topics in Geometry, Intermediate Algebra, and Precalculus.

Finishing these steps should give you a reasonable baseline level of mathematical thinking skills (comparable to where I was when I started competitive programming). If you want to keep solving math problems, you can work on past problems from the AMC 10, AMC 12, and AIME. My advice is to start by working on algebra, combinatorics, and number theory problems numbered 1-5 on the AIME. Once you can solve these consistently, move on to combinatorics and number theory problems numbered 6-10, then work on combinatorics problems numbered 11-15. (It may be that AIME problems feel too difficult, in which case you should start with AMC 10/12 problems.)

You should continue to participate in programming contests and upsolve problems from those contests while working through this plan; it is designed to supplement and not replace programming contest training.

Time Commitment

The time commitment for this plan depends on your prior math background. I think most people should be able to master at least two topics per day in algebra and one per day in C&P/NT, with easier topics going faster. With about thirty topics per section, this should lead to a completion time of at most around two months. I'm not especially confident in this estimate, though, and you might work through it much faster or much slower than I did (which is totally fine). (However, if you e.g. complete the plan in under two weeks, it was probably too easy for you unless you spent many hours on it in that timeframe.)

Target Audience

I think this plan is most likely to help grays, and maybe greens. People above cyan probably already have enough mathematical reasoning skills that the problems on Alcumus will be too easy and thus won't benefit from this plan. I'm also not sure this will be useful if you've already completed a high-quality discrete math course, which would have taught many of the same concepts.

Obviously, this training plan will not teach you any programming. If you want to advance past gray, you should also know big-O notation, the basics of programming in your language of choice, and the STL data structures or their equivalent in your language. As mentioned previously, this plan is not a substitute for programming contest practice.

Additionally, I think this plan will be much more helpful in competitive programming than in job interviews because technical interview/LeetCode problems don't heavily emphasize mathematical problem-solving. I don't recommend this plan if your main goal is to prepare for technical interviews.

Some Thoughts on the Plan

I think this plan has several advantages as a way of practicing math for competitive programming:

  • It is completely free. Many of the other math resources I'd recommend (including the AoPS books) are not, and their costs may be prohibitive for some people.
  • It is very systematic. There are clear instructions and a specific set of benchmarks you need to achieve to finish the program.
  • Alcumus isolates specific math problem-solving skills. For those with limited math background, I think it can be very helpful to e.g. isolate casework, constructive counting, and complementary counting as separate counting strategies rather than just trying combinatorics problems in general.

My main concern about these instructions is that I'm not sure the Alcumus problems are difficult enough to be useful for this purpose. I've tried to compensate for this by suggesting a way of organizing practice on AIME problems, which I'm hoping will be useful for anyone who wants to continue working specifically on math problems.

I also am not confident that training on math problems will ultimately be more efficient for beginners than training directly on programming contest problems. To my knowledge, this theory has not been tested before, hence why I'm writing this blog :)

I've described the plan as "highly experimental" because I have very little sense of whether this plan is actually the best way for beginners to start their training. I think those who complete this plan will almost certainly experience some improvement, but I'm not sure whether this improvement will come faster than if they had instead spent an equivalent amount of time on programming contests. However, I think there's a small but nontrivial chance this plan will lead to significant improvements for a sizable population of beginning competitive programmers who are struggling to train by solving programming contest problems. I'm posting the plan here so that hopefully some people will try it out and we can see if it works for them.

If you try this training plan, please feel free to post updates on your progress below! Let me know if you have thoughts or suggestions, and I'll be eager to see whether you see improvement within a few months of finishing.

Full text and comments »

  • Vote: I like it
  • +248
  • Vote: I do not like it

By Geothermal, history, 16 months ago, In English

After reaching 3200 in yesterday's contest, now feels like as good a time as any to hop on this bandwagon. Feel free to ask questions in the comments and I'll respond to as many as I can.

A few sidenotes:

Thoughts on Practicing

In my streams, by far the most common questions I receive are variants of "how can I improve at competitive programming?" I'm happy to offer advice if people have specific questions about training related to their particular circumstances, but I wanted to write down some general thoughts so that I can redirect people who ask about general practice strategies (in this AMA or in my future streams) to this post.

I endorse most of the advice given at this link. In short, I recommend solving problems around or slightly above your skill level (I like to go to clist.by and choose Codeforces problems with luck between around 25% and 75%). If you run out of ideas and get totally stuck, you can look at the editorial--usually I try to just read the first part I haven't figured out already and then solve the rest of the problem on my own. (I don't have a specific timeframe for looking at the editorial; usually I give up whenever I feel like I haven't made any progress for a while and am unlikely to come up with ideas in the future, or when I'm bored of the problem.) I almost always implement solutions to the problems I attempt to make sure I haven't missed anything. (I think this is especially important after reading editorials in order to confirm that I really understood the solution.)

If the solution to a problem relies on an algorithm you haven't seen before, I suggest reading about it in e.g. CPH, CP4, cp-algorithms, or USACO Guide. Afterwards, solve a few related problems to make sure you've picked up the general idea. I don't recommend spending much time learning algorithms before related problems have come up in your training, since this strategy risks spending too much time on ideas that aren't relevant to the problems you need to solve in-contest. (One exception is that if you're just getting started, you should learn the basics of programming--in C++, this includes things like conditionals, loops, functions, etc, as well as the STL data structures, but not memory management, advanced topics related to e.g. pointers, or any object-oriented programming--and big-O complexity before doing anything else because these ideas are foundational to almost every problem you'll solve.)

Gauging Interest in Lockout Tournament

I'm considering hosting another Lockout tournament similar to the 2020 Lockout Championship. If you'd be interested in competing in such a tournament, please click here:

Edits

Edit 7/31: I've received a lot of questions where people have asked me to review their profiles to see if they're doing anything wrong. If you've asked such a question, I'll take a quick look at your profile but probably won't have time to review it in depth aside from looking for obvious issues (e.g. not doing contests or upsolving consistently, exclusively practicing problems beneath your level, submitting lots of WAs on test 1, etc). I also probably won't have especially helpful responses to questions of the form "How do I reach X rank?" because my practice strategy outlined above is applicable to all ranks.

Edit 8/17: Thanks for all the questions, everyone! Another quick update: I'm no longer going to review individual profiles because doing so is relatively time-consuming and I don't think I've had any new comments or advice to offer in a while. Instead, consider reviewing the advice at the top of this post, at this link, and in the comments on this AMA. (I'm happy to continue answering other questions as long as people have them.)

Full text and comments »

  • Vote: I like it
  • +206
  • Vote: I do not like it

By Geothermal, history, 19 months ago, In English

Posting since there isn't a thread yet. Discuss the GCJ Farewell Rounds here.

Full text and comments »

  • Vote: I like it
  • +122
  • Vote: I do not like it

By Geothermal, history, 2 years ago, In English

In the past, most elite programming competitions (e.g. IOI, ICPC, TopCoder Open, AtCoder World Tour, GCJ, Hacker Cup) have held their final stages in person. In 2020 and 2021, however, most of these contests were moved online. This was unfortunate, but understandable given the circumstances: in 2020, the state of the pandemic clearly did not allow for extensive travel, and while the dangers decreased by late 2021 as a result of rising vaccination rates around the world, there was still sufficient uncertainty to justify holding events online in 2021.

This year, however, both Google Code Jam and Meta Hacker Cup have committed to hosting online finals. [UPD: Meta has retracted its commitment to a virtual finals, so it seems that the plans for MHC are still up in the air.] I believe that this is a highly unnecessary precaution--there is plenty of evidence showing that it is possible to successfully run an in-person event in 2022. Notably, TopCoder has committed to hosting the TCO finals in person in November. Moreover, as a recent/particularly relevant example, the ICPC North America Championship was hosted in person a few weeks ago; while I'm aware of a handful of COVID cases that occurred at the contest, my understanding is that there was no especially large outbreak, in spite of the fact that the NAC had nearly 10x as many participants as either GCJ or Hacker Cup finals and that there were several precautions ICPC chose not to take (e.g. masks were not required except for during the competition itself; participants were not tested for COVID before/after the contest). If GCJ/Hacker Cup were to take these additional precautions, I find it hard to believe that the COVID risks would be particularly significant.

I participated in the online FHC finals in 2020. All in all, the Facebook staff did a great job transitioning the event to the online format (some highlights included adding some unannounced prizes, including Oculus Quests for all the finalists, and organizing card games after the contest). Even so, there's simply no comparison between online and in-person events. Moreover, the opportunities to compete in-person are so limited, especially for those who have run out of ICPC eligibility, that each additional in-person event contributes huge value in terms of community-building, particularly because many finalists in these events have reached a point at which only a few others in their country are as passionate about/skilled at competitive programming as they are. Finally, the in-person competition experience is the biggest difference between these events and the competitions taking place on CF, AtCoder, and CodeChef every week, and I suspect that if this experience doesn't return, these contests will become much less exciting to top competitors and will thus lose prominence in the community over time.

That said, I'm concerned that if in-person finals don't return within the next year or two, online finals will become the new default, and it's unlikely that in-person competition will ever return to these events. I would view this as a huge loss to the community, and as a result, I'd like to strongly encourage the organizers of GCJ/Hacker Cup to return to the in person format for their 2023 finals.

A few parting thoughts:

  • My understanding is that there's been at least some effort to hold in-person finals this year, and I suspect that one primary reason these events haven't been held in person is difficulty finding the necessary support from events staff rather than a lack of effort put in by the contest organizers themselves. (Shout-out to the problem authors/others involved in preparing these events; my intent is not to call any of you out, and I appreciate what you do to make these contests happen!) I'm not sure how best to convince these organizations that returning to in-person competition is worth the effort; if anyone has any thoughts on this, I'd be happy to hear them.
  • If in-person competition returns, I'd strongly encourage Google/Meta to implement a remote option for people unable to attend due to travel restrictions, underlying health concerns, the ongoing war in Ukraine, etc.

Thanks for reading--please feel free to share your thoughts below!

Full text and comments »

  • Vote: I like it
  • +374
  • Vote: I do not like it

By Geothermal, history, 3 years ago, In English

Hi, Codeforces!

Recently, Wind_Eagle wrote a blog with his thoughts on practicing in which he asserted the standard advice to solve more problems is not a reliable way for gray/green users to improve. I don't agree with some of his conclusions: for example, he asserts that it is difficult/impossible to practice at the gray/green level without a coach; as anecdotal evidence, I have never had a coach coordinating my competitive programming practices, and I think many other top competitive programmers have also never had access to coaching. However, I agree that the standard "just solve problems" advice is not working, and I don't know what is the best advice to give to gray/green users on how best to practice, so I wanted to start a discussion of what has worked for people in the past.

Therefore, I have a request for everyone reading this blog: if you are cyan or above, please post a comment sharing what you did to advance to cyan (or perhaps even blue). Everyone is welcome to comment, no matter your current rank and no matter when you made the climb to cyan.

I'm hoping it should be interesting to see if there are any patterns in the responses (e.g. how many people had a coach? What sort of mathematical preparation did most people have? What resources did people use? How do all of things vary when we compare people who started competitive programming many years ago to people who started today?). I'm primarily interested in hearing what you actually did, not what you would recommend to someone else who's currently gray/green (though if you want to give advice after sharing your experience, please feel free to do so). The reason I'm writing this blog is because I think the advice we're giving is generally not working, so I'd like to learn more about what actually worked for others in order to get a better sense of what the "standard" path to cyan is (and how it can perhaps be made more efficient).

I'll start by posting a comment about my own experience and some thoughts on why training as a new competitive programmer is so difficult in the current landscape. I'm looking forward to reading what you have to say!

Full text and comments »

  • Vote: I like it
  • +313
  • Vote: I do not like it

By Geothermal, history, 3 years ago, In English

Hi, Codeforces!

I'm writing to ask if there's any way Round #745 can be rescheduled in light of the conflict between the round and the ICPC World Finals Invitational Division. For those unaware, teams unable or unwilling to travel to Moscow to attend the 2020 ICPC World Finals in person are competing remotely on September 30th. The two competitions completely overlap: the WF Invitational Division begins 90 minutes before the Codeforces round starts and ends 90 minutes after the round ends.

For what it's worth, the clash affects a fairly sizable population of participants: a total of 57 teams are currently slated to compete in WF remotely, so this affects around 170 people. Moreover, as ICPC WF qualifiers, these people are particularly likely to benefit from access to Div. 1 rounds, which generally aren't all too common. Unfortunately, the series of dates surrounding the 30th are relatively packed with contests, but here are a few scheduling options that might make sense (i.e., as far as I'm aware no other contests are being held during the CF timeslot on these dates):

  • Move CF Round #745 to Sunday, September 26th or Monday, September 27th
  • Swap Rounds #744 and #745 (I think it's not a problem if a Div. 3 round clashes with WF, since very few WF participants are in Div. 3 anyway)
  • Move Round #745 to between Sunday and Wednesday of the following week (October 3rd — 6th)

I think moving Round #745 to October 3rd may make the most the sense, given that holding the round on a weekend may make it easier for those interested to participate. That said, I'm open to other solutions (or to keeping the round on the 30th if need be, but for obvious reasons I'd prefer a different date!).

If anyone has thoughts or suggestions, feel free to share them below.

Full text and comments »

  • Vote: I like it
  • +207
  • Vote: I do not like it

By Geothermal, history, 4 years ago, In English

As the editorial for round #709 has yet to be released, I thought I'd write up and post my solutions to the problems (with the exception of Div. 1 F). My code may not be especially useful, since it's a bit messier than necessary in some cases, but I'm hoping that the written explanations will be fairly helpful. Feel free to comment if you have any questions!


2A — Prison Break

By simply playing around with a few small test cases, you might guess that the answer is always equal to $$$ab$$$. This intuition is correct; let's prove it formally.

Consider a graph with $$$ab+1$$$ vertices including one vertex representing each cell of the prison, one vertex representing the outside world, and an edge for each wall of the prison connecting the two vertices on either side of the wall. Then, observe that a set of walls we can remove such that the outside world is accessible from every cell is equivalent to a set of edges that spans our graph (as if every cell is reachable from the outside world, then all cells must be in the same connected component). In particular, a minimal set of walls we can remove is equivalent to a spanning tree of our graph, and since our graph has $$$ab+1$$$ vertices, its spanning trees have $$$ab$$$ edges.

Consequentially, we can simply output $$$ab$$$.

Time Complexity: $$$\mathcal{O}(1)$$$ per test case. Click here for my solution.


2B — Restore Modulo

Let $$$d_i = a_{i+1} - a_i.$$$ The key observation is that $$$c$$$ must be congruent to $$$d_i$$$ modulo $$$m$$$ for all $$$i$$$. Moreover, note that $$$c$$$ cannot be congruent to both $$$x$$$ and $$$y$$$ modulo $$$m$$$, if $$$x$$$ and $$$y$$$ are both nonnegative or if they are both negative. Consequentially, if $$$d_i$$$ takes on multiple different positive values or multiple different negative values, the answer is no.

Now, let's say $$$d_i$$$ takes on at most one positive value and at most one negative value. We do casework.

First, suppose $$$d_i$$$ takes no positive values and no negative values. Clearly, this occurs only when $$$n = 1$$$, at which point $$$m$$$ can be arbitrarily large and the answer is zero.

Second, suppose $$$d_i$$$ takes on a positive value but no negative values. Then, $$$m$$$ can take any value greater than $$$a_n$$$, while $$$c = d_i$$$ for all $$$i$$$. Therefore, since $$$m$$$ is unbounded, the answer to the problem is zero.

Third, suppose $$$d_i$$$ takes on a negative value but no positive values. Then, $$$m$$$ can take any value greater than $$$a_1$$$ with $$$c = m - d_i$$$ for all $$$i$$$. Therefore, $$$m$$$ is unbounded, so the answer to the problem is zero.

Finally, suppose $$$d_i$$$ takes on a positive value and a negative value. Let the positive value be $$$x$$$ and the negative value be $$$y$$$. Then, since $$$d_i = x = y \mod m$$$, we must have $$$x - y = 0 \mod m$$$. Therefore, $$$m$$$ cannot be larger than $$$x-y$$$. We can then verify that $$$x-y$$$ is a valid choice of $$$m$$$ with $$$c = x$$$, giving us our desired answer.

Having addressed every case, we're finished here.

Time Complexity: $$$\mathcal{O}(n)$$$ per test case. Click here for my solution.


2C/1A — Basic Diplomacy

Initially, let's pick $$$f_{i1}$$$ as our teammate on day $$$i$$$, for all $$$i$$$. (In other words, we choose the first available friend from the given list on each day.) Obviously, this may not provide a valid set of selections, as some friend may be chosen more than $$$\lceil \frac{m}{2} \rceil$$$ times.

Note that because $$$2 \lceil \frac{m}{2} \rceil \geq m$$$, it is impossible for two friends to both be chosen more than $$$\lceil \frac{m}{2} \rceil$$$ times. If no friend is chosen this many times, we can simply output our initial selections and be done. Otherwise, there must be exactly one friend $$$x$$$ who is chosen more than $$$\lceil \frac{m}{2} \rceil$$$ times.

We give a greedy algorithm that will turn this into a valid set of selections, if doing so is possible. Iterate over all days once again. If friend $$$x$$$ is chosen on day $$$i$$$ and at least two friends are available on this day, replace friend $$$x$$$ with $$$f_{i2}$$$ in our selections. Terminate the algorithm once we've iterated over all the days or friend $$$x$$$ no longer appears more than $$$\lceil \frac{m}{2} \rceil$$$ times.

We conclude by proving that this algorithm is optimal. We clearly see that this algorithm will ensure that friend $$$x$$$ appears at most $$$\lceil \frac{m}{2} \rceil$$$ times unless there are more than $$$\lceil \frac{m}{2} \rceil$$$ days on which only friend $$$x$$$ can be chosen, in which case it is impossible to choose friend $$$x$$$ at most $$$\lceil \frac{m}{2} \rceil$$$ times, and the answer is no.

Then, we note that this algorithm will never lead to a different friend being chosen more than $$$\lceil \frac{m}{2} \rceil$$$ times. Suppose for contradiction that this algorithm causes friend $$$y$$$ to be chosen more than $$$\lceil \frac{m}{2} \rceil$$$ times. Then, immediately before this occurs, friend $$$y$$$ must have been chosen exactly $$$\lceil \frac{m}{2} \rceil$$$, while friend $$$x$$$ must have been chosen more than $$$\lceil \frac{m}{2} \rceil$$$ times. However, the sum of these two values must then be greater than $$$m$$$, which is impossible since we choose only $$$m$$$ friends in total, providing our desired contradiction.

Therefore, if it is possible for friend $$$x$$$ to be chosen at most $$$\lceil \frac{m}{2} \rceil$$$ times, this algorithm will ensure that this happens, and it will never cause another friend to be chosen more than $$$\lceil \frac{m}{2} \rceil$$$ times. Therefore, if a solution exists, this algorithm will find it, so it is optimal, as desired.

Time Complexity: $$$\mathcal{O}(n + m + \sum k)$$$ per test case. Click here for my solution.


2D/1B — Playlist

We maintain two sets. First, we maintain a set containing all songs that have not been removed from the playlist. (We'll refer to this as set one.) Then, we maintain a set containing all songs such that the GCD of their genre with the song of the previous song in the current playlist is equal to $$$1$$$. (We'll refer to this as set two.) Both of these sets can be initialized efficiently using the initial array.

Then, maintain the index $$$nxt$$$ of the next song we must listen to. While set two is non-empty, the next song we must remove from the playlist is the next element of set two after $$$nxt$$$. Call this song $$$rem.$$$ Then, we must remove $$$rem$$$ from both sets. Afterwards, we must update set two. The trick here is that we only need to update one element of set two with each removal: we only need to update whether the next element in set one after $$$rem$$$ is in set two, which occurs if and only if it has GCD $$$1$$$ with the previous element in set one.

We repeat this process until set two is empty. It will then give us a complete list of all the songs that must be removed from the playlist, in the order in which we remove them.

Time Complexity: $$$\mathcal{O}(n \log n)$$$ per test case. Click here for my solution.


2E/1C — Skyline Photo

We'll need a bit of heavy machinery to solve this problem. We'll maintain a segment tree of pairs $$$(dp_i, val_i)$$$ that allows us to perform the following operations: - Set the value of $$$dp_i$$$ for some $$$i$$$. - Set the value of $$$val_i$$$ to some fixed value for all $$$i$$$ in a range. - Find the maximum value of $$$dp_i + val_i$$$ over all $$$i$$$ in a given range.

Now, let's actually define these variables. We'll iterate over the buildings in reverse order. Let $$$dp_i$$$ be the maximum beauty of a set of photos covering buildings $$$i$$$ through $$$n$$$. For simplicity, define $$$dp_{n+1}$$$ to be zero. Then, we'll maintain $$$val_i$$$ such that as we iterate through building $$$j$$$, we have that $$$val_i$$$ is the beauty of the shortest building from $$$i$$$ to $$$j-1$$$. Note that $$$val_i$$$ is then the beauty of a photo consisting of buildings $$$i$$$ through $$$j-1.$$$ The last piece of machinery we'll need is a monotonic stack, which will contain a series of buildings in decreasing order (from the bottom) of position and increasing order of height. We'll use this to find the next building shorter than each building we process.

Let's begin iterating over the buildings. When we reach a building, pop all taller buildings from the stack. Once we're finished, the building on top of the stack will be the next building shorter than $$$i$$$. Let this building have position $$$j$$$. Then, update $$$val_k$$$ to $$$b_i$$$ over the interval $$$[i+1, j].$$$ Finally, let $$$dp_i$$$ be the maximum value of $$$dp_j + val_j$$$ over the interval $$$[i+1, n+1]$$$. Each choice of $$$j$$$ accounts for the case in which our first photo contains buildings $$$i$$$ through $$$j-1$$$.

After iterating over all our buildings, our answer is $$$dp_1$$$.

Time Complexity: $$$\mathcal{O}(n \log n)$$$. Click here for my solution.


2F/1D — Useful Edges

Let's start with some precomputation. First, using Floyd-Warshall, let's compute the shortest paths between all pairs of vertices. Let the shortest path from $$$i$$$ to $$$j$$$ have length $$$d_{ij}.$$$ Then, let's compute an array $$$L_{ij}$$$ representing the maximum length of a path from vertex $$$i$$$ to vertex $$$j$$$ such that this path can be extended to form a path satisfying one of our conditions $$$(u, v, l).$$$ We initialize this array such that $$$L_{uv} = l$$$ for all queries $$$(u, v, l).$$$ Then, observe that for all $$$i, j,$$$ and $$$k$$$, we have $$$L_{ij} \geq L_{ik} - D_{jk}$$$, since a path from $$$i$$$ to $$$j$$$ can be part of a valid path from $$$i$$$ to $$$k$$$ if it can be extended using the shortest possible path from $$$j$$$ to $$$k$$$ to reach a length less than $$$L_{ik}.$$$ We can therefore compute the final array $$$L$$$ using another application of Floyd-Warshall, taking the edges to have negative length.

Now, let's find the useful edges. Iterate over all root vertices $$$r$$$. Then, iterate over each edge $$$(u, v),$$$ and let this edge have length $$$x$$$. Then, if $$$D_{iu} + x \leq L_{iv}$$$ or $$$D_{iv} + x \leq L_{iu}$$$, the edge $$$(u, v)$$$ can be marked as useful. In the first case, $$$D_{iu} + x$$$ is the minimum length of a path starting at $$$i$$$ and using the edge in the direction from $$$u$$$ to $$$v$$$ to finally reach $$$v$$$, and we know from our computation above that $$$L_{iv}$$$ is the maximum length of a path starting at $$$i$$$ and ending at $$$v$$$ such that this path can be extended to satisfy one of our queries. The second case proceeds similarly.

Once we've finished this process, we can output the list of valid edges.

Time Complexity: $$$\mathcal{O}(n^3 + nm + q)$$$. Click here for my solution.


1E — Vabank

Let's start with an observation. Let's let our account balance equal $$$B$$$, and let $$$H$$$ be the highest query we've made so far without exceeding the limit. Then, we should never make a query greater than $$$\max(B, H)$$$, as we'll go bankrupt if $$$M$$$ is between $$$\max(B, H)$$$ and our query.

Consequentially, we should clearly start by querying $$$1$$$. Moreover, this observation motivates us to increase $$$H$$$ as quickly as possible, both to bound our answer and to maximize our ability to increase our account balance quickly. Therefore, let's begin by querying $$$1, 2, 4, \cdots$$$ until one of these values exceeds $$$M$$$ or until we would need to query a value greater than $$$10^{14}$$$. These are the maximum values we can query at any given time, since each query is equal to our current balance, which is greater than the highest previous query we've made. Since $$$\log_2 10^{14} \approx 46.5$$$, this will take at most $$$47$$$ queries.

Now, we've bounded the answer in a range of the form $$$[2^k, 2^{k+1} - 1].$$$ At this point, we might be tempted to binary search for $$$M$$$. Suppose at any given time that we've bounded $$$M$$$ in the range $$$[lo, hi].$$$ Then, we might query $$$lo$$$ until our balance exceeds $$$\frac{lo+hi}{2}$$$, at which point we query $$$\frac{lo+hi}{2}$$$ and update our range based on whether this query is successful. However, this is not efficient enough. If all queries of this form fail, then we take about $$$\log_2 10^{14}$$$ of these queries, and we must also query $$$lo$$$ at least once for each of these queries (in some cases, twice) in order to ensure our balance is at least $$$\frac{lo+hi}{2}.$$$ Therefore, this process consumes a number of queries on the order of $$$2 \log_2 10^{14}$$$ queries. After our step above, it takes $$$3 \log_2 10^{14}$$$ queries, which vastly exceeds our query limit.

We realize here that we need more queries to get an upper bound on our range than a lower bound, since when a query fails, it costs us money (and thus forces us to query $$$lo$$$ again), while when a query succeeds, it increases our balance (and thus decreases the extent to which we'll need to query $$$lo$$$ in the future). Therefore, as an improvement over the above idea, we might try doing a similar approach in the style of binary search, but instead of taking our query to be $$$\frac{lo+hi}{2}$$$, we might take the query to be $$$lo + r(hi-lo)$$$ for some fixed ratio $$$r$$$. By taking $$$r < \frac{1}{2}$$$, we try to establish relatively tight lower bounds so that it will take fewer lower bounds to find our answer, thus avoiding the $$$3 \log_2 10^{14}$$$ worst-case complexity of the above solution.

This is the approach that many people attempted in-contest; unfortunately, some quick math shows that this idea is probably hopeless. With about $$$60$$$ queries remaining after our initial step, we have the capacity to perform roughly $$$30$$$ upper bounds, since each lower bound requires at least one additional operation. (In practice, the maximum number of upper bounds turns out to be slightly lower, but this is a good enough approximation for our purposes.) Since our range can contain up to $$$2^{45}$$$ values initially, we must have $$$r^{30} \leq \frac{1}{2^{45}}$$$, which implies $$$r \leq \frac{1}{2^{3/2}} \approx 0.35.$$$ However, when $$$M = hi$$$, we must also be able to identify $$$M$$$ using at most about $$$60$$$ lower bounds, which means that we must have $$$(1-r)^{60} \leq \frac{1}{2^{45}}.$$$ This gives $$$1-r \leq \frac{1}{2^{3/4}} \approx 0.59$$$, which implies $$$r \geq 0.41.$$$ No $$$r$$$ satisfies both of these conditions, implying that any choice of $$$r$$$ is doomed to fail (though it may be quite hard to find a value of $$$M$$$ to break any given solution using this approach, leading to a number of unfortunate FSTs).

For the right value of $$$r$$$, this approach intuitively feels relatively optimized (and, in fact, it's quite close to passing), so let's figure out what precision we haven't yet squeezed out of this algorithm. The key idea is that when our queries are successful, they actively help us perform future queries--that is, after successfully finding a new lower bound for our answer, performing upper bounds becomes cheaper in the future. As a result, after each lower bound, we can afford to increase $$$r$$$ slightly to account for the fact that we won't have to perform as many $$$lo$$$ queries in order to get a given number of lower bounds.

Let's now state the correct solution formally. Suppose that at some point, we know that $$$M$$$ is in the range $$$[lo, hi]$$$ and we have $$$Q$$$ queries left. Moreover, let's say that $$$\frac{B}{lo} = K$$$. At this point, we can perform approximately $$$K$$$ upper bounds without needing to query $$$lo$$$ in order to restore our balance. Therefore, the total number of upper bounds we can perform is $$$\frac{Q+K}{2}.$$$ As a result, if all our future queries are unsuccessful, we must have $$$r \leq \frac{1}{(hi-lo+1)^{2/(Q+K)}}$$$. We take this value of $$$r$$$, or $$$\frac{1}{2}$$$ if $$$r > \frac{1}{2}$$$, and query $$$lo + r \cdot (hi-lo).$$$ Finally, we update $$$lo$$$ or $$$hi$$$ accordingly.

Unfortunately, this solution is a little finicky; I had to perform a bit of tweaking to get my solution to pass, but once configured appropriately, this approach consistently passes within the given query limit.

Time Complexity: $$$\mathcal{O}(q)$$$ per test case, where $$$q$$$ is our query limit. Click here for my solution.

Full text and comments »

  • Vote: I like it
  • +207
  • Vote: I do not like it

By Geothermal, history, 4 years ago, In English

A — Determinant

Just perform the requested computation. There really isn't much to do here.

Time Complexity: $$$O(1).$$$ Click here for my submission.


B — Quizzes

A simple brute-force simulation works here. Iterate over the characters of $$$S$$$ from left to right, adding one to the answer if the current character is an o and subtracting one if it's an x (while ensuring that the score never drops below zero).

Time Complexity: $$$O(N).$$$ Click here for my submission.


C — Super Ryuma

The key observation is that the answer will always be at most three. We will provide a constructive proof. Suppose that we want to go from $$$(x_1, y_1)$$$ to $$$(x_2, y_2)$$$. In our first move, we move to any point $$$(x_3, y_3)$$$ such that $$$x_3 + y_3$$$ has the same parity as $$$x_2 - y_2.$$$ Then, we will finish with two diagonal moves. We want to construct a point on one diagonal containing $$$(x_3, y_3)$$$ and another containing $$$(x_2, y_2).$$$ Thus, we seek a point $$$(x_4, y_4)$$$ with $$$x_4 + y_4 = x_3 + y_3$$$ and $$$x_4 - y_4 = x_2 - y_2.$$$ Some quick algebra confirms that this system has a solution as long as $$$x_3 + y_3$$$ and $$$x_2 - y_2$$$ have the same parity. So, we can move from $$$(x_1, y_1)$$$ to $$$(x_3, y_4)$$$ to $$$(x_4, y_4)$$$ to $$$(x_2, y_2)$$$, as desired.

Now, we can do casework to figure out when we can finish the problem in fewer moves. Obviously, a zero-move solution exists only if the two points are equal. We can use the described conditions to determine whether a one-move solution exists. A two-move solution exists by similar logic to the above if $$$x_1 + y_1$$$ has the same parity as $$$x_2 - y_2.$$$ Otherwise, a solution consisting of two diagonal moves cannot exist (since moves along diagonals maintain the parity of $$$x+y$$$), so we instead brute-force over all possible moves to cells within three units and check if a one-move solution exists starting at any of those points. If so, then we have a two-move solution. Otherwise, we've exhausted all possibilities, at which point we know that three moves is the best possible solution.

Time Complexity: $$$O(1).$$$ Click here for my submission.


D — Increment of Coins

We use Dynamic Programming. Let $$$dp_{ijk}$$$ be the expected number of operations necessary, supposing that we start with $$$i$$$ gold coins, $$$j$$$ silver coins, and $$$k$$$ gold coins. Then, if $$$i, j,$$$ or $$$k$$$ equals $$$100$$$, we know $$$dp_{ijk} = 0.$$$ Otherwise, we have

$$$dp_{ijk} = 1 + \frac{i}{i+j+k} dp_{(i+1)jk} + \frac{j}{i+j+k} dp_{i(j+1)k} + \frac{k}{i+j+k} dp_{ij(k+1}.$$$

Then, our answer is $$$dp_{ABC}.$$$

Time Complexity: $$$O(N^3),$$$ where $$$N$$$ is the number of coins needed to end the game. Click here for my submission.


E — Third Avenue

We represent this problem as a graph. First, add an edge between every two vertices adjacent in the grid. Then, we might be tempted to add edges between every pair of vertices marked with the same letter, but this could result in $$$O(N^2M^2)$$$ edges in the worst case. So, instead, we create a new helper vertex for each letter. Then, every cell marked with a letter has an edge of length zero to its letter's helper vertex, and each letter's helper vertex has an edge of length one to each vertex marked with that letter. This ensures that each vertex has a path of length one to each other vertex marked with the same letter while creating just $$$O(NM)$$$ edges in the graph.

Now, what's left is a fairly simple 0-1 BFS problem. There are a number of ways to implement this; I think the easiest is probably to use a deque, pushing vertices reached from an edge of length $$$0$$$ to the front of the deque and edges of length $$$1$$$ to the back. (This ensures that the crucial property that we visit vertices in increasing order of distance is maintained.)

Time Complexity: $$$O(NM).$$$ Click here for my submission.


F — Programming Contest

We use a meet-in-the-middle approach. Split the array $$$A$$$ into two halves. Then, construct sets of all total times that can be constructed by solving some set of the problems within each half-array. (There will be $$$2^{N/2}$$$ elements in each set.)

For each element $$$X$$$ of the first set, find the greatest element of the second set less than or equal to $$$T - X$$$, and let this element be $$$Y$$$. Note that $$$Y$$$ is the maximum amount of time we can spend on problems from the second half-array, given that we spend $$$X$$$ minutes on problems from the first half-array. Then, our answer is the maximum value of $$$X+Y$$$ over all possible choices of $$$X$$$.

Time Complexity: $$$O(N 2^{N/2}).$$$ Click here for my submission.

Full text and comments »

  • Vote: I like it
  • +45
  • Vote: I do not like it

By Geothermal, history, 4 years ago, In English

Hi all!

I'm pleased to announce the 2020 Lockout Championship, a lockout tournament that will be held over the next few weeks on Discord.

A full description of the tournament and a link to register are on the tournament Discord, but the gist is pretty simple: I'm running an open Lockout tournament for all interested competitors. All interested players will be allowed to compete. Matches will be run on the Discord Lockout Bot. The tournament will consist of a double elimination bracket. There are separate brackets for lower-rated competitors so that everyone has access to balanced matches, but anyone is welcome to join a bracket above their rating, should they want to do so. More specifics are on the Discord server.

For the first few rounds, matches will be scheduled between you and your opponent, but when we're down to the last few participants, the final rounds will be held live (and will hopefully be streamed on YouTube).

If you're interested in competing (or spectating), please join the tournament Discord at this link. Looking forward to seeing y'all there!

Registration is open on the server; to compete, you must register by 5 PM UTC on Saturday, August 15th.

If you have any questions (after reading the detailed info on Discord), please feel free to let me know, either in the comments below or on the Discord server.

Full text and comments »

  • Vote: I like it
  • +269
  • Vote: I do not like it

By Geothermal, history, 4 years ago, In English

A — Calc

Just directly print the given sum.

Time Complexity: $$$O(1)$$$. Click here for my submission.


B — Minor Change

The answer is simply the number of positions $$$i$$$ for which $$$S[i] \neq T[i]$$$. This is because we must make a move fixing each of those position, and each move can only make $$$S[i] = T[i]$$$ for one new position. Thus, we can iterate through the positions and count those where $$$S$$$ and $$$T$$$ differ, printing the total as our answer.

Time Complexity: $$$O(|S|)$$$. Click here for my submission.


C — Tsundoku

We apply two pointers. Iterate over the number of books to be read from desk A in decreasing order, starting with the greatest number of those books we could read without exceeding $$$K$$$ minutes. Maintain the sum of all books to be read from desk A. Then, while we can do so without exceeding $$$K$$$ minutes, add the next book from desk B to the list of books to be read. We can then consider the total number of books read between desk A and desk B as a possible answer, decrement the number of books from desk A, and continue.

The basic intuition is that as we decrease the number of books we read from desk A, the number of books we read from desk B monotonically increases. By processing the number of books to be read from desk A in order, we avoid having to recompute any sums, achieving a time complexity of $$$O(N+M)$$$.

Time Complexity: $$$O(N+M)$$$. Click here for my submission.


D — Sum of Divisors

Let's reframe the problem by considering the total contribution each divisor makes to the answer. For any given $$$i$$$, $$$i$$$ contributes $$$K$$$ to the sum for each $$$K$$$ from $$$1$$$ to $$$N$$$ such that $$$K$$$ is a multiple of $$$i$$$.

Notice that the list of these $$$K$$$ is of the form $$$i$$$, $$$2i$$$, $$$3i$$$, ..., $$$\lfloor \frac{N}{i} \rfloor i$$$, noting that the last term is the greatest multiple of $$$i$$$ less than $$$N$$$. For simplicity, let $$$M = \lfloor \frac{N}{i} \rfloor.$$$ Then, the sum of this sequence is equal to $$$i + 2i + 3i + \cdots + Mi = i (1 + 2 + \cdots + M) = \frac{iM(M+1)}{2}.$$$

Thus, by computing $$$M$$$ and the above product, we can compute each divisor's contribution to the answer in $$$O(1)$$$, giving us an $$$O(N)$$$ solution.

Time Complexity: $$$O(N)$$$. Click here for my submission.


E — NEQ

First, let's count the number of ways to choose $$$A$$$, irrespective of $$$B$$$. There are $$$\dbinom{M}{N}$$$ ways to choose the $$$N$$$ integers to appear in $$$A$$$ and $$$N!$$$ ways to arrange them, so we have $$$\dbinom{M}{N} N!$$$ choices for $$$A$$$ in total.

Now, without loss of generality, assume the sequence $$$A$$$ is simply $$$1, 2, \cdots, N$$$, since the number of choices for $$$B$$$ doesn't depend on our sequence $$$A$$$. We want to count the number of sequences $$$B$$$ where there is no position $$$i$$$ such that $$$B_i = i$$$.

We do this using the principle of inclusion and exclusion. By PIE, we know that the number of sequences $$$B$$$ where $$$B_i \neq i$$$ for all $$$i$$$ is equal to the sum over all valid $$$j$$$ of $$$(-1)^j$$$ times the number of ways to create a sequence $$$B$$$ where at least $$$j$$$ positions $$$i$$$ have $$$B_i = i$$$.

Thus, it remains to compute the number of ways to compute the number of sequences $$$B$$$ with at least $$$j$$$ equal positions. There are $$$\dbinom{N}{j}$$$ ways to choose $$$j$$$ positions where $$$B_i = i$$$. Then, it doesn't matter what we put in the remaining positions, so there are $$$\dbinom{M-j}{N-j}$$$ ways to choose the numbers and $$$(N-j)!$$$ ways to arrange them. Thus, the total number of ways to ensure that at least $$$j$$$ positions have $$$B_i = i$$$ is $$$\dbinom{N}{j} \dbinom{M-j}{N-j} (N-j)!$$$.

By precomputing all relevant factorials and their modular inverses in $$$O(M)$$$, we can compute the above product in $$$O(1)$$$ for each $$$j$$$. We then sum these up using our PIE formula, giving us an $$$O(N+M)$$$ solution. Since $$$N \leq M$$$, this is simply $$$O(M)$$$.

Time Complexity: $$$O(M)$$$. Click here for my submission.


F — Unfair Nim

It is well known that the second player wins a game of Nim if and only if the bitwise XOR of the numbers of stones in each pile is equal to $$$0$$$. Thus, we want to make the equation $$$A_1 \wedge A_2 \wedge A_3 \wedge \cdots \wedge A_n = 0$$$. By taking the XOR of this equation with $$$A_3 \wedge A_4 \wedge \cdots \wedge A_n$$$, we have $$$A_1 \wedge A_2 = A_3 \wedge A_4 \wedge \cdots \wedge A_n.$$$ We can immediately compute the right-hand side, since it is fixed before we make our moves. Now, we want to find the largest $$$x$$$ between $$$1$$$ and $$$A_1$$$, inclusive, such that $$$x \wedge (A_1 + A_2 - x) = A_3 \wedge A_4 \wedge \cdots \wedge A_n.$$$

We build $$$x$$$ via digit DP, determining the bits of $$$x$$$ from the most significant bit to the least significant bit. Let $$$a$$$ be the maximum possible value of $$$x$$$ thus far such that we have found a bit in which $$$x$$$ contains a $$$0$$$ while $$$A_1$$$ contains a $$$1$$$ (so, we know that no matter what bits we add to $$$a$$$ in the future, it will be less than $$$A_1$$$), or $$$-\infty$$$ if no such value exists thus far. Let $$$b$$$ be the value of $$$x$$$ such that thus far, all bits in $$$b$$$ are the same as those in $$$A_1$$$, or $$$-\infty$$$ if creating this value is impossible. Note that we cannot add a $$$1$$$ to $$$b$$$ if the corresponding bit in $$$A_1$$$ is a $$$0$$$, as this would make $$$b$$$ too large.

Let $$$S = A_3 \wedge A_4 \wedge \cdots \wedge A_n.$$$ If $$$S$$$ contains a $$$0$$$ in any given bit, we know that $$$x$$$ and $$$A_1 + A_2 - x$$$ must either both contain a bit or neither must contain a bit. We can tell which case applies by determining whether after subtracting out all bits we've assigned so far, $$$A_1 + A_2$$$ is greater than $$$2^{k+1}$$$, where $$$k$$$ is the index of the current bit. This is because the sum of all future bits will be at most $$$2^{k+1} - 2$$$, so $$$A_1 + A_2 \geq 2^{k+1}$$$ both enables us to add two $$$1$$$s at position $$$b$$$ and forces us to do so. If we add a bit to both numbers, we add $$$2^k$$$ to $$$a$$$ and $$$b$$$, but set $$$b$$$ to $$$-\infty$$$ if $$$A_1$$$ contains a $$$0$$$ in position $$$k$$$. Otherwise, we don't add anything to either $$$a$$$ or $$$b$$$, but if $$$A_1$$$ contains a $$$1$$$ in position $$$k$$$, we can set $$$a$$$ to $$$\max(a, b)$$$, since now we've found a position in which $$$x < A_1$$$.

If $$$S$$$ contains a $$$1$$$ in any given bit, we know that exactly one of $$$x$$$ and $$$A_1 + A_2 - x$$$ must contain a bit. Thus, we can add $$$2^k$$$ to $$$a$$$, because this will not make $$$a$$$ exceed $$$A_1$$$. If $$$A_1$$$ contains a $$$1$$$ in position $$$k$$$, we can also add $$$2^k$$$ to $$$b$$$, or we can alternatively assign a $$$0$$$ in order to guarantee that $$$x < A[1]$$$, allowing us to set $$$a$$$ to $$$\max(a, b)$$$.

At the end of this process, $$$a$$$ and $$$b$$$ are our possible values for $$$x$$$. Of course, if $$$a$$$ and $$$b$$$ are zero or $$$-\infty$$$, they can't work (the zero case fails because we cannot move every stone from the first pile). Moreover, we also must confirm that $$$a \wedge (A_1 + A_2 - a) = S$$$ in order to use $$$a$$$, and we must check the same equation for $$$b$$$. This is because the total bits we assign throughout our process may not equal $$$A_1 + A_2$$$, since the number of bits we assign is largely determined by $$$S$$$. (One countercase that demonstrates this issue is $$$N=3, A = 7, 5, 6$$$: our solution assigns one bit in position $$$2$$$, one bit in position $$$1$$$, and two bits in position $$$0$$$, but the total of these bits is $$$4 + 2 + 2 = 8 < 7+5.$$$) Then, we take the largest of these possible values for $$$x$$$ and subtract it from $$$A_1$$$ to get our answer.

Time Complexity: $$$O(N + \log A_i).$$$ Click here for my submission.


Full text and comments »

  • Vote: I like it
  • +259
  • Vote: I do not like it

By Geothermal, history, 4 years ago, In English

I just did my first ABC in a while, so I decided to write up and share my solutions below. Feel free to leave questions in the comments!

There were several questions this round that had the potential to create precision issues; however, the solutions below give approaches that sidestep those errors altogether. Sadly, I think compiling and testing my A prevented me from winning the round :(

A — Multiplication 1

Just multiply the numbers and print them out. It's not that hard.

Time Complexity: $$$O(1)$$$. Click here for my submission.


B — Multiplication 2

This turns out not to be quite as easy as the last problem. Multiplying numbers this large is actually a challenge in C++ because the result is likely to be greater than $$$2^{63}$$$, and thus will cause long long overflow. Thus, in C++, we can't just multiply the numbers and check if the result is greater than $$$10^{18}$$$.

One possible approach is to switch to a different language in order to solve this problem. For example, Python integers seem like a natural choice given their ability to handle integers of arbitrary size. Java BigIntegers also seem workable for similar reasons. However, we have to be a little careful here: if you try to multiply all the integers together and then check whether they're greater than $$$10^{18}$$$, you might TLE: you're performing $$$O(N)$$$ multiplications using a number with $$$O(N)$$$ digits, so the time complexity could potentially be at least $$$O(N^2)$$$.

To get around this, we need a slightly smarter approach: multiply the numbers together, and print $$$-1$$$ as soon as our product exceeds $$$10^{18}$$$. Now, the number of digits we're working with is bounded, so this is safe time-wise. Note that you need to separately check if the input contains a $$$0$$$, since that's the only way the product can decrease: for example, on an input of $$$10^{10}$$$, $$$10^{10}$$$, $$$0$$$, you need to check for the $$$0$$$ separately because otherwise you'll stop as soon as you scan the first two numbers and reach a product of $$$10^{20}$$$.

But, what if you don't want to switch languages? It turns out that there's a pretty simple C++ solution. The key idea is that we can easily compute the base-10 logarithm of the answer, using the identity that $$$\log A_1 A_2 A_3 \cdots A_N = \log A_1 + \log A_2 + \log A_3 + \cdots + \log A_N$$$. Then, the product is greater than $$$18$$$ if and only if the logarithm is greater than $$$18$$$.

To avoid precision errors, though, my solution just checks that the logarithm is small enough that we can perform the computation without long long overflow. If the logarithm is smaller than some value slightly greater than $$$18$$$ (say, $$$18.1$$$), we perform the multiplication and print the result if it is less than $$$10^{18}$$$. Note that we also need to handle the $$$0$$$ case separately here because $$$\log 0$$$ is undefined.

Time Complexity: $$$O(N)$$$. Click here for my submission.


C — Multiplication 3

One could perform the multiplication using doubles, but precision errors are always scary, so let's try a different approach. Let's read $$$A$$$ as a long long and $$$B$$$ as two integers: one representing its integer part and one representing its decimal part. Then, we can write the integer $$$100B$$$ as the sum of $$$100$$$ times $$$B$$$'s integer part plus the two digits representing $$$B$$$'s fractional part. For example, in the first sample, we read $$$1$$$ as $$$B$$$'s integer part and $$$10$$$ as its decimal part, so $$$100B = 110$$$.

Now that we have $$$A$$$ and $$$100B$$$ as integers, we can compute $$$100AB$$$ by taking their product. Then, we can compute the integer part of $$$AB$$$ by dividing this by $$$100$$$, noting that division in C++ drops the fractional part.

Time Complexity: $$$O(1).$$$ Click here for my submission.


D — Div Game

First, let's factor $$$N$$$ in $$$O(\sqrt{N})$$$ time. This is doable by attempting trial divisions by all numbers up to $$$\sqrt{N}$$$. Then, if there's anything left over when all these factors are removed from $$$N$$$, what's left must be $$$N$$$'s last prime factor.

Realize that since each $$$z$$$ we select only affects one prime, we can just process each of $$$N$$$'s prime factors independently, compute the number of times we can apply the given operation to the power of that prime factor, and sum up the results.

So, we now only need to solve the problem for numbers equal to $$$p^k$$$ for some prime $$$p$$$. We first attempt a simple greedy algorithm: divide out $$$p$$$, then $$$p^2$$$, then $$$p^3$$$, and so on, until our input is no longer divisible by the next power of $$$p$$$; let the last power of $$$p$$$ we divided out be $$$p^x$$$. Then, we claim that $$$x$$$ is the maximum number of times we can apply this operation to this number.

First, we see that $$$x+1$$$ is impossible: since $$$p p^2 p^3 \cdots p^x p^{x+1}$$$ does not divide $$$p^k$$$, we know that the smallest $$$x+1$$$ operations are still too large, so we cannot have $$$x+1$$$ or more operations. Then, we see that $$$x$$$ is doable: apply the operations with $$$p$$$, $$$p^2$$$, and so on, up to $$$p^{x-1}$$$, then apply one last operation to whatever is left over. The remaining exponent must be at least $$$p^x$$$ because we know $$$p p^2 p^3 \cdots p^x$$$ divides $$$p^k$$$, so we don't repeat any values of $$$z$$$, so this is a valid series of operations. Thus, we can't do any better than $$$x$$$, but we can definitely achieve $$$x$$$ operations, so $$$x$$$ is our answer for $$$p^k$$$.

We thus compute the value of $$$x$$$ for each $$$p^k$$$ and sum up the results to get our answer. Though there are more efficient ways to compute $$$x$$$, you can just do it naively (by adding $$$1$$$, then $$$2$$$, then $$$3$$$, and so on, until the result exceeds $$$k$$$): $$$N$$$ can't have very many prime factors and none of them can be raised to especially large powers, so the $$$O(\sqrt{N})$$$ factor from factoring $$$N$$$ will dominate.

Time Complexity: $$$O(\sqrt{N}).$$$ Click here for my submission.


E — Count Median

So that we can just deal with integers, rather than decimals, we redefine the definition of median for even numbers to refer to $$$x_{N/2} + x_{N/2 + 1}$$$. This is essentially twice the given median. We can see that this does not change the number of unique medians in the set because we're essentially doubling every number in the set of medians, which means that two equal medians will still be equal and two different medians will still be different after we double them.

We can easily compute the smallest and largest possible medians: take the medians of the arrays $$$A$$$ and $$$B$$$, respectively. Let these medians be $$$Y$$$ and $$$Z$$$. Then, we make a critical claim: the medians we can achieve are exactly the set of integers between $$$Y$$$ and $$$Z$$$. Thus, our answer is the number of integers from $$$Y$$$ to $$$Z$$$, which is $$$Z-Y+1$$$.

But, of course, we need to prove this claim. Obviously, we can't achieve a non-integer median, nor can we have a median lower than $$$Y$$$ or greater than $$$Z$$$, so there's no way to have any medians that aren't integers from $$$Y$$$ to $$$Z$$$.

Now, we just need to show that every median from $$$Y$$$ to $$$Z$$$ is achievable. Start with an array equal to $$$A$$$, which has median $$$Y$$$. Here's the key observation: whenever we increase an integer in the array by $$$1$$$, the median will either not change or will increase by $$$1$$$. This can be proven by fairly simple casework, but it's also pretty intuitive. Thus, as we increase the integers by $$$1$$$ in some arbitrary order until the array becomes $$$B$$$, our median never skips any integers, so it goes from $$$Y$$$ to $$$Z$$$ and, at some point, touches all the integers in between. This shows that any integer between $$$Y$$$ and $$$Z$$$ is a possible median, as desired.

Now, we can sort the arrays $$$A$$$ and $$$B$$$, compute their medians, and print $$$Z-Y+1$$$ as our answer.

Time Complexity: $$$O(N \log N).$$$ Click here for my submission.


F — Knapsack for All Subsets

Consider a subset $$$x_1, x_2, x_3, \cdots, x_k$$$ summing to $$$K$$$. Then, notice that there are $$$2^{N-k}$$$ subsets of $$$A$$$ containing this subset, because for each element not in our set of $$$k$$$, we have two options: to include it or exclude it from our subset. So, for each subset containing $$$k$$$ integers, we want to add $$$2^{N-k}$$$ to our answer.

We compute this summation using DP. Let $$$dp[i][j]$$$ be the sum of $$$2^{N-k}$$$ over all subsets of the first $$$i$$$ elements of the array $$$A$$$ that sum to $$$j$$$. Initially, $$$dp[0][0] = 2^N$$$ and $$$dp[0][j] = 0$$$ for all other $$$j$$$, since there is one subset of the first $$$0$$$ elements, which has size $$$0$$$ and sums to $$$0$$$.

Then, to transition, we can either add the next element in the array to our subset or not add it. If we don't add it, the result is simple: we add $$$dp[i][j]$$$ to $$$dp[i+1][j]$$$, effectively skipping this element. Alternatively, though, we can add the array to the subset. However, this adds $$$1$$$ to $$$k$$$, effectively multiplying the sum of $$$2^{N-k}$$$ by one-half, so we add $$$\frac{dp[i][j]}{2}$$$ to $$$dp[i+1][j + A[i]]$$$. Since we're working with modular arithmetic, we can divide by $$$2$$$ by multiplying by the modular inverse of $$$2$$$.

Then, our answer is $$$dp[N][S]$$$. Since we have $$$O(NS)$$$ states and each state has $$$O(1)$$$ transitions, this easily passes in time.

Time Complexity: $$$O(NS).$$$ Click here for my submission.


Full text and comments »

  • Vote: I like it
  • +281
  • Vote: I do not like it

By Geothermal, history, 4 years ago, In English

As most of you are likely aware, cheating in virtual contests (typically by submitting prewritten or copied solutions in order to appear at the top of the leaderboard) is very common. Most recently, wannabecandidatemaster submitted solutions to all problems in this round in order to take first place in the standings, and bemandrei competed virtually in this round after competing in the round officially, and ended up in second place on the leaderboard.

In response to the rise of VC cheating (as well as some other issues with the VC system I'll describe below), I'd like to make a simple proposal: Remove all virtual participants from the contest leaderboards. The one exception is that you should see your own name on leaderboards for contests you VCed (that is, virtual contestants should be visible only to themselves). Similarly, virtual participants should not be counted in the solve statistics during future contestants' VCs.

Cheating on VCs is harmful for two reasons. First, it makes it impossible to review the scoreboard as it appeared at the end of the round; in particular, this makes it difficult to discern accurate rankings for out of contest participants. Though this may seem like a relatively minor issue, Div 2, 3, and 4 rounds regularly attract hundreds or even thousands of competitors too high rated to compete officially, so it has an exceptionally broad scope and affects a large number of users.

Second, and more importantly, VC cheaters damage the VC experience for future competitors by making the solve statistics inaccurate. Many competitors use the dashboard's view of how many people have solved each problem in order to make strategic decisions during contests. In the round when I first reached GM, I achieved a high rank by noticing a few early solves on D, correctly discerning that it was unusually easy, and solving it before B or C in order to achieve a better penalty time. This kind of strategy is very common, but the presence of VC cheaters makes it much less practical because in pretty much every contest, when there are a few early solves on the hardest problems, they're not because those problems are easy--they're the result of VC cheaters. The result is that VC cheaters make the VC experience less realistic, so removing them from the solve statistics and leaderboards is critical to achieving the intended goals of the VC feature.

One might claim that the solution is simply to identify cheaters and remove them from the standings after the fact. However, this is a worse approach than removing VCs from the leaderboards altogether for three reasons. First, this approach requires continual effort on the part of the site administrators to remove VC cheaters, whereas my proposed solution needs only a one-time update. Second, it is impossible to accurately detect cheaters, especially because many forms of cheating are less obvious--for example, VCers can look up the test data using a second browser in order to gain an advantage when debugging, and there is no way to accurately distinguish between this and simply guessing the error quickly.

Third and finally, VCs are fundamentally different from participating in a contest live, so even in the absence of cheating, they should not appear on the leaderboards. There are a number of reasons for this, but I see two of them as most important. First, VC solutions are judged on all tests, rather than just the pre-tests, so it is impossible for them to FST: if their solution fails a test not originally included in the pre-tests, they get a chance to fix their error and resubmit, giving them a huge advantage over in-contest competitors. Second, frequently, Codeforces users release blog posts discussing techniques that came up in recent rounds; competitors who read an article inspired by a certain round and then VCed that round would have an advantage over anyone who did that round live.

As a result, I claim that the ideal way to deal with VCs is to remove them from the leaderboards entirely. Please feel free to leave thoughts below; I'm happy to discuss this further with anyone interested or to answer questions about my proposal.

Full text and comments »

  • Vote: I like it
  • +1001
  • Vote: I do not like it

By Geothermal, history, 4 years ago, In English

Hi all!

I'm here to share my solutions to today's Div. 3 round, since I managed to finish the round relatively quickly and thought that my approaches might be of some use to others. Feel free to leave questions in the comments!


A — Minimal Square

Without loss of generality, let's say $$$A \leq B$$$, that is, $$$B$$$ is the larger side-length. We claim that the minimum side length is $$$\max(2A, B)$$$, hence, the answer is $$$\max(2A, B)^2$$$. This is relatively easy to observe just by looking at the sample cases and trying different configurations, but I'll prove this claim below in case you're curious.

To prove this, we first show that $$$\min(2A, B)$$$ is a valid side-length. To achieve this, just stack the two rectangles on top of each other, with the two short sides stacking vertically and the long sides overlapping. (This looks much like the solution given in the problem statement for the second test case.) The vertical side-length of the block formed will be $$$2A$$$, and the horizontal side-length of the block formed will be $$$B$$$. Thus, as long as we can fit a block with dimensions $$$2A$$$ by $$$B$$$ in the square, we can fit in the two rectangles; this is clearly doable when the side-length of the square is $$$\min(2A, B)$$$.

Now, we prove that no lower side-length is achievable. First, suppose the side-length is less than $$$B$$$. Then, there is no way to fit either of the rectangles in the square because the sides of length $$$B$$$ won't fit. Thus, the side-length must be at least $$$B$$$. Then, we can show that the side-length must also be greater than $$$2A$$$ through messy casework, so the side-length must be at least $$$\max(2A, B)$$$.

Since a side-length of $$$\max(2A, B)$$$ is both attainable and optimal, our answer is $$$\max(2A, B)^2$$$. We can print this value directly, after swapping $$$A$$$ and $$$B$$$ if necessary to ensure that $$$B$$$ is larger.

Time Complexity: $$$O(1)$$$ per test case. Click here for my solution.


B — Honest Coach

We claim that the answer is the minimum difference between any two athletes' strengths. Obviously, since the answer must be a difference between two strengths, it is impossible to do any better. Then, to show that this is attainable, consider the two athletes $$$X$$$ and $$$Y$$$ who form this minimal difference, letting $$$X$$$ be at least as strong as $$$Y$$$. Place $$$X$$$ and all athletes weaker than $$$X$$$ except for $$$Y$$$ on the first team, and $$$Y$$$ and all athletes at least as strong as $$$X$$$ on the second team. It is easy to see that each team will have at least one athlete and that each athlete will be on a team. Moreover, $$$X$$$ is the strongest athlete on the first team and $$$Y$$$ is the weakest athlete on the second team, so $$$|max(A) - min(B)|$$$ is equal to the difference in their strengths, as desired.

Thus, we can simply sort the input, compare every two consecutive values, and print the minimum difference. Alternatively, since the bound on $$$n$$$ is fairly low, we can just brute-force all pairs of athletes and compare their strengths to give the answer.

Time Complexity: $$$O(N \log N)$$$ per test case. $$$O(N^2)$$$ is also possible and acceptable. Click here for my solution.


C — Similar Pairs

Suppose that there is an even number of even numbers in the input (and thus, there is also an even number of odd numbers in the input). Then, the answer is always yes--we can just pair up the even numbers with each other and pair up the odd numbers similarly.

Now, suppose that there is an odd number of even numbers (and thus, an odd number of odd numbers). Then, if there are no numbers with difference $$$1$$$ in the input, the answer is no, because in this case we cannot pair odd numbers with even numbers, and it is impossible to pair all the numbers with other numbers of their same parity. However, if there does exist a pair of numbers with difference $$$1$$$, the answer is yes: pair these two numbers up, and then there will be even numbers of odd and even numbers in the input; we have already shown that we can pair up these remaining numbers.

Thus, the answer is yes if there is a pair of numbers with difference $$$1$$$ or there is an even number of even numbers in the input. We can check these conditions in $$$O(N \log N)$$$ by sorting the input and checking consecutive numbers to see if they have difference $$$1$$$ or in $$$O(N^2)$$$ by brute-forcing all possible pairs; either will be accepted. (Obviously, we can count even numbers in $$$O(N)$$$.)

Time Complexity: $$$O(N \log N)$$$ per test case. $$$O(N^2)$$$ is also possible and acceptable. Click here for my submission.


D — Buying Shovels

Clearly, the type of packages must be a factor of $$$N$$$, since to get exactly $$$N$$$ shovels from buying a certain number of packages, the number of shovels per package must be a factor of $$$N$$$. We can thus compute all factors of $$$N$$$ in $$$O(\sqrt(N))$$$, then, any factors less than $$$K$$$ could be the type of package we want to use. The last observation is that we want to buy the largest type of package possible to minimize the number of packages we have to buy. Thus, the type of package we want is the largest factor of $$$N$$$ that is less than or equal to $$$K$$$, and our answer is then $$$N$$$ divided by this package size.

Time Complexity: $$$O(\sqrt{N})$$$. Click here for my submission.


E — Polygon

We claim that the answer is YES if and only if every $$$1$$$ has another $$$1$$$ or the border in either the cell to the right or the cell below it. We prove that this criterion is correct.

First, suppose that this is the case. Then, the desired grid is achievable; we will prove it by giving a construction. We build the grid such that the lowest $$$1$$$'s are added first; if several $$$1$$$'s appear on the same level, then we add the rightmost $$$1$$$'s first. This order ensures that whenever it comes time to add a $$$1$$$ to the grid, all $$$1$$$'s in the rectangle below and to the right of the $$$1$$$ we intend to add are already in the grid, while no $$$1$$$'s directly above or to the left of this $$$1$$$ have been added yet. Then, if there is a $$$1$$$ or the grid border below the $$$1$$$ we intend to add, we can fire the $$$1$$$ vertically and it will land in its intended position, while if there is a $$$1$$$ or the grid border to the right of the $$$1$$$ we intend to add, we can fire the $$$1$$$ horizontally and it will land where we want it. Thus, we can use this process to add $$$1$$$'s to the grid until all $$$1$$$'s have been added, showing that this grid is achievable.

Now, suppose this is not the case. Then, there exists a $$$1$$$ with $$$0$$$'s below it and to its right. It is then clearly impossible to fire a $$$1$$$ into this position because if it is fired horizontally, it will pass through to the $$$0$$$ position on its right, and if it is fired vertically, it will pass through to the $$$0$$$ position below it. Thus, in these cases, the grid is not achievable, showing that our criterion is exactly right.

Then, we can check this criterion for each $$$1$$$ in our grid; it is fairly easy to directly confirm whether it is satisfied.

Time Complexity: $$$O(N^2)$$$ per test case. Click here for my submission.


F — Spy-string

This problem falls to a brute-force approach. Clearly, the solution must differ from the first string in the input in at most one position, thus, there are less than $$$26M$$$ possible input strings. We can simply check all of them to see if they satisfy the condition: for each string, iterate over all input strings and count the differences between the input string and our potential solution; if we find a string that has at most $$$1$$$ difference, then we can report it as our answer.

Time Complexity: $$$O(26NM^2)$$$ per test case. Click here for my submission.


G — A/B Matrix

First, note that the matrix must have exactly $$$NA$$$ ones, since it has $$$N$$$ rows and $$$A$$$ ones per row. Likewise, it must have exactly $$$MB$$$ ones, since it has $$$M$$$ columns and $$$B$$$ ones per column. Thus, we must have $$$NA = MB$$$, so if $$$NA \neq MB$$$, the answer is NO.

If $$$NA = MB$$$, then we claim the answer is YES. Initially, let the matrix contain only zeros. Then, iterate over all rows in the matrix. For each row, $$$A$$$ times, let $$$X$$$ be the number of ones we've added so far, then, place a $$$1$$$ in the corresponding row and in column $$$X \mod M$$$. (We can maintain $$$X$$$ throughout this process so we don't need to recompute it every time.)

This clearly gives us $$$A$$$ ones per row, so we have a total of $$$NA$$$ rows. Moreover, it is fairly easy to see that this process distributes ones evenly across all the columns, so each column gets $$$\frac{NA}{M}$$$ ones. Then, since $$$NA = MB$$$, we have $$$B = \frac{NA}{M}$$$, so each column gets $$$B$$$ ones, as desired.

Runtime: $$$O(NM)$$$ per test case. Click here for my submission.


H — Binary Median

First, note that the resulting set contains $$$2^M - N$$$ strings. Thus, our median will be larger than exactly $$$X = \lfloor \frac{2^M - N - 1}{2} \rfloor$$$ strings in our set.

Then, we build the string bit by bit, starting from the most significant bit. Suppose we add a $$$1$$$ to the end of the string. Then, if the answer string is now lexicographically greater than at most $$$X$$$ strings (not counting strings that are equivalent to our answer up until the point where the answer ends), we know that we must append a $$$1$$$, because if we do not, then our string cannot be greater than $$$X$$$ strings because it will be smaller than a string greater than at most $$$X$$$ strings. Meanwhile, if our answer string becomes lexicographically greater than more than $$$X$$$ strings, we must append a $$$0$$$ instead, because if we append a $$$1$$$, then no matter what, our string will be greater than more than $$$X$$$ strings and thus cannot be our median.

We can compute the number of lexicographically smaller strings than our current answer by computing our answer string's binary representation and subtracting the number of smaller strings we've removed from the set.

We need to complete this building process $$$O(M)$$$ times. Each time, we must evaluate whether each of the input strings is lexicographically smaller; this can be done in $$$O(NM)$$$ time. By reusing data between iterations, we can also count smaller input strings in $$$O(N)$$$ time, but this makes the code slightly more complex. As is, we have an $$$O(NM^2)$$$ solution; since the sum of all $$$NM$$$ is bounded at $$$10^5$$$, the sum of all $$$NM^2$$$ is at most $$$6 \cdot 10^6$$$, so we can expect this algorithm to pass easily.

Time Complexity: $$$O(NM^2)$$$ per test case. Click here for my submission.

Full text and comments »

  • Vote: I like it
  • +109
  • Vote: I do not like it

By Geothermal, history, 5 years ago, In English

Since yesterday's Div. 3 round, I've received several PMs asking me to explain my solution to problem E (link here). As far as I'm aware, it's one of the simplest solutions submitted from an implementation standpoint; the code is relatively short and only one line involves nontrivial logic. So, I figured it might be more efficient to just post a writeup publicly. (As an aside, this is not something I do often; I am unable to respond to most PMs I receive asking for problem explanations. I'm writing this post because I received multiple requests related specifically to my solution for this particular problem, and because I happened to have the time. Generally, though, PMing me over Codeforces is unfortunately not an especially efficient way to get help with problems.) I'll try to outline my thought process as I worked through the problem, since I think solving E within three minutes was probably the biggest single contributing factor to my performance in the round.

The first thing I noticed about this problem was that the definition of $$$k$$$-periodic was unusual---typically, for example, 110110110 would have period $$$3$$$, while 1001000 would not. After parsing the definition, we see that the basic idea is that there must be one string of 1's, each $$$k$$$ spaces apart, and no other 1's left in the string.

The intuition we gather from this is that spaces $$$i$$$ and $$$i-k$$$ are probably related. More formally, if we have a string such that the last 1 is at position $$$i-K$$$, then we can create a string such that the last 1 is at position $$$i$$$ by taking the same string, then changing the character in position $$$i$$$ to a 1.

Our ability to reuse computation for earlier positions in the string to help us deal with later positions motivates a DP approach. We define our state as follows: let $$$dp[i]$$$ be the maximum value of the number of 1's kept as 1's (rather than changed to 0) minus the number of 0's changed to 1's in order to get a valid string that has its last 1 at position $$$i$$$, if it has any 1's at all. Then, our answer is the minimum value of $$$cnt_1 - dp[i]$$$ over all $$$i$$$, where $$$cnt_1$$$ is the number of 1's in the whole string. This is because, since $$$dp[i]$$$ counts unchanged 1's minus changed 0's, $$$cnt_1 - dp[i]$$$ counts changed 1's plus changed 0's, which is exactly what we want.

This seems like a very bizarre DP state, but it's actually quite nicely motivated. The key idea is that since we're forming a chain of 1's separated by $$$k$$$ spaces each, and our transition will thus relate $$$dp[i]$$$ to $$$dp[i-k]$$$, we probably want to form a state based only on changes made within the chain of positions we're changing to 1's, so that we don't need to worry about any positions after $$$i$$$ or outside this chain. Then, within our chain, we keep all the 1's as 1's and change all the 0's to 1's. Keeping 1's that were in the string already effectively saves us from performing a modification, while changing 0's effectively forces us to perform a modification. Thus, $$$dp[i]$$$ is effectively the number of modifications we saved minus the number of extra modifications we added compared to if we changed all the elements of the string to 0.

Then, we just need to figure out the transitions. First, it is possible for $$$dp[i]$$$ to equal $$$S[i]$$$. If $$$S[i] = 0$$$, then we can achieve the string with no 1's without changing any 0's to 1's, so $$$dp[i]$$$ can equal $$$0$$$. If $$$S[i] = 1$$$, then we can save one modification by creating the string where the only 1 is at position $$$i$$$.

This leaves one case left: taking a string ending at $$$i-K$$$, and adding a $$$1$$$. Starting with $$$dp[i-K]$$$, this forces us to make one extra change if $$$S[i] = 0$$$, because we now need to change position $$$i$$$ to a 1. However, it saves us one modification if $$$S[i] = 1$$$, because we no longer need to change position $$$i$$$ to a 0. This can be written concisely as dp[i-K] - 1 + 2 * (S[i] - '0'), since $$$dp[i] = dp[i-K] + 1$$$ if $$$S[i] = 1$$$ or $$$dp[i-K] - 1$$$ if $$$S[i] = 0$$$.

From here, we can compute the DP table in $$$O(N)$$$ and use it to generate the answer. The relevant portion of my code is below.

    int T; cin >> T;
    while(T--) {
        int N, K; cin >> N >> K;
        string S; cin >> S;
        int dp[N];
        int ans = 0;
        F0R(i, N) {
            dp[i] = S[i] - '0';
            if (i >= K) {
                dp[i] = max(dp[i], dp[i-K] - 1 + 2 * (S[i] - '0'));
            }
            ans = max(ans, dp[i]);
        }
        int cnt = 0; F0R(i, N) cnt += S[i] - '0';
        cout << cnt-ans << nl;
    }

Full text and comments »

  • Vote: I like it
  • +158
  • Vote: I do not like it

By Geothermal, history, 5 years ago, In English

Hi all!

Since nobody has posted about the round yet, I thought I'd open up a page to discuss GCJ Round 1A, which happened earlier this evening.

The scoreboard, problems, and editorials are available at this link. Practice mode should be open now--I'd recommend trying the problems if you didn't compete, since I thought they were generally pretty nice.

It looks like everyone with a score greater than 63 or with a score of 63 and a penalty time no greater than 2:14:05 qualifies for Round 2, assuming that nobody gets removed from the scoreboard later on. Everyone else can attempt to qualify through Rounds 1B and 1C, even if they participated in this contest but failed to qualify.

If nobody posts solutions here within the nearish future, I'll write up and share mine, but as far as I can tell, my ideas were pretty much identical to the ones explained in the official editorials.

Full text and comments »

  • Vote: I like it
  • +75
  • Vote: I do not like it

By Geothermal, history, 5 years ago, In English

Hi all!

As it might be a little while before the editorial for today's round is released (due to the open hacking phase), I thought I'd publish my solutions in case anyone wants to review the problems right away.


A — Divisibility Problem

There are a few ways to approach this. One is to find the next multiple of $$$B$$$ after $$$A$$$ and subtract $$$A$$$. Recall that we can compute $$$\frac{A}{B}$$$, rounded up, as $$$\frac{A+B-1}{B}$$$, rounded down. We can compute this value and then multiply it by $$$B$$$ to get the next multiple of $$$B$$$, subtracting $$$A$$$ to get our answer.

An alternative solution (and the one I implemented) is to realize that, as long as $$$A$$$ isn't already a multiple of $$$B$$$, the number of moves required is $$$B - (A \% B)$$$, since the distance to the next lower multiple of $$$B$$$ plus the distance to the next higher multiple of $$$B$$$ will be $$$B$$$. We can use this to compute the answer. In the case where $$$A$$$ is a multiple of $$$B$$$, this prints $$$B$$$ instead of $$$0$$$, so we can take this value modulo $$$B$$$ to compute the correct answer.

Runtime: $$$O(1)$$$. Click here for my submission.


B — K'th Beautiful String

We build the string character-by-character. At any stage, we want to find the $$$K$$$'th lexicographically smallest string starting with a certain prefix and including $$$X$$$ more characters, including $$$Y$$$ b's. Then, there are $$$Z = \dbinom{X-1}{Y}$$$ strings satisfying these conditions and with an 'a' as the next character. These strings are lexicographically smaller than all strings with b's as the next character, so if $$$K \leq Z$$$, we'll use one of these strings, and we should append an 'a' and continue with $$$X$$$ decreased by $$$1$$$. Otherwise, we need to append a 'b', so for our next stage, we want to find the $$$(K-Z)$$$'th lexicographically minimal string with $$$X-1$$$ more characters and $$$Y-1$$$ more b's.

Runtime: $$$O(N)$$$. Click here for my submission.


C — Ternary XOR

We build the answer digit by digit. Notice that the two numbers will be equal up to a certain point. As soon as we make the two numbers different, though, whichever number is bigger at this stage will always be bigger, no matter what we do, so we should seek to minimize this number without caring what happens to the smaller number. This means that as soon as we add different characters to the two numbers, every remaining digit in the larger number should be a zero, while the remaining digits in the smaller number should be the same as the digits of the input number.

Until we reach this point, though, we should try to keep the two numbers as balanced as possible to avoid making the larger one too big. When we process a $$$0$$$, we are clearly best off adding a $$$0$$$ to each number. When we process a $$$2$$$, we should add a $$$1$$$ to each number: this is better than adding a $$$0$$$ to one number and a $$$2$$$ to the other because this gives a maximum of $$$2$$$ rather than $$$1$$$. When we process a $$$1$$$, we must add a $$$1$$$ to one number and a $$$0$$$ to the other, and from here, we move into the stage described above, adding only $$$0$$$'s to the number that received the $$$1$$$ and appending all digits from the input to the number which received a $$$0$$$.

Runtime: $$$O(N)$$$. Click here for my submission.


D — Carousel

First, note that the answer will always be at most $$$3$$$. To achieve this, alternate between $$$1$$$ and $$$2$$$ for the first $$$N-1$$$ numbers, with the last number receiving a $$$3$$$. We can easily see that this guarantees that all adjacent figures have distinct colors, so we will never have a pair of adjacent figures with the same color and distinct types. (The reason $$$2$$$ may not be achievable is because the carousel is circular, so the last figure is adjacent to the first--this makes the odd case harder to deal with, since, for example, when $$$N=3$$$, we could end up with a result of 1, 2, 1, which doesn't work if the first and last figures are of different types.)

Now, we determine whether we can do better. It is trivial to check if the answer is $$$1$$$: this works only if all figures are of the same type. Now, we simply need to check if the answer is $$$2$$$ and provide a construction if so. To do this, we use dynamic programming. Without loss of generality, let the first figure have color $$$1$$$. Then, if it is possible to color the first $$$i$$$ figures in a valid way such that figure $$$i$$$ has color $$$j$$$, let $$$dp[i][j]$$$ be the color of figure $$$i-1$$$ used in one such coloring. Otherwise, let $$$dp[i][j] = -1$$$. To transition, if $$$dp[i][j] \neq -1$$$, we can transition to $$$dp[i+1][3-j]$$$ (if we let the colors be $$$1$$$ and $$$2$$$) from $$$dp[i][j]$$$, and if figures $$$i$$$ and $$$i+1$$$ have the same color, we can also transition to $$$dp[i+1][j].$$$

From here, we can determine whether an answer of $$$2$$$ is possible. If $$$dp[N-1][2]$$$ isn't $$$-1$$$, then we can definitely get an answer of $$$2$$$. If $$$dp[N-1][1]$$$ isn't $$$-1$$$ and figure $$$N-1$$$ has a different color from figure $$$0$$$, we can also achieve an answer of $$$2$$$. Either way, we can pick one of these cases and backtrack through our DP table to compute a valid coloring, if one exists.

If no such coloring exists, the answer must be $$$3$$$, and we can output the construction described above.

Runtime: $$$O(N)$$$. Click here for my submission.


E — Tree Queries

Let's start with an observation. Realize that if a path from the root goes through any vertex within distance $$$1$$$ of a node, it will also go through the vertex's parent. Thus, the problem is reduced to determining whether some path from the root contains the parents of all vertices in the query set. For simplicity, we simply replace every vertex in the query set with its parent (replacing the root with itself).

Since a path from the root to a vertex includes that vertex and its ascendants, we see that the answer is yes if and only if there is some node in the set that's a descendant of every other node in the set. We can perform a pre-order traversal, tracking entry and exit times, to enable us to determine in $$$O(1)$$$ whether some node $$$v$$$ is an ancestor of some node $$$u$$$. Then, we maintain a node that's a descendant of all nodes we've considered thus far. We process each node in the query set---if this node is an ancestor of our current node, we do nothing, and if the current node is an ancestor of this node, this node is now a descendant of all nodes we've considered, so we replace the current node with the new one. If neither node is an ancestor of the other, there is no vertex in the set that is a descendant of all the others, so the answer is no. If we finish this process and still have a valid result node, the answer is yes.

Click here for my submission.


F — Make k Equal

Suppose we want to make $$$K$$$ elements of the set equal to some number $$$X$$$. If there are at least $$$K$$$ copies of $$$X$$$ in the set, this takes $$$0$$$ operations. Otherwise, realize that to make any number lower than $$$X$$$ equal to $$$X$$$, we must first raise all numbers lower than $$$X$$$ to $$$X-1$$$. Meanwhile, to make any number higher than $$$X$$$ equal to $$$X$$$, we must first lower all numbers greater than $$$X$$$ to $$$X+1$$$. Take for granted that we can compute the costs of doing these things---we'll explain how to do this below. Then, if there are at least $$$K$$$ numbers less than or equal to $$$X$$$, we can bring all the lower numbers to $$$X-1$$$ and then raise however many of them we need to $$$X$$$. If there are at least $$$K$$$ numbers greater than or equal to $$$X$$$, we can bring all the higher numbers to $$$X+1$$$ and then lower however many we need to $$$X$$$. In any case, we can also bring all the lower numbers to $$$X-1$$$ and all the higher numbers to $$$X+1$$$, after which we can take however many numbers we need to $$$X$$$.

Now, how can we determine the number of operations this will take for each possible $$$X$$$ efficiently? First, a quick observation: we only need to consider elements of the input array as $$$X$$$, as we can prove that as long as $$$X$$$ is not an element of the array, we can increase or decrease it to the next element of the input array without increasing the number of required operations. Then, we consider each possible $$$X$$$ in increasing order. We can then maintain the number of elements less than $$$X$$$ and the sum of all those elements, as well as the same values for the numbers greater than $$$X$$$. These values allow us to compute the costs of bringing all numbers lower than $$$X$$$ to $$$X-1$$$ or the costs of bringing all numbers higher than $$$X$$$ to $$$X+1$$$, after which we can consider each of the three cases above to compute our answer.

Runtime: $$$O(N \log N)$$$, to account for sorting. Click here for my submission.

Full text and comments »

  • Vote: I like it
  • +97
  • Vote: I do not like it

By Geothermal, history, 5 years ago, In English

It's been a while since I've done one of these!


A — Station and Bus

After parsing the problem statement, we see that we simply need to determine whether the two companies each operate at least one station, since if this is the case, those stations will be connected, and otherwise, if one company operates every station, no connections will be built. This is now a fairly easy task---one way to do it is to check whether every character in the string is the same, printing No if this is the case and Yes otherwise. Alternatively, we can directly compare $$$S$$$ to "AAA" and "BBB", rejecting if the input is one of these strings and accepting otherwise. There are a wide variety of approaches, but they're all essentially equivalent.

Runtime: $$$O(1)$$$. Click here for my submission.


B — Count Balls

Given the large limits on $$$N, A,$$$ and $$$B$$$, a brute force simulation will be vastly too slow---we need a faster approach here. First, we compute $$$X = \lfloor \frac{N}{A+B} \rfloor$$$ to determine the number of times the operation is fully completed before we reach $$$N$$$ balls. Then, each of these operations adds $$$A$$$ blue balls to our answer, so we can initialize our answer as $$$XA$$$. Then, we have $$$Y = N \% (A+B)$$$ remaining balls to add. If $$$Y < A$$$, all of the remaining balls will be blue, and if $$$Y \geq A$$$, then we add another full set of blue balls. Thus, we can add $$$\textbf{min} (Y, A)$$$ to our answer and print it.

Runtime: $$$O(1)$$$. Click here for my submission.


C — Tax Increase

This initially seems like an annoying but doable math exercise--we could compute the minimum and maximum values at which the taxes are $$$A$$$ and $$$B$$$, intersect these ranges, and take the minimum value in the intersection. However, implementing this is mildly frustrating, and it turns out that there's an even simpler solution.

Notice that the inputs on $$$A$$$ and $$$B$$$ are extremely low. Indeed, as soon as we reach a tax of $$$1010$$$, we see that the $$$8\%$$$ tax will already be greater than $$$100$$$, so the answer must be less than $$$1010$$$. Thus, we can check the taxes resulting from every potential answer from $$$1$$$ to $$$1009$$$ and print the first one that works, returning $$$-1$$$ if none of them are valid.

Runtime: $$$O(A+B)$$$, albeit with a bad constant factor. Click here for my submission. Note that I capped the answer at $$$10000$$$ instead of $$$1009$$$ in my solution--this doesn't change the output; I just used an unnecessarily large bound in case I did my arithmetic wrong and the correct bound was higher than I thought.


D — String Formation

We can simulate the process fairly directly. First, we maintain whether or not the string $$$S$$$ is currently reversed or in its original configuration. Then, we maintain two strings $$$B$$$ and $$$A$$$, containing the characters before and after $$$S$$$ when $$$S$$$ is in its original configuration.

Processing the queries is fairly easy. For queries of type $$$1$$$, we just update our variable to indicate that $$$S$$$ is in the reverse of its position before the query. For queries of type $$$2$$$, we simply append the new character to either $$$B$$$ or $$$A$$$. If $$$S$$$ is in its original configuration, we append to the end specified in the problem. If $$$S$$$ is reversed, we append to the other end, because, for example, appending to the end when $$$S$$$ is reversed is equivalent to appending to the beginning when $$$S$$$ is in its original configuration.

Then, at the end of the process, if $$$S$$$ is in its original configuration, we print $$$B^RSA$$$, where $$$B^R$$$ is $$$B$$$ reversed (we define this operator for other strings similarly). We reverse $$$B$$$ because we want characters added to the beginning last to show up at the beginning of our string. If $$$S$$$ is reversed, we print $$$A^RS^RB$$$, which is the reverse of this string.

Runtime: $$$O(|S|+Q).$$$ Click here for my submission.


E — Divisible Substring

We use a common trick in problems asking us to count substrings satisfying a specific criterion. Essentially, we store the residues, modulo $$$P$$$, of every prefix of the string, followed by appending zeros until we reach $$$N$$$ characters. For example, in the first sample input, the numbers we get are $$$0000$$$, $$$3000$$$, $$$3500$$$, $$$3540$$$, and $$$3543$$$, giving us residues of $$$0, 0, 2, 0,$$$ and $$$0$$$ modulo $$$3$$$. Then, given that each residue appears $$$C[i]$$$ times in the result, we sum $$$\dbinom{C[i]}{2}$$$ over all $$$i$$$ and print that as our answer. There's one catch: because we're appending extra zeros, we handle the cases where $$$P = 2$$$ or $$$5$$$ separately. (These are fairly easy because we simply need to count the substrings whose last digits are divisible by $$$P$$$.)

Why does this work? Notice that if we subtract the $$$i$$$'th value in this array from the $$$j$$$'th, we get the substring over the range $$$i+1 \cdots j$$$, followed by some zeros. For example, subtracting $$$3000$$$ from $$$3540$$$ gives us $$$540$$$. Moreover, note that as long as $$$10$$$ is relatively prime to $$$P$$$, appending zeros won't affect whether or not this is divisible by $$$P$$$. Thus, we can choose any two of these numbers with the same residue mod $$$P$$$ and subtract them to get a substring divisible by $$$P$$$, while choosing any two of these numbers with different residues mod $$$P$$$ will not give us a substring divisible by $$$P$$$. Therefore, this process computes exactly the right answer.

How do we compute these numbers? Note that we don't need to know the numbers exactly (and in fact, computing all of them would require $$$O(N^2)$$$ memory): we only need their residues modulo $$$P$$$. We can thus compute these values sequentially. First, we compute the residue of each prefix--to compute one prefix's residue from the last, multiply by ten, add the new digit, and take the remainder modulo $$$P$$$. Then, to get the residue of a number from its corresponding prefix, multiply by $$$10^{N-1-i}$$$, given that we just added the digit in position $$$i$$$ in the string. These powers of ten can be computed quickly using any efficient modular exponentiation algorithm.

Once we have the counts of each residue, we can compute the answer easily.

Runtime: $$$O(N \log P).$$$ Click here for my submission.


F — Removing Robots

Start by sorting the robots by position. Realize that each robot $$$i$$$, either directly or through a chain reaction, will necessarily activate all robots up to some position $$$L[i]$$$, and no robots after this position. As a subproblem, we determine how to compute $$$L[i]$$$. We compute the values in reverse. Maintain a segment tree storing the values of $$$L$$$. Then, for each robot, use binary search to determine the furthest robot it directly activates. Finally, set $$$L[i]$$$ to the maximum of the values of $$$L[i]$$$ up to this furthest robot (using the segment tree to query for this value). This works because the furthest robot activated by $$$i$$$ is also the furthest robot directly or indirectly activated by any robot $$$i$$$ touches.

Now, let's compute the answer. Let $$$dp[i]$$$ be the number of sets of robots that could be activated given that we're only activating robots in the range $$$i \cdots N-1$$$. Define $$$dp[N]$$$ to be $$$1$$$. We see that $$$dp[0]$$$ is our answer. Then, we compute $$$dp[i]$$$ from $$$N-1$$$ to $$$0$$$. Note that when we consider adding given robot, we have two possible choices. We could avoid activating this robot, giving us $$$dp[i+1]$$$ possible sets (since if we're considering robots $$$i \cdots N-1$$$ but don't activate robot $$$i$$$, we can now pick any valid set from $$$i+1 \cdots N-1$$$). Alternatively, we can activate this robot, which adds robots $$$i \cdots L[i]$$$ to this set and gives us $$$dp[L[i]+1]$$$ possible sets. Notice also that if we're considering robots $$$i \cdots N-1$$$ and don't directly activate robot $$$i$$$, no higher-numbered robot will activate robot $$$i$$$, so the only way we'll ever activate robot $$$i$$$ is if we choose it now. Thus, we have $$$dp[i] = dp[i+1] + dp[L[i] + 1].$$$

We compute this recursion and print $$$dp[0]$$$ to get our answer.

Runtime: $$$O(N \log N)$$$. Click here for my submission.

Full text and comments »

  • Vote: I like it
  • +81
  • Vote: I do not like it

By Geothermal, history, 5 years ago, In English

A — Serval vs Monster

Unpacking the problem, we need to find the least multiple of $$$A$$$ that is greater than $$$H$$$. This is equal to $$$H$$$ divided by $$$A$$$, rounded up, or $$$\lceil \frac{H}{A} \rceil$$$. Using the integer division provided by most programming languages, we can compute this as $$$\frac{H + A - 1}{A}$$$, rounded down. (This is correct because when $$$H$$$ is equal to $$$0$$$ mod $$$A$$$, the $$$A-1$$$ component will be discarded, but otherwise, adding $$$A-1$$$ will be enough to push $$$H$$$ to the next multiple of $$$A$$$, effectively adding $$$1$$$ to the result similarly to rounding up.)

Runtime: $$$O(1)$$$. Click here for my submission.


B — Common Raccoon vs Monster

Given that we can't use the same move twice, our best option is to just use all the moves once, so the amount of damage we can deal is the sum of $$$A_i$$$ over all $$$i$$$. We can compute this sum and then compare it to $$$H$$$, printing Yes if the sum is at least $$$H$$$ and No otherwise.

Runtime: $$$O(N)$$$. Click here for my submission.


C — Fennec vs Monster

First, we claim that we should use the special move on the $$$K$$$ monsters with the greatest $$$H_i$$$. To prove this, note that we should never attack a monster and then use the special move on it (it'd be pointless--we could save a move just by not attacking beforehand), so the special move will always save us exactly $$$H_i$$$ attacks when used on monster $$$i$$$. Thus, since we want to save as many attacks as possible, we should use the special move on the monsters with the $$$K$$$ greatest values of $$$H_i$$$.

Now, we need to attack the remaining $$$N-K$$$ monsters. We sort the data and take the sum of the $$$N-K$$$ (or $$$0$$$, if $$$N < K$$$) monsters with the lowest $$$H_i$$$. We can then print this sum as our answer.

Runtime: $$$O(N \log N)$$$. Click here for my submission.


D — Caracal vs Monster

We claim that the answer is $$$2^{\lfloor \log_2 H \rfloor + 1} - 1$$$. We will prove this by strong induction on $$$H$$$. The base case, $$$N = 1$$$, is trivial: $$$2^{0 + 1} - 1 = 2 - 1 = 1$$$, and indeed, we need to use only a single attack to kill the monster, as desired.

Now, for our inductive step, we prove that this answer is correct for some $$$H$$$ given that it works for all other values. First, notice that the answer for $$$H$$$ is equal to one more than twice the answer for $$$\lfloor \frac{H}{2} \rfloor$$$, since after our first attack, we have two independent subproblems with value $$$\lfloor \frac{H}{2} \rfloor$$$. Thus, our answer for $$$H$$$ is

$$$2(2^{\lfloor \log_2 \lfloor \frac{H}{2} \rfloor \rfloor + 1} - 1) + 1.$$$

We can observe that $$$\lfloor \log_2 \lfloor \frac{H}{2} \rfloor \rfloor = \lfloor \log_2 H \rfloor - 1$$$. The intuition comes from a basic application of logarithm rules, but we can also prove this formally: note that if $$$H$$$ is an odd number greater than $$$1$$$, then $$$\lfloor \log_2 H \rfloor = \lfloor \log_2 (H-1) \rfloor$$$ and $$$\lfloor \frac{H}{2} \rfloor = \lfloor \frac{H-1}{2} \rfloor$$$, so subtracting $$$1$$$ from $$$H$$$ won't change the result on either side of the equation. Thus, if $$$H$$$ is odd, we can subtract $$$1$$$ from it, so we can assume $$$H$$$ is even. Then, $$$\lfloor \frac{H}{2} \rfloor = \frac{H}{2}$$$, and $$$\lfloor \log_2 \frac{H}{2} \rfloor = \lfloor \log_2 H \rfloor - 1$$$, as desired, where this final step comes from our logarithm rules.

Thus, our answer becomes

$$$2(2^{\lfloor \log_2 H \rfloor} - 1) + 1 = \boxed{2^{\lfloor \log_2 H \rfloor + 1} - 1},$$$

as desired.

Now, we can simply implement this formula and output its value for our $$$H$$$.

Runtime: $$$O(1)$$$ if you use a native function to compute logarithms, or $$$O(\log H)$$$ if you do it yourself. Click here for my submission.


E — Crested Ibis vs Monster

Though it initially might seem like this problem demands some sort of greedy solution, the problem is fraught with edge cases, so coming up with a correct greedy approach is more difficult than it looks (maybe even impossible). Luckily, the constraints are low enough that $$$O(HN)$$$ solutions are admitted, motivating a dynamic programming approach.

Let $$$dp[i]$$$ be the fewest MP necessary to deal $$$i$$$ damage. Of course, we are looking for $$$dp[H]$$$. To start off, note that $$$dp[0] = 0$$$, and initialize all other $$$dp[i]$$$ to $$$\infinity$$$. Then, our transitions are our spells--for each $$$i$$$ and $$$j$$$, we take $$$dp[i] = min(dp[i], dp[i-A_j] + B_j)$$$, raising $$$i-A_j$$$ to zero if it is negative. The intuition is that we're transitioning from the state $$$i-A_j$$$, since that's the damage we had dealt before casting this spell, and then we add $$$B_j$$$, the cost of this spell.

Our answer is then $$$dp[H]$$$.

Runtime: $$$O(HN)$$$. Click here for my submission.


F — Silver Fox vs Monster

My solution essentially overkills this problem with a lazy segment tree. Of course, this problem can be solved via several simpler approaches, most of which implement the same general idea using prefix sums or a BIT (or any other structure that can support range updates and point queries), but the asymptotic complexity is the same either way, and I found this solution easiest to write quickly since I had an easily accessible template I could use.

We start by sorting the monsters by location and creating a map from monster locations to their indices in the sorted list. We then create a lazy segment tree, where the value at position $$$i$$$ represents the health of monster $$$i$$$, where monsters are indexed based on their position in the sorted list. We initialize all of these values to $$$H_i$$$.

Then, we proceed through the monsters in increasing order of location. Our basic idea is to continually drop bombs whose leftmost position is the position of the leftmost remaining monster. We can prove that this will lead to an optimal solution because any bombs with positions further to the right will avoid killing the leftmost remaining monster, and this configuration kills the leftmost monster while maximizing damage to monsters further to the right.

For each monster $$$i$$$, we start by querying the segment tree for its health. Then, we compute the number of bombs $$$B$$$ necessary to kill it, similarly to our approach to problem A. We also compute, using the map created beforehand, the highest index $$$j$$$ whose monster is at a position less than or equal to $$$X_i + 2D$$$, since this will be the furthest monster to the right that we can reach. Finally, we update the segment tree by subtracting $$$AB$$$ over the range $$$[i, j]$$$ and add $$$B$$$ to our answer. Then, we move on to the next monster.

Runtime: $$$O(N \log N)$$$, due to sorting, our maps, and $$$O(N)$$$ segment tree operations. Click here for my submission.

Full text and comments »

  • Vote: I like it
  • +23
  • Vote: I do not like it

By Geothermal, history, 5 years ago, In English

A — Next Alphabet

The easiest way to deal with this problem is probably to convert the given character to an integer, add one, and convert it back. If you absolutely can't do that, a somewhat more annoying implementation would involve iterating through the alphabet, waiting until you find the given letter, using a boolean variable to store that the letter has been found, and printing the letter on the next iteration of the loop. (However, iterating over the alphabet is itself nasty unless you have conversion from letters to integers, so this approach is rather pointless unless your language just doesn't support the first method, though in that case you should probably switch languages anyway.)

Runtime: $$$O(1)$$$. Click here for my submission.


B — Achieve the Goal

In total, we need to earn $$$NM$$$ points. We subtract the points earned so far and determine whether this is a valid score for the final test. There are two special cases to deal with. First, if we need more than $$$K$$$ points, the answer is $$$-1$$$ because we cannot achieve the goal. Second, if we need a negative number of points, we should print $$$0$$$, since we must earn a nonnegative score. Otherwise, we can print $$$NM$$$ minus the sum of the input values.

Runtime: $$$O(N)$$$. Click here for my submission.


C — Welcome to AtCoder

This problem can be solved via direct simulation. Maintain two arrays: the number of incorrect submissions and whether we've seen an accepted solution, with both arrays containing entries for each problem. Then, when we see an incorrect submission for a problem, we can increment the first array. When we see a correct submission, we check whether this is the first correct submission we've seen so far and, if so, add the count of incorrect submissions for this problem to our answer.

Runtime: $$$O(N)$$$. Click here for my submission.


D — Maze Master

Note that using BFS, we can compute the distance from one point to all other points in $$$O(HW)$$$. Thus, by performing BFSes rooted from all $$$HW$$$ points, we can compute all pairwise distances in this graph in $$$O(H^2W^2)$$$. From here, we can iterate over each of these distances and compute the largest among them, which is our answer.

Runtime: $$$O(H^2W^2)$$$. Click here for my submission.


E — Max-Min Sums

We will separately count the number of times each element appears as the minimum and the maximum in a set of $$$K$$$ integers. Then, if $$$A_i$$$ appears $$$L_i$$$ times as the largest element in the set and $$$S_i$$$ times as the smallest element in the set, we add $$$(L_i - S_i) A_i$$$ to our answer.

First, sort the array and zero-index it (so that, for example, $$$A_0$$$ is the smallest element). Note that the number of sets in which $$$A_i$$$ is the largest value is the number of ways to choose $$$K-1$$$ smaller integers, or $$$\dbinom{i}{K-1}$$$. Meanwhile, the number of sets in which $$$A_i$$$ is the smallest value is the number of ways to choose $$$K-1$$$ larger integers, or $$$\dbinom{N-i-1}{K-1}$$$. We can thus compute $$$L_i$$$ and $$$S_i$$$ for each value in $$$O(1)$$$ each, with $$$O(N \log MOD)$$$ precomputation. (If you aren't familiar with how to efficiently compute the choose function, I strongly recommend looking into this--the basic idea is to precompute all relevant factorials and their modular inverses, which we can then use to compute the value of the choose function.) Now, we can use these values to compute our final answer.

Runtime: $$$O(N \log MOD)$$$. Click here for my submission.


F — Enclose All

Using the Google Algorithm, we can find code for this problem on the internet in $$$O(1)$$$ real time. From here, we can apply the copy-and-paste lemma to enter this code into our editor. Afterwards, we can read in our input and feed it to our copied algorithm, printing out the result. If you implemented the Google Algorithm effectively, your code should have $$$O(N)$$$ runtime, which will easily pass in time. This is the approach I implemented during the contest.

Of course, I'm (mostly--I actually did do this during the contest) joking, and I'll now describe the solution I assume was intended. The basic idea is that there are two cases to deal with. First, two points form a diameter of the circle, or second, at least three points are on the circle.

We will now prove that in any other case, we can shrink the circle to create a better answer. First, if only one point is on the circle, we can shrink the circle by dilating it with respect to that one point until another point is on the border, shrinking it without excluding any points. Second, if two points are on the circle, we can move the center towards these points while keeping them on the border, until the two points are on the diameter or another point is on the border, to shrink the circle without excluding any points. The reason this shrinks the circle is that the distance between the two points is constant, but the distance from the center of the circle to the line is decreasing, so by the Pythagorean Theorem we can see that the distance from the center to each of the two points, which is the radius, decreases. In all other cases, we have that either two points are on the diameter or three points are on the circle, as desired.

Thus, we can make a list of possible circles by computing the circles with the segments connecting each pair of points as diameters and the circumcircles of all triples of points. The circumcenter can be computed with some algebra by noting that it is the intersection of the perpendicular bisectors or through a closed form in terms of the given coordinates. Given the circumcenter, we can compute the circumradius by finding its distance to one of our three points. Now, we can consider each of these circles in turn and determine whether they contain all points. Among all the circles that do contain all of the points, we print the smallest radius.

Runtime: $$$O(N^4)$$$ if implemented from scratch, or $$$O(N)$$$ if copied from the internet. Click here for my submission. All credit goes to nayuki.io!

A bit of a soapbox: though this was an ABC, I think this problem was far too standard to appear on a modern programming competition, especially as problem F. Finding the solution on Google took me less than five minutes during the round, and given the simplicity of the problem statement, at least one of the round's authors/testers ought to have realized that the solution might exist on the internet already. Then, they should have done some basic research and, upon realizing how easy it would be to copy/paste the solution, scrapped this problem and written a new F.

Yes, writing original, challenging problems is very challenging, but this would have been a very easy issue to avoid. And, yes, the intended solution was not absurdly complex, but this problem still substantially advantages those who use Google and copy/paste code (including yours truly, oops), and even the intended solution is a lot easier to write if you Google the circumcenter formula. I'm not especially bothered by this as a one-time issue, but I do hope that this doesn't create a norm of allowing standard problems to appear as ABC E or F. (In contrast, for example, I'm mostly fine with problem D, even though it was also very standard, because at that level the focus probably should be on educating newcomers over encouraging competition.) I don't intend this as a personal attack on the authors, though, and I thank them for the contest and for the other five problems, which made for a very solid (if somewhat easier than usual) ABC.

Full text and comments »

  • Vote: I like it
  • +40
  • Vote: I do not like it

By Geothermal, history, 5 years ago, In English

A — 500 Yen Coins

We are essentially asked whether $$$500K \geq X$$$. We can determine this using an if statement. If you'd like to be fancy, you can shorten your code using the ternary operator, printing $$$\texttt{500K >= X ? "Yes" : "No"}$$$.

Runtime: $$$O(1)$$$. Click here for my submission.


B — Count ABC

There are several ways to do this. The first is to compute each three-letter substring of $$$S$$$, either through brute force or your language's substring computation function, then comparing them to "ABC". Another, which I implemented, is to simply iterate over each position in the string up to $$$N-3$$$, using zero-indexing, and check whether $$$S[i] = A$$$, $$$S[i+1] = B$$$, and $$$S[i+2] = C$$$. If so, we increment the answer. Either way, we can maintain a count of "ABC" substrings and return it at the end. (Of course, we theoretically could use a more complicated pattern matching algorithm, but because the string whose occurrences we're counting is extremely short, brute force is more than sufficient here.)

Runtime: $$$O(N)$$$. Click here for my submission.


C — Count Order

Obviously, if we can compute $$$a$$$ and $$$b$$$ efficiently, we're finished. Since the two values can be computed similarly, this reduces the problem to determining the lexicographical position of a permutation.

Luckily, $$$N$$$ is quite small, so we can use brute force. Using $$$\texttt{next_permutation}$$$ in C++, we can generate all permutations in lexicographical order and compare each of them to $$$P$$$ and $$$Q$$$ in order to compute $$$a$$$ and $$$b$$$. Then, we can print $$$|a-b|$$$, as requested. If you're not using C++, you could simply generate all arrays of $$$N$$$ numbers from $$$1$$$ to $$$N$$$ and ignore the ones that aren't permutations. This has complexity $$$O(N^{N+1})$$$, which still passes in time.

As an extra challenge, it turns out that there's a way to solve this problem in $$$O(N \log N)$$$. Of course, doing so is not necessary here and takes substantially more thought and code.

Runtime: $$$O(N \cdot N!)$$$. Click here for my submission.


D — Semi Common Multiple

It should be noted that much of this explanation is largely messy number theory used to prove that the solution works. The majority of the actual implementation details is contained in the final two paragraphs.

We are essentially being asked to count values of $$$X$$$ such that $$$X = \frac{a_i}{2} \mod a_i$$$ for all $$$i$$$. Let's start by characterizing the possible values of $$$X$$$. The key here is the Chinese remainder theorem, which tells us in this case that there is at most one valid solution to this system of modular congruences when we take it modulo $$$L = \texttt{lcm}(a_1, a_2, \cdots, a_n)$$$. In other words, there is at most one value $$$K$$$, with $$$0 \leq K < L$$$, such that $$$K = \frac{a_i}{2} \mod a_i$$$ for all $$$i$$$.

Now, there are two possible cases. First, suppose $$$2$$$ divides some two values of $$$a_i$$$ in the array a different number of times. (This is the case in the second sample input.) In this case, there is no working value of $$$K$$$. To prove this, suppose that $$$a_i$$$ is divisible by $$$2^a$$$ but not $$$2^{a+1}$$$, and $$$a_j$$$ is divisible by $$$2^b$$$ but not $$$2^{b+1}$$$, with $$$a < b$$$. Then, any value that is $$$\frac{a_i}{2} \mod a_i$$$ must be divisible by $$$2^{a-1}$$$ but not $$$2^a$$$, but any value satisfying the same equation for $$$a_j$$$ must be divisible by $$$2^{b-1}$$$, which, since $$$a < b$$$, implies that it is divisible by $$$2^a$$$, a contradiction. Therefore, in this case, the answer is $$$0$$$.

Now, assume that $$$2$$$ divides all $$$a_i$$$ the same number of times. In this case, $$$K = \frac{L}{2}$$$. We can see that since $$$L$$$ contains the same power of two as all of the $$$a_i$$$, $$$K$$$ contains the same power of two as all of the $$$\frac{a_i}{2}$$$ values, and for all other primes $$$p$$$ such that $$$p^k$$$ divides $$$a_i$$$, $$$L = a_i = 0$$$ mod $$$p^k$$$, so $$$\frac{L}{2} = \frac{a_i}{2}$$$ mod $$$p^k$$$, so $$$\frac{L}{2}$$$ will be congruent to $$$\frac{a_i}{2}$$$ modulo any power of any prime dividing $$$a_i$$$, proving that $$$\frac{L}{2} = \frac{a_i}{2} \mod a_i$$$, as desired.

By the way, it's probably not worth working through the logic to prove that $$$\frac{L}{2}$$$ works in the second case below--in-contest, I figured this out by experimenting with the sample inputs.

From here, we can first determine which case we're in--we can do this either by counting the times $$$2$$$ divides each $$$a_i$$$ or by computing $$$\frac{L}{2}$$$ and checking that it is congruent to $$$\frac{a_i}{2}$$$ modulo $$$a_i$$$ for all $$$i$$$. (Recall that we can compute the LCM of two numbers by multiplying them and taking their GCD. To prevent overflow, we need to return zero immediately if $$$L$$$ gets too large, noting that if $$$L > 2M$$$, the answer will always be zero.)

Now, we need to count values that are $$$\frac{L}{2}$$$ mod $$$L$$$ and less than or equal to $$$M$$$. To start, our answer is at least $$$M / L$$$, rounded down, since for any block of $$$L$$$ consecutive numbers, one of them will be $$$\frac{L}{2} \mod L$$$. Then, if $$$M \% L \geq \frac{L}{2}$$$, we get one extra value, so we increment the answer. Finally, we print the result.

Runtime: $$$O(N \log M)$$$. (The latter factor comes in from our LCM computation.) Click here for my submission.


E — Change a Little Bit

Starting with an observation, note that we should make the necessary changes in decreasing order of cost, as we want to pay the smallest cost the most times and the largest cost the fewest times. Let's sort the data so that without loss of generality, $$$C_1 \geq C_2 \geq \cdots \geq C_N$$$.

Now, let's compute the amount each $$$C_i$$$ contributes to the answer. We do this by computing the amount each $$$C_i$$$ contributes on average given a completely random $$$(S, T)$$$ pair (to simplify the computation, we won't assume $$$S \neq T$$$, as the cases where $$$S = T$$$ don't actually add to the answer) and multiplying by the $$$2^{2N}$$$ total ways to pick $$$S$$$ and $$$T$$$.

How much does $$$C_i$$$ contribute for any given string? Obviously, if the two strings are not different in position $$$i$$$, $$$C_i$$$ doesn't contribute at all. However, if the strings are different in position $$$i$$$, $$$C_i$$$ will count once for position $$$i$$$, plus once for each position less than $$$i$$$ at which $$$S$$$ and $$$T$$$ are different. $$$S$$$ and $$$T$$$ have a $$$\frac{1}{2}$$$ chance of differing at each position, so this adds $$$\frac{i-1}{2}$$$, using one-indexing, to our count, for a total of $$$\frac{i+1}{2} C_i$$$ as our average cost among strings that differ in position $$$i$$$. Half the strings differ in position $$$i$$$, so $$$C_i$$$ contributes $$$\frac{i+1}{4} C_i$$$ to our sum.

Thus, our answer is $$$2^{2N} \sum_{i = 1}^N \frac{i+1}{4} C_i$$$, or alternatively, $$$2^{2N-2} \sum_{i = 1}^N (i+1)C_i$$$. We can directly compute this sum in $$$O(N)$$$.

Runtime: $$$O(N \log N)$$$, due to sorting. Click here for my submission.


F — Xor Shift

Consider the function $$$d(a)$$$ that computes, given an array $$$a$$$, a result array such that $$$d[i] = a[i] \, XOR \, a[i+1]$$$ for $$$0 \leq i < N-1$$$ and $$$d[N-1] = a[N-1] \, XOR \, A[0].$$$ The first insight here is that the operation given in the problem is that $$$d(a')$$$ is a cyclic shift of $$$d(a)$$$ by $$$k$$$ positions. The proof is that XORing two values by the same number does not change the XOR of the two values--in other words, $$$(a \, XOR \, x) \, XOR \, (b \, XOR \, x) = a \, XOR b$$$ for all $$$a, b, x$$$, since XOR is commutative, associative, and satisfies $$$x \, XOR \, x = 0$$$. Therefore, since we obviously must have $$$d(a') = d(b)$$$, our values of $$$k$$$ must have that $$$d(a)$$$, cyclically shifted $$$k$$$ positions, is equal to $$$d(b)$$$.

The second observation is that all of these values of $$$k$$$ have exactly one value of $$$x$$$. The proof is that we can uniquely construct an array $$$a$$$ from $$$d(a)$$$ given $$$a[0]$$$, by the recursion $$$a[i+1] = a[i] \, XOR \, d[i]$$$. Thus, if $$$a[0] \, XOR \, x = b[(N-k) \mod N]$$$ and $$$d(a') = d(b)$$$, we will have $$$a' = b$$$. Thus, for any $$$k$$$ such that $$$d(a') = d(b)$$$, we can take $$$x = a[0] \, XOR \, b[(N-k) \mod N]$$$ and have a working pair. Note that this uniquely defines $$$x$$$, so there will be at most one value of $$$x$$$ for any $$$k$$$.

Now, we need to enumerate the ways to cyclically shift $$$d(a)$$$ to get $$$d(b)$$$. There are several ways to do this--the one I used involves hashing the sequence. Note that we can compute the polynomial hash of an array $$$d$$$ as $$$d[0] \cdot s^{N-1} + d[1] \cdot s^{N-2} + \cdots + d[N-1] \cdot s^0$$$ for some arbitrary base $$$s$$$, all modulo some large prime. We can compute the hash of $$$a$$$ and $$$b$$$ in $$$O(N)$$$. Then, to cyclically shift our hash of $$$a$$$ by one position, we can subtract $$$a[0] \cdot s^{N-1}$$$, multiply the remainder of the hash by $$$s$$$, and add $$$a[0]$$$ back. Now, we have the hash of the array $$$a[1], a[2], \cdots, a[N-1], a[0]$$$. We can repeat this process for all $$$i$$$ and list the cyclic shifts for which our hash of $$$a$$$ equals our hash of $$$b$$$. With an appropriate hashing prime (the prime should be low enough to prevent long long overflow but higher than $$$2^{30}$$$ to prevent collisions), this is extremely unlikely to cause collisions and should pass essentially every time.

Once we have our values of $$$k$$$, we can use the approach outlined above to compute the corresponding values of $$$x$$$, at which point we're done.

Complexity: $$$O(N)$$$. Click here for my submission.

Full text and comments »

  • Vote: I like it
  • +25
  • Vote: I do not like it