marat.snowbear's blog

By marat.snowbear, history, 9 years ago, In English

I was trying to add yesterday Google Code Jam qualification round to the gym and decided this time to avoid typing in statements myself and print them from website to PDF and attach to the contest. I thought that I'll be able to create a contest with statements in attachments easily and get contest like 2009-2010 Petrozavodsk Winter Training Camp, Moscow SU Unpredictable + SE + old ST Contest for example. It turned out that I have no idea how do that.

What I did:

  1. Created regular contest in Polygon with checkers, tests, solutions, etc but with no statements.
  2. Created a gym contest and added polygon contest's descriptor through FTP.

This creates a contest but there are two issues about it:

First of all I had no idea how to attach PDF statements. Eventually I was able to do that using Codeforces Contest Wizard in update mode using a link in gym contest. This Java tool is not that intuitive to use but I was able to upload my PDF document eventually and specify that this is English statements. Is there easier/better/other way to do that?

Secondly, my problems in gym have no name, usually I specify the names as part of the statement in the Polygon but since I had no statements in Polygon I had no names as well. I tried using Contest Wizard to solve this issue as well but the only way I think I found to do that requires uploading all problem data (checker/tests/etc) which I'm not going to change and don't want to bother getting that again assuming that Polygon already created everything.

If somebody wants to help me with that I can give an access to the gym contest, but I would be grateful for an advice as well.

Full text and comments »

  • Vote: I like it
  • +16
  • Vote: I do not like it

By marat.snowbear, history, 9 years ago, In English

I'd like to inform you that on 30th of June on Coursera will start second session of Tim Roughgarden's "Algorithms: Design and Analysis, Part 1" course. I took first session of this course a couple of years ago, in fact this is a source of my initial knowledge about algorithms and data structures and I started taking part in competitive programming contests soon after finishing second part of the course. I would definitely recommend taking it, I'd say that it should be interesting and useful for CF green & blue rating zone coders. The course consists of video lectures, quizzes and programming assignments (it's not a pure theory class) and will take approximately 4-8 hours per week to finish it successfully.

To give you an idea on topics covered by this course, let me just copy the course syllabus here:

Week 1: Introduction. Asymptotic analysis including big-oh notation. Divide-and-conquer algorithms for sorting, counting inversions, matrix multiplication, and closest pair.

Week 2: Running time analysis of divide-and-conquer algorithms. The master method. Introduction to randomized algorithms, with a probability review. QuickSort.

Week 3: More on randomized algorithms and probability. Computing the median in linear time. A randomized algorithm for the minimum graph cut problem.

Week 4: Graph primitives. Depth- and breadth-first search. Connected components in undirected graphs. Topological sort in directed acyclic graphs. Strongly connected components in directed graphs.

Week 5: Dijkstra's shortest-path algorithm. Introduction to data structures. Heaps and applications.

Week 6: Further data structures. Hash tables and applications. Balanced binary search trees.

There is a second part of this course as well which covers greedy algorithms, dynamic programming and NP hard problems, I would recommend taking it as well but it turns out that there was a session of it which finished recently, I don't know when they will schedule the next one but I guess you can still watch the lectures.

Full text and comments »

  • Vote: I like it
  • +93
  • Vote: I do not like it

By marat.snowbear, history, 9 years ago, In English

I have written a bat script for Windows to automate a bit testing DCJ solutions on multiple examples, putting it here, might be helpful to somebody. I've written it to test C++ solutions but in case you write in other languages you should be able to change it easily.

script itself:

echo off

set PYTHONPATH=C:\Program Files (x86)\Python27\
set DCJ_PATH=c:/Temp/SP/DistributedCodeJam/dcj/dcj.py

cls
for %%* in (.) do set TaskName=%%~nx*
echo %TaskName%

for %%f in ("%TaskName% (*).h") do (
	echo ----------------------------
	echo %%f
	copy /y "%%f" %TaskName%.h > NUL	
	copy /y "%%f.out" expected.out > NUL
	for /F "delims=" %%i in (expected.out) do echo EXPECTED: %%i

	"%PYTHONPATH%\python.exe" "%DCJ_PATH%" test --source=solution.cpp --nodes=3
)

There are two path variables on top, don't forget to set them appropriately.

Here is an output for sandwich problem:

Sandwich
----------------------------
sandwich (1).h
EXPECTED: 14
STDOUT 0: 14
Duration: 15.6001ms (longest running instance: 2)
----------------------------
sandwich (2).h
EXPECTED: 0
STDOUT 0: 0
Duration: 15.6001ms (longest running instance: 1)
----------------------------
sandwich (3).h
EXPECTED: 5
STDOUT 0: 5
Duration: 31.2002ms (longest running instance: 0)
----------------------------
sandwich (4).h
EXPECTED: 50
STDOUT 0: 50
Duration: 15.6001ms (longest running instance: 2)

How to make it working:

  1. Download sample libraries from DCJ Dashboard. I download them with Google Chrome to the same folder all at once, so they get names like 'sandwich.h', 'sandwich (1).h', 'sandwich (2).h', etc. The script assumes that all input files have names matching 'PROBLEM NAME (TEST_CASE).h' pattern, to do that I simply download first input file twice so it becomes 'sandwich (1).h'.
  2. Copy all 'sandwich (*).h' files to the same folder where your solution is.
  3. Put this script to the same folder, change path variables in the script.
  4. For each *.h file create an expected output file appending .out extension to it. So expected output for 'sandwich (1).h' should be at 'sandwich (1).h.out'.

That should be it and you should be able to run the script and check all testcases in one run. It is assumed that you already made DCJ tool running on your PC in advance.

Full text and comments »

Tags dcj, gcj
  • Vote: I like it
  • +47
  • Vote: I do not like it

By marat.snowbear, 9 years ago, In English

I'd like to start a discussion on how CF community is involved or at least notified about the changes happening in CF and in CF API in particular.

It seems that there are people more or less constantly working on implementing some new features on CF. And sometimes we can see blog posts from MikeMirzayanov or somebody from CF team highlighting a list of changes which happened during some period, but these posts are very rare, they are published like once or twice per year, while it seems that we have something changed almost every month. CF has very strong community, some people from the community do something on their own built on top of CF web site or CF API (submit solution from command line, VIM plugin, my CF achievements project). For all of us it is critical to be aware of changes happening on CF. For example this submission script stopped working after CSRF security fix. Another example I found today: since it's launch CF API had a bug that when retrieving hacks hacker and defender were swapped, turns out this was fixed several weeks ago. My achievements web site relies on retrieving hacks through CF API, so I had a workaround to swap them back, and turns out it's broken for several weeks because there is no need to swap hacker&defender anymore. Of course I can blame myself in this as well, I should have implement a check because I expected that CF will fix the bug at some point, but still I think it would be great if CF administration could inform us about such changes happening (hopefully in advance of course).

So my main concern is that CF community is not informed well about some important changes happening on CF, it makes it harder to do something on top of CF if you are not even aware when API is changing or some bug is being fixed. I'm not sure how it should be handled though, CF blogs do not seem to be a good solution in this case because they get almost no attention as soon as they go off the 'recent actions' list. Also major part of the community might be not interested in these updates. I'm not asking to implement something for this particular purpose, even simple mailing list will work fine for me.

Another point similar to the one I discussed above is that it's not clear how community can affect the development of CF API. Obviously CF API is implemented for the community so we should have a way to tell CF what we actually need. For example while working on CF Achievements site there were several places where CF API was not enough to grab all the data I need and I wanted to inform CF about it, but I don't what's the best way to do it:

  1. PM to Mike? Quite often he might be busy, might not have time to respond, etc. Also again it leaves the entire community uninformed. Some other people in community might also need similar features so it doesn't seem to make sense to discuss this topic in PM.

  2. Ivan's introduction of CF API blog post? But it's quite old, I don't think anybody from CF team monitors it, Ivan is not working in CF anymore.

  3. Create a new blog post? Also doesn't seem to be a good option. It will be lost as soon as it will go off 'recent actions' list, it might be unnoticed by CF team.

Again, I'm not sure what is the best option to handle it, but something like issue tracker should do it well.

So I want to discuss this topic here both with the community and with CF administration. It might be a case that community is happy with the way it is right now, in this case I'll just switch back to PM'ing CF administration :-)

Full text and comments »

  • Vote: I like it
  • +47
  • Vote: I do not like it

By marat.snowbear, 10 years ago, In English

Hi!

I'd like to introduce my project to Codeforces community. My project's name is 'Codeforces achievements' and as you can probably guess it is about achievements! Your achievements on Codeforces site. For those who do not like my long posts I should mention that the link to the site is at the end of this post.

Introduction

Last summer I left my last job, so I have quite a lot of spare time which I could invest in my hobby. Also I decided to learn something more or less new for me (for last several years my job was to write desktop software for Windows), so I decided to put my hands on some modern popular web-framework. At that point I chosen Python and Django, cause it was matching one position I applied. So I had a time and I wanted to learn Django. The best way to learn something is to practice in it, so I started looking for the idea for the site to create (didn't want to write yet another forum engine). Luckily at that point I ran into I_love_Tanya_Romanova's comment:

System> tourist unsuccessfully challenged LeBron's 250-point problem.
Achievement unlocked:)

That was it! I like the idea of achievements in the games and I decided that I want to something similar for Codeforces. That achievement Bohdan mentioned in his comment became the first achievement I implemented in my CF-achievements.

Achievements

The original idea was to have more funny & unusual achievements and less hard time-consuming achievements like 'solve XXX problems'. I think I didn't really succeed in that, there are not that many achievements there but I still have a couple of ideas for the future (of course I'm open to your ideas as well). Generally at the moment there are two types of achievements implemented. First one is something like 'language expert', in order to get this one you need to solve 50 problems in some particular programming language or language family. The second type of achievements can be rewarded for something you did in a single contest. For example tourist might try to hack your solution and if he doesn't succeed you get a You didn't even scratch me! achievement. Unfortunately Gennady won't be able to get this achievement himself, I apologise for that.

I was trying to come up with funny & interesting names for achievements, if you do not get the point in some achievement's name feel free to Google it or ask about it in the comments.

I should also mention here that all the achievements are given only for the online round you take part in, there is nothing for solving the problems after the round has finished. Also some achievements might take only rated rounds into account. There is a description for each achievement on how to get it.

Technical side

The site at the moment is not automated much, I need to manually start achievement crawling jobs after each contest so please do not blame me if you don't see your achievement two minutes after contest is over. It might take around 10-15 minutes after the system test phase for me to run all the scripts. I might also be in the gym, stay patient then :-) I'm working on automating the site.

And finally...

Finally the link: CF-achievements

Link to 'You didn't even scratch me!' achievement

You start point should be the main page and the user search textbox there, type in your CF handle and see what you have earned.

I should say that the project is in the beta stage at the moment, there is a lot to be done, I just wanted to involve the community into it, get some feedback and maybe some support. I'm open to all ideas about this project, please feel free to post them as comments here. Also you can see that the site was designed by a programmer (me), if you are more talented designer that I am, I would appreciate your help.

Initially I was going to let you know about it after CF round 288, but looks like Div. 1 part of it was cancelled so I decided to introduce it immediately. Also in the last round I_love_Tanya_Romanova got his achievement in CF, the idea of this site was born when I saw his comment so I decided it's good sign to start.

Hope you enjoy it,
Marat.

Full text and comments »

  • Vote: I like it
  • +333
  • Vote: I do not like it

By marat.snowbear, 10 years ago, In English

Hi guys!

I'm currently unemployed, looking for a job and would like to make some survey here, basically the topic would be "the interesting job". To give you a quick introduction: I was working as a .Net developer for around 7 years in different but more or less similar outsourcing companies, I've started participating in competitive programming 1.5 years ago and eventually this ruined my career in regular outsourcing cause the tasks there most of the time are more boring and less brain-teasing than what you have in CP.

To be clear, what I want here is to have as much opinions as possible, I would appreciate any ideas, consider this topic to be a brainstorming platform. I just want to broaden my understanding of what is there outside of the companies where I was working before. I'm interested to hear about your companies (I won't blame you if you decide to advertise your company in this blog post), or maybe you won't give out the company name but will share what do you do there, whether you find it interesting or not, anything.

I'm interested to hear about different areas you are working in, I'm not even restricting it to be a programming job, maybe something CS-related (but still supposed to be interesting for a programmer), maybe you're working in CS-research centres, maybe in some educational centre, maybe you're winning a lot of monthly/yearly contests and selling the T-shirts you get there and that's basically your salary.

Good example of what I'd like to see here is gojira's Rockethon introduction blog post where he was introducing Rocket Fuel, what do they do there, why is it interesting to work there, what might be the challenging task there and what does it have to do with competitive programming.

Full text and comments »

  • Vote: I like it
  • +105
  • Vote: I do not like it

By marat.snowbear, 10 years ago, In English

475A - Bayan Bus

As usual for A we have an implementation problem and the main question is how to implement it quickly in the easiest (thus less error-prone) way. During the contest I spent almost no time thinking about it and decided to implement quite straightforward approach — start with empty string and concatenate different symbols to the end of it. The result is quite messy and worst of all I forgot one of the main principles in the [commercial] programming — separate logic from its presentation. The "presentation" here is all these bus characters which are all the same across different inputs. So I paid my price, my original solution failed on system tests.

Much better approach afterwards I looked up in Swistakk's solution — you copy&paste the "template" of the bus from the example test into your source code, then you just change the characters for each of 34 possible seats. Result becomes much cleaner — 8140120.

475B - Strongly Connected City

For B we already might need some kind of algorithm to solve it, though for the constraints given you might use almost any approach. If you just have a directed graph then in order to check whether you can reach all nodes from all other nodes you can do different things — you can dfs/bfs from all nodes, two dfs/bfs from the single node to check that every other node has both paths to and from that node, or you can employ different "min distance between all pairs of nodes" algorithm. All of these seem to work fine for this problem.

But we shouldn't ignore here that the input graph has a special configuration — it's a grid of vertical and horizontal directed lines. For this graph all you need is to check whether four outer streets form a cycle or not. If they form a cycle then the answer is "YES", otherwise "NO". The good explanation why this works was already given by shato. And then the entire solution takes just 10 lines of code — 8140438.

475C - Kamal-ol-molk's Painting

There are probably several completely different ways to solve this problem, I will describe only the one I implemented during the round. The start point for the solution is that there are three cases we need to cover: either first time we move the brush right or we move it down or we do not move it at all. Let's start with the last one. Since we're not moving the brush at all then it's obvious that altered cells on the painting form a rectangle. It can also be proven that the only case we need to consider here is the rectangle 1x1, that's the only rectangle which cannot be achieved by moving some smaller rectangle. So we need to count the number of altered cells, if it is equal to 1 then the answer is 1.

Now we're left with two cases to consider, you start by moving right in the case and you move down in the second case. You can write some code to cover each of those cases, but you can also notice that they are symmetric in some sense. To be more clear, they are symmetric against the diagonal line which goes from (0, 0) point through (1, 1) point. If you have some code to solve "move down first" case, then you don't need to write almost completely the same code to solve "move right" case, you can simply mirror the image against that diagonal and invoke "move down" method again. Small trick which saves a lot of time and prevents copy-pasting of the code.

Now let's try to solve this last case. Basically the approach I used can be shortly described like that: we start in the leftmost topmost altered cell, then we move down and that move already leaves us only one option what do in the next moves until the end, so we get the entire sequence of moves. As well as getting these moves we also get the only possible width and height of the brush, so we know everything, I was not very strict while moving the brush so at the end I also compared whether these moves and these sizes in fact give exactly the same image as we get in the input.

That was the brief explanation of the "move down" method, now let's get into details. First of all since we move down immediately from the start point then there is only one value which can be valid for the brush width — it is the number of altered cells which go to the right from the start point. So we know one dimension. Now let's try moving the brush. Ok, first time as we said we move it down, what is next? Then you might have a choice whether you want to move right or to move down again. The answer is to move right first because your width is already fixed while height might be unknown still, so if you miss point to be altered in the left column of the brush and you move right, you might still have a chance to paint it if you take the correct height of the brush. But if you miss the point to the right of the current topmost brush row then you won't be able to cover it later, because your width was fixed on the first step. Here is a picture:

Grayed out is the current position of the brush. So what I'm saying is that you should move to the right if the cell in the red area is painted, otherwise you will definitely miss it. So this simple thing gives you the entire idea on how to build the only possible sequence of moves. You also need to calculate some possible value for the brush height. It can be calculated just before moving right, in order not to miss any painted cells you need to extend you height of the brush to cover all the painted cells in the leftmost column of you current brush position (this was highlighted as green on the image above). Now you know all the information about the probable painting — you have both dimensions of the brush and you know how it was moving, as I said before all you need to do is to double check that these moves and these dimensions in fact give you the same painting you get. If it gives the same painting then the answer for this case is width·height, otherwise there is no answer for this particular case.

If you implement all of these steps carefully and you won't paint cells more than one time, then you will be able to achieve an O(N·M) complexity.

475D - CGCDSSQ

It seems that all the solutions for this problem based on the same observation. Let's introduce two number sequences: a0, a1, ..., an and x0 = a0, xi = gcd(xi - 1, ai). Then the observation is that the number of distinct values in x sequence is no more than 1 + log2a0. It can be proven by taking into account that x is non-increasing sequence and value for greatest common divisor in case if it decreases becomes at most half of the previous value.

So we have this observation and we want to calculate the count of GCD values across all the intervals, I see couple of ways how we can exploit the observation:

  1. We can fix the left side of the interval and look for the places where gcd(al, ..., ar) function will decrease. Between the points where it decreases it will stay flat so we can add the size of this flat interval to the result of this particular gcd. There are different ways how to find the place where the function changes its value, some people were using sparse tables, I have never used those so won't give this solution here. Otherwise you can use segment tree to calculate gcd on the interval and then do a binary search to find where this value changes. This adds squared logarithmic factor to the complexity, not sure if that passes the TL easily. Otherwise you can do basically the same by doing only one descent in the segment tree. Then you get rid of one logarithmic factor in the complexity but that will make segment tree a bit more complicated.

  2. Another solution seems to be much simpler, you just go left to right and you calculate what is the number of segments started to the left of current element and what is the greatest common divisors values on those intervals. These values you need to store grouped by the gcd value. This information can be easily updated when you move to the next element to the right — you can check that part in my submission. Our observation guarantees that the number of different gcd values that we have to track is quite small all the time, it's no more than 1 + logAmax. Submission: 8141810, maps are used here to count gcd values, using the vectors instead makes it running two times faster but the code is not that clear then.

475E - Strongly Connected City 2

Here we can start with the simple observation — if we have a cycle then we can choose such an orientation for the edges such that it will become a directed cycle. In this case all nodes in the cycle will be accessible from all other nodes of the same cycle. I was using the bridges finding algorithm to find the cycles. So you find all the cycles, for each cycle of size si you add si2 to the answer. Then you can collapse all the cycles into the single node (storing the cycle size as well). After merging all cycles into single nodes original graph becomes a tree.

Now it comes to some kind of a magic, as far as I've seen from the discussions here people were make an assumption about the optimal solution, but nobody proved this assumption. Assumption goes like this: "there exists an optimal solution which has some node (let's call it root) such that for every other node v you can either reach root from v or you can reach v from root using directed edges of this optimal solution". Intuitively this assumption makes sense because probably in order to increase the number of pairs of reachable nodes you will try to create as least as possible components in the graph which will be mutually unreachable. So we have this assumption, now in order to solve the problem we can iterate over all nodes in the tree and check what will be the answer if we will make this node to be a root from the assumption.

Let's draw some example of the graph which might be a solution according to our assumption:

Here the root is drawn on top and every other node has either a path to the root or a path from the root. So let's say we decided that some node will be a root but we didn't decide yet for each of its children whether their edges will directed to the root or from the root. In order to fulfil our assumption the orientation of the edges within some child's subtree should be the same as the orientation of the edge between the root and its child. There will be two more numbers which we need to add to the result — pairs of reachable nodes within some child's (root children only!) subtree and pairs of reachable nodes for nodes from different subtrees. Let's introduce some variables:

si — size of the i-th cycle from the step when we were merging the cycles. In our tree all cycles are already merged into a single node, so si is a weight of the i-th node. It can be equal to 1 if node i did not belong to any cycle.

Wi — weight of the entire subtree rooted at node i.

It can be seen that if we orient all edges in the subtree in the same manner according to our assumption then the number of reachable pairs of nodes will not depend on the chosen orientation. Each particular node i adds the following term to the final result: si·(Wi - si). Now we need to decide what should be the orientation of all the edges adjacent to the root. Let's declare two more variables:

in — sum of Wi for all root's children whose edge is directed towards the root.

out — same for the children whose edge is directed from the root.

So if we have those definitions then the last term for the result will be equal to: (in + srootout + in·sroot. We can see that this term depends only on the in and out values, it doesn't depend on which particular children contribute to each of these values. We check which values for in and out we can achieve, we can do it using the DP similar to the one used in knapsack. So we check every possible value for in, based on this we can calculate out value because their sum is fixed and equals to Wroot - sroot, then we put them into the formula above and check whether it gives us a better total answer.

There is one more thing to be noted about this solution — you can see that I said that we need to iterate over all nodes and make them to be the root, then you will need to get all the sizes of the children for particular root and for those sizes you will need to find all the possible values for their sums. The sum can be up to N, for some root we might have up to O(N) children, so if you've fixed the root then the rest might have O(N2) complexity. This might lead you to a conclusion that the entire algorithm then has O(N3) complexity, which is too slow for the given constraints. But that is not the case because outermost and innermost loops depend on each other, basically there you iterate over all the parent-children relations in the tree, and we know that this number in the tree equals to 2(N - 1), less than N2, so in total it gives O(N2) complexity for this solution.

Submission: 8168350

475F - Meta-universe

Let's try to solve this problem naively first, the approach should be obvious then — you have a set of points, you iterate them left to right and you mind the gap between them. If there is such a gap between two neighbours you split this meta-universe with vertical line and do the same thing for each of two subsets. If you found no such gap by going left to right, you do the same going top to bottom. If you are unlucky again then you are done, this is a universe, it cannot be split. Let's think about the complexity of this approach now, even if get the points sorted and split for free then the complexity will highly depend on how soon you found a place where you could split the meta-universe. For example if you are lucky and you will always split close to the start of the sequence then your total complexity would be O(n), which is very good indeed. If you will encounter the option to split somewhere in the middle of the sequence, which means that you split the meta-universe in two halves, you will get a O(NlogN) solution, not linear but still good enough for this problem. The worst case occurs when you split always after iterating through the entire sequence, that means that in order to extract one point from the set you will have to iterate through all of them, now the complexity becomes O(N2), that's not enough already.

Let's think about it again, we got several cases, two of them are good (when we split in the beginning or in the middle) and the third one is too slow, it happens when we split always in the end. We need to make our solution better in this particular case and that should be almost enough to solve the entire problem. The obvious approach how to get rid of "split at the end" case is to start iterating from the end initially. And it should be obvious as well that it doesn't change much because your old start becomes the end now. But what we can do instead is to start iterating from both sides, then in any case you are winner! Intuitively, if you encountered a way to split at some end of the sequence then you're good to go because you spent almost no time on this particular step and you already found a way to split it further. In the opposite case, if you had to iterate through the entire sequence and found a split point in the middle you should be still happy about it because it means that on the next steps each of your new sets will be approximately two times smaller, that leads you back to O(NlogN) complexity. Also you should not forget that you need to do the same for Y coordinate as well, that means that you need to have four iterators and you need to check them simultaneously.

This is the basic idea for my solution, there is just one more detail to be added. Initially I told that our complexity estimate is given if we "sort and split for free". That's not that easy to achieve, but we can achieve something else almost as good as this one. In order to split cheaply all you need to do is to avoid the actual split :-) Instead of splitting the sequence into two subsequences you can just leave the original sequence and extract part of it which will become a new subsequence. Obviously if you want to make this operation cheaper you need to extract the smaller part. For example if you have a sequence of points sorted by X and you found a split point by iterating from the end of the sequence then you can be sure that the right subsequence will not be longer than the left sequence, because you iterate from the left-hand side and from the right-hand side simultaneously. Now we need to take care about sorting part as well. This one is easy, all you need to do is instead of storing one sequence of points you store all the points in two sorted sets — one by X and one by Y coordinate. In total this should add just one more logarithmic factor to the complexity.

That is the entire solution, I'd like to get back one more time to the complexity analysis. We have a recurring algorithm, on every step of the recurrence we're looking for the split point, then we split and invoke this recurring algorithm two more times. It looks that for the worst case (which is split in the middle) we will split a sequence into two subsequences of the same size, so we have a full right to apply a Master theorem here. On each step our complexity seems to be O(NlogN) (log factor because of using sets) so the "Case 2. Generic form" section from the Wiki gives us a solution that our entire algorithm has an O(Nlog2N) time complexity. Am I correct?

My submission — 8138406

Full text and comments »

  • Vote: I like it
  • +224
  • Vote: I do not like it

By marat.snowbear, 10 years ago, In English

471A - MUH and Sticks

Given six sticks and their lengths we need to decide whether we can make an elephant or a bear using those sticks. The only common requirement for both animals is that four leg-sticks should have same length. This means that the answer "Alien" should be given only if we can't find four sticks for the legs. Otherwise we will be able to make some animal. The type of the animal will depend on the relation of the remaining sticks' lengths. If they are equal then it will an elephant, if they are different we will have a bear.

So this algorithm should solve the problem:

  1. Find the number which appears at least four times in the input.

  2. If no such number exist then the answer is "Alien".

  3. Otherwise remove four entries of that number from the input.

  4. After removing that number you will have two numbers left, compare them and decide whether it's an elephant or a bear.

One shortcut for this problem might be to sort the input array, then if it's a bear or an elephant then 3rd and 4th elements in the sorted should be animal's legs. So you can assume that one of these numbers is the length of the leg and count how many times you see this number in the input.

Author's solution: 7977022

Since the numbers in the input are very small you can implement 'brute force' solution as well. By brute force solution in this case I mean that you can actually check all possible values for leg-length, head-length and body-length. If in total they give the same set as the input then you found a matching, all you need is to check whether it's a bear or an elephant. And it's an alien if you checked all possible combinations and found nothing matching to the input. Though in this case the brute force solution is not easier than another one.

Checking all possible lengths solution: 7975645

It seems that there were two common mistakes people were making in this problem:

  1. Not taking into account that legs can be of the same length as body or head. So you can't just count the number of distinct numbers in the input to decide which type of animal is that. We assumed that people might make such a mistake, there was a relevant warning in the statement.

  2. Trying to sort the input and then check whether some elements in this array are equal to other elements and based on this comparison decide which type of animal is that. People were simply making mistakes when deciding which elements to compare. The correct way to implement this is:

 // 0-indexing below, array is assumed to be sorted

     if (l[0] == l[3]) cout << (l[4] == l[5] ? "Elephant" : "Bear");
else if (l[1] == l[4]) cout << "Bear";
else if (l[2] == l[5]) cout << (l[0] == l[1] ? "Elephant" : "Bear");
else cout << "Alien";

Solution: 7977214

This solution seems to be shorter but there are many failed solutions like this because people were not paying enough attention to the details. So I would prefer implementing more straightforward approach.

Hope you liked the pictures!


471B - MUH and Important Things

You need to check whether exist three pairwise different permutations of the indices of the input which result in array being sorted. Generally you can count the number of total permutation which give non-decreasing array. This number might be very large and it might easily overflow even long integer type. And what is more important is that you don't actually need to count the exact number of such permutations.

Let's tackle this problem from another angle. Assume you already sorted the input array and you have the corresponding permutation of indices. This already gives you one array for the result, you only need to find two more. Let's look for any pair of equal numbers in the input array, if we swap them then we will get another valid permutation. And if we find one more pair of equal numbers then swapping them you can get third permutation and that will be the answer. You need to keep in mind here that one of the indices when swapping the second time might be the same as one of the numbers in the first swap, that's ok as soon as second index is different. So all you need to do is to find two pairs of indices which point to the equal elements.

The entire algorithm is as follows:

  1. Transform the input array into array of pairs (tuples), first element of the pair will be the number given in the input array, the second element will be the index of that number.

  2. Sort that array of pairs by the first element of the pairs. Then the second elements will give you one correct permutation.

  3. Scan this array in order to find possible swaps. You just iterate over this array and check if the first element in the current pair equals to the first element of the previous pair. If it equals you remember the indices of these two pairs. You stop scanning the array as soon as you have two swaps.

  4. Then you check how many swaps you have, if you have less than two swaps then there are no three distinct permutations.

  5. Otherwise you have two swaps which means that you have an answer. So you print the current permutation, then you execute the first swap (you just swap those two elements you remembered in the first swap), then you print the permutation array received after executing that swap. And you execute the second swap and print the permutation array third time.

Author's solution: 7977528


471C - MUH and House of Cards

Card house. This problem required some maths, but just a little bit. So in order to start here let's first observe that the number of cards you need to use for a complete floor with R rooms equals to:

C = Crooms + Cceiling = 2·R + (R - 1) = 3·R - 1

Then if you have F floors with Ri rooms on the i-th floor then the total number of cards would be:

where R is the total number of the rooms in the card house.

This already gives you an important property — if you divide N + F on 3 then the remainder of this division should be 0. This means that if you have found some minimum value of floors somehow and you found maximum possible number of floors in the house, then within that interval only every third number will be a part of the solution, the rest of the numbers will give a non-zero remainder in the equation above.

Now let's think what is the highest house we can build using N cards. In order to build the highest possible house obviously you need to put as few cards on each floor as you can. But we have a restriction that every floor should have less rooms than the floor below. This gives us the following strategy to maximize the height of the house: we put 1 room on the top floor, then 2 rooms on the floor below, then 3 rooms on the next floor, etc. In total then the number of cards we will need equals to:

This is minimum number of cards we need in order to build a house with F floors. This gives us a way to calculate the maximum height of the house we can build using N cards, we just need to find maximum F which gives Nmin <  = N. Mathematicians would probably solve the quadratic inequation, programmers have two options:

  1. Check all possible F until you hit that upper bound. Since Nmin grows quadratically with F then you will need to check only up to numbers. This gives time complexity and fits nicely in the given time limit.

  2. The second approach would be a binary search. Using binary search to find maximum number of the floors would give you O(logN) time complexity. This was the intended originally solution but it was decided to lower the constraints in order to allow sqrt solutions as well.

Now that you know the maximum number of the floors in the house you might need to correct it a bit because of that remainder thing we discussed above, this might make your maximum height one or two floors lower. Looking again at the remainder discussion on top we can see that starting from here only every third number will be valid for an answer. Now you can either count them brutally (back to solution) or you can simply calculate them using this formulae:

ans = (Fmax + 3 - 1) / 3 (integer division)

That seems to be it, just don't forget to use longs all the time in this problem.

Author's O(logN) solution: 7977863

Authors solution: 7977888


471D - MUH and Cube Walls

In this problem we are given two arrays of integers and we need to find how many times we can see second array as a subarray in the first array if we can add some arbitrary constant value to every element of the second array. Let's call these arrays a and b. As many people noticed or knew in advance this problem can be solved easily if we introduce difference arrays like that:

aDiffi = ai - ai + 1 (for every i =  = 0..n - 1)

If we do that with both input arrays we will receive two arrays both of which have one element less than original arrays. Then with these arrays the problem simply becomes the word search problem (though with possibly huge alphabet). This can be solved using your favourite string data structure or algorithm. Originally it was intended to look for linear solution but then we made time limit higher in case if somebody will decide to send O(NlogN) solution. I haven't seen such solutions (that is understandable) but some people tried to squeeze in a quadratic solution.

Linear solution can be made using Z-function or KMP algorithm. In order to add a logarithmic factor you can exercise with suffix array for example. I had suffix array solution as well, but it's a lot messier than linear solution.

There is one corner case you need to consider — when Horace's wall contains only one tower, then it matches bears' wall in every tower so the answer is n. Though for some algorithms it might not even be a corner case if you assume that empty string matches everywhere. Another error which several people did was to use actual string data structures to solve this problem, so they converted the differences to chars. This doesn't work since char can't hold the entire range an integer type can hold.

I didn't think that switching to difference arrays will be that obvious or well-known, so I didn't expect that this problem will be solved by that many people.

Author's Z-function O(n + w) solution: 7978022

Author's suffix array O((n + wlog(n + w)) solution: 7978033


471E - MUH and Lots and Lots of Segments

Given a set of horizontal/vertical lines you need to erase some parts of the lines or some lines completely in order to receive single connected drawing with no cycles.

First of all let's go through naive N2 solution which won't even remove cycles. In order to solve this problem you will need a DSU data structure, you put all your lines there and then for every pair of horizontal and vertical line you check if they intersect and if they do you join them in DSU. Also your DSU should hold the sum of the lengths of the joined lines. Initially it should be equal to the line lengths. Since there might be up to N2 / 4 intersections between lines we receive a quadratic solution.

Now let's get rid of cycles. Having the previous solution we can do it pretty easily, all we need is to change the way we were connecting the sets in DSU if some lines intersect. Previously we were simply asking DSU to join them even if they already belong to the same set. Now what we will do is when finding some pair of lines which intersects and is already joined in DSU instead of asking DSU to join them again we will ask DSU to decrement their length sum. In terms of the problem it is equivalent to erasing a unit piece of segment in the place where these two lines intersect and this will break the cycle. With this change we already have a correct solution which is too slow to pass the time limits.

Now we need to make our solution work faster. We still might have up to N2 / 4 intersections so obviously if we want to have a faster solution we can't afford to process intersections one by one, we need to process them in batches. All our lines are horizontal and vertical only, so let's do a sweep line, this should make our life easier.

Let's assume that we're sweeping the lines from the left to the right. Obviously then the code where the intersections will be handled is the code where we process the vertical line. Let's look at this case closer. We can assume that we're going to add some vertical line on with coordinates (x, y1, x, y2), we can also assume that there are some horizontal lines which go through the given x coordinate, we track this set of lines while sweeping left to right. So we're going to add a vertical line and let's say that it has n intersections with horizontal line. Previously we were handling each intersection separately, but you can see that if some horizontal lines already belong to the same set in DSU and they go next to each other then we don't need to handle them one by one anymore. They already belong to the set in DSU so there is no need to join them, we might only need to count the number of them between y1 and y2 coordinates, but that can be calculated in logarithmic time. So the trick to get rid of quadratic time complexity is to avoid storing horizontal lines one by one and store instead an interval (across y coordinate) of horizontal lines which belong to the same set in DSU. I will call these intervals chunks. You will need to manipulate these chunks in logarithmic time and you will need to locate them by y coordinate so you need to store them in a treap or a STL map for example with y coordinate serving as a key.

To be clear let's see what data each of these chunks will hold:

struct chunk{
	int top, bottom; // two coordinates which describe the interval covered by the chunk
	int id; // id of this chunk in DSU. Several chunks might have the same id here if they belong to the same set in DSU
};

And we agreed that data structure which holds these chunks can manipulate them in logarithmic time.

Let's now get into details to see how exactly it works. While sweeping we will have one of three possible events (listed in the order we need to handle them): new horizontal line starting, vertical line added, horizontal line finishing. First and third operation only update our chunks data structure while the second operation uses it and actually joins the sets. Let's look into each of these operations:

horizontal line start. We need to add one more chunk which will consist of a single point. The only additional operation we might need to do happens when this new line goes through some chunk whose interval already covers this point. In this case we need to split this covering chunk into two parts — top and bottom one. It's a constant number of updates/insertions/removals in our chunk data structure and we agreed that each of these operations can be done in logarithmic time so the time complexity of a single operation of this time is O(logN). It should be also mentioned here that during processing a single operation of this type we might add at most two new blocks. Since in total we have no more than N operations of this type them it means that in total we will have no more than N blocks created. This is important for the further analysis.

vertical line. In this operation we need to find all chunks affected by this vertical line and join them. Each join of two chunks takes logarithmic time and we might have up to n chunks present there, so we might need up O(NlogN) time to draw a single vertical line. This doesn't give us a good estimate. But we can see that we have only two ways to get new chunks — they are either added in the step 1 because it's a new line or one chunk is being split into two when we add a line in between. But we have an upper bound on total number of the chunks in our structure as shown above. Ans since we have such an upper bound then we can say that it doesn't matter how many chunks will join a single vertical line because in total all vertical lines will not join more than N chunks. So we have an amortized time complexity analysis here, in total all vertical line operations will take O(NlogN) time.

There are some other details we need to handle here. For example we need to avoid cycles correctly. This margin is too narrow to contain the proof but the formulae to correct the length sum is like this:

d = y2 - y1 - (Nintersections - 1) + Ndistinctsets - 1

where d — the number you need to add to the sum of the lengths in DSU

Nintersections — number of horizontal lines intersecting with the given vertical line, I used a separate segment tree to get this value in O(logN)

Ndistinctsets — number of distinct sets in DSU joined by this vertical line, you need to count them while joining

So this gives you a way to correct the lengths sums. There is one more thing that needs to be mentioned here — it might happen that your vertical line will be contained by some chunk but will not intersect any horizontal lines in it. In this case you simply ignore this vertical line as if it doesn't overlap any chunk at all.

horizontal line end. Finally we came here and it seems to be simple. When some horizontal line ends we might need to update our chunks. There are three cases here:

a. This line is the only line in the chunks — we simply delete the chunk then.

b. This line lays on some end of the interval covered by the chunk — we update that end. In order to update it we need to know the next present horizontal line or the previous present horizontal line. I used the same segment tree mentioned above to handle these queries.

c. This line lays inside some chunk — we don't need to update that chunk at all.

And that's it! In total it gives O(NlogN) solution.

Author's solution: 7978166 (that chunk data structure is called 'linked_list' in the code because originally I thought it would be a linked list with some way to manipulate it quickly and later I removed all the list functionality).


This editorial was written very late in the night, I'm pretty sure there are tons of typos here, I will proof read it tomorrow, but please don't hesitate to report typos and some minor error to be fixed in private messages.

Full text and comments »

  • Vote: I like it
  • +186
  • Vote: I do not like it

By marat.snowbear, 10 years ago, In English

Hello Codeforces community! I'm happy to invite you to participate in Codeforces Round #269 for Division 2 which will be held this Friday, September 26th. As usual first division coders are invited to participate unofficially, I would appreciate a lot if they will join.

All problems are mine, so if you won't like them... well, you know whom to blame for those. I did my best in order not to disappoint you.

Since this is my first round, let me introduce myself. I'm 27, have around 7 years of programming experience and was never into ACM/ICPC and stuff like that before, learned words like "dynamic programming" couple of years ago from the algorithms course on Coursera and a bit later found this site. I'm from Saint Petersburg, Russia, was born and lived here until I became 24. Then I got married and me and my wife decided to go live somewhere else and we moved to Kyiv, Ukraine, and we love that city a lot and we will return there some day, hopefully soon.

I hope you will find my problem set interesting and varied. Meanwhile let me get you prepared for the round and introduce you the guys you will help to. So the heroes of my round are three animals from Saint Petersburg and Kyiv zoos. Those are Menshikov and Uslada ("Uslada" means Delight in Russian) — polar bears, symbols of Saint Petersburg zoo. These are my favourite animals of that zoo, they can do a moonwalk dance. Their third friend would be Khoras — an elephant from the Kyiv zoo, not sure what kind of dances he likes but he was very friendly last time we've been there. I was always fond of polar bears and elephants so now I have a round with them! Hope you will remember their names, in problem statements I call them by name and by animal type interchangeably. They want the friendship to exist between their countries and they have some minor problems you can help them with.

Traditional and untraditional regards for this round go to:

  • MikeMirzayanov and the entire team of Codeforces for the site,

  • Gerald for his help with the problems,

  • Maria Belova aka Delinur for the translation of the statements,

  • last but definitely not least, my wife Tanya for her infinite support in everything I do. This round wouldn't be here without her.

Wikipedia asked me to remind you that the day we'll have the round (at least in Russian tome zone and time zones close to it) is 269th day of the year, matches the round number perfectly thanks to Gerald's scheduling skill.

If you read up (down?) to here and I still have your attention then let me know the secret every non-red coder wants to know — in order to become red you need to solve at least 1513 problems on this site. That's a personal experience, trust me.

Another thing I'd like to mention here: gojira was cheating! In his round announcement he mentioned that he's taking the 'eldest problem setter' title from Sammarize but he didn't mention his age, so I can't tell whether this title should be mine or not. gojira, please let us know ASAP!

Will see you on the round and I will appreciate a feedback (both positive and negative) a lot!

P.S.: as usual the scoring will be announced closer to the start time, hopefully my memory won't fail and I will update this blog post then.

upd 0: small update for the English version. In English statements the names of the characters will be Menshykov, Uslada and Horace, don't get surprised :-)

upd 1: scoring will be a bit different from normal today: 500, 1000, 2000, 2000, 2500


Thanks to everybody who was with us today, hope you enjoyed it! My congratulations to the winners. worse didn't take part in today's contest so the fight was tough on both sides of the table. While Menshykov and Uslada are updating the ratings, Horace is finishing the editorial, he's promising it will be ready a bit later today.

I understood and noted some mistakes of mine in this contest, but please let me know if you have anything to tell about it. I like how it's started, thanks for beating the 5K record, that's awesome.

Unfortunately nobody solved E, though hos.lyric was very close to it. His solution failed on the second to last test, which I didn't consider to be a max test. I consider this to be my mistake that I underestimated the problem, otherwise I would leave it for some next contest.

Best regards,
Will see you at some next round,
Marat.

Round stats from DmitriyH
Editorial

Full text and comments »

  • Vote: I like it
  • +511
  • Vote: I do not like it

By marat.snowbear, 10 years ago, translation, In English

416A - Guess a number!

Let's use the usual Div 2 problem A approach — the naive one. We will track the interval which might contain the number we're guessing. With each of the query we update this interval. If at the end the interval is non-empty then we output any number from it, otherwise the result is "Impossible".

Submission: 6606892

416B - Art Union

All we need is to iterate over all painters and for each painter to iterate over all pictures. In the inner loop we also remember when the painter finished working on the picture to make sure that the next painter will not start working on it earlier.

Submission: 6606994

416C - Booking System

Let's solve this one greedy. All we need to notice is that the optimal solution will be to place first the groups with biggest sum which they are ready to pay. For each such group it will be optimal to allocate the smallest matching table. The input limits allow to do a full search when looking for a table.

Submission: 6617198

416D - Population Size

One thing to notice for this problem is that if we cover some interval with a progression then it will better (at least no worse) to include as many elements to the right of it as possible. So the solution is to greedy — find the leftmost number not covered by a progression, start a new progression with that number (the interval covered by that progression will be of size 1) and then try to extend this interval to the right as far as possible. Repeat this step until all the numbers are covered. One thing you should pay attention to is which numbers can be covered by one arithmetic progression, for example:

  • If there are no fixed numbers in the interval then we can cover it with one progression.

  • If there is only one non-fixed number in the interval then we can cover this interval with one progression.

  • If there are more than one non-fixed numbers in the interval then we can calculate the parameters of the progression (start value and difference). All non-fixed numbers should match those parameters. Difference should be integer.

  • If the progression is ascending and there are some non-fixed numbers in the beginning then those numbers should match positive numbers in the progression.

  • Same way if the progression is descending then we can include numbers from the right side only while matching progression term is positive.

Submission: 6607174

416E - President's Path

Let's look at the graph given to us in the example:

We need to count the count of the edges on all the shortest paths between each pair of vertices. Let's do something easier first — instead of counting all the edges we will count only those which have the destination vertex on its side. For example here are the edges belonging to shortest paths from 4 to 2 which are connected to vertex 2:

Let's denote this number like this: inEdgessource, v — number of edges which go into vertex v on some shortest path from source to v. In the given example inEdges4, 2 = 3. Let's also denote the set Ssource, dest — it is a set of the vertices which belong to at least one shortest path from source to dest. For example S4, 2 = {1, 2, 3, 4}. With these two variables it can be seen that the answer for vertices source and dest will be:

In other words the answer for vertices s and d will be equal to the sum of inEdgess, v for all vertices v, which belong to any shortest path from s to d. So the only thing left is to calculate these S and inEdges. Both of them can be easily calculated if you have minimum distances between all pairs of vertices. And these distances can be calculated using the Floyd-Warshall. So the full solution is:

  1. Calculate minimum distances between all pairs of vertices using Floyd-Warshall algorithm.

  2. Count inEdges. Simply iterate over all source vertices and all edges. For each edge check whether any of its ends belong to any shortest path from source.

  3. Calculate the answer. Let's have three loops to iterate over the vertices — — source, destination и mid. First two vertices are those for which we're calculating the answer. Third vertex is the vertex which should belong to any shortest path (basically we're checking whether v belongs to Ssource, dest). If mid belongs to any shortest path from source to dest then we add inEdgessource, mid to the answer.

Each step has a complexity O(n3).

Submission: 6607257

P.S.: Please feel free to let me know about any typos, errors, etc using the private messages.

Full text and comments »

  • Vote: I like it
  • +35
  • Vote: I do not like it