brunomont's blog

By brunomont, history, 22 months ago, In English

Hello, Codeforces!

I want to share with you 2 problems that I was not able to find a tight time complexity / bound for. After trying for a while, I decided to ask here, maybe someone can help me.

Problem 1

We want to maintain a collection of arrays of integers in $$$[0, U)$$$ represented with binary search trees, such as Treaps. We will have $$$q$$$ updates, which can be one of the following:

  1. Split an array by position;
  2. Concatenate two arrays;
  3. Reverse an array;
  4. Split an array by value.

Of course, it is well known how to support operations (1), (2) and (3) each in $$$\mathcal O(\log n)$$$, if $$$n$$$ is the total size of the collection.

What is interesting here is operation (4). Going into more detail about what it means, given an array and some value $$$x$$$, we want to split this array into two: one containing values that are smaller than $$$x$$$ in the same order as they were, and the other containing values that are at least $$$x$$$, also in the same order as they originally were. This is equivalent to running

std::stable_partition(v.begin(), v.end(), [&](int k) { return k < x; });

My proposed solution to implement this is to repeatedly split the array at the first position that has value smaller than $$$x$$$, then split at the first position that has value at least $$$x$$$, and repeat. It would look like this

void treap::partition(int x) {
	treap small, big;
	while (size()) {
		treap tmp;
		split(first_not_smaller(x), tmp);
		small.concat(tmp);
		split(first_smaller(x), tmp);
		big.concat(tmp);
	}
	// Now the elements are split into small and big,
	// and this treap is empty
}

The number of splits this algorithms does is the number of positions such that a "small" element and a "big" element are adjacent in the array.

Example

Of course, this can be linear in the worst case, but can you find an instance such that this algorithm does $$$\Theta(q * n)$$$ splits? My guess is that this algoriths does at most $$$\mathcal O(\log U)$$$ splits, amortized. But I was not able to prove this.

I have tried a few potential functions, such as the sum of logs of gaps between adjacent values in the arrays, but they did not work.

Full code: link.

Problem 2

It turns out that a bound for this problem implies a bound for the previous problem, if we implement it in a specific way.

Imagine that there are $$$n$$$ coins, arranged in some piles in an arbitrary way. We will do $$$n$$$ operations. In each operation, we choose some number $$$k > 0$$$, and choose $$$k$$$ non-empty piles. For each of these piles, we choose a non-zero number of coins to take from it. All of the coins we took are arranged in a new pile.

The cost of the operation is $$$k$$$ (note that it is not the total number of coins, but the number of piles).

If we have the freedom to choose our operations in any way we like (and also the starting configuration), what is the maximum cost we can achieve? I could not find an instance that took $$$\Theta(n^2)$$$. However it is easy to create an instance that takes $$$\Theta(n * \sqrt n)$$$.

Sqrt instance
Fun observation

We think the maximum cost is $$$\mathcal O(n * \sqrt n)$$$, but we could not prove it. We managed, however, to prove this bound if we restrict the operation a bit: if we only allow to take at most one coin per pile.

Proof

Challenge

Can you prove any of the above unproven bounds? Can you find a counter-example? Please let me know, I will be happy to discuss this in the comments!

  • Vote: I like it
  • +56
  • Vote: I do not like it

»
22 months ago, # |
  Vote: I like it +39 Vote: I do not like it

Problem 2:

Let's split piles into heavy and light piles. Heavy piles have size greater than or equal to sqrt(N) and light piles have size lesser than sqrt(N).

At a point in time, there are at most sqrt(N) heavy piles. The cost that comes from heavy piles is thus O(N^1.5) in total.

During the lifespan of a pile, it might start as a heavy pile and turn into a light pile. Total cost of operations while it's heavy is already considered above. When it's light, it'll be a part of at most O(sqrt(N)) operations, thus each pile might contribute to the total cost with O(sqrt(N)) operations while it's light. It helps if you visualize the i-th operation as creating a pile with index N+i.

Thus the total cost is the sum of these 2 costs, so it is O(N^1.5)

»
22 months ago, # |
  Vote: I like it 0 Vote: I do not like it

The fun observation is a well-known problem. The proof is elementary, but very hard (at least for me). Hint: consider the potential energy after rotating the plane by $$$45$$$ degrees.

  • »
    »
    22 months ago, # ^ |
    Rev. 2   Vote: I like it +16 Vote: I do not like it

    Thank you for this comment, I completely missed reading that observation. I remember there was some div1E or so problem that used that observation or something similar as a small step in the solution, perhaps someone that upsolved it might be able to find it?

»
22 months ago, # |
  Vote: I like it 0 Vote: I do not like it

For the 1st question, can you specify the constraint on N and U and Q? The reason I ask this is because time complexity of ur method is definitely at least O(U*min(N, Q)). This is because if you think about it, worst case of operation(4) is linear on the size of the array — which can be as big as U. So if we do operation(4) on every array in the collection once, then the worst case time complexity would be O(U*N). The reason I put min(N, Q) instead of just N at the beginning was to make sure we have enough operations.

Or is this wrong and I just have a huge misunderstanding? If so, then please correct me.

  • »
    »
    22 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    $$$N$$$ is the total size of the collection of arrays (the sum of their lengths), maybe this is what you got mixed up. Of course, one call for operation (4) can be do a linear number of splits in the array, but you can't do this all the time, the arrays get kind "more sorted" in some sense. This is what feels like it would give a good amortized complexity.