Petr's blog

By Petr, history, 10 months ago, In English
  • Vote: I like it
  • +62
  • Vote: I do not like it

»
10 months ago, # |
  Vote: I like it +40 Vote: I do not like it

There was still some work to do to improve the complexity from O(n2) to O(n*logn),

As I see, you went for the diagonals (probably because the probabilities on the same diagonal are the same), but it was much easier to go for rows (or columns), because the sum of $$$1/i(i+1) = 1/i - 1/(i+1)$$$ on a subsegment is very nice.

  • »
    »
    10 months ago, # ^ |
      Vote: I like it +24 Vote: I do not like it

    Very nice!

    Do you know if this has a combinatorial meaning? Maybe we can arrive at 1/i directly from some linearity of expectation argument?

    • »
      »
      »
      10 months ago, # ^ |
        Vote: I like it +8 Vote: I do not like it

      Okay, so imagine that we have a chip at $$$(n, m)$$$, and every second we randomly move it left or down, decreasing some coordinate somehow to stay positive (this is essentially what happens in the problem). If $$$a < n$$$ and $$$b < m$$$ are two positive integers, and we look at the coordinates $$$(x, y)$$$ of the chip during the operations, then eventually one of the following happens: $$$x \leq a$$$ or $$$y \leq b$$$. In other words, we arrive at one of the rectangles $$$A$$$ and $$$B$$$ on the figure below.

      -------+      <-o
             |        |
         A   |        v
             |         
             |         
      -------+         
              +--------
              |        
              |   B    
              |        
              |        
              +--------
      

      Now, while this didn't happen, on each operation, if $$$s$$$ is the number of possible jumps, the probability to jump to $$$A$$$ right now is $$$a / s$$$, to jump to $$$B$$$ is $$$b / s$$$, and to stay outside them is some other number. In any case, these two probabilities are always in $$$a:b$$$ proportion, therefore the probability that we arrive at $$$A$$$ before $$$B$$$ is $$$a / (a + b)$$$ (exercise for the reader, but should be intuitively obvious).

      Then, provided that we ended up in $$$A$$$, the probability that we are at its right border is the probability that the last jump was at this exact point chosen from $$$a$$$ possible points given from the premise, which is $$$1/a$$$. Thus, the probability that we ever were on the right border of $$$A$$$ is clearly the probability that the first point from $$$A\cup B$$$ that we entered was on the right border of $$$A$$$, which is (the probability that we entered $$$A$$$ and not $$$B$$$) times (the probability that on the last jump we picked this point among $$$a-1$$$ others) $$$= a/(a+b)\cdot 1/a = 1/(a+b)$$$.

      Is this what you are asking?

      • »
        »
        »
        »
        10 months ago, # ^ |
        Rev. 3   Vote: I like it +8 Vote: I do not like it

        Actually, you could even say that while the chip still can reach the required border, there are three options:

        • to reach it with probability $$$1/s$$$,

        • to go somewhere where we can no longer win from, with probability $$$(a + b - 1) / s$$$,

        • to continue from some other place.

        Again, since the first two options are the only ones that stop the process, and since their probabilities ratio is $$$1 : (a+b-1)$$$, then the answer is $$$1/(a+b)$$$.

        • »
          »
          »
          »
          »
          10 months ago, # ^ |
            Vote: I like it +8 Vote: I do not like it

          This argument makes sense, but it seems to be computing something different from what we need to compute?

          In your top-level comment you suggest to take my solution that computes the probability that we pass through position (x, y), then sum up those probabilities by row. So we effectively compute "the expected number of times we visit the cells (x, y) for any y between y_0 and m", where y_0=ceil(k/x), and this expected number turns out to have a nice formula, something like 2/(x+y0-1)-2/(x+m). My question was whether we can arrive at this formula directly without computing the per-cell probability.

          In your latest comment, you seem to be computing something slightly different: instead of expected number of visits on the boundary you compute the probability that we ever visit the boundary. It is similar, but does not have 2 in the numerator, and also does not have the "-2/(x+m)" term, so I'm not sure it is helpful in solving the original problem?

          • »
            »
            »
            »
            »
            »
            10 months ago, # ^ |
              Vote: I like it +8 Vote: I do not like it

            Yes, I indeed did something different in the second comment. I also made a mistake (namely, I said or implied that the sum of probabilities over a set $$${x = a, b < y\leq m}$$$ is the probability to eventually end up on that edge, which is incorrect), which is probably why you noticed that this doesn't solve the original problem immediately. So below I write a correction.

            So I assume you are interested in probabilities $$$p(x, y)$$$ to reach $$$(x, y)$$$, then we just need to add them over all $$$(x, y)$$$ s.t. $$$xy > k$$$ or something. We can represent this probability as $$$p(x, y) = p_h(x, y) + p_v(x, y)$$$, where $$$p_h(x, y)$$$ is the probability to eventually come to $$$(x, y)$$$ from the right, and $$$p_v(x, y)$$$ is the same but from the above. Let's calculate the sum of $$$p_h(x, y)$$$ over all good cells grouping them by columns and the sum of $$$p_v(x, y)$$$ grouping them by rows. Then it is true that the sum of $$$p_h(x, y)$$$ over $$${(x, y)\,\colon\,x = a, b < y\leq m}$$$ is what I wrote in the second comment (i.e. $$$1/(a+b)$$$). The solution becomes even easier: link.

            You may be confused because the equation for the rows was something like $$$2 / (x + y) - 2 / (x + m)$$$ instead of a simple ratio like $$$1 / (x + y)$$$ as in our case. The reason behind the $$$1$$$ in the numerator is that we decomposed that probability into the horizontal one and the vertical one (and they turned out to be the same, so $$$2/c(c+1)$$$ became two $$$1/c(c+1)$$$'s). The reason why we don't have the $$$-2/(x+m)$$$ leftover is that we no longer distinguish the border cells from the inner ones. In the initial solution we had something like $$$1/i(i+1) + \ldots + 1/j(j+1) = 1/i - 1/(j+1)$$$, but here we add one $$$1/(j+1)$$$ from the boundary, which cancels out, thus leaving only $$$1/i$$$.

            Regarding explaining the initial solution combinatorically, I am not ready to do this in any way except the one I just described, as I now think it is the most natural way to think about it.

            • »
              »
              »
              »
              »
              »
              »
              10 months ago, # ^ |
                Vote: I like it 0 Vote: I do not like it

              Thanks a lot for the detailed clarification! Now everything is clear and indeed is even more beautiful than the original approach.