Блог пользователя spike1236

Автор spike1236, история, 36 часов назад, По-английски

Hello, Codeforces community! Today, I want to share my solution to a problem from recent contest: Binary Subsequence Value Sum. Rather than using polynomials and FFT, I tried to approach this problem from a different viewpoint — probability! Now, let's get started with the solution itself.


1. Deterministic definition of the score

Suppose you have a binary string $$$v$$$ of length $$$m$$$. Interpret each character as a “contribution”: assign $$$+1$$$ for a ‘1’ and $$$-1$$$ for a ‘0’. Define

$$$ d = \text{(number of 1's)} - \text{(number of 0's)} $$$

so that $$$d$$$ is the overall “imbalance” of $$$v$$$.

The score of $$$v$$$ is defined as

$$$ \text{score}(v) = \max_{1 \le i < m} \Bigl( F(v,1,i) \cdot F(v,i+1, m) \Bigr), $$$

where $$$F(v,1,i)$$$ is imbalance of prefix and $$$F(v,i+1,m)$$$ is imbalance of suffix; if the prefix imbalance is $$$S$$$, then the suffix is $$$d-S$$$, and the product is

$$$ P(S) = S\cdot (d-S). $$$

For fixed $$$d$$$, the product $$$S(d-S)$$$ is a quadratic function in $$$S$$$. It is maximized when

$$$ S = \frac{d}{2}. $$$

However, since $$$S$$$ must be an integer (and may not be exactly $$$d/2$$$ when $$$d$$$ is odd), the best we can do is to take

$$$ S = \left\lfloor\frac{d}{2}\right\rfloor \quad \text{and} \quad d-S = \left\lceil\frac{d}{2}\right\rceil. $$$

Thus the maximum product is

$$$ \left\lfloor\frac{d}{2}\right\rfloor \cdot \left\lceil\frac{d}{2}\right\rceil. $$$

So in one compact expression, the score of $$$v$$$ can be written as

$$$ \text{score}(v) = \frac{d^2 - (d \bmod 2)}{4}. $$$

2. Counting scores over all subsequences

Now, consider a fixed binary string $$$s$$$ of length $$$n$$$ with $$$R$$$ ones and $$$Z = n-R$$$ zeros.

A nonempty subsequence of $$$s$$$ is obtained by choosing, independently for each position, whether to include that bit. (There are $$$2^n$$$ subsequences in total, including the empty one—but the empty subsequence contributes nothing to the score.)

Imagine that for each bit in $$$s$$$ we make an independent choice with probability $$$1/2$$$ of including it. Then every subsequence occurs with equal probability $$$(\frac{1}{2})^n$$$.

For any subsequence $$$v$$$, let its imbalance be

$$$ d' = \text{(# of ones chosen)} - \text{(# of zeros chosen)}. $$$

Then the score of that subsequence is, by our previous argument,

$$$ \text{score}(v) = \frac{(d')^2 - ((d') \bmod 2)}{4}. $$$

We want to sum the scores over all nonempty subsequences. Rather than summing directly, we use the idea of expectation.

Recall that for any function $$$f(v)$$$ defined on the set of subsequences (with the uniform probability distribution), the average value is:

$$$ E[f(v)] = \frac{1}{2^n} \sum_{v} f(v). $$$

Multiplying by $$$2^n$$$ recovers the sum over all subsequences:

$$$ \sum_{v} f(v) = 2^n \cdot E[f(v)]. $$$

3. Calculating the Expectation $$$E[(d')^2]$$$

For each bit of $$$s$$$, define a random variable $$$X_i$$$ as follows:

  • If the $$$i$$$th bit is a ‘1’ and is chosen, $$$X_i=+1$$$; if it’s not chosen, $$$X_i=0$$$.
  • If the $$$i$$$th bit is a ‘0’ and is chosen, $$$X_i=-1$$$; if it’s not chosen, $$$X_i=0$$$.

Then the imbalance for a given subsequence is

$$$ d' = \sum_{i=1}^n X_i. $$$
  • For ‘1’: $$$E[X_i]= \frac{1}{2}$$$ (since it contributes $$$+1$$$ with probability $$$1/2$$$).
  • For ‘0’: $$$E[X_i]= -\frac{1}{2}$$$.

Thus, if there are $$$R$$$ ones and $$$Z$$$ zeros,

$$$ E[d'] = R\cdot\frac{1}{2} + Z\cdot\left(-\frac{1}{2}\right) = \frac{R - Z}{2} = \frac{2R-n}{2}. $$$

Each $$$X_i$$$ is a Bernoulli-type variable with two outcomes (nonzero with probability $$$1/2$$$ and zero with probability $$$1/2$$$). In either case (for ‘1’ or ‘0’), one can show that

$$$ \operatorname{Var}(X_i) = \frac{1}{4}. $$$

Since the choices are independent,

$$$ \operatorname{Var}(d') = \sum_{i=1}^n \operatorname{Var}(X_i) = \frac{n}{4}. $$$

Recall that

$$$ E[(d')^2] = \operatorname{Var}(d') + (E[d'])^2. $$$

Thus,

$$$ E[(d')^2] = \frac{n}{4} + \left(\frac{2R-n}{2}\right)^2 = \frac{n}{4} + \frac{(2R-n)^2}{4} = \frac{(2R-n)^2 + n}{4}. $$$

Then the expected value of $$$\frac{(d')^2}{4}$$$ is

$$$ E\!\left[\frac{(d')^2}{4}\right] = \frac{1}{4} \cdot \frac{(2R-n)^2+n}{4} = \frac{(2R-n)^2+n}{16}. $$$

So if we (naively) approximated the score of every subsequence by $$$\frac{(d')^2}{4}$$$, then the total over all $$$2^n$$$ subsequences would be:

$$$ 2^n \cdot \frac{(2R-n)^2+n}{16} = 2^{n-4}\Bigl((2R-n)^2+n\Bigr). $$$

4. $$$d'$$$ parity correction

The true score of a subsequence with imbalance $$$d'$$$ is

$$$ \frac{(d')^2 - (d'\bmod 2)}{4}, $$$

which means that whenever $$$d'$$$ is odd, our naive expression $$$\frac{(d')^2}{4}$$$ is an overestimate by $$$\frac{1}{4}$$$.

A classical combinatorial fact (which one may also derive via the bruteforce check, as I did) is that the total number of subsequences with an odd imbalance is exactly half of all subsequences. Thus, the total “overcount” when summing $$$\frac{(d')^2}{4}$$$ is:

$$$ \frac{1}{4} \cdot \bigl(\text{number of subsequences with odd } d'\bigr) = \frac{1}{4} \cdot 2^{n-1} = 2^{n-3}. $$$

Therefore, by linearity of expectation, the correct total sum $$$S$$$ over all nonempty subsequences is:

$$$ S = \left(2^n \cdot E\!\left[\frac{(d')^2}{4}\right]\right) - 2^{n-3}. $$$

Substitute the expectation we computed:

$$$ S = 2^n \cdot \frac{(2R-n)^2+n}{16} - 2^{n-3}. $$$

5. Simplifying to the Final Formula

$$$ S = 2^n \cdot \frac{(2R-n)^2+n}{16} - 2^{n-3} = $$$
$$$ 2^{n-4} \cdot ((2R-n)^2+n) - 2^{n-3}. $$$

Factor $$$2^{n-4}$$$ out of both terms, and we obtain the final closed-form formula:

$$$ S = 2^{n-4} ((2R-n)^2+n-2). $$$

My solution: 309852412. Thank you for reading!

  • Проголосовать: нравится
  • +112
  • Проголосовать: не нравится

»
30 часов назад, # |
  Проголосовать: нравится +22 Проголосовать: не нравится

Nice solution!

I'd like to hijack this post take this opportunity to advertise my own solution for the same problem, which I haven't really seen anywhere else:

So, do as the above blog says at first. Then, this reduces to finding the sum of $$$\lfloor \frac{s^2}{4} \rfloor$$$ over all subsequences ($$$s$$$ denotes the sum). We can reduce to finding $$$s^2 - (s^2 \pmod{4})$$$, but the second term depends only on $$$s \pmod{2}$$$, which is classic (the sum of that over all subsequences is $$$2^{n - 1}$$$ if there is an odd number in the array, and $$$0$$$ otherwise), therefore we can reduce it to finding the sum of $$$s^2$$$ over all subsequences. Now, here is the main part: We will use a segment tree to achieve this. How do we make the combine function? Well, we obviously only need to consider $$$(\text{sum}(a) + \text{sum}(b))^2$$$ where $$$a$$$ is a subsequence (possibly empty) of the left child $$$l$$$, and $$$b$$$ is a subsequence (possibly empty) of the right child $$$r$$$. Thus, it's $$$\text{sum}(a)^2 + \text{sum}(b)^2 + 2 \cdot \text{sum}(a) \cdot \text{sum}(b)$$$. What is the contribution of the first term? For a fixed subsequence $$$a$$$ of $$$l$$$, it is counted $$$2^{\text{length}(r)}$$$ times, because we can pair any subsequence in $$$r$$$ (possibly empty) with this term. Thus, it's $$$2^{\text{length}(r)} \cdot \text{answer}(l)$$$. Similarly, the contribution of the second term is $$$2^{\text{length}(l)} \cdot \text{answer}(r)$$$. What is the contribution of the last term? Well, for a fixed subsequence $$$a$$$ of $$$l$$$, we take each subsequence of $$$b$$$ with it, and we take the sum of that over all subsequences $$$a$$$, thus the total sum is $$$2 \cdot (\text{sum of sums of all subsequences of} \; l) \cdot (\text{sum of sums of all subsequences of} \; r)$$$. Both of these are standard to calculate; they are equal to $$$2^{\text{length}(l) - 1} \cdot \text{sum}(l)$$$, and $$$2^{\text{length}(r) - 1} \cdot \text{sum}(r)$$$, which can both be stored in a segment tree.

Code: 309800687

  • »
    »
    23 часа назад, # ^ |
      Проголосовать: нравится +13 Проголосовать: не нравится

    Too many Gods on this site

»
27 часов назад, # |
  Проголосовать: нравится +5 Проголосовать: не нравится

I know I am no where near comprehending this level of math, but another thing I know is that i will get there someday.

»
10 часов назад, # |
  Проголосовать: нравится +8 Проголосовать: не нравится

базар жоқ тигр