Say you have a binary string of length N where each character is equally likely to appear. What is the probability that there will be a substring of K 1 characters?
For instance if K = 2, N = 3 the answer is 3/8 = 37.5%.
000: NO
001: NO
010: NO
011: YES
100: NO
101: NO
110: YES
111: YES
How to solve this problem. Can it be done in polynomial time?
What are the bounds on $$$N$$$ and $$$K$$$? There is a pretty straightforward $$$\mathcal{O}(NK)$$$ dynamic programming solution.
No particular bounds, I'm simply wondering what is the most efficient solution.
What would be your DP formulation/recursion?
Oh, okay.
Well its obvious that if a string contains substring of $$$1$$$'s with length $$$\geq K$$$, then it also contains a substring of exactly length $$$K$$$.
To compute the probability we can look at the complement problem and that is the probability that longest substring of $$$1$$$'s in a string is strictly less than $$$K$$$. It is known that $$$P(A) = 1 - P(\overline{A})$$$. Which means that we can just compute the probability of the complement and subtract it from 1.
Let $$$C$$$ be equal to all binary strings of length $$$N$$$ that don't contain a substring of $$$1$$$'s with length $$$\geq K$$$, then the answer will be $$$1 - \frac{C}{2^n}$$$, since there are $$$2^n$$$ binary strings.
To compute $$$C$$$ we use dynamic programming. Let $$$f(i, j)$$$ be equal to number of strings of lenght $$$i$$$ with current substring of $$$1$$$'s of length $$$j$$$ that don't have substrings of $$$1$$$'s with lenght $$$\geq$$$ K. Obviously, we are looking for the value $$$f(N, 0)$$$. The relations are the following, at each step we can either add $$$1$$$ or $$$0$$$ next, if we add $$$1$$$ then we increase our current substring lenght by one and if we add $$$0$$$ then we reset it back to 0.
There are $$$\mathcal{O}(NK)$$$ states and the transitions are $$$\mathcal{O}(1)$$$ so the total complexity is $$$\mathcal{O}(NK)$$$.
Thanks! Thinking about the complement problem was clever, nice solution.
https://math.stackexchange.com/questions/1944807/arrangements-of-a-a-a-b-b-b-c-c-c-in-which-no-three-consecutive-letters-are-the
https://math.stackexchange.com/questions/949165/how-many-binary-string-are-there-such-that-there-are-no-k-consecutive-characters
Consider the complement problem of counting how many binary strings of length $$$N$$$ exist such that length of longest substring of 1s is strictly less than $$$K$$$. Define $$$DP(i)$$$ to represent number of such binary strings of length $$$i$$$.
Trivially, $$$DP(i) = 2^i$$$ for $$$0 \leq i < K$$$.
Furthermore, $$$DP(i) = DP(i - 1) + DP(i - 2) + \ldots + DP(i - K)$$$ for $$$i \geq K$$$
This is because to get a string of length $$$N$$$, we could do
This can be done in $$$O(N)$$$ time with a sliding window approach.
If $$$N$$$ is large but $$$K$$$ is small, then this can be easily solved as a linear recurrence in $$$O(K^3log(N))$$$ time (matrix exponentiation).
Linear recurrences can also be solved in $$$O(K^2log(N))$$$ time or even $$$O(Klog(K)log(N))$$$ time (see this blog and this code)
Let's solve the problem when there can not be k consecutive 1s.
Let $$$dp[i]$$$ be the answer for prefix $$$i$$$.
Now there are two cases. first, if we set 0 at the ith position then the answer will be $$$dp[i-1]$$$.
In the second case, we assume some subarray of length $$$j$$$ filled with 1 ending at ith position and starting at $$$i-j+1$$$ also there will 0 be $$$(i-j)$$$th index. $$$j$$$ will vary from 1 to k-1. In that case, the answer will be $$$dp[i-j-1]$$$. summing up both the cases we get
$$$dp[i]$$$ = $$$\sum\limits_{j = {i-k}}^{i-1}dp[j]$$$
So by using prefix sums we can calculate the answer in $$$O(N)$$$
Why is this downvoted? Solution looks okay to me