[Variants] An interesting counting problem related to square product

#	User	Rating
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	151

The statement:

Given three integers $$$n, k, p$$$, $$$(1 \leq k \leq n < p)$$$.

Count the number of array $$$a[]$$$ of size $$$k$$$ that satisfied

$$$1 \leq a_1 < a_2 < \dots < a_k \leq n$$$
$$$a_i \times a_j$$$ is perfect square $$$\forall 1 \leq i < j \leq k$$$

Since the result can be big, output it under modulo $$$p$$$.

For convenient, you can assume $$$p$$$ is a large constant prime $$$10^9 + 7$$$

Notice that in this blog, we will solve for generalized harder variants

For original problem you can see in this blog [Tutorial] An interesting counting problem related to square product

Extra Tasks

These are harder variants, and generalization from the original problem. You can see more detail here

*Marked as solved only if tested with atleast $$$10^6$$$ queries

Solved A: Can we also use phi function or something similar to solve for $$$k = 2$$$ in $$$O(\sqrt{n})$$$ or faster ?

Solved B: Can we also use phi function or something similar to solve for general $$$k$$$ in $$$O(\sqrt{n})$$$ or faster ?

Solved C: Can we also solve the problem where there can be duplicate: $$$a_i \leq a_j\ (\forall\ i < j)$$$ and no longer $$$a_i < a_j (\forall\ i < j)$$$ ?

Solved D: Can we solve the problem where there is no restriction between $$$k, n, p$$$ ?

Solved E: Can we solve for negative integers, whereas $$$-n \leq a_1 < a_2 < \dots < a_k \leq n$$$ ?

Solved F: Can we solve for a specific range, whereas $$$L \leq a_1 < a_2 < \dots < a_k \leq R$$$ ?

Solved G: Can we solve for cube product $$$a_i \times a_j \times a_t$$$ effectively ?

H: Can we solve if it is given $$$n$$$ and queries for $$$k$$$ ?

I: Can we solve if it is given $$$k$$$ and queries for $$$n$$$ ?

J: Can we also solve the problem where there are no order: Just simply $$$1 \leq a_i \leq n$$$ ?

K: Can we also solve the problem where there are no order: Just simply $$$0 \leq a_i \leq n$$$ ?

M: Can we solve for $$$q$$$-product $$$a_{i_1} \times a_{i_2} \times \dots \times a_{i_q} = x^q$$$ (for given constant $$$q$$$) ?

N: Given $$$0 \leq \delta \leq n$$$, can we also solve the problem when $$$1 \leq a_1 \leq a_1 + \delta + \leq a_2 \leq a_2 + \delta \leq \dots \leq a_k \leq n$$$ ?

O: What if the condition is just two nearby elements and not all pairs. Or you can say $$$a_i \times a_{i+1} \forall 1 \leq i < n$$$ is a perfect square ?

A better solution for k = 2

Extra task A

Problem

Easy Version

Hard Version

Examples

Example 1

Input 1:

Output 1:

Explanation 1:

There are no satisfied integer pair $$$(a, b)$$$ that $$$1 \leq a < b \leq 1$$$

Example 2

Input 2:

Output 2:

Explanation 2:

There are $$$4$$$ satisfied pairs: {$$$1, 4$$$}, {$$$1, 9$$$}, {$$$2, 8$$$}, {$$$4, 9$$$}.

Example 3

Input 3:

Output 3:

Explanation 3:

There are $$$16$$$ satisfied pairs: {$$$1, 4$$$}, {$$$1, 9$$$}, {$$$1, 16$$$}, {$$$1, 25$$$}, {$$$2, 8$$$}, {$$$2, 18$$$}, {$$$3, 12$$$}, {$$$4, 9$$$}, {$$$4, 16$$$}, {$$$4, 25$$$}, {$$$5, 20$$$}, {$$$6, 24$$$}, {$$$8, 18$$$}, {$$$9, 16$$$}, {$$$9, 25$$$}, {$$$16, 25$$$}.

Idea

Observation

Definition

Property

Formula

Implementation

O(sqrt n log log sqrt n) solution

#include <iostream>
#include <cstring>
#include <numeric>
#include <cmath>

using namespace std;

const int MOD = 1e9 + 7;
const int LIM = 1e7 + 17;
const int SQRT_LIM = ceil(sqrt(LIM) + 1) + 1;

int euler[SQRT_LIM];
void sieve_phi(int n)
{
    iota(euler, euler + n + 1, 0);
    for (int x = 2; x <= n; x++) if (euler[x] == x)
        for (int j = x; j <= n; j += x)
            euler[j] -= euler[j] / x;
}

int solve(int n)
{
    sieve_phi(ceil(sqrt(n) + 1) + 1);
    
    long long res = 0;
    for (int p = 2; p * p <= n; ++p)
        res += 1LL * euler[p] * (n / (p * p));

    res %= MOD;
    return res;
}

int main()
{
    ios::sync_with_stdio(false);
    cin.tie(NULL);
    
    int n;
    cin >> n;
    cout << solve(n);
    return 0;
}

O(sqrt) solution

#include <iostream>
#include <cstring>
#include <numeric>
#include <vector>
#include <cmath>

using namespace std;

const int MOD = 1e9 + 7;

vector<int> lpf;
vector<int> prime;
vector<int> euler;
void linear_sieve_phi(int n)
{
    lpf.assign(n + 1, 0);
    euler.assign(n + 1, 1);
    for (int x = 2; x <= n; ++x)
    {
        if (lpf[x] == 0)
        {
            prime.push_back(lpf[x] = x);
            euler[x] = x - 1;                    
        }
        for (int i = 0; i < prime.size() && x * prime[i] <= n; ++i)
        {
            lpf[x * prime[i]] = prime[i];
            if (x % prime[i] == 0) {
                euler[x * prime[i]] = euler[x] * prime[i];    
                break;
            }
            euler[x * prime[i]] = euler[x] * euler[prime[i]];
        }
    }
}

int solve(int n)
{
    linear_sieve_phi(ceil(sqrt(n) + 1) + 1);
    
    long long res = 0;
    for (int p = 2; p * p <= n; ++p)
        res += 1LL * euler[p] * (n / (p * p));

    res %= MOD;
    return res;
}

int main()
{
    ios::sync_with_stdio(false);
    cin.tie(NULL);
    
    int n;
    cin >> n;
    cout << solve(n);
    return 0;
}

Complexity

Hint

A better solution for general k

Extra task B

Problem

Very Easy Version

Easy Version

Hard Version

Examples

Example 1

Input 1:

2 1

Output 1:

Explanation 1:

There are $$$2$$$ satisfied array of size $$$1$$$: {$$$1$$$}, {$$$2$$$}.

Example 2

Input 2:

10 2

Output 2:

Explanation 2:

There are $$$4$$$ satisfied array of size $$$2$$$: {$$$1, 4$$$}, {$$$1, 9$$$}, {$$$2, 8$$$}, {$$$4, 9$$$}.

Example 3

Input 3:

27 3

Output 3:

Explanation 3:

There are $$$12$$$ satisfied array of size $$$3$$$: {$$$1, 4, 9$$$}, {$$$1, 4, 16$$$}, {$$$1, 4, 25$$$}, {$$$1, 9, 16$$$}, {$$$1, 9, 25$$$}, {$$$1, 16, 25$$$}, {$$$2, 8, 18$$$}, {$$$3, 12, 27$$$}, {$$$4, 9, 16$$$}, {$$$4, 9, 25$$$}, {$$$4, 16, 25$$$}, {$$$9, 16, 25$$$}.

Idea

Definition

The formula

Implementation

O(sqrt n log sqrt n)

const int LIM = 5e6 + 56;
const int SQRT_LIM = ceil(sqrt(LIM) + 1) + 1;
const int MOD = 1e9 + 7;

/// Precalculating factorials under prime modulo
int fact[SQRT_LIM + 10]; /// fact[n] = n!
int invs[SQRT_LIM + 10]; /// invs[n] = n^(-1)
int tcaf[SQRT_LIM + 10]; /// tcaf[n] = (n!)^(-1)
void precal_nck(int n = SQRT_LIM)
{
    fact[0] = fact[1] = 1;
    invs[0] = invs[1] = 1;
    tcaf[0] = tcaf[1] = 1;
    for (int i = 2; i <= n; ++i)
    {
        fact[i] = (1LL * fact[i - 1] * i) % MOD;
        invs[i] = MOD - 1LL * (MOD / i) * invs[MOD % i] % MOD;
        tcaf[i] = (1LL * tcaf[i - 1] * invs[i]) % MOD;
    }
}

/// Calculating binomial coefficient queries
int nck(int n, int k)
{
    k = min(k, n - k);
    if (k < 0) return 0;

    long long res = fact[n];
    res *= tcaf[k];         res %= MOD;
    res *= tcaf[n - k];     res %= MOD;
    return res;
}

/// Linear Sieve
vector<int> prime;           /// prime list              = A000040
bool isPrime[SQRT_LIM + 10]; /// characteristic function = A010051
int lpf[SQRT_LIM + 10];      /// lowest prime factor     = A020639
int mu[SQRT_LIM + 10];       /// mobius                  = A008683
void linear_sieve(int n)
{
    if (n < 1) return ;
    /// Extension Sieve || You can add something more
    memset(lpf, 0, sizeof(lpf[0]) * (n + 1));
    fill_n(mu, n + 1, 1);
    /// Main Sieve || Without this, you barely able to achive linear complexity
    prime.clear();
    prime.reserve(n / log(n - 1));
    memset(isPrime, true, sizeof(isPrime[0]) * (n + 1));
    isPrime[0] = isPrime[1] = false;
    for (int x = 2; x <= n; ++x) /// For each number
    {
        if (isPrime[x]) /// Func[Prime]
        {
            mu[x] = -1;
            lpf[x] = x;
            prime.push_back(x);
        }
        for (int p : prime) /// Func[Prime * X] <- Func[Prime]
        {
            if (p > lpf[x] || x * p > n) break;
            isPrime[x * p] = 0;
            lpf[x * p] = p;
            mu[x * p] = (lpf[x] == p) ? 0 : -mu[x];
        }
    }
}

/// Divisor sieve
vector<int> divisors[SQRT_LIM];
void precal_div(int n) /// O(n log n)
{
    for (int u = n; u >= 1; --u)
    {
        divisors[u].clear();
        for (int v = u; v <= n; v += u)
            divisors[v].push_back(u);
    }
}

/// Solving for n, k
long long solve(int n, int k)
{
    /// We only care for d that 1 <= d <= sqrt(n)
    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t);
    precal_div(t);

    long long res = 0;
    for (int d = 1; d * d <= n; ++d) /// For each fixed p^2
    {
        long long sum = 0;
        for (int p : divisors[d]) /// For each (p | d)
            sum += mu[d / p] * nck(p - 1, k - 1);

        sum %= MOD;
        res += sum * (n / (d * d));
    }

    res %= MOD;
    return res;
}

int main()
{
    ios::sync_with_stdio(false);
    cin.tie(NULL);

    /// Assumming constant p = 10^9 + 7
    int n, k;
    cin >> n >> k;
    cout << solve(n, k);
    return 0;
}

O(sqrt log log sqrt n)

vector<int> prime;           /// prime list              = A000040
bool isPrime[SQRT_LIM + 10]; /// characteristic function = A010051
int lpf[SQRT_LIM + 10];      /// lowest prime factor     = A020639
void linear_sieve(int n)
{
    if (n < 1) return ;
    prime.clear();
    prime.reserve(n / log(n - 1));
    memset(lpf, 0, sizeof(lpf[0]) * (n + 1));
    memset(isPrime, true, sizeof(isPrime[0]) * (n + 1));
    isPrime[0] = isPrime[1] = false;
    for (int x = 2; x <= n; ++x)
    {
        if (isPrime[x]) /// Func[Prime]
        {
            lpf[x] = x;
            prime.push_back(x);
        }
        for (int p : prime) /// Func[Prime * X] <- Func[Prime]
        {
            if (p > lpf[x] || x * p > n) break;
            isPrime[x * p] = 0;
            lpf[x * p] = p;
        }
    }
}

long long res[SQRT_LIM + 10];
long long solve(int n, int k)
{
    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t);

    memset(res, 0, sizeof(res[0]) * (t + 1));
    for (int d = 1; d * d <= n; ++d) 
        res[d] = nck(d - 1, k - 1);

    for (int p : prime)
        for (int d = t / p; d > 0; --d)
            res[d * p] -= res[d];

    long long ans = 0;
    for (int d = 1; d * d <= n; ++d)
        ans += res[d] * (n / (d * d));

    ans %= MOD;
    return ans;
}

But while doing research for task H, I found an improvement

O(sqrt (n/k) log log sqrt(n/k) - k) solution

vector<int> valid;
int cnt[SQRT_LIM];
bool is_squarefree[LIM];
long long res[SQRT_LIM + 10];
int solve(int n, int k)
{
    int t = ceil(sqrt(n) + 0.5);
    if (k > t) return 0;
    linear_sieve(t);
    precal_nck(t);

    memset(res, 0, sizeof(res[0]) * (t - k + 1));
    for (int d = k; d * d <= n; ++d) 
        res[d - k] = nck(d - 1, k - 1);

    for (int p : prime)
        for (int d = t / p; d >= k; --d)
            res[d * p - k] -= res[d];

    long long ans = 0;
    for (int d = k; d <= t; ++d)
        ans += res[d - k] * (n / (d * d));

    ans %= MOD;
    return ans;
}

Complexity

The first implementation

The second implementation

The third implementation

Solution for duplicates elements in array

Extra task C

Problem

Given $$$k, n (1 \leq k \leq n \leq 10^9)$$$, count the number of array $$$a[]$$$ of size $$$k$$$ that satisfied

$$$1 \leq a_1 \leq a_2 \leq \dots \leq a_k \leq n$$$
$$$a_i \times a_j$$$ is perfect square $$$\forall 1 \leq i < j \leq k$$$

Since the result can be big, output it under modulo $$$10^9 + 7$$$.

Idea

Observation

Calculation

Implementation

O(n) solution


int fact[SQRT_LIM + 10];
int invs[SQRT_LIM + 10];
int tcaf[SQRT_LIM + 10];
void precal_nck(int n = SQRT_LIM)
{
    fact[0] = fact[1] = 1;
    invs[0] = invs[1] = 1;
    tcaf[0] = tcaf[1] = 1;
    for (int i = 2; i <= n; ++i)
    {
        fact[i] = (1LL * fact[i - 1] * i) % MOD;
        invs[i] = MOD - 1LL * (MOD / i) * invs[MOD % i] % MOD;
        tcaf[i] = (1LL * tcaf[i - 1] * invs[i]) % MOD;
    }
}

int nck(int n, int k)
{
    k = min(k, n - k);
    if (k < 0) return 0;

    long long res = fact[n];
    res *= tcaf[k];         res %= MOD;
    res *= tcaf[n - k];     res %= MOD;
    return res;
}

bool is_squarefree[LIM];
int solve(int n, int k)
{
    memset(is_squarefree, true, sizeof(is_squarefree[0]) * (n + 1));
    precal_nck(2 * n + 1);

    long long res = 0;
    for (int i = 1, j; i <= n; ++i) if (is_squarefree[i]) 
    {
        for (j = 1; i * j * j <= n; ++j)
            is_squarefree[i * j * j] = false;

        res += nck(k + j - 2, k);
    }

    res %= MOD;
    return res;
}

O(sqrt n log sqrt n + k) solution

const int LIM = 5e6 + 56;
const int SQRT_LIM = ceil(sqrt(LIM) + 1) + 1;
const int MOD = 1e9 + 7;

/// Precalculating factorials under prime modulo
int fact[SQRT_LIM + 10]; /// fact[n] = n!
int invs[SQRT_LIM + 10]; /// invs[n] = n^(-1)
int tcaf[SQRT_LIM + 10]; /// tcaf[n] = (n!)^(-1)
void precal_nck(int n = SQRT_LIM)
{
    fact[0] = fact[1] = 1;
    invs[0] = invs[1] = 1;
    tcaf[0] = tcaf[1] = 1;
    for (int i = 2; i <= n; ++i)
    {
        fact[i] = (1LL * fact[i - 1] * i) % MOD;
        invs[i] = MOD - 1LL * (MOD / i) * invs[MOD % i] % MOD;
        tcaf[i] = (1LL * tcaf[i - 1] * invs[i]) % MOD;
    }
}

/// Calculating binomial coefficient queries
int nck(int n, int k)
{
    k = min(k, n - k);
    if (k < 0) return 0;

    long long res = fact[n];
    res *= tcaf[k];         res %= MOD;
    res *= tcaf[n - k];     res %= MOD;
    return res;
}

/// Linear Sieve
vector<int> prime;           /// prime list              = A000040
bool isPrime[SQRT_LIM + 10]; /// characteristic function = A010051
int lpf[SQRT_LIM + 10];      /// lowest prime factor     = A020639
int mu[SQRT_LIM + 10];       /// mobius                  = A008683
void linear_sieve(int n)
{
    if (n < 1) return ;
    /// Extension Sieve || You can add something more
    memset(lpf, 0, sizeof(lpf[0]) * (n + 1));
    fill_n(mu, n + 1, 1);
    /// Main Sieve || Without this, you barely able to achive linear complexity
    prime.clear();
    prime.reserve(n / log(n - 1));
    memset(isPrime, true, sizeof(isPrime[0]) * (n + 1));
    isPrime[0] = isPrime[1] = false;
    for (int x = 2; x <= n; ++x) /// For each number
    {
        if (isPrime[x]) /// Func[Prime]
        {
            mu[x] = -1;
            lpf[x] = x;
            prime.push_back(x);
        }
        for (int p : prime) /// Func[Prime * X] <- Func[Prime]
        {
            if (p > lpf[x] || x * p > n) break;
            isPrime[x * p] = 0;
            lpf[x * p] = p;
            mu[x * p] = (lpf[x] == p) ? 0 : -mu[x];
        }
    }
}

/// Divisor sieve
vector<int> divisors[SQRT_LIM];
void precal_div(int n) /// O(n log n)
{
    for (int u = n; u >= 1; --u)
    {
        divisors[u].clear();
        for (int v = u; v <= n; v += u)
            divisors[v].push_back(u);
    }
}

/// Solving for n, k
long long solve(int n, int k)
{
    /// We only care for d that 1 <= d <= sqrt(n)
    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t);
    precal_div(t);

    long long res = 0;
    for (int d = 1; d * d <= n; ++d) /// For each fixed p^2
    {
        long long sum = 0;
        for (int p : divisors[d]) /// For each (p | d)
            sum += mu[d / p] * nck(d + k - 2, k - 1);

        sum %= MOD;
        res += sum * (n / (d * d));
    }

    res %= MOD;
    return res;
}

int main()
{
    ios::sync_with_stdio(false);
    cin.tie(NULL);

    /// Assumming constant p = 10^9 + 7
    int n, k;
    cin >> n >> k;
    cout << solve(n, k);
    return 0;
}

O(sqrt n log log sqrt n + k) solution

int fact[SQRT_LIM + 10];
int invs[SQRT_LIM + 10];
int tcaf[SQRT_LIM + 10];
void precal_nck(int n = SQRT_LIM)
{
    fact[0] = fact[1] = 1;
    invs[0] = invs[1] = 1;
    tcaf[0] = tcaf[1] = 1;
    for (int i = 2; i <= n; ++i)
    {
        fact[i] = (1LL * fact[i - 1] * i) % MOD;
        invs[i] = MOD - 1LL * (MOD / i) * invs[MOD % i] % MOD;
        tcaf[i] = (1LL * tcaf[i - 1] * invs[i]) % MOD;
    }
}

int nck(int n, int k)
{
    k = min(k, n - k);
    if (k < 0) return 0;

    long long res = fact[n];
    res *= tcaf[k];         res %= MOD;
    res *= tcaf[n - k];     res %= MOD;
    return res;
}

vector<int> prime;           /// prime list              = A000040
bool isPrime[SQRT_LIM + 10]; /// characteristic function = A010051
int lpf[SQRT_LIM + 10];      /// lowest prime factor     = A020639
void linear_sieve(int n)
{
    if (n < 1) return ;
    prime.clear();
    prime.reserve(n / log(n - 1));
    memset(lpf, 0, sizeof(lpf[0]) * (n + 1));
    memset(isPrime, true, sizeof(isPrime[0]) * (n + 1));
    isPrime[0] = isPrime[1] = false;
    for (int x = 2; x <= n; ++x)
    {
        if (isPrime[x]) /// Func[Prime]
        {
            lpf[x] = x;
            prime.push_back(x);
        }
        for (int p : prime) /// Func[Prime * X] <- Func[Prime]
        {
            if (p > lpf[x] || x * p > n) break;
            isPrime[x * p] = 0;
            lpf[x * p] = p;
        }
    }
}

long long res[SQRT_LIM + 10];
long long solve(int n, int k)
{
    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t + k);
    precal_nck(t + k);

    memset(res, 0, sizeof(res[0]) * (t + 1));
    for (int d = 1; d * d <= n; ++d) 
        res[d] = nck(d + k - 2, k - 1);

    for (int p : prime)
        for (int d = t / p; d > 0; --d)
            res[d * p] -= res[d];

    long long ans = 0;
    for (int d = 1; d * d <= n; ++d)
        ans += res[d] * (n / (d * d));

    ans %= MOD;
    return ans;
}

Complexity

The first implementation

The second and third implementation

Solution when there are no restriction between k, n, p

Extra task D

Problem

Given $$$k, n, p (1 \leq k, n, p \leq 10^9)$$$, count the number of array $$$a[]$$$ of size $$$k$$$ that satisfied

$$$1 \leq a_1 < a_2 < \dots < a_k \leq n$$$
$$$a_i \times a_j$$$ is perfect square $$$\forall 1 \leq i < j \leq k$$$

Since the result can be big, output it under modulo $$$p$$$.

Idea

Observation

Large prime p

For large prime $$$p > max(n, k)$$$

Just using normal combinatorics related to factorial (since $$$p > max(n, k)$$$ nothing will affect the result)
For taking divides under modulo you can just take modular inversion (as a prime always exist such number)
Yet this is standard problem, just becareful of the overflow part
You can also optimize by precalculating factorial, inversion number and inversion factorial in linear too

For general prime $$$p$$$

We can just ignore factors $$$p$$$ in calculating $$$n!$$$.
You also need to know how many times factor $$$p$$$ appears in $$$1 \dots n$$$
Then combining it back when calculating for the answer.
If we dont do this $$$n!$$$ become might divides some factors of $$$p$$$.
By precalculation you can answer queries in $$$O(1)$$$

For squarefree $$$p$$$

Factorize $$$p = p_1 \times p_2 \times p_q$$$ that all $$$p_i$$$ is prime.
Ignore all factors $$$p_i$$$ when calculate $$$n!$$$.
Remember to calculate how many times factors $$$p_i$$$ appear in $$$1 \dots n$$$.
When query for the answer we just combine all those part back.
Remember you can just take modulo upto $$$\phi(p)$$$ which you can also calculate while factorizing $$$p$$$.
Remember that $$$n!$$$ must not divides any factor $$$p_i$$$ otherwise you will get wrong answer.
By precalculation you can answer queries in $$$O(\log p)$$$

For general positive modulo $$$p$$$

Factorize $$$p = p_1^{f_1} \times p_2^{f_2} \times p_q^{f_q}$$$ that all $$$p_i$$$ is unique prime.
We calculate $$$C(n, k)$$$ modulo $$$p_i^{f_i}$$$ for each $$$i = 1 \dots q$$$.
To do that, we need to calculate $$$n!$$$ modulo $$$p_i^{f_i}$$$ which is described here.
To get the final answer we can use CRT.
Yet this is kinda hard to code and debug also easy to make mistake so you must becareful
I will let the implementation for you lovely readers.
Yet depends on how you calculate stuffs that might increase your query complexity
There are few (effective or atleast fully correct) papers about this but you can read the one written here

Implementation

O(n) for prime p > max(n, k)

/// SPyofgame linear template for precalculating factorials under large prime modulo
int fact[SQRT_LIM + 10]; /// fact[n] = n!
int invs[SQRT_LIM + 10]; /// invs[n] = n^(-1)
int tcaf[SQRT_LIM + 10]; /// tcaf[n] = (n!)^(-1)
void precal_nck(int n = SQRT_LIM)
{
    fact[0] = fact[1] = 1;
    invs[0] = invs[1] = 1;
    tcaf[0] = tcaf[1] = 1;
    for (int i = 2; i <= n; ++i)
    {
        fact[i] = (1LL * fact[i - 1] * i) % MOD;
        invs[i] = MOD - 1LL * (MOD / i) * invs[MOD % i] % MOD;
        tcaf[i] = (1LL * tcaf[i - 1] * invs[i]) % MOD;
    }
}

/// Calculating binomial coefficient queries
int nck(int n, int k)
{
    k = min(k, n - k);
    if (k < 0) return 0;

    long long res = fact[n];
    res *= tcaf[k];         res %= MOD;
    res *= tcaf[n - k];     res %= MOD;
    return res;
}

O(n log mod + sqrt(mod)) for prime p or squarefree p

vector<int> factor;
int factorize(int n) /// Calculating phi(n) while factorizing (n) in O(sqrt n)
{
    factor.clear();
    int phi = n;

    if (!(n & 1))
    {
        n >>= __builtin_ctz(n);
        factor.push_back(2);
        phi -= phi / 2;
    }

    for (int x = 3; x * x <= n; x += 2)
    {
        if (n % x == 0)
        {
            do n /= x; while (n % x == 0);
            factor.push_back(x);
            phi -= phi / x;
        }
    }

    if (n > 1)
    {
        factor.push_back(n);
        phi -= phi / n;
    }

    return phi;
}

int f[LIM];    /// f[x] = nck(n, x)
int fact[LIM]; /// n! 
int tcaf[LIM]; /// n!^(-1)
int divp[LIM]; /// x but ignore all factors p[i]
int cntp[LIM][LOG_LIM]; /// cntp[x][i] = Number of time factor p[i] appear in 1..x
void precal(int MOD) /// Calculate f[x] for all x = 1 -> n in O(n log mod + sqrt mod)
{
    int PHIMOD = factorize(MOD);
    for (int x = 1; x <= n; ++x) /// For each part x in n!
    {
        int &t = divp[x] = x;
        for (int i = 0; i < factor.size(); ++i) /// Ignore all factor p[i] of p
        {
            cntp[x][i] = cntp[x - 1][i];
            for (; t % factor[i] == 0; t /= factor[i]) /// Count how many times p[i] appears in 1..n
                ++cntp[x][i];
        }
    }

    fact[0] = fact[1] = 1;
    tcaf[0] = tcaf[1] = 1;
    for (int x = 2; x <= n; ++x) /// Finding n! and n!^(-1)
    {
        fact[x] = (1LL * fact[x - 1] * divp[x]) % MOD;
        tcaf[x] = powMOD(fact[x], PHIMOD - 1, MOD);
    }

    memset(f, 0, sizeof(f[0]) * k);
    for (int x = k; x <= n; ++x)
    {
        /// Calculate nck % p normally
        f[x] = fact[x];
        mulMOD(f[x], tcaf[k], MOD);
        mulMOD(f[x], tcaf[x - k], MOD);
        for (int i = 0; i < factor.size(); ++i) /// Bringing those factors back
        {
            int p = cntp[x][i] - cntp[k][i] - cntp[x - k][i];
            f[x] = 1LL * f[x] * powMOD(factor[i], p, MOD) % MOD;
        }
    }
}

Complexity

Spoiler

Solution when numbers are also bounded by negative number

Extra task E

Problem

Given $$$k, n (1 \leq k \leq n \leq 10^9)$$$, count the number of array $$$a[]$$$ of size $$$k$$$ that satisfied

$$$-n \leq a_1 < a_2 < \dots < a_k \leq n$$$
$$$a_i \times a_j$$$ is perfect square $$$\forall 1 \leq i < j \leq k$$$

Since the result can be big, output it under modulo $$$10^9 + 7$$$.

Idea

Hint

Yet this is the same as extra task C where only the counting part should be changed.

As we only care about integer therefore let not use complex math into this problem.

If there exist a negative number and a positive number, the product will be negative thus the sequence will not satisfied.

Becareful, there are the zeros too.

When the numbers are all unique, or $$$-n \leq a_1 < a_2 < \dots < a_k \leq n$$$

There are 4 cases:

Thus give us the formula of $$$task_E(n, k) = 2 \times task_B(n, k) + 2 \times task_B(n, k - 1)$$$.

Hint 1

Hint 2

Hint 3

Hint 4

Proof

Remember that when $$$k = 0$$$ the answer is $$$0$$$ otherwise you might somewhat having wrong result for negative number in binomial coefficients formula

With duplicates case

So what if I mix the problem with task C too ?

When the numbers can have duplicates, or $$$-n \leq a_1 \leq a_2 \leq \dots \leq a_k \leq n$$$

There are 5 cases:

Yet once again you can simplified it with less cases for easier calculation.

There are 2 main cases:

Thus give us the formula of $$$task_E(n, k) = 1 + 2 \times \overset{k}{\underset{t = 1}{\Large \Sigma}} task_B(n, t)$$$.

Why the formula is 2 * ...?

No I mean why there is no binomial coefficients for selecting the number of zeros ?

So where is the part 1 come frome ? - Why isnt it 2 instead ?

But this give you a $$$O(k)$$$ solution.

You can do better with math

Hint 1

Hint 2

Solution

Implementation

O(sqrt n log log sqrt n) when the numbers are unique


long long res[SQRT_LIM + 10];
long long solve(int n, int k)
{
    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t);

    memset(res, 0, sizeof(res[0]) * (t + 1));
    for (int d = 1; d * d <= n; ++d) 
    {
        res[d] += (k >= 1) * nck(d - 1, k - 1) * 2;
        res[d] += (k >= 2) * nck(d - 1, k - 2) * 2;  
    }

    for (int p : prime)
        for (int d = t / p; d > 0; --d)
            res[d * p] -= res[d];

    long long ans = 0;
    for (int d = 1; d * d <= n; ++d)
        ans += res[d] * (n / (d * d));

    ans %= MOD;
    return ans;
}

And for duplicates (mixed with task C), we have:

O(kn) = O(n^2)

bool is_squarefree[LIM];
int brute(int n, int k)
{
    memset(is_squarefree, true, sizeof(is_squarefree[0]) * (n + 1));
    precal_nck(2 * n + 1);

    long long res = 0;
    for (int i = 1, j; i <= n; ++i) if (is_squarefree[i]) 
    {
        for (j = 1; i * j * j <= n; ++j)
            is_squarefree[i * j * j] = false;

        res += nck(k + j - 2, k);
    }

    res %= MOD;
    return res;
}


long long solve(int n, int k)
{
    long long res = 1;
    for (int t = 1; t <= k; ++t)
        res += brute(n, t) * 2;

    res %= MOD;
    return res;
}

O(k sqrt n log sqrt n) = O(n sqrt n log n)

long long res[SQRT_LIM + 10];
long long brute(int n, int k)
{
    /// We only care for d that 1 <= d <= sqrt(n)
    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t + k);
    precal_div(t);

    long long res = 0;
    for (int d = 1; d * d <= n; ++d) /// For each fixed p^2
    {
        long long sum = 0;
        for (int p : divisors[d]) /// For each (p | d)
            sum += mu[d / p] * nck(d + k - 2, k - 1);

        sum %= MOD;
        res += sum * (n / (d * d));
    }

    res %= MOD;
    return res;
}

long long solve(int n, int k)
{
    long long res = 1;
    for (int t = 1; t <= k; ++t)
        res += brute(n, t) * 2;

    res %= MOD;
    return res;
}

O(k sqrt n log log sqrt n) = O(n sqrt n log log n)

long long res[SQRT_LIM + 10];
long long brute(int n, int k)
{
    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t + k);

    memset(res, 0, sizeof(res[0]) * (t + 1));
    for (int d = 1; d * d <= n; ++d) 
        res[d] = nck(d + k - 2, k - 1);

    for (int p : prime)
        for (int d = t / p; d > 0; --d)
            res[d * p] -= res[d];

    long long ans = 0;
    for (int d = 1; d * d <= n; ++d)
        ans += res[d] * (n / (d * d));

    ans %= MOD;
    return ans;
}

long long solve(int n, int k)
{
    long long res = 1;
    for (int t = 1; t <= k; ++t)
        res += brute(n, t) * 2;

    res %= MOD;
    return res;
}

O(k sqrt n + sqrt n log log sqrt n) = O(n sqrt n)

long long res[SQRT_LIM + 10];
long long solve(int n, int k)
{
    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t + k);

    memset(res, 0, sizeof(res[0]) * (t + 1));
    for (int d = 1; d * d <= n; ++d) 
        for (int t = 1; t <= k; ++t)
            res[d] += nck(d + t - 2, t - 1) * 2;

    for (int p : prime)
        for (int d = t / p; d > 0; --d)
            res[d * p] -= res[d];

    long long ans = 1;
    for (int d = 1; d * d <= n; ++d)
        ans += res[d] * (n / (d * d));

    ans %= MOD;
    return ans;
}

O(k + sqrt n log log sqrt n) = O(n)


long long res[SQRT_LIM + 10];
long long solve(int n, int k)
{
    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t + k);

    memset(res, 0, sizeof(res[0]) * (t + 1));
    for (int d = 1; d * d <= n; ++d) 
        res[d] += nck(d + k - 1, k - 1) * 2;

    for (int p : prime)
        for (int d = t / p; d > 0; --d)
            res[d * p] -= res[d];

    long long ans = 1;
    for (int d = 1; d * d <= n; ++d)
        ans += res[d] * (n / (d * d));

    ans %= MOD;
    return ans;
}

Complexity

Spoiler

Conclusion

Spoiler

Solution when numbers are also bounded by a specific range

Extra task F

Problem

Given $$$k, L, R (1 \leq k, L, R \leq 10^9)$$$, count the number of array $$$a[]$$$ of size $$$k$$$ that satisfied

$$$L \leq a_1 < a_2 < \dots < a_k \leq R$$$
$$$a_i \times a_j$$$ is perfect square $$$\forall 1 \leq i < j \leq k$$$

Since the result can be big, output it under modulo $$$10^9 + 7$$$.

Idea

Observation

We split into 4 cases

The cases

You can easilier solve for each cases in linear

Case 1

Case 2

Case 3

Case 4

Now assumming $$$1 \leq L \leq R$$$, here is how we solve it.

As a simple approach, we can just do like original problem for general $$$k$$$.

We just need to iterative for each fixed squarefree $$$u$$$ and count the number of way to select $$$p^2$$$ as usual but a bit of change.

Why do I get WA

By doing so trivially we can have the complexity of $$$O((R - L + 1) \log R)$$$ time and $$$O(R)$$$ space.

Optimize my time complexity

Hint 1 to optimize space complexity

Hint 2 to optimize space complexity

Hint 3 to optimize space complexity

Optimization

Is it over ? Cant we do better ?

Isnt there is trivial way that we forget ?

Is there a way that we can iterative through [L, R] ?

Can we have some ways that you iterative not the whole part [1, R] ?

Can we just iterative through [1, \sqrt{R}] and [L, R] to solve it

What is that way ?

Isnt it bad ?

Can we factorize numbers faster ?

Is there another way then pollard rho ?

But how do we count the number of way to select p^2 ?

But how can we iterative for each p^2 ?

Wait is another way to reduce the complexity ?

Wait still we use factorization ?

Another way, but still faster then to apply pollard rho ?

You mean something like sieving or am I wrong in something ?

Isnt the normal sieve we marked for primes, but now the loop is inside, you you mean to...

Can we use the trick like u * c^2 to reduce the number of cases when we take prime as the first loop ?

Wait isnt this a kind of segment sieve ?

So what is that the better solution and how do we implement it steps by steps ?

Solution

Implementation

Let $$$Z = max(|L|, |R|)$$$

O(Z) time - O(R - L) space

bool is_squarefree[LIM + 10];
int solve(int l, int r, int k)
{
    if (l > r) return 0;
    if (r < 0) return solve(-r, -l, k);
    if (l <= 0 && 0 <= r)
    {
        long long res = 0LL + taskB(abs(l), k) + taskB(abs(l), k - 1) + taskB(abs(r), k) + taskB(abs(r), k - 1);
        while (res >= MOD) res -= MOD;
        return res;
    }

    memset(is_squarefree, true, sizeof(is_squarefree[0]) * (r - l + 1));
    precal_nck(r - l + 1);

    int tot = r - l + 1;
    long long res = 0;
    for (int i = 1; i <= r; ++i)
    {
        int j = sqrt(l / i);
        while (i * j * j > l) --j;
        while (i * j * j < l) ++j;
        if (is_squarefree[i * j * j - l])
        {
            int cnt = 0;
            for (; i * j * j <= r; ++j)
            {
                ++cnt;
                --tot;
                is_squarefree[i * j * j - l] = false;
            }
            res += nck(cnt, k);
            if (tot == 0) break;
        }
    }

    res %= MOD;
    return res;
}

O(R * sqrt(Z) / log(Z)) time - O(R - L) space


long long res[SQRT_LIM + 10];
long long taskB(int n, int k)
{
    if (k == 0) return 0;
    if (k == 1) return n;

    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t);

    memset(res, 0, sizeof(res[0]) * (t + 1));
    for (int d = 1; d * d <= n; ++d) 
        res[d] = nck(d - 1, k - 1);

    for (int p : prime)
        for (int d = t / p; d > 0; --d)
            res[d * p] -= res[d];

    long long ans = 0;
    for (int d = 1; d * d <= n; ++d)
        ans += res[d] * (n / (d * d));

    ans %= MOD;
    return ans;
}

bool is_squarefree[LIM + 10];
int solve(int l, int r, int k)
{
    if (l > r) return 0;
    if (r < 0) return solve(-r, -l, k);
    if (l <= 0 && 0 <= r)
    {
        long long res = 0LL + taskB(abs(l), k) + taskB(abs(l), k - 1) + taskB(abs(r), k) + taskB(abs(r), k - 1);
        while (res >= MOD) res -= MOD;
        return res;
    }
    
    int t = ceil(sqrt(r + 1) + 1) + 1;
    memset(is_squarefree, true, sizeof(is_squarefree[0]) * (r - l + 1));
    precal_nck(r - l + 1);
    linear_sieve(t);

    long long res = 0;
    for (int x = l; x <= r; ++x) if (is_squarefree[x - l])
    {
        int u = x;
        int v = 1;
        for (int p : prime)
        {
            int q = p * p;
            if (q > u) break;
            while (u % q == 0)
            {
                u /= q;
                v *= p;
            }
        }

        int cnt = 0;
        for (; u * v * v <= r; ++v)
        {
            is_squarefree[u * v * v - l] = false;
            ++cnt;
        }

        res += nck(cnt, k);
    }

    res %= MOD;
    return res;
}

O(sqrt R log log R + (R - L)) time - O(R - L) space

long long res[SQRT_LIM + 10];
long long taskB(int n, int k)
{
    if (k == 0) return 0;
    if (k == 1) return n;

    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t);

    memset(res, 0, sizeof(res[0]) * (t + 1));
    for (int d = 1; d * d <= n; ++d) 
        res[d] = nck(d - 1, k - 1);

    for (int p : prime)
        for (int d = t / p; d > 0; --d)
            res[d * p] -= res[d];

    long long ans = 0;
    for (int d = 1; d * d <= n; ++d)
        ans += res[d] * (n / (d * d));

    ans %= MOD;
    return ans;
}

#include <algorithm>
#define all(x) (x).begin(), (x).end()
bool is_squarefree[LIM + 10];
int squarefree[LIM + 10];
int sqf_factor[LIM + 10];
int solve(int l, int r, int k)
{
    if (l > r) return 0;
    if (r < 0) return solve(-r, -l, k);
    if (l <= 0 && 0 <= r)
    {
        long long res = 0LL + taskB(abs(l), k) + taskB(abs(l), k - 1) + taskB(abs(r), k) + taskB(abs(r), k - 1);
        while (res >= MOD) res -= MOD;
        return res;
    }
    
    int t = ceil(sqrt(r + 1) + 1) + 1;
    memset(is_squarefree, true, sizeof(is_squarefree[0]) * (r - l + 1));
    precal_nck(r - l + 1);
    linear_sieve(t);
    
    for (int x = l; x <= r; ++x)
    {
        squarefree[x - l] = x;
        sqf_factor[x - l] = 1;
    }

    long long res = 0;
    for (int p : prime)
    {
        for (int q = p * p, x = max(q, (l + q - 1) / q * q); x <= r; x += q)
        {
            while (squarefree[x - l] % q == 0)
            {
                squarefree[x - l] /= q;
                sqf_factor[x - l] *= p;
            }
        }
    }
    
    for (int x = l; x <= r; ++x)
    {
        int u = squarefree[x - l];
        int p = sqf_factor[x - l];
        if (is_squarefree[x - l])
        {
            int cnt = 0;
            for (; u * p * p <= r; ++p)
            {
                is_squarefree[u * p * p - l] = false;
                ++cnt;
            }
    
            res += nck(cnt, k);   
        }
    }
    
    res %= MOD;
    return res;
}

Complexity

Spoiler

Solution when the product you must find is a perfect cube

Extra task G

Problem

Given $$$k, n (1 \leq k \leq n \leq 10^9)$$$, count the number of array $$$a[]$$$ of size $$$k$$$ that satisfied

$$$1 \leq a_1 < a_2 < \dots < a_k \leq n$$$
$$$a_i \times a_j \times a_t$$$ is perfect cube $$$\forall 1 \leq i < j < t \leq k$$$

Since the result can be big, output it under modulo $$$10^9 + 7$$$.

Idea

k < 3

k > 3

For $$$k > 3$$$, you can prove that every number you selected must share same cubefree therefore just make a cubefree sieve in linear.

Hint 1

Hint 2

Hint 3

Hint 4

Hint 5

But still, you can apply the same idea used in extra task B to achive better complexity.

Instead of throwing a bunch of math using weird formulas with long chain of theorems and proving stuffs.

We can see the algorithm in the other way then come to the formula.

What we had done in task B

What we are going to do in task G

So now we come for the formula:

Defining stuffs

Hint 1

Hint 2

Hint 3

Hint 4

Bonus

k = 3

Implementation

k<3::O(1) || k>3::O(n) || k=3::O(n^2 log n) solution

int cnt[LIM];
vector<int> appear[LIM];
int solveG(int n, int k)
{
    linear_sieve(n * n);
    for (int i = 1; i <= n; ++i)
        appear[i].clear();

    for (int i = 1; i <= n; ++i)
    {
        for (int j = i + 1; j <= n; ++j)
        {
            int u = 1;
            for (int x = i * j; x > 1; )
            {
                int a = lpf[x], b = a * a * a;
                for (; x % b == 0; x /= b);
                for (; x % a == 0; x /= a) u *= a;
            }
            appear[j + 1].push_back(u);
        }
    }

    long long res = 0;
    memset(cnt, 0, sizeof(cnt[0]) * (n * n + 1));
    for (int i = 1; i <= n; ++i)
    {
        for (int u : appear[i]) ++cnt[u];
        int v = 1;
        for (int x = i; x > 1; )
        {
            int a = lpf[x], b = a * a * a;
            for (; x % b == 0; x /= b);
            if (x % (a * a) == 0)
            {
                x /= a * a;
                v *= a;
            }
            else if (x % a == 0)
            {
                x /= a;
                v *= a * a;
            }
        }

        res += cnt[v];
    }    

    res %= MOD;
    return res;
}

bool is_cubefree[LIM + 10];
int solve(int n, int k)
{
    if (k == 0) return 0;
    if (k == 1) return n;
    if (k == 2) return (1LL * n * (n - 1) / 2) % MOD;
    if (k == 3) return solveG(n, k);

    memset(is_cubefree, true, sizeof(is_cubefree[0]) * (n + 1));
    precal_nck(n);

    long long res = 0;
    for (int i = 1, j; i <= n; ++i) if (is_cubefree[i]) 
    {
        for (j = 1; i * j * j * j<= n; ++j)
        {
            is_cubefree[i * j * j * j] = false;
        }

        res += nck(j - 1, k);
    }

    res %= MOD;
    return res;
}

k<3::O(1) || k>3::O(cbrt n log cbrt n) || k=3::Õ(n) but practically O(n^(0.59)) for small n

vector<int> divisor[LIM];
int solveG(int n, int k)
{
    linear_sieve(n);
    for (int i = 1; i <= n; ++i)
        divisor[i].clear();
 
    for (int j = 1; j <= n; ++j)
    {
        int x = 1;
        for (int t = j; t > 1; )
        {
            if (x > n) break;
            int a = lpf[t];
            ll b = 1LL * a * a * a;
            while (t % b == 0)
            {
                t /= b;
                if (1LL * x * a > n) goto skip;
                x *= a;
            }
 
            if (t % a == 0)
            {
                if (1LL * x * a > n) goto skip;
                x *= a;
                do t /= a; while (t % a == 0);
            }
        }
 
        for (int i = x; i <= n; i += x)
            divisor[i].push_back(j);
 
        skip:{};
    }
 
    int res = 0;
    for (int i = 1; i <= n; ++i)
    {
        ll t = 1LL * i * i * i;
        for (int x = 0; x < divisor[i].size(); ++x)
        {
            for (int y = x + 1; y < divisor[i].size(); ++y)
            {
                int a = divisor[i][x];
                int b = divisor[i][y];
                int c = t / (1LL * a * b);
                res += (n >= c && c > b && 1LL * a * b * c == 1LL * i * i * i);
            }
        }
    }
 
    return res;
}
 
int res[LIM + 10];
int solve(int n, int k)
{
    if (k == 0) return 0;
    if (k == 1) return n;
    if (k == 2) return (1LL * n * (n - 1) / 2) % MOD;
    if (k == 3) return solveG(n, k);
    int t = ceil(cbrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t);
    precal_div(t);

    long long res = 0;
    for (int d = 1; d * d * d <= n; ++d) /// For each fixed p^2
    {
        long long sum = 0;
        for (int p : divisors[d]) /// For each (p | d)
            sum += mu[d / p] * nck(p - 1, k - 1);

        sum %= MOD;
        res += sum * (n / (d * d * d));
    }

    res %= MOD;
}

k<3::O(1) || k>3::O(cbrt n log log cbrt n) || k=3::Õ(n) but practically fast

int ceil_sqrt(ll x)
{
    int t = sqrt(x);
    while (1LL * t * t > x) --t;
    while (1LL * t * t < x) ++t;
    return t;
}

#define sz(x) int((x).size())
#define all(x) (x).begin(), (x).end()
#define rall(x) (x).rbegin(), (x).rend()
#define lb(x, v) lower_bound(all(x), v) - (x).begin()
#define ub(x, v) upper_bound(all(x), v) - (x).begin()

ll scm[LIM];
vector<int> divisor[LIM];
int solveG(int n, int k)
{
    linear_sieve(n);
    fill_n(scm, n + 1, 1);
    for (int i = 1; i <= n; ++i)
        divisor[i].clear();
        
    for (int p : prime)
    {
        ll q = 1LL * p * p * p;
        for (int t = p; ; t *= q)
        {
            for (int i = t; i <= n; i += t)
                scm[i] = (scm[i] > n / p) ? n + 1 : scm[i] * p;
            
            if (1LL * t > n / q) break;
        }
    }

    for (int j = 1; j <= n; ++j)
        for (int i = scm[j]; i <= n; i += scm[j])
            divisor[i].push_back(j);
    
    int res = 0;
    for (int i = 1; i <= n; ++i)
    {
        ll t = 1LL * i * i * i;
        for (int x = 0; x < divisor[i].size(); ++x)
        {
            int a = divisor[i][x];
            int v = ceil_sqrt(t / a);
            int y0 = (divisor[i].back() < v) ? sz(divisor[i]) : lb(divisor[i], v);
            for (int y = y0 - 1; y > x; --y)
            {
                int b = divisor[i][y];
                int c = t / (1LL * a * b);
                if (c > n) break;
                res += (1LL * a * b * c == t);
            }
        }
    }

    return res;
}

int res[LIM + 10];
int solve(int n, int k)
{
    if (k == 0) return 0;
    if (k == 1) return n;
    if (k == 2) return (1LL * n * (n - 1) / 2) % MOD;
    if (k == 3) return solveG(n, k);
    int t = ceil(sqrt(n) + 1) + 1;
    linear_sieve(t);
    precal_nck(t);

    memset(res, 0, sizeof(res[0]) * (t + 1));
    for (int d = 1; d * d * d <= n; ++d) 
        res[d] = nck(d - 1, k - 1);

    for (int p : prime)
        for (int d = t / p; d > 0; --d)
            res[d * p] -= res[d];

    long long ans = 0;
    for (int d = 1; d * d * d <= n; ++d)
        ans += res[d] * (n / (d * d * d));

    ans %= MOD;
    return ans;
}

Complexity

k < 3

k > 3

k = 3

In the first implementation as you must using factorization, the cost is $$$O(\log n)$$$ for each number, hence you got that complexity.

Bonus

Secondary Bonus

About the actual complexity or better algorithm related to the first the implementation

Yet for the second implementations things gone crazy.

Precalculation part

Calculation part

It has unprovable complexity, though I tried to search for papers, and blogs, even with the help of some GMs I cant find nothing good enough to claim its real complexity.

Yet it might be not the real complexity but also kind of an illusion assumptioning high constant and faking its real higher complexity.

Fast running time

Real Complexity

And the third implementation have a bit optimization on complexity

Precalculation part

Calculation part

Yet it is still hard to find the complexity under the form of $$$O(n \log^k n)$$$

Solution when you are given n and queries for k

Extra task H

Problem

Given $$$n$$$ you have to answer for queries of $$$k$$$ $$$(1 \leq k \leq n \leq 10^9)$$$, count the number of array $$$a[]$$$ of size $$$k$$$ satisfied

$$$1 \leq a_1 < a_2 < \dots < a_k \leq n$$$
$$$a_i \times a_j$$$ is perfect square $$$\forall 1 \leq i < j \leq k$$$

Idea

Simplest idea

Observation

Implementation

O(n) precalculate - O(sqrt n - k) query

vector<int> valid;
int c[SQRT_LIM];
bool is_squarefree[LIM];
void precal(int n)
{
    int t = ceil(sqrt(n) + 0.5);
    memset(c, 0, sizeof(c[0]) * (t + 1));
    memset(is_squarefree, true, sizeof(is_squarefree[0]) * (n + 1));
    for (int i = 1, j; i <= n; ++i) if (is_squarefree[i]) 
    {
        for (j = 1; i * j * j <= n; ++j)
            is_squarefree[i * j * j] = false;

        ++c[j - 1];
    }

    precal_nck(t);
    valid.clear();
    for (int i = t; i >= 1; --i) if (c[i])
        valid.push_back(i);
}

int query(int k)
{
    long long res = 0;
    for (int x : valid)
    {
        if (x < k) break;
        res += 1LL * c[x] * nck(x, k);
        res %= MOD;
    }

    return res;
}

O(sqrt n) precalculate - O(sqrt (n/k) log log sqrt(n/k) + sqrt n - k) query

vector<int> valid;
int cnt[SQRT_LIM];
bool is_squarefree[LIM];
long long res[SQRT_LIM + 10];
int global_t, global_n;
void precal(int n)
{
    global_n = n;
    global_t = ceil(sqrt(n) + 0.5);
    linear_sieve(global_t);
    precal_nck(global_t);
}

int query(int k)
{
    if (k > global_t) return 0;
    memset(res + k, 0, sizeof(res[0]) * (global_t - k + 1));
    for (int d = k; d * d <= global_n; ++d) 
        res[d] = nck(d - 1, k - 1);

    for (int p : prime)
        for (int d = global_t / p; d >= k; --d)
            res[d * p] -= res[d];

    long long ans = 0;
    for (int d = k; d <= global_t; ++d)
        ans += res[d] * (global_n / (d * d));

    ans %= MOD;
    return ans;
}

Complexity

The first implementation

The second implementation

Contribution

Yurushia for pointing out the linear complexity of squarefree sieve.
clyring for fixing typos, and the approach for tasks A, B, C, D, E, G, H, J.
errorgorn for adding details, and the approach for task F, J, M, O, better complexity for C, E, G.
cuom1999 for participating $$$O(n^2)$$$ approach for problem G.
vinfat for participating approach related to factorize $$$p^3$$$ into $$$3$$$ product partions in problem G though failed to achieve better complexity (editted: confirmed that the complexity seems to be better now).
Lihwy, jalsol for combinatorics calculation and the proof of stars and bars in task C.
Editorial Slayers Team Lyde DeMen100ns Duy_e OnionEgg QuangBuiCPP _FireGhost_ Shironi for reviewing, fixing typos and feed backs.

Rev.	By	When	Δ	Comment
en58	SPyofcode	2022-08-11 06:44:58	1	Fixing Latex Notation
en57	SPyofcode	2022-08-11 06:43:05	47
en56	SPyofcode	2022-08-11 06:40:20	80	Fix latex notations.
en55	SPyofcode	2021-11-10 19:53:18	1215
en54	SPyofcode	2021-11-09 16:59:56	533
en53	SPyofcode	2021-11-09 12:12:28	1853	Tiny change: '^9 + 7$.\nFor $n \' -> '^9 + 7$.\n\nFor $n \'
en52	SPyofcode	2021-11-08 13:43:39	1194
en51	SPyofcode	2021-11-08 06:13:01	13
en50	SPyofcode	2021-11-07 16:25:16	6097	Tiny change: 'es of $k$ (1 \leq k ' -> 'es of $k$ $(1 \leq k '
en49	SPyofcode	2021-11-07 14:18:44	12
en48	SPyofcode	2021-11-07 12:45:40	155	Reverted to en46
en47	SPyofcode	2021-11-07 12:28:35	155
en46	SPyofcode	2021-11-06 16:50:00	285	Tiny change: 't \rfloor \right + \unders' -> 't \rfloor + \unders'
en45	SPyofcode	2021-11-06 14:33:23	12
en44	SPyofcode	2021-11-06 13:46:02	153
en43	SPyofcode	2021-11-06 07:13:11	57
en42	SPyofcode	2021-11-06 07:06:43	3009
en41	SPyofcode	2021-11-06 04:59:50	35
en40	SPyofcode	2021-11-06 04:11:51	7
en39	SPyofcode	2021-11-06 04:11:11	556
en38	SPyofcode	2021-11-06 03:56:31	933
en37	SPyofcode	2021-11-06 02:55:34	920	Tiny change: '\n\n#### Algorithm\n\n<spoil' -> '\n\n#### Idea\n\n<spoil'
en36	SPyofcode	2021-11-05 17:25:43	37	Tiny change: 'squarefree as it j' -> 'squarefree, it is not how we use in the formula as it j'
en35	SPyofcode	2021-11-05 17:11:33	8990
en34	SPyofcode	2021-11-05 17:03:34	14732
en33	SPyofcode	2021-11-05 16:50:02	4141
en32	SPyofcode	2021-11-05 10:36:09	617
en31	SPyofcode	2021-11-05 07:21:43	631
en30	SPyofcode	2021-11-05 06:45:33	246
en29	SPyofcode	2021-11-05 06:39:36	2084
en28	SPyofcode	2021-11-03 05:22:22	537
en27	SPyofcode	2021-11-03 05:04:31	1713
en26	SPyofcode	2021-11-03 03:04:41	58
en25	SPyofcode	2021-11-02 19:52:48	5	Tiny change: 'log/entry/edit/96379)\' -> 'log/entry/96379)\'
en24	SPyofcode	2021-11-02 13:30:46	44
en23	SPyofcode	2021-11-02 12:45:42	90
en22	SPyofcode	2021-11-02 12:43:07	181
en21	SPyofcode	2021-11-02 12:30:16	107
en20	SPyofcode	2021-11-02 12:28:56	50
en19	SPyofcode	2021-11-02 12:27:13	5785
en18	SPyofcode	2021-11-02 04:29:23	46	Tiny change: 'g,2021-7-11] for fixi' -> 'g,2021-7-12] for fixi'
en17	SPyofcode	2021-11-01 20:05:24	74
en16	SPyofcode	2021-11-01 14:07:51	12
en15	SPyofcode	2021-11-01 13:56:28	7	Tiny change: ' generalization harder va' -> ' generalized harder va'
en14	SPyofcode	2021-11-01 13:50:48	92
en13	SPyofcode	2021-11-01 13:35:43	24	Tiny change: 'qrt{R - L} {\Large )' -> 'qrt{R - L}) {\Large )'
en12	SPyofcode	2021-11-01 13:33:59	299
en11	SPyofcode	2021-11-01 13:31:23	188
en10	SPyofcode	2021-11-01 13:27:51	9686	Tiny change: 'ask F, J, better' -> 'ask F, J, M, O, better'
en9	SPyofcode	2021-11-01 08:44:03	156
en8	SPyofcode	2021-11-01 06:59:09	39
en7	SPyofcode	2021-11-01 06:51:52	29
en6	SPyofcode	2021-11-01 06:50:49	77
en5	SPyofcode	2021-11-01 06:49:01	21
en4	SPyofcode	2021-11-01 06:44:37	71
en3	SPyofcode	2021-11-01 06:42:35	605
en2	SPyofcode	2021-11-01 06:40:17	15718	(published)
en1	SPyofcode	2021-11-01 06:22:35	32737	Initial revision (saved to drafts)

The statement:

Extra Tasks

A better solution for k = 2

Problem

Examples

Idea

Implementation

Complexity

A better solution for general k

Problem

Examples

Idea

Implementation

Complexity

Solution for duplicates elements in array

Problem

Idea

Implementation

Complexity

Solution when there are no restriction between k, n, p

Problem

Idea

Implementation

Complexity

Solution when numbers are also bounded by negative number

Problem

Idea

Implementation

Complexity

Conclusion

Solution when numbers are also bounded by a specific range

Problem

Idea

Implementation

Complexity

Solution when the product you must find is a perfect cube

Problem

Idea

Implementation

Complexity

Solution when you are given n and queries for k

Problem

Idea

Implementation

Complexity

Contribution

History