Using bitset to Count Dominating Pairs in High Dimensions

№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

Hi everyone!

Here I want to share a method to count dominating pairs in high dimensions, while the number of points ($$$n$$$) is small but the dimensionality ($$$k$$$) is large (like $$$n = 32768, k = 8$$$).

The problem of counting dominating pairs in K-Dimension is as follows:

For two $$$k$$$-tuples $$$a = (a_0, a_1, \dots, a_{k - 1})$$$ and $$$b = (b_0, b_1, \dots, b_{k - 1})$$$, define $$$dom(a, b) = \prod\limits_{i = 0}^{k - 1}[a_i \geq b_i]$$$, which outputs $$$1$$$ if each $$$a_i$$$ is greater than or equal to $$$b_i$$$ ($$$a$$$ dominates $$$b$$$), otherwise $$$0$$$.
Given $$$n$$$ $$$k$$$-tuples $$$t_0, t_1, \dots, t_{n - 1}$$$, for each $$$i \in [0, n)$$$, compute $$$g(i) = \sum\limits_{j = 0}^{n - 1} dom(t_i, t_j)$$$.

When $$$k = 2$$$, the problem simplifies to counting inversions, solvable directly using a Fenwick Tree.

For $$$k = 3$$$, one can consider a Divide and Conquer (D&C) approach, achieving a complexity of $$$O(n \log^2 n)$$$, which is excellent. Alternatively, a K-D Tree approach achieves $$$O(n \sqrt{n})$$$ complexity, also a good option.

For $$$k = 4$$$, D&C embedded with D&C or just K-D Tree solutions are less optimal, with reduced complexity benefits.

At $$$k = 5$$$, the Divide and Conquer approach is notably less efficient than brute force, and the K-D Tree offers no significant advantage over brute force.

An optimized brute force approach involves using bitset to enhance efficiency.

Brute Force Approach

An obvious brute force approach is to enumerate $$$i, j$$$ and check whether $$$p_i$$$ dominates $$$p_j$$$ in each dimension in $$$O(k)$$$ time. The overall complexity is $$$O(n^2 k)$$$.

A key property is that only if $$$dom[(a_{0..i}), (b_{0..i})] = 1$$$, $$$dom[(a_{0..i + 1}), (b_{0..i + 1})]$$$ can be equal to $$$1$$$. Therefore, the enumeration order can be swapped. First, enumerating dimensions (tuple indices) $$$d$$$, maintaining the dominant relation under dimension $$$d$$$, and computing the relation up to the next dimension based on the current order:

Let $$$b_{d, i, j} = dom(p_{i, 0..d - 1}, p_{j, 0..d - 1})$$$. Then:

$$$ b_{d + 1, i, j} \gets b_{d, i, j} \cdot [p_{i, d} \geq p_{j, d}] $$$

In practice, dimension $$$d$$$ can be omitted because each $$$(i, j)$$$ pair is independent, simplifying to:

$$$ b_{i, j} \gets b_{i, j} \cdot [p_{i, d} \geq p_{j, d}] $$$

The total complexity is $$$O(n^2 k)$$$.

Bitset Optimization

It's observed that this problem can be optimized using bitset. Utilize a 2D array bitset<N> b[N]. Considering the transition condition $$$[p_{i, d} \geq p_{j, d}]$$$, sort points by $$$p_{i, d}$$$ to obtain $$$q$$$, where $$$q_i$$$ is located at $$$p$$$ index $$$l_i$$$.

Enumerating from $$$0$$$ to $$$n - 1$$$, deduce a relation array $$$r$$$, where $$$r_j$$$ indicates $$$p_{l_j, d} \leq q_{i, d}$$$. Then update $$$b_{l_i}$$$ using $$$b_{l_i} \gets \left(b_{l_i} \text{ bitand } r\right)$$$.

Time complexity is $$$O(\frac{n^2 k}{w})$$$, space complexity is $$$O(\frac{n^2}{z})$$$, where $$$w = 64, z = 8$$$, with excellent constant factors.

Space Optimization

If space constraints are strict, use grouping.

Divide $$$S$$$ (where $$$S \geq w$$$) points $$$p_{l..l + S - 1}$$$ into groups. Each time, focus only on the values of $$$b_{l} \dots b_{l + S - 1}$$$. Analyzing the complexity:

Sorting takes $$$O(nk\log n)$$$ time;
Calculating $$$r$$$ for each group is $$$O(nk)$$$, thus the total complexity is $$$O(\frac{n^2k}{S})$$$. Thus, without space optimization, calculating $$$r$$$ has a complexity of $$$O(nk)$$$;
Calculating the array $$$b$$$ for each group takes $$$O(\frac{Snk}{w})$$$, resulting in a total complexity of $$$O(\frac{n^2k}{w})$$$.

Since $$$S \geq w$$$, the time complexity is $$$O(\frac{n^2k}{w})$$$, and the space complexity is $$$O(\frac{Sn}{z})$$$.

Implementation (C++)

Note that this implementation is not for CF1826E but for the pure counting problem above.

#include <iostream>
#include <cstdio>
#include <vector>
#include <cstring>
#include <algorithm>
#include <bitset>
#include <cmath>
using namespace std;

using i32 = int;
using i64 = long long;
using u32 = unsigned int;
using u64 = unsigned long long;

template<typename T> using vec = vector<T>;

signed main() 
{
    u32 k, n;
    cin >> k >> n;
    // p[k][i].first = p[i][k], p[k][i].second = l[i]
    vec<vec<pair<u32, u32>>> p(k, vec<pair<u32, u32>>(n));
    for (u32 i = 0; i < n; i++) {
        for (u32 d = 0; d < k; d++) {
            cin >> p[d][i].first;
            p[d][i].second = i;
        }
    }
    for (u32 d = 0; d < k; d++) sort(p[d].begin(), p[d].end());
    vec<u32> ans(n);
    // S mentioned above.
    // Note that larger bs, less time cost, but more space cost.
    const u32 bs = ceil(sqrt(n)) * 16;
    for (u32 i = 0; i * bs < n; i++) {
        u32 l = i * bs, r = min(n, i * bs + bs);
        // '80000' means n is no more than 80000.
        vec<bitset<80000>> b(r - l);
        for (auto &c: b) c.set();
        for (u32 d = 0; d < k; d++) {
            bitset<80000> s; ///< relationship array mentioned above
            for (u32 j = 0, it = 0; j < n; j++) {
                auto [v, q] = p[d][j];
                while (it < n && p[d][it].first <= v) s.set(p[d][it++].second);
                if (l <= q && q < r) b[q - l] &= s;
            }
        }
        for (u32 j = l; j < r; j++) ans[j] = b[j - l].count();
    }
    for (u32 i = 0; i < n; i++) cout << ans[i] << '\n';
    return 0;
}

Rev.	Кто	Когда	Δ	Комментарий
en8	robinyqc	2024-10-22 14:27:03	35	Fixed a typo. (And solved the mistake of revisions).
en7	robinyqc	2024-10-22 14:25:48	35	Reverted to en4
en6	robinyqc	2024-10-22 14:24:21	0	Fixed a typo.
en5	robinyqc	2024-10-22 14:22:08	35
en4	robinyqc	2024-07-10 04:26:47	4	Tiny change: ', i, j} = f(p_{i, 0..' -> ', i, j} = dom(p_{i, 0..'
en3	robinyqc	2024-07-10 04:14:18	421	publish (published)
en2	robinyqc	2024-07-10 03:36:10	325
en1	robinyqc	2024-07-09 17:19:30	5604	Initial revision (saved to drafts)

Brute Force Approach

Bitset Optimization

Space Optimization

История