TunaMayoSandwich's blog

By TunaMayoSandwich, history, 13 months ago, In English

As a pupil there are a lot of problems I come across that I can't figure out by myself, at least not in a contest's time. Most of the time there are great resources available online and on this very site to learn more and understand all the nuances implied in these problems, but I recently stumbled upon what is regarded as a "standard problem" and couldn't find satisfying explanations about it so I decided to write one myself hoping it'll be of some help to other beginners. I'm going to discuss the problem of finding the maximum bitwise XOR of a subarray in $$$O(n \cdot log(C))$$$ as it pertains to the problem 1847C - Vampiric Powers, anyone?. I'll try to be as thorough as possible in the first part and as formal as possible in the second, only skipping over basic concepts (what is a subarray, what is bitwise xor, etc).

Task

Given an array $$$a$$$, let's define $$$f(l, r)$$$ as the xor of elements of $$$a$$$ from index $$$l$$$ to $$$r$$$, where $$$0 \leq l < r < n$$$. Find $$$\text{max} f(l, r)$$$ in $$$O(n \cdot log(C))$$$ where $$$C$$$ is the greatest value in the array.

Solution

The idea is to use the fact that $$$f(0, r) = f(0, l-1) \oplus f(l, r) \implies f(l, r) = f(0, r) \oplus f(0, l-1)$$$. This means that one can traverse the array and maintain a variable $$$\text{pre_xor} = f(0, i)$$$ where $$$i$$$ is the current position, then all that remains to do is to find $$$l-1$$$ such that $$$\text{pre_xor} \oplus f(0, l-1)$$$ is maximised, this way one has found $$$\text{max} (f(0, i) \oplus f(0, l-1)) = \text{max} f(l, i)$$$ and solves the problem by iterating over all values of $$$i$$$ from left to right.

The ideal way to find $$$l-1$$$ is to create a trie data structure. A trie is a k-ary search tree but in this case a branch is going to represent a bit and a path a number, since bits can only be 1s and 0s the trie will simply be a standard binary tree.

struct trieNode{
	ll value;
	trieNode *arr[2];
};

trieNode *newNode(){
	trieNode *temp = new trieNode;
	temp->value = 0;
	temp->arr[0] = NULL;
	temp->arr[1] = NULL;
	return temp;
}

Start traversing $$$a$$$ from index 0 and insert $$$\text{pre_xor}$$$ into the trie. To insert a number in the trie take a look at its bits from the most significant downwards, if it's set to 1 take a step right, otherwise take a step left and create new nodes along the way if needed. Exiting this cycle a path'll have been created through the trie representing $$$\text{pre_xor}$$$. Append $$$\text{pre_xor}$$$ to the end of this path so that when one reaches a leaf he has a convenient way of knowing what number he has arrived to. Note that nowhere in a trie node 1s and 0s appear, one just has to decide on a convention (right = 1, left = 0) and the encoding of 1s and 0s becomes implicit.

void insert(trieNode *root, ll pre_xor, ll max_bits){
	trieNode *temp = root;
 
	bool val;
	for (ll i=max_bits; i>=0; i--){
		val = pre_xor & (1<<i);
 
		if (temp->arr[val] == NULL) temp->arr[val] = newNode();
		temp = temp->arr[val];
	}
 
	temp->value = pre_xor;
}

Given $$$\text{pre_xor}$$$ the ideal value to xor it with is one that has 0s where $$$\text{pre_xor}$$$ has 1s and viceversa. The trie helps in finding the value that comes closest to the optimal one among $$$f(0, l-1)$$$ for all $$$0 \leq l \leq i$$$. Just start traversing it from top to bottom and try to go right (right = bit set to 1) if the most significant bit of $$$\text{pre_xor}$$$ is set to 0, left otherwise. If at any point one can't go where he wishes he shall go down the other path. When finally one reaches a leaf he'll find there the value representing the path he just went down, which is the optimal $$$f(0, l-1)$$$.

ll query(trieNode *root, ll pre_xor, ll max_bits){
	trieNode *temp = root;
	
	bool val;
	for (ll i=max_bits; i>=0; i--){
		val = pre_xor & (1<<i);
 
		if (temp->arr[!val] != NULL) temp = temp->arr[!val];
		else if (temp->arr[val] != NULL) temp = temp->arr[val];
	}
	
	return pre_xor^(temp->value);
}

ll maxSubarrayXOR(vector<ll>& arr, ll n = -1, ll max_val = -INF){
	if(n == -1) n = arr.size();
	if(max_val == -INF) max_val = *max_element(arr.begin(), arr.end());
	ll max_bits = 64 - __builtin_clzll(max_val | 1);
	
	trieNode *root = newNode();
	insert(root, 0, max_bits);
	ll result = 0, pre_xor =0;
 
	for (ll i=0; i<n; i++){
		pre_xor = pre_xor^arr[i];
		insert(root, pre_xor, max_bits);
		result = max(result, query(root, pre_xor, max_bits));
	}
	
	return result;
}

Why does this solve 1847C - Vampiric Powers, anyone?

It's not at all evident that problem 1847C tasks one to find the max xor of a subarray, but indeed it does, with one little adjustment. The function to optimise is still $$$f(l, r)$$$ but instead of $$$0 \leq l < r < n$$$, $$$0 \leq l \leq r < n$$$ stands where $$$f(i, i)$$$ is equal by definition to $$$a[i]$$$ for all $$$0 \leq i < n$$$. This means that if there's a value in the array which is not smaller than any xor will ever be then one needs to return that value. This is easily solved by inserting 0 into the trie at the beginning so that each number can be xored with 0 (that is, remain the same number) if there isn't anything better to xor it with. The piece of code posted above already does so.

If one doesn't find himself in such a lucky case, then the vampire powers only allow xoring with a fixed right index, that is $$$f(l, n-1)$$$, but allow to do so multiple times. Let's show that each xor of the form $$$f(l, r)$$$ can be constructed from 2 xors of the form $$$f(l_{1}, n-1)$$$, $$$f(l_{2}, n-1)$$$. This will prove that the xor of any subarray is not bigger than the solution to the problem at hand.

It goes as simple as this $$$f(l, r) = f(l, n-1) \oplus f(r+1, n-1)$$$, but there's a little caveat: after having performed the first xor the length of array $$$a$$$ will have grown to n+1 so, formally, the process is described as follows:

$$$a[0, ..., n-1] \rightarrow a[0, ..., n]$$$ where $$$a[n] = f(l, n-1) \rightarrow a[0, ..., n+1]$$$ where $$$ a[n+1] = f(r+1, n) = f(r+1, n-1) \oplus a[n] = f(r+1, n-1) \oplus f(l, n-1) = f(l, r)$$$

To end the proof that the solution to problem 1847C is the max xor of a subarray it's still required to show that vampire powers cannot create a solution that's better than $$$\text{max} f(l, r)$$$, let's do it by contradiction. Suppose vampire powers can create a solution that's better than $$$\text{max} f(l, r)$$$, this means that after having appended to the array several elements and having expanded it to be like $$$a[0, ..., m]$$$ where $$$a[i] = f(l_{i}, i-1)$$$ for all $$$n \leq i \leq m$$$, $$$0 \leq l_{i} \leq i-1$$$ one finds himself able to create a solution $$$f(l, m) >$$$ max xor of a subarray.

$$$f(l_{1}, m) = f(l_{1}, m-1) \oplus a[m] = f(l_{1}, m-1) \oplus f(l_{2}, m-1)$$$. Let $$$l_{+} = max(l_{1}, l_{2})$$$ and $$$l_{-} = min(l_{1}, l_{2})$$$, then $$$f(l_{1}, m) = f(l_{-}, l_{+})$$$. Repeating this process eventually yields the equality $$$f(l_{1}, m) = f(l, r)$$$ where $$$0 \leq l < r < n$$$ which is absurd and concludes the proof and this post. Have a nice rest of the day.

  • Vote: I like it
  • +19
  • Vote: I do not like it

| Write comment?
»
13 months ago, # |
  Vote: I like it +9 Vote: I do not like it

It is not $$$O(n)$$$, but $$$O(n\log C)$$$, where $$$C$$$ is the greatest value in the array.

»
13 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Once you understand this, this is a free CSES solve.

»
13 months ago, # |
  Vote: I like it 0 Vote: I do not like it

you have written "from" as "form" in task section.

Task :

Given an array a, let's define f(l, r) = xor of elements of a """form""" index l to r, where 0 <= l < r <= n-1. Find max f(l,r) in O(nlogC) where C is the greatest value in the array.

»
13 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Thanks for the blog. I've learned something new today.

»
6 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by TunaMayoSandwich (previous revision, new revision, compare).

»
4 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Can you tell me why did you take bitwise OR of the maximum number with 1 before calculating the leading zeros ?

ll max_bits = 64- __builtin_clzll(max_val | 1);
  • »
    »
    4 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Sure! log2() has a certain margin of error that makes it unfit for finding the index of the most significant bit set to one in a long long, for example it fails with 576460752303423487. A better way to do it is to use __builtin_clz() or __builtin_clzll() which return the number of leading zeros, because for x == 0 the result is undefined, the best way to do it is 64 — 1 — __builtin_clzll(x | 1). Of course the number of significant bits is 64 — __builtin_clzll(x | 1).

»
3 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Nice blog!, so basically you reduced the problem to finding the maximum xor of a given number with the numbers present till the index, and you are kind of calculating the pre_xor as you are moving forward which reduces it to a standard lc problem!

»
3 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Great blog ! I always appreciate formal proofs when the problem allows it. I'll try to remember this while grinding for cyan.