Urbanowicz's blog

By Urbanowicz, 5 years ago, In English

Hello everyone!

This time I’d like to tell you about a data structure from 1983 which, I believe, deserves to be widely known in CP community. This structure solves the problem of quickly finding ancestors in a rooted tree. Like usual binary lifting, it does so in $$$O(\log n)$$$. But unlike with usual binary lifting, you can add and remove leaves in $$$O(1)$$$, which means that the total space requirement is just $$$O(n)$$$.

Each node in the tree requires the following fields:

class Node {
  int depth;
  Node parent;
  Node jump;
}

Here depth is the number of times you need to .parent the node in order to reach the root. Field jump points to some ancestor of the node.

Below is the definition of the root node. Its jump pointer is undefined, but it turns out convenient if it points to the root itself.

root.parent = null;
root.depth = 0;
root.jump = root;

Here’s how you find an ancestor that has specified (smaller) depth:

Node find(Node u, int d) {
  while (u.depth > d) {
    if (u.jump.depth < d) {
      u = u.parent;
    } else {
      u = u.jump;
    }
  }

  return u;
}

The implementation is straightforward because there’s not much you can do with a pair of pointers. We simply go up towards the target node, occasionally taking a jump whenever we are sure that it won’t skip the target.

Jump pointers

Now let’s define the jump pointers. Here’s how you create a new leaf:

Node makeLeaf(Node p) {
  Node leaf = new Node();
  leaf.parent = p;
  leaf.depth = p.depth + 1;

  if (p.depth - p.jump.depth == p.jump.depth - p.jump.jump.depth) {
    leaf.jump = p.jump.jump;
  } else {
    leaf.jump = p;
  }

  return leaf;
}

Parent and depth assignments are hopefully obvious. But what is this?

p.depth - p.jump.depth == p.jump.depth - p.jump.jump.depth

Unfortunately, proper rigorous explanation is quite intimidating. If you want to read it, I direct you to the original 1983 paper of Myers called “An applicative random-access stack”.

If you don’t feel like reading about properties of canonical skew-binary representations, I’ve got some drawings that hopefully will give you rough ideas and intuitions and, most importantly, the confidence required to code and use the structure properly under the pressure of real competition.

Below is an illustration of the case when the equality holds:

Here y is our new leaf. Blue arrows are paths formed by parent links, green arrows are jumps. Each green arrow has a length of the jump written above it. Important thing to notice is that our find algorithm will first try a $$$2d+1$$$ jump, and if it happens to be too long, it will go to the parent. This parent has a smaller jump, at least twice shorter.

This is almost what a conventional binary lifting would do. But instead of cramming all the jumps of different lengths into single node, it spreads them amongst the ancestors.

Okay, this explains how it exponentially reduces the size of a jump. But what if our very deep node happens to have only a very short jump? How reducing such a jump might help? To answer that, let’s look at bigger picture:

This picture is not really that big, but I hope it helps you to grasp the whole scheme. Notice that no farther than two jumps away there’s always a 2x or longer jump, if such a jump is possible at all. That is, if you keep jumping, the size of the jumps grows exponentially. This roughly explains how find manages to work in logarithmic time:

  1. It gains speed exponentially by encountering larger and larger jumps.
  2. It slows down by looking at parent’s shorter jumps.

At this point it is clear that actual number of steps done by find is not just a binary logarithm. Indeed, the original paper gives the following upper bound, where $$$d$$$ is the depth of the starting node:

$$$3 \lfloor \log (d + 1) \rfloor - 2$$$

I didn’t perform any tests, but this might be practically three times slower than conventional binary lifting.

Binary search on the ancestry path

You might be tempted to implement standard binary search and use find as its subroutine. This would be $$$O(\log^2 n)$$$. Luckily, we can repurpose find instead.

Node search(Node u, Predicate<Node> p) {
  while (!p.test(u)) {
    if (p.test(u.jump)) {
      u = u.parent;
    } else {
      u = u.jump;
    }
  }

  return u;
}

This function returns the deepest ancestor for which the predicate holds. This particular implementation assumes that such an ancestor always exists. Basically, we approach the target step by step, occasionally jumping, if we’re sure that we wouldn’t skip the target.

Finding LCA

Here LCA finding relies on two things:

  1. If the two nodes have different depths, we can quickly equalize them using find.
  2. The jump structure depends on nothing but depth.

The last point means that we can do a search simultaneously on both nodes — they’ll be at the same depth throughout the process. Then we can find LCA using the approach of binary search. We simply check whether the two nodes coincide to detect if we are past the LCA:

Node lca(Node u, Node v) {
  if (u.depth > v.depth) {
    return lca(v, u);
  }

  v = find(v, u.depth);

  while (u != v) {
    if (u.jump == v.jump) {
      u = u.parent;
      v = v.parent;
    } else {
      u = u.jump;
      v = v.jump;
    }
  }

  return u;
}

Full text and comments »

  • Vote: I like it
  • +174
  • Vote: I do not like it

By Urbanowicz, 6 years ago, In English

Hi all,

Today I'd like to present you yet another suffix data structure. According to my (quite limited) knowledge, it is new. But even if it's not, I believe the data structure possesses such a clarity that it is hard to screw up its implementation. So I think it might be interesting to a competitive programmer.

This is a byproduct of my attempt to find a simple solution for the problem of real-time indexing. Today we won't care for real-time part, thus hash-tables and amortized algorithms are welcome to keep things simple.

Full text and comments »

  • Vote: I like it
  • +51
  • Vote: I do not like it

By Urbanowicz, 13 years ago, In English

Consider the following problem. Given a finite set S of integers and k numbers b1, b2, ..., bk, is there any partition {B1, B2, ..., Bk} of S such that sum of all integers in Bi is equal to bi for all i = 1... k?

Obviously, this problem is in NP and NP-complete, because it is able to solve Subset Sum Problem instances by assigning k = 2. Although turning SSP into this problem is trivial, I cannot see any direct way to perform reverse transformation. (It is well-known that every NPC problem can be reduced to another NPC within polynomial time.)

I tried to follow Cook's theorem proof: I've built Circuit-SAT instance that verifies solution and successively reduced it to SSP through 3-SAT. Of course, I've got desired SSP instance, but it was over 9000 larger than original problem's one, rendering pseudo-polynomial algorithms unusable (to say the least). So, here's my questions:

  1. Is there direct way to reduce the problem to SSP?
  2. Is there any estimations of the minimal possible overhead of transformations?
  3. Or maybe there is pseudo-polynomial algorithm for the original problem?

Thank you in advance.

Full text and comments »

  • Vote: I like it
  • +15
  • Vote: I do not like it

By Urbanowicz, 14 years ago, In English

Let integer number N to be power of 2. Now consider an array of N elements numbered from 0 to N − 1. We will name that array with “a” letter and “ai will denote its i-th element. Imagine that we have some rule to modify single ai and we want to apply that rule to every element between ax and ay inclusive. We will call this range modification.

Consider binary operation which is associative, commutative and has identity element. Let it be denoted by × sign, for example. Imagine that we want to evaluate that operation over elements between ax and ay inclusive (ax × ax + 1 × … × ay − 1 × ay). We will call this range query.

I am to describe non-recursive algorithms for performing range queries and modifications within O(logN) using data structure that takes O(N) additional space.

This is schematic representation of the data structure built for array of 8 elements. It consists of 4 layers (log2N + 1 in general). Each layer contains a set of ranges that don't overlap and cover the whole array. Ranges of the same layer have equal size that grows exponentially from bottom to top. All the ranges are numbered in row-major order from left to right, from top to bottom. The bottom-most (numbered from 8 to 15 here, from N to 2N − 1 in general) layer is intended to hold ai array values. Other ranges hold an answer for their range query.

One may call this structure a segment or interval tree, but thinking that this structure is a tree will give you nothing useful here.

Range Query over Static Array

First of all we will discuss how to implement range query while ignoring modifications. Assume that we have already built the structure somehow. Then the sample C++ implementation for sum operation looks as follows. (Please note that zero is an identity element of sum operation.)

int staticQuery(int L, int R) {

L += N;

R += N;

int sum = 0;

for (; L <= R; L = (L + 1) >> 1, R = (R - 1) >> 1) {

if (L & 1) {

sum += a[L];

}

if (!(R & 1)) {

sum += a[R];

}

}

return sum;

}

Let's break the function's code down. As was mentioned before, ai element resides at i + N index in the structure. That explains what first two statements do: query's range bounds are turned into indices of the bottom layer. Next statement initializes sum variable with identity. This variable will accumulate the sum in the range to return the answer.

Now look at the for-loop and try to figure out how it changes L and R indices. In this picture colored ranges are ranges that were traversed while answering staticQuery(3, 7). Yellow ones were ignored, red ones were accounted. It's plain to see that red ranges don't overlap and their union is 3–7 range.

For simplicity I will explain the movement of L index only. Note that we must get into account a[L] value if L is odd because other ranges that cover L-range (1, 2, 5 ranges for L = 11) depend on ai values that don't appear in the query's range. That is what first if-statement checks. Of course we should ignore a[L] value if L is even because there is wider range that covers L-range and doesn't confront with query's range (3 range for L = 6).

As you can guess from the picture, the next range we will examine has L / 2 index (in case of even L). Generally, we will get a set of covering ranges if we divide L by powers of 2. In case of odd L we aren't interested in covering ranges, but we can notice that “interesting” ranges reside rightwards. It's easy to prove that L = (L + 1) >> 1 assignment takes into account both cases.

As for the R index, everything is the same. But because of this index is responsible for the right edge, everything will be right-to-left. That's why second if-statement and R = (R - 1) >> 1 assignment slightly differ from their counterparts. And of course, we must stop the for-loop when L > R because when L-range and R-range meet, the query's range is completely covered.

Note 1. It is possible to implement the algorithm in such a way that it won't require operation to be commutative. You can even avoid identity element requirement.

Note 2. If operation is idempotent (minimum, for example), you can omit both if-statements and rewrite body of the for-loop in a single line:

m = min(m, min(a[L], a[R]));

And don't forget that identity element of minimum is positive infinity, not zero.

Note 3. The algorithm can easily be adapted for multidimensional queries. Just look at the code for 2D-case.

int staticQuery2D(int x1, int x2, int y1, int y2) {

x1 += N;

x2 += N;

int sum = 0;

for (; x1 <= x2; x1 = (x1 + 1) >> 1, x2 = (x2 - 1) >> 1) {

int i1 = y1 + N;

int i2 = y2 + N;

for (; i1 <= i2; i1 = (i1 + 1) >> 1, i2 = (i2 - 1) >> 1) {

if (x1 & 1) {

if (i1 & 1) {

sum += a[x1][i1];

}

if (!(i2 & 1)) {

sum += a[x1][i2];

}

}

if (!(x2 & 1)) {

if (i1 & 1) {

sum += a[x2][i1];

}

if (!(i2 & 1)) {

sum += a[x2][i2];

}

}

}

}

return sum;

}

Modification of Single ai Element

Performing single element modification is two-step. First, you modify the value stored at i + N index in order to modify ai. Second, you reevaluate values for all covering ranges from bottom layer to the very top.

void pop(int i) {

a[i >> 1] = a[i] + a[i ^ 1];

}


void popUp(int i) {

for (; i > 1; i >> 1) {

pop(i);

}

}

We already know how popUp() enumerates covering ranges, so let's focus on its subroutine. You can see that operation is evaluated over ranges with indices i and i ^ 1. It's not hard to figure out that these ranges are subranges of the same i >> 1 range. This reevaluation is correct because operation is associative. Please note that this code is not suitable if operation is not commutative.

Clearly, we can use single element modifications to build the structure up. As each modification has logarithmic complexity, the build up will take O(NlogN). However, you can do this within O(N) by using the following code. (You still need to load up values in the bottom-most layer, of course.)

void buildUp() {

for (int i = 2 * N - 1; i > 1; i -= 2) {

pop(i);

}

}

Range Query within O(log2N) after Transparent Range Modifications

Imagine we have a set of range modifications with different parameters. Then we apply these modifications over array in some order. This will result in some state of the array. Now imagine we apply the same modifications over original array but in another order. The result may be the same as previous and may be not. If result doesn't depend on the order we will call such modifications transparent. For example, “increment all the values between first and third inclusive by five” is a transparent modification.

Assume that the range we want to modify is not arbitrary but one of the presented in the structure. We can simply apply modification to its value, but it will leave the structure in inconsistent state. Although we know how to update covering ranges, we won't do that for every such modification because there is more efficient way to update them when we will perform modification for arbitrary range. So we will assume that covering ranges became consistent somehow. But there are still inconsistent subranges. Of course, we cannot update them directly, because their number is O(N). Instead, we will attach a modifier that will store parameters of that modification virtually applied to every subrange. Here's the code for performing increment with parameter p:

void modifierHelper(int i, int p) {

a[i] += p;

m[i] += p;

}

Although both statements look the same, they are different in sense. First statement performs modification itself using parameter p. Second statement calculates a parameter for modification in such a way that this modification is equivalent to sequential appliance of modifications with parameters m[i] and p.

Now, how to perform arbitrary modification. Recall that we already know how to represent arbitrary ranges from staticQuery() function. So the code for increment modification will be:

void pop(int i) {

a[i >> 1] = (a[i] + m[i >> 1]) + (a[i ^ 1] + m[i >> 1]);

}


void modify(int L, int R, int p) {

L += N;

R += N;

int LCopy = L;

int RCopy = R;

for (; L <= R; L = (L + 1) >> 1, R = (R - 1) >> 1) {

if (L & 1) {

modifierHelper(L, p);

}

if (!(R & 1)) {

modifierHelper(R, p);

}

}

popUp(LCopy);

popUp(RCopy);

}

Two last statements is a more efficient way to update covering ranges that I mentioned before. It's enough to do only these pop ups. Please note that the pop() subroutine has been changed. That's because i and i ^ 1 ranges are under influence of covering range's modifier.

Now, how to answer range queries. Note that we cannot just use staticQuery() function, because traversed ranges may be affected by modifiers in higher layers. To find out real value stored in the range we will use this subroutine which is just an appliance of all modifiers of covering ranges:

int realValue(int i) {

int v = a[i];

for (i >>= 1; i > 0; i >>= 1) {

v += m[i];

}

}

Then this code will perform range query:

int log2Query(int L, int R) {

L += N;

R += N;

int sum = 0;

for (; L <= R; L = (L + 1) >> 1, R = (R - 1) >> 1) {

if (L & 1) {

sum += realValue(L);

}

if (!(R & 1)) {

sum += realValue(R);

}

}

return sum;

}

Range Query and Opaque Range Modification within O(logN)

As opposed to transparent, opaque modifications' result depends on the order of appliance. However, transparent modifications can be viewed as opaque, and it makes opaque case more general than transparent one. In this section we will learn as an example how to perform range assignment and range sum query. (Range assignment changes each value in the range to desired value p.)

Simple trick with realValue() won't work anymore because now modifications in lower layers can affect modifiers in higher layers. We need a way to detach modifiers from the ranges we need to update. The way is “push down”. It is so called because it removes modifier at the range, and applies that modifier down to both subranges.

void push(int i) {

if (h[i >> 1]) {

modifierHelper(i,     m[i >> 1]);

modifierHelper(i ^ 1, m[i >> 1]);

h[i >> 1] = false;

}

}


void pushDown(int i) {

int k;

for (k = 0; (i >> k) > 0; k++);

for (k -= 2; k >= 0; k--) {

push(i >> k);

}

}

Let's look at pushDown() at first. All you need to know about this is that it enumerates the same indices as popUp() does, but in reverse order. This reverse order is required because push downs propagate from top to bottom.

Now look at push(). There is new value h[i >> 1] which is true if modifier is attached to i >> 1 and false otherwise. If modifier is present then it is attached to both subranges and i >> 1 is marked as free of modifier.

In order to support range assignments and sum queries we need new modifierHelper() and pop() subroutines. I believe it will not be hard to understand how they work.

void modifierHelper(int i, int p) {

a[i] = p * q[i];

m[i] = p;

h[i] = true;

}


void pop(int i) {

if (h[i >> 1]) {

a[i >> 1] = m[i >> 1] * q[i >> 1];

} else {

a[i >> 1] = a[i] + a[i ^ 1];

}

}

The only thing you couldn't know is what's the q[i] values. Answer is simple: this is number of array elements which covered by i range. Please note that these values are necessary for the problem of our example only, you don't need them in general.

The most direct way to calculate q[i] values is to use this routine which is similar to buildUp() function:

void makeQ() {

for (int i = N; i <= 2 * N - 1; i++) {

q[i] = 1;

}

for (int i = 2 * N - 1; i > 1; i -= 2) {

q[i >> 1] = q[i] + q[i ^ 1];

}

}

Clearly, after all necessary push downs have been made, we can modify range straightforwardly. There are only two push downs we need to do. So the code for range assignment is as follows:

void modify(int L, int R, int p) {

L += N;

R += N;

int LCopy = L;

int RCopy = R;

pushDown(LCopy);

pushDown(RCopy);

for (; L <= R; L = (L + 1) >> 1, R = (R - 1) >> 1) {

if (L & 1) {

modifierHelper(L, p);

}

if (!(R & 1)) {

modifierHelper(R, p);

}

}

popUp(LCopy);

popUp(RCopy);

}

Absolutely the same holds for range sum query:

int query(int L, int R) {

L += N;

R += N;


pushDown(L);

pushDown(R);

int sum = 0;

for (; L <= R; L = (L + 1) >> 1, R = (R - 1) >> 1) {

if (L & 1) {

sum += a[L];

}

if (!(R & 1)) {

sum += a[R];

}

}

return sum;

}

Note 4. Sometimes you can avoid use of h[i] values. You can simulate detachment by “undefined” modifier.

Note 5. If your modifications are transparent, you can omit push downs in modify(). Also you can avoid push downs in query(), but this won't be as simple as removing a pair of statements.

Surprise!

In the very beginning I have said that N should be integer power of 2. In fact, that's not true. All the algorithms described here work correctly for any positive integer N. So for any array of size N this structure takes just O(N) of additional space.

Full text and comments »

  • Vote: I like it
  • +21
  • Vote: I do not like it