The following problem popped into my mind all of a sudden and I have no idea how to solve it.
You are given a weighted rooted tree with $$$n$$$ nodes. You need to find the number of unordered pairs of nodes $$$(u,v)$$$ such that
$$$Constraints: 1<=n<= 10^5, -10^6<=edge\,weights<=10^6$$$
It will be really helpful if you can provide me with a solution.
Let $$$s_u$$$ be the subtree weight of vertex u, and $$$l(u,v)$$$ be the lowest common ancestor of vertices $$$u$$$ and $$$v$$$.
$$$d_u + d_v - 2d_{l(u,v)} = s_u + s_v \iff (d_u - s_u) + (d_v - s_v) - 2d_{l(u,v)} = 0$$$. If we attach a vertex $$$v'$$$ to every vertex $$$v$$$ with edge weight $$$-s_v$$$, we have reduced the problem to finding the number of zero-length paths. This is solvable using DSU on tree.
Can you explain A BIT MORE? Because $$$2d_{l(u,v)}$$$ will not be the same in the original tree and the modified tree. OR am I missing smth or what?
$$$l(u,v) = l(u', v')$$$, since $$$u'$$$ and $$$v'$$$ are children of $$$u$$$ and $$$v$$$.
I assume that means merge small to large will work too?
Zero-length paths would includes paths (u,v) that:
(1) $$$ (d_u - s_u) + (d_v - s_v) = 2 d_{l(u,v)} $$$
(2) $$$ (d_u - s_u) + d_v = 2 d_{l(u,v)} $$$
(3) $$$ d_u + d_v = 2 d_{l(u,v)} $$$
(1) is what we want to count, but we'll instead count them all.
As an example, consider following graphs:
The left describes case (3), which is the 0-lenght paths the tree itself has.
(2, 3) is a 0-length path, but (2,3) isn't a valid pair.
The right describes case (2), considering only the attached vertex of 2.
The vertex V constructs a 0-length path (V, 3), but (2,3) isn't a valid pair.
Edit: Way to exlude the cases
The problem exist because, only the pair of attached vertices should be counted.
We can construct a new graph $$$G'$$$ with only the attached vertices, and edges connecting them be the path between them in original graph. We can then do a DSU on tree correctly.
Building such graph can be done by DFS once.
Edit: It's not hard to exclude the cases (in f2lk6wf90d 's comment below).
It is trivial to exclude the original vertices.
Then, in the DSU on tree, only vertices $$$v$$$ such that $$$v \geq n$$$ must be included.
Is the tree rooted?
i have same doubt.
yes. I have updated the statement. Sorry for the inconvenience.
Let's root the tree at vertex 1. Solve the original problem on this tree, storing the answers separately for each original vertex.
How do the subtree sizes of the tree's vertices change if we reroot the tree at vertex v?
Let $$$f_0 = 1, \dots, f_i = v$$$ be the DFS stack from vertex 1 to vertex v. Only the vertices in this stack will have different subtree sizes after rerooting. However, if we reroot the tree at a child $$$f_{i+1} = c$$$ of vertex v, the subtree sizes of $$$f_0, \dots, f_{i-1}$$$ will not change. Therefore, when moving to vertex $$$c$$$, we will only check for the candidate pairs containing $$$c$$$ or $$$v$$$.
To check for $$$(c, f_0), \dots, (c, f_i)$$$ pairs:
$$$c$$$'s new subtree size will be equal to $$$s_1$$$, i.e. the total edge weight in the original tree. Therefore, the new equation is $$$d(c, x) = s'_x + s_1$$$, where $$$s'$$$ denotes the new subtree size. $$$d(c,x) = s'_x + s_1 \iff d_c - d_x = s'_x + s_1 \iff d_c - s_1 = s'_x + d_x$$$. It is easy to calculate how many of these pairs exist by maintaining a set containing the $$$s'_x + d_x$$$ values while performing the DFS.
To check for other pairs containing $$$c$$$:
$$$d(c, x) = s_1 + s_x \iff d_c + d_x - 2d_{l(c,x)} = s_1 + s_x \iff d_c + (d_x - s_x) - 2d_{l(c,x)} = s_1 \iff d_c + d_{x'} - 2d_{l(c,x')} = s_1$$$ i.e. we must calculate the number of paths from a leaf to a non-leaf vertex with weight $$$s_1$$$, such that the parent of the leaf vertex is not an ancestor of the non-leaf vertex, and store the answers separately for each non-leaf vertex. This is similar to the original problem. To remove the pairs that do not satisfy the latter condition: $$$d(c, x') = s_1 \iff d_c - (d_x + s_x) = s_1 \iff d_c - s_1 = d_x + s_x$$$. It is easy to calculate how many of these pairs exist by maintaining a set containing the $$$s_x + d_x$$$ values while performing the DFS.
To check for $$$(v, f_0), \dots, (v, f_{i - 1})$$$ pairs:
$$$s'_v = s_1 - s_v$$$. Therefore, the new equation is $$$d(v, x) = s'_v + s'_x \iff d_v - d_x = s'_v + s'_x \iff d_v - s'_v = d_x + s'_x$$$. It is easy to calculate how many of these pairs exist by maintaining a set containing the $$$s'_x + d_x$$$ values while performing the DFS.
To check for other pairs containing $$$v$$$:
$$$d(v, x) = s_1 - s_v + s_x \iff d_v + d_x - 2d_{l(v,x)} = s_1 - s_v + s_x \iff (d_v + s_v) + (d_x - s_x) - 2d_{l(v,x)} = s_1$$$. This is similar to the original problem.
Therefore, we can solve the problem for all $$$n$$$ roots in $$$O(n \log^2 n)$$$ time, just like the original problem.
I think you can solve it using small to large trick.
Let's rewrite the formula:
Note: $path(u,v)$ is the weight of the simple path from $$$u$$$ to $$$v$$$ and $$$st(u)$$$ is the weight of the subtree rooted at $$$u$$$.
For every node $$$u$$$ you should store the value: $$$path(u, root) - st(u)$$$ (this should be straight forward).
Then fix every node $$$x$$$ as the $$$lca(\cdot, \cdot)$$$. You should find all pairs of nodes in the subtree of $$$x$$$ (but that are not in the same immediate subtree from $$$x$$$) that their value $$$path(u,root) - st(u)$$$ sums to $$$2 \cdot path(x, root)$$$.
Now should process every subtree. Keep track the values you have seen so far and it's frequency on a map. When you are processing one subtree iterate through every value on it, find for its complement in the map and add the frequency to the answer. After that add the elements to the map (you should add them after the counting because you don't want to check among nodes from the same subtree)
Now the trick: don't iterate over the largest subtree. By default get the frequency map from that subtree, and iterate through other subtrees only. This will guarantee complexity $$$O(n \cdot log^2 n)$$$. (Extra $$$log$$$ comes from the fact that we are using a map.