It turns out, that the final amortized time complexity is $$$O(α(n))$$$, where $$$α(n)$$$ is the inverse Ackermann function, which grows very slowly. In fact it grows so slowly, that it doesn't exceed $$$4$$$ for all reasonable $$$n$$$ (approximately $$$n<10^{600})$$$.
code
Question: Why this works in $$$O(α(n))$$$?
link.
Read this blog to understand, atleast slightly, what exactly is going on.
?
The linked blog doesn't even mention inverse Ackermann.
I thought it would help him get a jist of what is exactly happening within DSU. Also, the blog does introduce a function, $$$log^{*}(n)$$$, although it doesn't mention any name. This function and how it comes up in the complexity is quite easy to see, and provides quite good bound too.
Using path compression heuristic alone does not make your find operation work in $$$\mathcal{O}(\alpha(n))$$$. You have to use the union-by-rank heuristic as well.
Path compression alone gives amortized bound $$$O(\log n)$$$ for one operation. For all details see paper: http://www.csd.uwo.ca/~eschost/Teaching/07-08/CS445a/p245-tarjan.pdf (see theorem 4. on page 264; the construction of the worst-case scenario starts in the middle of page 261 -- the essence is described in the last paragraph on this page and the general idea is depicted on figure 9 on page 262; also a nice summary of many variants of DSU is given in table I and II on page 280). So the whole idea is based on taking advantage of binomial tree's structure and arranging unions to construct the tree and then in such what that each iteration takes $$$\Theta(\log n)$$$.
Once I wrote a task that exposes this nuance in path-compression-only DSU: You are given graph of $$$n$$$ nodes and $$$m$$$ $$$(1 \leq n, m \leq 4\cdot 10^6)$$$ weighted edges and you have compute the sum of weights of edges in the MST. Input: $$$n$$$ and $$$m$$$ in the first line. Then $$$m$$$ triples $$$a_i, b_i, w_i$$$ meaning an edge of weight $$$w_i$$$ connects vertices $$$a_i$$$ and $$$b_i$$$ where $$$a_i \neq b_i, 1 \leq a_i, b_i \leq n, 1 \leq w_i \leq 10^9$$$ and $$$w_1 \leq w_2 \leq \ldots \leq w_m$$$.
The obvious solution is to use Kruskal's algorithm.
I prefer union-by-size. It's slightly less "correct" as a heuristic, but has the same $$$O$$$ complexity, and size is the more useful statistic for the "client" side.
If I remember correctly, union by rank alone guarantees $$$\mathcal{O}(logN)$$$ performance for the find operation. The worst possible test case being that you combine sets of the same size only.
According to GeeksForGeeks "using size as rank also yields worst case time complexity as $$$O(\log n)$$$" but the link to the proof is defunct and redirects to some kind of news website.
Well, the proof for the famous "small-to-large" technique also works for explaining why the union by rank heuristic provides that time complexity. I think that the proof can be found somewhere on Codeforces. Do pardon me for my laziness to Google.
Just a little surprised: Why did the blog post receive downvotes? I feel the above question is legitimate to ask, and it seems like it was an effective way for others to provide helpful comments / links.
The solution can be googled easily.
[a wiki page for it](http://en.wikipedia.org/wiki/Proof_of_O(log*n)_time_complexity_of_union%E2%80%93find)