JasonMendoza2008's blog

By JasonMendoza2008, history, 2 months ago, In English

I saw somewhere that if you were to pick randomly the next element to insert when building a binary search tree from an array, you would have 1/3 on the left and 2/3 on the right (or conversely) on average. I get that the best case is half-half and the worst case in everything in one of the branches but how do get formally to the 1/3-2/3?

  • Vote: I like it
  • +1
  • Vote: I do not like it

»
2 months ago, # |
  Vote: I like it 0 Vote: I do not like it

You either didn't read well or you didn't express the question right. If you randomly choose the root of a tree you will be expecting nodes to distribute half left half right (expected values of nodes to the right: $$$EV = \frac 1 n (0+1+2+...+(n-1)) $$$).

  • »
    »
    2 months ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    Having a 50/50 distribution would be the best case (as balanced as it gets)? How could the best case be the average case?

    • »
      »
      »
      2 months ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      I see what you mean. Well that is avarage in this case because if you calculate the difference of nodes from the left and from the right you are expected 0. If you are interested in the expected value of the maximum (and minimum) of the nodes to the right and to the left you will have to calculate the following: $$$EV_{max} = \frac 1 n \sum_{i=1}^nmax(i-1,n-i) \approx \frac 2 n ((n-1) + (n-2) + ... + \frac n 2) = \frac 3 4n$$$. Which gives me the maximum is expected $$$\frac 3 4$$$ and the minimum $$$\frac 1 4$$$. I got different results from what you said. You can do yourself the work again to verify.

  • »
    »
    2 months ago, # ^ |
    Rev. 3   Vote: I like it 0 Vote: I do not like it

    What the OP said could also be interpreted as, essentially, the expected value of the amount of nodes in the smaller subtree, whichever one that turns out to be (and proving it's $$$1/3$$$ of the nodes). That being said, I'm pretty sure that turns out to be $$$1/4$$$ of the nodes, which is still not $$$1/3$$$, so I'm confused as well.

    EDIT: Oops, too late.