Suppose we're given a graph with $$$N$$$ vertices and $$$M \space (\le \frac{N \times (N - 1)}{2})$$$ undirected edges, vertices are labelled from $$$0$$$ to $$$N - 1$$$.
We've to add $$$K \space (\le \frac{N \times (N - 1)}{2} - M)$$$ random undirected edges to this graph such that no edge is repeated (including the ones we added).
Goal is to deterministically minimize the number of calls made to $$$rand()$$$ function, only $$$K$$$ calls ideally, choice of edges should be provably random, while keeping rest of the algorithm not too inefficient, for $$$N, M, K \le 10^6$$$.
I feel this problem can be called "pick K random elements without repetition from complement set" where $$${ (i, j) \mid 0 \leq i \leq N \text{ and } 0 \leq j \leq N }$$$ is the universal set and we have a set of $$$M$$$ pairs initially.
I am unable to think any provably random and efficient ways to accomplish this, for example I could pick a random vertex, then pick $$$rand(0, N - degree)$$$ and do some mapping for second vertex, which is $$$O(degree)$$$ probably and can become inefficient.
I would like to know any ideas to accomplish this, thanks.
Will this work ?
Agree this approach will produce a less random. But how would you define "choice of edges should be provably random"?
Let $$$X = \frac{N \times (N - 1)}{2} - M$$$, there are $$$\binom{X}{K}$$$ ways to choose K random edges, so chances of choosing a particular set should be $$$\frac{1}{ways}$$$, though I don't know how we could calculate this for some algorithm, will try to implement this method, thanks.
You can generate $$$1$$$ edge at a time. To do that generate a random number between $$$1$$$ and $$$\frac{n \cdot (n - 1)}{2} - m$$$, get an edge by index with a binary search and then update everything with some data structures. One operation can be maintained in $$$O(\log n)$$$, however it has a bad constant factor.
Edit: actually, I can only see how to do in $$$O(\log ^2 n)$$$ because we need to get $$$k$$$-th element that is absent in some adjacency list, and I know how to dynamically do that only with ordered sets. There is probably a much better way.
You can achieve $$$O(\log n)$$$ complexity with that method if you use dynamic (implicit) segment trees instead of ordered sets, however I have a feeling that it would work even slower, and it uses $$$O(m \log n)$$$ memory.
I'm quite sure that you can find the $$$k$$$-th absent element in $$$O(\log N)$$$ by using treaps. The constant factor is bad, but memory is $$$O(M + K)$$$.
Compiling what others have said, this is I think the best you can expect:
I've also included some code, as I think in this case it's more suggestive. Please note that I haven't tested the code whatsoever and there might be bugs, but it highlights the general idea.
https://gist.github.com/theabbie/0be86edab481af581c450a4cfeef6d69
Implemented this random graph generator based on ideas suggested.