You are given N points with integer coordinates (Xi, Yi) and Q queries. In each query, you are given one of the points given in the input, let's call it P. For each query, you need to answer which point given in the input is the closest to P, considering that the distance between two points is the Manhattan Distance. If there is more than one point with the same distance, the one with lower X should be chosen. If the tie persists, the one with lower Y should be chosen.
1 <= N <= 105
1 <= Q <= 105
1 <= Xi, Yi <= 103
Any ideas?
Not sure if O(Q * max(X, Y)) would pass or not, but here is the idea. Make a quad tree of the points (some people mistakenly call it 2D segment tree) Example where every node contains four values. 1- min(X + Y) 2- min(-X + Y) 3- min(X + -Y) 4- min(-X + -Y)
Basically, every node represents a rectangle in your space and you can query every rectangle in O(max(X, Y)) time. Now for every query split the space to four quadrants 1- Rectangle to the right-top of your point 2- Rectangle to the left-top of your point 3- Rectangle to the right-bottom of your point 4- Rectangle to the left-bottom of your point
For every rectangle of the four rectangle you query it in the quad tree and the value you are interested in is the correspondent value of the four values mentioned earlier.
Revised: use accumulative arrays instead of 2D-BIT
Or you can actually do something even trickier accumulative arrays where you can reduce the time complexity to O( Q ). So it's basically the same idea above but instead of using Quad tree to query your answer we will use 4 accumulative arrays to store the min values.
So you can make 4 accumulative arrays where every array starts at a corner of your space and store the corresponding min values, for each query you need to query the four accumulative arrays for your answer.
Example:
Your query point is the intersection of all the four rectangles (bit should be accumulative array)
For every point find closest point in every of 4 parts of coordinate system centered at that point (not larger x and y coordinate, not larger x and not smaller y coordinate etc.). Then sweep coordinate system 4 times. E.g. first sweep determines for a point closest one so that it has not larger both x and y coordinate. That is as standard as it gets exercise for sweeping with segment tree, so I hope that is sufficient explanation. Complexity O((n+q) log (n+q)).
Yet another way to solve this.
If you convert each point (X; Y) to (X'; Y') so that X' = X-Y, Y' = X+Y, distance function becomes max(|X1 — X2|, |Y1 — Y2|) instead of original.
Then you can use binary search by answer for each query point, and you will just have to check whether some square contains a point. Due to low constraints on X_i and Y_i you can do it with a 2D array without any data structures.
OK, se here is fastest of all posted solutions -> bruteforce xD. For every point do sth like:
for every point P:
..for (int dist=1; ; dist++)
....check one by one all places with distance dist from P and if you find a point return that point as closest one to P
If I'm not mistaken this works in O(max(X, Y)2) which for given constraints is fastest :P.
My second comment is O(Q) so it should be faster :D, but your solution is much better and shorter tbh :D
Ahh, missed that, but that's also nice :). But isn't it O(Q + max(X, Y)2) = O(max(X, Y)2)? As I understand we need to store four precomputed arrays of size X × Y.
LOL you are right, I completely forgot about the preprocessing cost xD. By the way, could you please elaborate more why your above solution works in O(max(X, Y))2 as it doesn't look very obvious to me.
Ahhh, I solved a bit different problem :/. What I wanted to solve is "We are given set A of points. For every point in A determine closest other point in A", but problem we are given is "We are given sets A, B of points. For every point in B determine closest point in A.". Counterexample to my solution applied to second problem is easy (A = one point with large y, B = many point with small y), however if we wanted to solve first version it should be fast enough.
Aha I see.
Correct me if I'm wrong but the formulation you tried to solve is where the query points are from the ones given in the input and that's exactly what the problem wants (after reading it again). The query points are guaranteed to be from the ones given in the input according to the poster's statement.
Lol, it seems you're right :P. So, here's the trick. My claim is that no point (given in the input or not) won't be checked more than constant number of times. In other words, there exists a constant c (5 or something), so that there don't exists points Q, P1, ..., Pc, so that for a fixed i we have dist(Q, Pi) < dist(Pi, Pj) for all j ≠ i.
It's easy to see that each input point induces a region R in the plane where this point is the answer for any query in R. You can find these regions using a BFS: first insert all N points in the queue (insert them sorted by X,Y to solve ties), then run your usual BFS, marking the unvisited nodes with the same ID as its predecessor. Since the coordinates are limited to 10^3, your BFS will have to visit exactly 10^6 positions. Now, you can answer every query in O(1) printing the value stored in that position of the grid.
Small example: if your coordinates are restricted from (1,1) to (5,5) and your input points are P1 = (1,1) and P2 = (5,5), these are the regions:
1 1 1 1 1
1 1 1 1 2
1 1 1 2 2
1 1 2 2 2
1 2 2 2 2
The value v in position (i,j) means that Pv is the answer if you query this position. I hope i explained it well enough :)
Total complexity for Q queries is O(Q) + 10^6 operations for pre-processing
EDIT: Actually the complexity is O(NlogN+Q) because of sorting input points
My code if anyone is interested: http://pastebin.com/p1FSkZyq
m,n are the coordinate limits (<=10^3)
k is the amount of input points
l is the amount of query points