pistone's blog

By pistone, 12 years ago, In English

given , there are 3 points in 2d space . also there are 10000 points in 2d space . each of the 3 points in the first set can be related to at max 3500 points from the 2nd set. how to assign all the 10000 points in the 2nd set , to the 3 points in 1st set , so that the sum of distances of points in 2nd set from points in 1st set is minimised ?

  • Vote: I like it
  • +4
  • Vote: I do not like it

| Write comment?
»
12 years ago, # |
  Vote: I like it -10 Vote: I do not like it

I do not understand problem. point from second set can be related only with one point from first set?

  • »
    »
    12 years ago, # ^ |
      Vote: I like it +2 Vote: I do not like it

    asssume that the 3 points in first set represent school locations. now in the 2nd set , there are 10000 points representing students .

    we have to assign each student a school , so that sum of the euclidean distances of the students from their assigned schools is minimum.

    • »
      »
      »
      12 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      Wow, schools are very populated in india.

      I think this problem is harder than min cost perfect matching, because of the restrictions. Where did you find the problem? What complexity are you expecting?

      • »
        »
        »
        »
        12 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        I think min cost max flow will work .... but the complexity won't be good.

»
12 years ago, # |
  Vote: I like it -8 Vote: I do not like it

This can be solved by dynamic programming.

  • »
    »
    12 years ago, # ^ |
      Vote: I like it -8 Vote: I do not like it

    The only DP solution I can think of has an O(3500^3) ... is there a faster way ?

    • »
      »
      »
      12 years ago, # ^ |
        Vote: I like it -8 Vote: I do not like it

      whats ur dp approach for this problem

      • »
        »
        »
        »
        12 years ago, # ^ |
          Vote: I like it 0 Vote: I do not like it

        the DP state is (p1, p2, p3) means that the first point from the first set matches with p1 points from the second ... and the second point from the first set matches with p2 points from the second set ... and the third point from the first set matches with p3 points from the second set ... the transition will be matching one point from the second set with any one from the first set .. goind to states (p1 + 1, p2, p3) and (p1, p2 + 1, p3) and (p1, p2, p3 + 1).

»
12 years ago, # |
  Vote: I like it 0 Vote: I do not like it

You can try:

  1. Make a source and a sink (S and T).

  2. C(S, i) = 1 (i — each point from the second set);
    D(S, i) = 0;

    C(i, j) = 1 (i — 1st set, j — 2nd set);
    D(i, j) = Distance from i to j;

    C(j, T) = 3500 (j — 2nd set);
    D(j, T) = 0;

    Where D(i, j) is the cost of edge from vertex i to vertex j.

  3. The answer should be the value of min-Cost-Flow.

But the approach is slow (10000*one-iteration-of-min-cost-flow).

  • »
    »
    12 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    so this can be solved by successive shortest path algo , is it ?

  • »
    »
    12 years ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Augmenting path search can be optimized here by storing all "basic" paths between vertices, corresponding to 3 points from the first set. By basic I mean paths, not passing through the other point in the first set. I believe, that structure like STL set would do the job. Then we also store for each of the 3 points list of unassigned points from the second set, ascending by distance. With this structure we would be able to find shortest augmenting path in about log(10000).

    • »
      »
      »
      12 years ago, # ^ |
        Vote: I like it 0 Vote: I do not like it

      could u explain further about ur algo

      • »
        »
        »
        »
        12 years ago, # ^ |
        Rev. 4   Vote: I like it 0 Vote: I do not like it

        Augmenting paths in the graph in question can be classified into 3 categories:

        1. S -> one of 3 points in first set -> one of points in second set -> T
        2. S -> one of 3 points in first set -> one of points in second set -> another one of 3 points in first set -> another one of points in second set -> T
        3. S -> one of 3 points in first set -> one of points in second set -> another one of 3 points in first set -> another one of points in second set -> last one of 3 points in first set -> another one of points in second set -> T

        We'll have a structure to figure out shortest augmenting path of each of those categories. So first of all, we store for each of vertices in first set all the points from second set, which aren't assigned yet, sorted by distance (std::set would be good for that). Now let's consider all possible sequences of points from 1st set on augmenting path: there are 6 of length 3, 6 of length 2, and 3 of length 1, so 15 in total. We will assume that augmenting path passes points of first set in chosen order. If we fixate last point from 1st set on the path, we can easily find next point on the path (which would be from 2nd set) in sorted collection, described above. So the question is how to figure out shortest paths between points from 1st set. The answer is — they can also be stored sorted by distance easily, since path containing "v1->x->v2", where v1 and v2 are from 1st set and x is from 2nd, can be augmented only if x is assigned to v2 at the moment (after augmentation x will be assigned to v2). So all those paths can be easily stored in a std::set as well.

        If this isn't clear, I can write a pseudo-code.

»
12 years ago, # |
  Vote: I like it +13 Vote: I do not like it

Another solution can by formulated as transportation problem.

Let schools are producers (with 3500 units available); students are recipients (with 1 unit requirements); we add fake_student (for balance the problem) with 500 units requirements.

Cost is simply distance between two points (0 in case of fake_student).

We have 3x10001 matrix. So complexity should be "O(n) smaller" than classic TP version.