DISCLAIMER: I really want to understand why this solution (which I think is quite reasonable) is wrong. If this is a nonsensical solution, please say so. I am not looking for a AC solution, I'd like to prove that this one is wrong, but i've been unable to do so..
Problem Statement
The statement is from timus 1588 — Jamaica:
Each two cities are to be connected by a road going along a straight line. One road may connect several cities if they lie on the same straight line. Jamaican economists are sure that the new network will minimize transportation expenses. In order to estimate the cost of the project, it is required to determine the total length of the roads to be constructed, and this is the job for Andrey.
Solution to be proven wrong
So, we have N points and we want the sum of a subset of the lines.
Consider all point combinations, this give all lines. A line is described by its slope and its point of intersection with the Y axis. The ideas is simply to find the biggest segment in each line.
Some code:
from fractions import Fraction # available since python 2.6 . Irreducible fractions, very slow
...
for a,b in combinations(pts,2):
dx,dy = a[0]-b[0], a[1]-b[1]
dist = _dist(a,b)
# Vertical, horizontal and default line
if a[1] == b[1]:
line = (0, a[1])
elif a[0] == b[0]:
line = (INF, a[0])
else:
# irreducible fraction Fraction(8,2) = 4/1
# M = dy/dx and B = Y - MX
M = Fraction(dy,dx)
# line is tuple of integers. No division is done to avoid precision errors.
B = Fraction(dx*b[1] - (dy*b[0]), dx)
line = ((M.numerator,M.denominator, (B.numerator,B.denominator))
# Find biggest line
if line not in lines:
lines[line] = dist
elif lines[line] <= dist:
lines[line] = dist
Here's the example input: (correct answer is 412)
0 0
0 100
100 0
50 50
And a verbose output:
(A,B) DIST , where (A,B) describes segment AB
((0, 0), (100, 0)) 100.0
((0, 0), (50, 50)) 70.7106781187
((0, 0), (0, 100)) 100.0
((0, 100), (100, 0)) 141.421356237
412
What may be wrong
Why is it wrong? I couldn't find a test in which this algorithm fail.
Is the general idea Faulty? I'm quite 'preocupied' because the problem's discussion said that the "best algo" for this problem was O(N^2 log N) or O(N^2 log N^2) or something like that, but in my solution I don't see a O(log X) factor hah (i.e I could be overlooking some case). In fact, both log(N) solutions mentioned sort (sort by angle or sort by distance), but I didn`t find a reason to sort..
Is it a precision problem? Or a rounding problem? I've used python's Fraction to (try) avoid precision problems and math.round(X + 0.5) to print the answer. Is that not enough?
OBS: Also I'm aware of some O(N^3) AC solutions, such as the one I put at the bottom of this post. But I didn't find they interesting..
I think my solution is quite similar to one mentioned in the problem's discussion:
- Sort n^2 lines by distance.
- Iterate from largest to smallest.
- We have to add line to the answer, if we not have bigger line in set, covering this. We can easily check it with set.
Where did I go wrong?
EDIT: As the user @ffao said, my problem could be anywhere in my code, so here's the whole thing.
Maybe it's a precision problem while printing the answer (i.e floor( X + 0.5) not enought) I don't think so, I kind of tested with some 'round problematic answers'.
I really suspect that has got something todo with the general idea of the solution and that I'm overlooking some sort of case, unfortunately haha
Here`s the whole code:
def main():
from sys import stdin
from itertools import izip, combinations
from math import sqrt, floor
from fractions import Fraction
tkns = iter(map(int,stdin.read().strip().split()))
n = tkns.next()
pts = [ (x,y) for x,y in izip(tkns, tkns) ]
_dist = lambda A,B: sqrt((A[0]-B[0])*(A[0]-B[0]) + (A[1]-B[1])*(A[1]-B[1]))
lines = {}
INF = 10000005
for a,b in combinations(pts,2):
dx,dy = a[0]-b[0], a[1]-b[1]
dist = _dist(a,b)
if a[1] == b[1]:
line = (0, a[1])
elif a[0] == b[0]:
line = (INF, a[0])
else:
M = Fraction(dy,dx)
B = Fraction(dx*b[1] - (dy*b[0]), dx)
line = ((M.numerator,M.denominator), (B.numerator,B.denominator))
#M = float(dy)/dx
#line = (M, b[1] - M*b[0])
if line not in lines:
lines[line] = dist
elif lines[line] <= dist:
lines[line] = dist
print int(floor(sum(lines.values())+0.5))
Also, here's an AC solution which I think is O(n^3): (most timus1588 solutions I googled were analogous to this straightforward one)
#include <bits/stdc++.h>
using namespace std;
typedef long long ll;
int x[303], y[303];
int dist(int i, int j) {
return (x[i] - x[j]) * (x[i] - x[j]) + (y[i] - y[j]) * (y[i] - y[j]);
}
int cross(int i, int j, int k) {
return (x[j] - x[i]) * (y[k] - y[i]) - (y[j] - y[i]) * (x[k] - x[i]);
}
map<vector<int>, int> ss;
int main() {
int i, j, k, n;
scanf("%d", &n);
for(i = 0; i < n; i++) {
scanf("%d %d", &x[i], &y[i]);
}
for(i = 0; i < n; i++)
for(j = i + 1; j < n; j++) {
vector<int> v;
for(k = 0; k < n; k++)
if(k == i || k == j || cross(i, j, k) == 0)
v.push_back(k);
ss[v] = max(ss[v], dist(i, j));
}
double tot = 0;
for(auto &e : ss)
tot += sqrt(e.second);
printf("%.0f\n", tot);
}
Since you are using python, a much better way to store a line would be by representing it as Ax + By + C = 0 and using a dictionary to store the points as values on the basis of keys
tuple (A,B,C)
, whereA,B,C
should have no common factor apart from 1. This will avoid the floating point errors , that's where I think you are going wrong.Thank you for the answer.
How is that diferent from using a tuple (M,B), where M,B are also integers? I mean, after finding A,B,C I still have to guarantee there is no common factor, right?
Isn't that the same I did with M,B by using Fraction? (there's never a division)
ignore, you are using Fractions
Your explanation of your solution was pretty good (thanks for that), but why didn't you post your own working code? Maybe there is a mistake there you didn't copy here, or maybe there is a mistake in a different place of the program.