Mapping a permutation to a number has attracted much social concern. In shortest-path problems, in case a node (or state) is a permutation, we should convert states to integers, in order to BFS or Dijkstra on new graph comfortably. Many coders have known using std::map or trie to do this work. However, both have certain disadvantages. My writing will introduce a new way to solve this problem.
It is hard to write both long and detailed blog. Therefore, you can comment anything which you didn't understand well. I will reply (or update this blog if it is necessary).
To understand the role of mapping a permutation to a number, consider problem POSLOZI from COCI. Our goal is to find the length of the shortest path from a permutation S to an other permutation T using allowed operations. A valid operation is swapping two elements in the permutation. We are given a list of pair (p, q) denote we can swap the element indexed p and the element indexed q. Any other swapping operations are not allowed. A possible strategy is to BFS simultaneously from both S and T.
. .
1. Disadvantages of std::map and trie
In the first place, what I put in my priority is inefficiency of wide-known methods. Std::map has big time complexity, and trie demands a quite long conding-time. 1, Using std::map is the easiest way to code but it costs big complexity, i.e. O(logN * L) for each query (where L is length of the permutation, N is number of permutations, in other words N = L!). 2, Trie brings us the speed in queries but it uses quite much memory O(N * L) and requests a long time to code. Therefore, both std::map and trie are not perfect solutions. It encourages me to find a better solution.
2. My new method
In the second place, I will talk about my method treating this problem. Assume length of permutations L = 11. We will use a 7-directional array to map first 7 elements to a number, called C1.
int C1[11][11][11][11][11][11][11];
int MaxC1=0;
At this point, the number associate with the first 7 elements of permutation P is C1[P0][P1][P2][P3][P4][P5][P6].
For the first 7 elements, we have 11! / 4! = 1663200 states. There are 4 remaining elements.
Now first let's talk about a naive solution, we will create array C2 declared as following:
int C2[1663200+1][11][11][11][11];
int MaxC2=0;
In order to get the number associate with the permutation P, we will use following function:
int index(const vector<int> &P) {
int &I1 = C1[P[0]][P[1]][P[2]][P[3]][P[4]][P[5]][P[6]];
if (I1==0) I1=++MaxC1;
int &I2 = C2[I1][P[7]][P[8]][P[9]][P[10]];
if (I2==0) I2=++MaxC2;
return I2;
}
We have already created function index to map a permutation to a number (although it is not efficient in memory yet). If you have not understood well, please read again carefully what I wrote before continuing reading.
You may realize that size of C1 is ok but C2 is too large. Now we need to make C2 smaller. When 7 elements are determined, we can use the relations (6 relations of 6 pairs of 4 elements) between the 4 remaining elements to represent them (remaining elements). Let x = P7, y = P8, z = P9, t = P10 . Instead of using C2[I2][x][y][z][t], we will use C2[I2][x > y][x > z][x > t][y > z][y > t][z > t] instead. This makes the size of C2 become 1663200 * 64 instead of 1663200 * 114.
The final implementation is here:
int C1[11][11][11][11][11][11][11]; // 77MB
int MaxC1=0;
int C2[1663200+1][2][2][2][2][2][2]; // 425 MB
int MaxC2=0;
int index(const vector<int> &P) {
int &I1 = C1[P[0]][P[1]][P[2]][P[3]][P[4]][P[5]][P[6]];
if (I1==0) I1=++MaxC1;
int x=P[7], y=P[8], z=P[9], t=P[10];
int &I2 = C2[I1][x>y][x>z][x>t][y>z][y>t][z>t];
if (I2==0) I2=++MaxC2;
return I2;
}
3. Effectiveness
In the third place, what cannot be left out is the effectiveness of my method.
Time complexity
The time complexity of function index is O(L) (to access), and the constant is quite small.
Memory complexity
The memory is quite big but acceptable. If you cannot accept this memory size, you can use 3 or 4 array (C1, C2, C3, C4, …) to reduce the memory size.
Coding time
Personally, function index is short and easy to code in a little time.
Effectiveness in DP
Assume you are using function dp(u) (which u is a permutation), you can reduce a bool array. To determine if f(u) is calculated, just check if C2[I1][x>y][x>z][x>t][y>z][y>t][z>t] is zero or not.
4. Magic numbers
Lastly, selected magic numbers is note-worthy. I will talk about reasons why I split 11 elements into 2 groups: first 7 elements and last 4 elements. Because I want to use only two array C1 and C2, I must write 11 into a sum of two number, such ways are: 11=10+1=9+2=8+3=7+4=6+5, and I found that 7+4 is the best selection. 8+3 is impossible because size of C1 is too large: 11^8 ints = 857 MB. 6+5 is not good because size of C2 is too large, 936 MB. Therefore 7+4 is reasonable. More notably, if I use [x][y>z][y>t][z>t] instead of [x>y][x>z][x>t][y>z][y>t][z>t], the size will be 11*2^3 (=88) instead of 2^6 (=64). You should consider those magic numbers carefully to balance the memory used.
To handle problems with L=12, in a naive view point, we can't use any data structure (even trie) because 12! ints = 1.9GB (too big). However, not all of states are reached. In problem POSLOZI, there are less than 10^7 reached states. This is my code http://paste.ubuntu.com/9686778/. It passed 9/10 tests and runs the last test in 3 seconds. It is what I expected because my strategy is not the perfect solution (someone said the optimal one is to use A* with some heuristics).
Conclusion
In summarize, for all the mentioned advantages above, the new method shows us a good efficiency. I highly recommend that you should carefully consider writing to make good decision on problem mapping a permutation to a number.
It is hard to write both long and detailed blog. Therefore, you can comment anything which you didn't understand well. I will reply (or update this blog if it is necessary).
In problem POSLOZI, since not all states are reached, the natural way would be to use
map< vector<int>, int>
to map state to index (which you mentioned). To be more efficient, you can hash thevector<int>
to long long, and getmap<long long, int>
.It would be much better if you try implementing this and compare running time. Without meaningful comparison, I'm not convinced to use this method. For me, it would be better to just use map and then optimize using hash and A* (it's not hard to implement).
I tried map<vector, int> and it took a lot of time (about 10x slower). My new method runs faster considerably.
It showed the speed in the contest PREVNOI 2015, last problem. There are some contestants used std::map, but I used this method. And I gained higher score (in this problems) thanks to this method.
map<vector,int>
10x slower is understandable, but what about map + hash?Last test:
hash + std::map: 8.68s
my method: 3.71s
(with enough optimization for both implementations)
why is people giving you negatives? it is quite a good optimization
Well because there are people who don't understand that a hash is probably slower than this method.Unfortunately, kien_coi_1997 didn't understand the question(prbably he thinks that hash means map), but anyway the hash is not a good method because you still have log because map.Unfotnutely, I didn't have enough time to take a look at entire solution, but it seems it is a preety cool one.Anyway, I discovered that there is a big community of haters which if don't undertand something, consider it wrong, and give downvotes, which is not a good thing :(
This is exactly what I think. Furthermore, even if some algorithmic comments are wrong, I don't think they deserve to be downvoted. In most case, the commeter genuinely want to contribute to the community, thus giving him negative contribution does not make sense. So, instead of downvotes, I hope people ask for clarification / prove the comment wrong / ...
Regarding "hash is not a good method because you still have log because map". The complexity of hash + map is
O(L + log)
. In usual problems,log(state)
is approximately L. So the complexity of this method can be considered equally good. The difference is which method have lower constant.I don't get why "In usual problems, log(state) is approximately L"?
I think that in this problem log(state) is log(L!), hence overall complexity is O(L + L*Log(L)) which explains why kien_coi_1997's method is faster or am I getting things wrong?
EDIT: ahh, I see Log(L) is pretty small Log(11) which is essentially a constant. You can ignore my comment above :)
In usual competitive programming problems, time limit is around few seconds, so the amount of states that we can explore is usually around 105. When L! is bigger, i.e. 11! to 12!, the problem is usually solvable using the fact that the number of explored states is much smaller than L!
So in typical situation, you will need to have around 105 states, meaning
log(states) = 16
Why we can't simply calculate the number of permutation p among all permutations of length L in O(L) time and then use HashSet?
Is is possible to get the number of permutation p in O(L)? Cause only method I know (Description in Russian) works in O(L * lg(L)).
Of course you are right) But for L ~ 12 O(L) is about equally to O(L * lg(L)) (we can use Fenwick tree to have small constant).
In fact, we can use bit compression for small L. Just set bits to some int & calc pop_count on it prefix.
Cause 2L < L! we can even precompute pop_count for all masks :)
It is possible, at least if we don't insist on maintaining lexicographical order. Which is true when we need numbers just to distinguish search states. See here for an answer to a related question.
I loved your explanation and the links you provided. I didn't know anything about ranking and unranking. Thank you very much :)
I see the article for three hours.As you know ,i am a beginner. But this way is really simply and easy to code. I try the test data to understand the thought of the author (Kien coi 1997).
Now i still can't understand it entirely. I want to know the use of the array color[] and d[] . And why the color[v]!=color[u] is the end of the BFS.(this search tree has two roots ? d[] records the depth of the node? the color[] records where the node comes from?) (My English i not good ...So if there is something wrong ,please forgive me .)
"this search tree has two roots ? d[] records the depth of the node? the color[] records where the node comes from?"
all of these are right.
"I want to know the use of the array color[] and d[]"
Color[u] can be 0 or 1 or 2. d[u] is the depth of u.
Initialy, d[S] and d[T] are zero. Color[S]=1 and Color[T]=2.
We will BFS simultaneously from both S and T. We stop whenever two colors touch.
"And why the color[v]!=color[u] is the end of the BFS"
When we use u to visit v, there are 3 cases:
Color[v]==0: v is not visited => use u to update v.
Color[v]==Color[u]: v is visited and has the same color with u => just ignore v
Color[v]!=Color[u]: two colors met => stop BFS
Full thanks to you. And "Initialy, d[S] and d[T] are zero. Color[S]=1 and Color[v]=2." 'v' should be 'T'.
What is the reason behind searching from both ends? Shouldn't starting from one end and progressing until we reach the other end be enough? or will generate a lot of intermediate states to explore?
Well, you got it almost right by yourself. In fact, the BFS is running from two starting states, and it will continue until a permutation V is found such that V has been previously reached by the other starting state.
Let's call both starting states A and B. Let's assume, without loss of generality, that you have reached a permutation V from A that has been previously reached from B. This means that D[V] is currently storing the distance from B to V, and since the distance from A to V turns out to be D[U] + 1 (U is the state currently being explored in the BFS), it follows that the distance between A and B should be (D[U] + 1) + (D[V]).
The technique being used is known as Meet in the Middle. You can read more about it in the following links: a) http://www.infoarena.ro/blog/meet-in-the-middle b) http://www.quora.com/What-is-meet-in-the-middle-algorithm-w-r-t-competitive-programming
When L=9, the DP function can be minimized as following:
You may be interested in factoradic system (aka factorial number system) as well: link
That's a swift and handy trick you've got there. I can however suggest an alternate method of my own which is also linear i.e, O(L) in time, but uses much less memory when compared to yours.
Could you please explain it?
What is the problem with the following approach: