An efficient way to map a permutation (length 11 or 12) to an integer without using std::map

№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

Mapping a permutation to a number has attracted much social concern. In shortest-path problems, in case a node (or state) is a permutation, we should convert states to integers, in order to BFS or Dijkstra on new graph comfortably. Many coders have known using std::map or trie to do this work. However, both have certain disadvantages. My writing will introduce a new way to solve this problem.

It is hard to write both long and detailed blog. Therefore, you can comment anything which you didn't understand well. I will reply (or update this blog if it is necessary).

To understand the role of mapping a permutation to a number, consider problem POSLOZI from COCI. Our goal is to find the length of the shortest path from a permutation S to an other permutation T using allowed operations. A valid operation is swapping two elements in the permutation. We are given a list of pair (p, q) denote we can swap the element indexed p and the element indexed q. Any other swapping operations are not allowed. A possible strategy is to BFS simultaneously from both S and T.

. .

1. Disadvantages of std::map and trie

In the first place, what I put in my priority is inefficiency of wide-known methods. Std::map has big time complexity, and trie demands a quite long conding-time. 1, Using std::map is the easiest way to code but it costs big complexity, i.e. O(logN * L) for each query (where L is length of the permutation, N is number of permutations, in other words N = L!). 2, Trie brings us the speed in queries but it uses quite much memory O(N * L) and requests a long time to code. Therefore, both std::map and trie are not perfect solutions. It encourages me to find a better solution.

2. My new method

In the second place, I will talk about my method treating this problem. Assume length of permutations L = 11. We will use a 7-directional array to map first 7 elements to a number, called C1.


int C1[11][11][11][11][11][11][11];
int MaxC1=0;

At this point, the number associate with the first 7 elements of permutation P is C1[P₀][P₁][P₂][P₃][P₄][P₅][P₆].

For the first 7 elements, we have 11! / 4! = 1663200 states. There are 4 remaining elements.

Now first let's talk about a naive solution, we will create array C2 declared as following:


int C2[1663200+1][11][11][11][11];
int MaxC2=0;

In order to get the number associate with the permutation P, we will use following function:


int index(const vector<int> &P) {
  int &I1 = C1[P[0]][P[1]][P[2]][P[3]][P[4]][P[5]][P[6]];
  if (I1==0) I1=++MaxC1;
  int &I2 = C2[I1][P[7]][P[8]][P[9]][P[10]];
  if (I2==0) I2=++MaxC2;
  return I2;
}

We have already created function index to map a permutation to a number (although it is not efficient in memory yet). If you have not understood well, please read again carefully what I wrote before continuing reading.

You may realize that size of C1 is ok but C2 is too large. Now we need to make C2 smaller. When 7 elements are determined, we can use the relations (6 relations of 6 pairs of 4 elements) between the 4 remaining elements to represent them (remaining elements). Let x = P₇, y = P₈, z = P₉, t = P₁₀ . Instead of using C2[I2][x][y][z][t], we will use C2[I2][x > y][x > z][x > t][y > z][y > t][z > t] instead. This makes the size of C2 become 1663200 * 64 instead of 1663200 * 11⁴.

The final implementation is here:


int C1[11][11][11][11][11][11][11]; // 77MB
int MaxC1=0;
int C2[1663200+1][2][2][2][2][2][2]; // 425 MB
int MaxC2=0;

int index(const vector<int> &P) {
  int &I1 = C1[P[0]][P[1]][P[2]][P[3]][P[4]][P[5]][P[6]];
  if (I1==0) I1=++MaxC1;
  int x=P[7], y=P[8], z=P[9], t=P[10];
  int &I2 = C2[I1][x>y][x>z][x>t][y>z][y>t][z>t];
  if (I2==0) I2=++MaxC2;
  return I2;
}

3. Effectiveness

In the third place, what cannot be left out is the effectiveness of my method.

Time complexity

The time complexity of function index is O(L) (to access), and the constant is quite small.

Memory complexity

The memory is quite big but acceptable. If you cannot accept this memory size, you can use 3 or 4 array (C1, C2, C3, C4, …) to reduce the memory size.

Coding time

Personally, function index is short and easy to code in a little time.

Effectiveness in DP

Assume you are using function dp(u) (which u is a permutation), you can reduce a bool array. To determine if f(u) is calculated, just check if C2[I1][x>y][x>z][x>t][y>z][y>t][z>t] is zero or not.

4. Magic numbers

Lastly, selected magic numbers is note-worthy. I will talk about reasons why I split 11 elements into 2 groups: first 7 elements and last 4 elements. Because I want to use only two array C1 and C2, I must write 11 into a sum of two number, such ways are: 11=10+1=9+2=8+3=7+4=6+5, and I found that 7+4 is the best selection. 8+3 is impossible because size of C1 is too large: 11^8 ints = 857 MB. 6+5 is not good because size of C2 is too large, 936 MB. Therefore 7+4 is reasonable. More notably, if I use [x][y>z][y>t][z>t] instead of [x>y][x>z][x>t][y>z][y>t][z>t], the size will be 11*2^3 (=88) instead of 2^6 (=64). You should consider those magic numbers carefully to balance the memory used.

To handle problems with L=12, in a naive view point, we can't use any data structure (even trie) because 12! ints = 1.9GB (too big). However, not all of states are reached. In problem POSLOZI, there are less than 10^7 reached states. This is my code http://paste.ubuntu.com/9686778/. It passed 9/10 tests and runs the last test in 3 seconds. It is what I expected because my strategy is not the perfect solution (someone said the optimal one is to use A* with some heuristics).

Conclusion

In summarize, for all the mentioned advantages above, the new method shows us a good efficiency. I highly recommend that you should carefully consider writing to make good decision on problem mapping a permutation to a number.

It is hard to write both long and detailed blog. Therefore, you can comment anything which you didn't understand well. I will reply (or update this blog if it is necessary).

int FF[9][9][9][9][9][9][2][2][2]; bool F[9][9][9][9][9][9][2][2][2]; #define C P[0]][P[1]][P[2]][P[3]][P[4]][P[5]][P[6]>P[7]][P[6]>P[8]][P[7]>P[8] int f(const vector<int> &P) { if (F[C]++) return FF[C]; ... return FF[C] = Result; }

Комментарии (25)

Написать комментарий?

I_love_Hoang_Yen

10 лет назад, # |

← Rev. 2 →

In problem POSLOZI, since not all states are reached, the natural way would be to use map< vector<int>, int> to map state to index (which you mentioned). To be more efficient, you can hash the vector<int> to long long, and get map<long long, int>.

It would be much better if you try implementing this and compare running time. Without meaningful comparison, I'm not convinced to use this method. For me, it would be better to just use map and then optimize using hash and A* (it's not hard to implement).

→ Ответить

kien_coi_1997

10 лет назад, # ^ |

I tried map<vector, int> and it took a lot of time (about 10x slower). My new method runs faster considerably.
It showed the speed in the contest PREVNOI 2015, last problem. There are some contestants used std::map, but I used this method. And I gained higher score (in this problems) thanks to this method.

+13

map<vector,int> 10x slower is understandable, but what about map + hash?

Last test:

hash + std::map: 8.68s

my method: 3.71s

(with enough optimization for both implementations)

ArepitaDePernil

why is people giving you negatives? it is quite a good optimization

geniucos

+35

Well because there are people who don't understand that a hash is probably slower than this method.Unfortunately, kien_coi_1997 didn't understand the question(prbably he thinks that hash means map), but anyway the hash is not a good method because you still have log because map.Unfotnutely, I didn't have enough time to take a look at entire solution, but it seems it is a preety cool one.Anyway, I discovered that there is a big community of haters which if don't undertand something, consider it wrong, and give downvotes, which is not a good thing :(

This is exactly what I think. Furthermore, even if some algorithmic comments are wrong, I don't think they deserve to be downvoted. In most case, the commeter genuinely want to contribute to the community, thus giving him negative contribution does not make sense. So, instead of downvotes, I hope people ask for clarification / prove the comment wrong / ...

Regarding "hash is not a good method because you still have log because map". The complexity of hash + map is O(L + log). In usual problems, log(state) is approximately L. So the complexity of this method can be considered equally good. The difference is which method have lower constant.

ibraaaa

I don't get why "In usual problems, log(state) is approximately L"?

I think that in this problem log(state) is log(L!), hence overall complexity is O(L + L*Log(L)) which explains why kien_coi_1997's method is faster or am I getting things wrong?

EDIT: ahh, I see Log(L) is pretty small Log(11) which is essentially a constant. You can ignore my comment above :)

In usual competitive programming problems, time limit is around few seconds, so the amount of states that we can explore is usually around 10⁵. When L! is bigger, i.e. 11! to 12!, the problem is usually solvable using the fact that the number of explored states is much smaller than L!

So in typical situation, you will need to have around 10⁵ states, meaning log(states) = 16

Edvard

Why we can't simply calculate the number of permutation p among all permutations of length L in O(L) time and then use HashSet?

yarrr

+11

Is is possible to get the number of permutation p in O(L)? Cause only method I know (Description in Russian) works in O(L * lg(L)).

Of course you are right) But for L ~ 12 O(L) is about equally to O(L * lg(L)) (we can use Fenwick tree to have small constant).

In fact, we can use bit compression for small L. Just set bits to some int & calc pop_count on it prefix.

Cause 2^L < L! we can even precompute pop_count for all masks :)

Gassa

It is possible, at least if we don't insist on maintaining lexicographical order. Which is true when we need numbers just to distinguish search states. See here for an answer to a related question.

alexpizarroj

← Rev. 3 →

I loved your explanation and the links you provided. I didn't know anything about ranking and unranking. Thank you very much :)

QiLiu_

I see the article for three hours.As you know ,i am a beginner. But this way is really simply and easy to code. I try the test data to understand the thought of the author (Kien coi 1997).
Now i still can't understand it entirely. I want to know the use of the array color[] and d[] . And why the color[v]!=color[u] is the end of the BFS.（this search tree has two roots ? d[] records the depth of the node? the color[] records where the node comes from?) (My English i not good ...So if there is something wrong ,please forgive me .)

"this search tree has two roots ? d[] records the depth of the node? the color[] records where the node comes from?"

all of these are right.

"I want to know the use of the array color[] and d[]"

Color[u] can be 0 or 1 or 2. d[u] is the depth of u.

Initialy, d[S] and d[T] are zero. Color[S]=1 and Color[T]=2.

We will BFS simultaneously from both S and T. We stop whenever two colors touch.

"And why the color[v]!=color[u] is the end of the BFS"

When we use u to visit v, there are 3 cases:

Color[v]==0: v is not visited => use u to update v.
Color[v]==Color[u]: v is visited and has the same color with u => just ignore v
Color[v]!=Color[u]: two colors met => stop BFS

Full thanks to you. And "Initialy, d[S] and d[T] are zero. Color[S]=1 and Color[v]=2." 'v' should be 'T'.

What is the reason behind searching from both ends? Shouldn't starting from one end and progressing until we reach the other end be enough? or will generate a lot of intermediate states to explore?

Well, you got it almost right by yourself. In fact, the BFS is running from two starting states, and it will continue until a permutation V is found such that V has been previously reached by the other starting state.

Let's call both starting states A and B. Let's assume, without loss of generality, that you have reached a permutation V from A that has been previously reached from B. This means that D[V] is currently storing the distance from B to V, and since the distance from A to V turns out to be D[U] + 1 (U is the state currently being explored in the BFS), it follows that the distance between A and B should be (D[U] + 1) + (D[V]).

The technique being used is known as Meet in the Middle. You can read more about it in the following links: a) http://www.infoarena.ro/blog/meet-in-the-middle b) http://www.quora.com/What-is-meet-in-the-middle-algorithm-w-r-t-competitive-programming

+18

When L=9, the DP function can be minimized as following:

magieNoire

You may be interested in factoradic system (aka factorial number system) as well: link

sgtlaugh

That's a swift and handy trick you've got there. I can however suggest an alternate method of my own which is also linear i.e, O(L) in time, but uses much less memory when compared to yours.

Could you please explain it?

I_love_natalia

What is the problem with the following approach:

Let each permutation will be encoded in 6 bytes, where each a half of byte is a number from permutation.
Create some random-shuffled array to encode a word (two bytes).
Use some integer hash function like hash1 = (x << 16) ^ (y) ^ (z << 14) ^ (z << 5) and hash2 = (hash1 << 1) | 1 for an open addressed hash table. Here x,y,z are encoded three words made from 6 bytes of permutation.
???
Accepted???