Heuristic algorithm for Hamiltonian path in directed graphs

#	User	Rating
1	jiangly	3898
2	tourist	3840
3	orzdevinwang	3706
4	ksun48	3691
5	jqdai0815	3682
6	ecnerwala	3525
7	gamegame	3477
8	Benq	3468
9	Ormlis	3381
10	maroonrk	3379

#	User	Contrib.
1	cry	168
2	-is-this-fft-	165
3	Dominater069	161
4	atcoder_official	160
5	Um_nik	159
6	djm03178	157
7	adamant	153
8	luogu_official	151
9	awoo	149
10	TheScrasse	146

(It's a translated version of my blog at https://peehs-moorhsum.blog.uoj.ac/blog/6384. )

Finding a Hamiltonian path in a directed graph is a well-known NP problem.

However, about a year ago, I came up with the following heuristic algorithm which has GREAT performance on random graphs(by first generating a hamiltonian path, adding random edges, then randomly permuting indices) and many CP problems.

Here is the idea of the algorithm:

We maintain a subset $$$S$$$ of edges, ensuring there's no cycle and each vertex has in-degree and out-degree at most 1. Whenever $$$|S|$$$ reaches $$$n-1$$$, we find a hamiltonian path.

We start with $$$S$$$ empty. Each time we pick a random edge $$$e \notin S$$$. If we can add $$$e$$$ to $$$S$$$ without violating any rule, we simply add it to $$$S$$$. Otherwise, if $$$e$$$ won't create a cycle and violate with exactly one edge e', we replace e' with $$$e$$$ with 50% probability. Repeat until $$$|S|$$$ reaches $$$n-1$$$.

We can use LCT to check cycles. Here's my code, which takes less than 1 second on most random graphs with $$$|V|=100000, |E|=500000$$$:

namespace hamil {
    template <typename T> bool chkmax(T &x,T y){return x<y?x=y,true:false;}
    template <typename T> bool chkmin(T &x,T y){return x>y?x=y,true:false;}
    #define vi vector<int>
    #define pb push_back
    #define mp make_pair
    #define pi pair<int, int>
    #define fi first
    #define se second
    #define ll long long
    namespace LCT {
        vector<vi> ch;
        vi fa, rev;
        void init(int n) {
            ch.resize(n + 1);
            fa.resize(n + 1);
            rev.resize(n + 1);
            for (int i = 0; i <= n; i++)
                ch[i].resize(2), 
                ch[i][0] = ch[i][1] = fa[i] = rev[i] = 0;
        }
        bool isr(int a)
        {
            return !(ch[fa[a]][0] == a || ch[fa[a]][1] == a);
        } 
        void pushdown(int a)
        {
            if(rev[a])
            {
                rev[ch[a][0]] ^= 1, rev[ch[a][1]] ^= 1;
                swap(ch[a][0], ch[a][1]);
                rev[a] = 0;
            }
        }
        void push(int a)
        {
            if(!isr(a)) push(fa[a]);
            pushdown(a); 
        }
        void rotate(int a)
        {
            int f = fa[a], gf = fa[f];
            int tp = ch[f][1] == a;
            int son = ch[a][tp ^ 1];
            if(!isr(f)) 
                ch[gf][ch[gf][1] == f] = a;    
            fa[a] = gf;

            ch[f][tp] = son;
            if(son) fa[son] = f;

            ch[a][tp ^ 1] = f, fa[f] = a;
        }
        void splay(int a)
        {
            push(a);
            while(!isr(a))
            {
                int f = fa[a], gf = fa[f];
                if(isr(f)) rotate(a);
                else
                {
                    int t1 = ch[gf][1] == f, t2 = ch[f][1] == a;
                    if(t1 == t2) rotate(f), rotate(a);
                    else rotate(a), rotate(a);    
                } 
            } 
        }
        void access(int a)
        {
            int pr = a;
            splay(a);
            ch[a][1] = 0;
            while(1)
            {
                if(!fa[a]) break; 
                int u = fa[a];
                splay(u);
                ch[u][1] = a;
                a = u;
            }
            splay(pr);
        }
        void makeroot(int a)
        {
            access(a);
            rev[a] ^= 1;
        }
        void link(int a, int b)
        {
            makeroot(a);
            fa[a] = b;
        }
        void cut(int a, int b)
        {
            makeroot(a);
            access(b);
            fa[a] = 0, ch[b][0] = 0;
        }
        int fdr(int a)
        {
            access(a);
            while(1)
            {
                pushdown(a);
                if (ch[a][0]) a = ch[a][0];
                else {
                    splay(a);
                    return a;
                }
            }
        }
    }
    vi out, in;
    vi work(int n, vector<pi> eg, ll mx_ch = -1) { 
        // mx_ch : max number of adding/replacing  default is (n + 100) * (n + 50) 
        // n : number of vertices. 1-indexed. 
        // eg: vector<pair<int, int> > storing all the edges. 
        // return a vector<int> consists of all indices of vertices on the path. return empty list if failed to find one.  
        out.resize(n + 1), in.resize(n + 1);
        LCT::init(n);
        for (int i = 0; i <= n; i++) in[i] = out[i] = 0;
        if (mx_ch == -1) mx_ch = 1ll * (n + 100) * (n + 50); //default
        vector<vi> from(n + 1), to(n + 1);
        for (auto v : eg)
            from[v.fi].pb(v.se), 
            to[v.se].pb(v.fi);
        unordered_set<int> canin, canout;
        for (int i = 1; i <= n; i++)
            canin.insert(i), 
            canout.insert(i); 
        mt19937 x(chrono::steady_clock::now().time_since_epoch().count());
        int tot = 0;
        while (mx_ch >= 0) {
        //    cout << tot << ' ' << mx_ch << endl;
            vector<pi> eg;
            for (auto v : canout)
                for (auto s : from[v])
                    if (in[s] == 0) {
                        assert(canin.count(s));
                        continue;
                    }
                    else eg.pb(mp(v, s));
            for (auto v : canin)
                for (auto s : to[v])
                    eg.pb(mp(s, v));
            shuffle(eg.begin(), eg.end(), x);
            if (eg.size() == 0) break;
            for (auto v : eg) {
                mx_ch--;
                if (in[v.se] && out[v.fi]) continue;
                if (LCT::fdr(v.fi) == LCT::fdr(v.se)) continue;
                if (in[v.se] || out[v.fi]) 
                    if (x() & 1) continue;
                if (!in[v.se] && !out[v.fi]) 
                    tot++;
                if (in[v.se]) {
                    LCT::cut(in[v.se], v.se);
                    canin.insert(v.se);
                    canout.insert(in[v.se]);
                    out[in[v.se]] = 0;
                    in[v.se] = 0;
                }
                if (out[v.fi]) {
                    LCT::cut(v.fi, out[v.fi]);
                    canin.insert(out[v.fi]);
                    canout.insert(v.fi);
                    in[out[v.fi]] = 0;
                    out[v.fi] = 0;
                }
                LCT::link(v.fi, v.se);
                canin.erase(v.se);
                canout.erase(v.fi);
                in[v.se] = v.fi;
                out[v.fi] = v.se;
            }
            if (tot == n - 1) {
                vi cur;
                for (int i = 1; i <= n; i++) 
                    if (!in[i]) {
                        int pl = i;
                        while (pl) {
                            cur.pb(pl), 
                            pl = out[pl];
                        }
                        break;
                    } 
                return cur;
            }
        }
        //failed to find a path
        return vi();
    }
}

Note that if the start vertex $$$S$$$ and/or end vertex $$$T$$$ of the path is given, we simply create two new vertices $$$A, B$$$ and add edges from $$$A$$$ to $$$S$$$, from $$$T$$$ to $$$B$$$. Any Hamiltonian path in the new graph automatically has $$$A\to S$$$ as the first edge and $$$T\to B$$$ as the last edge. If we want to find a Hamiltonian cycle, we simply enumerate an edge, then convert the problem to finding a path given start/end vertices. For bidirectional case, we can simply add each edge twice.

Comments (11)

Write comment?

heuristica

4 years ago, # |

Auto comment: topic has been updated by heuristica (previous revision, new revision, compare).

→ Reply

Benq

+11

I assume you meant $$$|V|=10^4$$$? With $$$|V|=10^5$$$ and $$$|E|=5\cdot 10^5$$$ directed edges the expected number of vertices with out-degree 0 is quite large ... ($$$\approx |V|\cdot e^{-5}\approx 674$$$)

geniucos

4 years ago, # ^ |

+23

He did say "by first generating a hamiltonian path, adding random edges, then randomly permuting indices" which I guess means he ensures there are no such vertices (and I guess there are ~674 vertices with degree 1 then?)

← Rev. 3 →

+14

aha, can't read

Well assuming $$$4\cdot 10^5$$$ edges in addition to the hamiltonian path then that's $$$|V|\cdot e^{-4}\approx 1832$$$ with out-degree 1.

Yeah, that's true. I also tried many other random graphs, and it turned out that the performance is approximate $$$O(|E|\log|V|)$$$, not depends on $$$|E|/|V|$$$ very much (but it sometimes stuck in cases $$$|E|~3|V|$$$). I guess there're many paths in random graphs, so random guesses work most of the time?

wery0

Very cool approach!

Instead of time(0) it's better to use chrono::steady_clock::now().time_since_epoch().count() inside mt19937, because we can call function work() several times for better probabilities. If we call this function with the same arguments several times per second, time(0) returns the same value, which is bad.

Or we can move mt19937(time(0)) out of work()'s body.

Yeah I see. Updated! >w<

← Rev. 2 →

You can test any input with the code above! Just generate graphs (store edges as vector<pair<int, int>>, where (u, v) denotes an edge from $$$u$$$ to $$$v$$$), and put them into hamil::work. Initialization is done automatically, so you can test graphs in series.

Elegia

+24

This idea is also useful in some problems which are related to Hamiltonian path! In JOISC 2021 Day1 T1 Aerobatics, applying this idea will get >80 points easily.

SoMuchDrama

+20

I tried to apply this idea here and looks like it doesn't perform great for the bidirectional case, where not so many random edges added, so it won't work for very simple undirected graphs. For example, it fails to find the path for 5000 vertices and undirected edges $$$1-2-3-\ldots-N$$$ in 2 seconds.

+40

I see. Seems like the algorithm works bad when there are lots of directional paths but few Hamiltonian paths(which is often true when bidirectional).

(It also depends on seed a lot. For 1-2-3-...-N with N = 5000, it runs rather fast with seed = 377921677545000 (the 1 out of 100 seeds that success at mx_ch = 1e6 lol))

But bidirectional case should be easier, as we can only restrict vertices to have $$$deg \le 2$$$ rather than $$$out \le 1$$$ and $$$in \le 1$$$. I'll write another code for bidirectional later. :)

heuristica's blog