(It's a translated version of my blog at https://peehs-moorhsum.blog.uoj.ac/blog/6384. )
Finding a Hamiltonian path in a directed graph is a well-known NP problem.
However, about a year ago, I came up with the following heuristic algorithm which has GREAT performance on random graphs(by first generating a hamiltonian path, adding random edges, then randomly permuting indices) and many CP problems.
Here is the idea of the algorithm:
We maintain a subset $$$S$$$ of edges, ensuring there's no cycle and each vertex has in-degree and out-degree at most 1. Whenever $$$|S|$$$ reaches $$$n-1$$$, we find a hamiltonian path.
We start with $$$S$$$ empty. Each time we pick a random edge $$$e \notin S$$$. If we can add $$$e$$$ to $$$S$$$ without violating any rule, we simply add it to $$$S$$$. Otherwise, if $$$e$$$ won't create a cycle and violate with exactly one edge e', we replace e' with $$$e$$$ with 50% probability. Repeat until $$$|S|$$$ reaches $$$n-1$$$.
We can use LCT to check cycles. Here's my code, which takes less than 1 second on most random graphs with $$$|V|=100000, |E|=500000$$$:
namespace hamil {
template <typename T> bool chkmax(T &x,T y){return x<y?x=y,true:false;}
template <typename T> bool chkmin(T &x,T y){return x>y?x=y,true:false;}
#define vi vector<int>
#define pb push_back
#define mp make_pair
#define pi pair<int, int>
#define fi first
#define se second
#define ll long long
namespace LCT {
vector<vi> ch;
vi fa, rev;
void init(int n) {
ch.resize(n + 1);
fa.resize(n + 1);
rev.resize(n + 1);
for (int i = 0; i <= n; i++)
ch[i].resize(2),
ch[i][0] = ch[i][1] = fa[i] = rev[i] = 0;
}
bool isr(int a)
{
return !(ch[fa[a]][0] == a || ch[fa[a]][1] == a);
}
void pushdown(int a)
{
if(rev[a])
{
rev[ch[a][0]] ^= 1, rev[ch[a][1]] ^= 1;
swap(ch[a][0], ch[a][1]);
rev[a] = 0;
}
}
void push(int a)
{
if(!isr(a)) push(fa[a]);
pushdown(a);
}
void rotate(int a)
{
int f = fa[a], gf = fa[f];
int tp = ch[f][1] == a;
int son = ch[a][tp ^ 1];
if(!isr(f))
ch[gf][ch[gf][1] == f] = a;
fa[a] = gf;
ch[f][tp] = son;
if(son) fa[son] = f;
ch[a][tp ^ 1] = f, fa[f] = a;
}
void splay(int a)
{
push(a);
while(!isr(a))
{
int f = fa[a], gf = fa[f];
if(isr(f)) rotate(a);
else
{
int t1 = ch[gf][1] == f, t2 = ch[f][1] == a;
if(t1 == t2) rotate(f), rotate(a);
else rotate(a), rotate(a);
}
}
}
void access(int a)
{
int pr = a;
splay(a);
ch[a][1] = 0;
while(1)
{
if(!fa[a]) break;
int u = fa[a];
splay(u);
ch[u][1] = a;
a = u;
}
splay(pr);
}
void makeroot(int a)
{
access(a);
rev[a] ^= 1;
}
void link(int a, int b)
{
makeroot(a);
fa[a] = b;
}
void cut(int a, int b)
{
makeroot(a);
access(b);
fa[a] = 0, ch[b][0] = 0;
}
int fdr(int a)
{
access(a);
while(1)
{
pushdown(a);
if (ch[a][0]) a = ch[a][0];
else {
splay(a);
return a;
}
}
}
}
vi out, in;
vi work(int n, vector<pi> eg, ll mx_ch = -1) {
// mx_ch : max number of adding/replacing default is (n + 100) * (n + 50)
// n : number of vertices. 1-indexed.
// eg: vector<pair<int, int> > storing all the edges.
// return a vector<int> consists of all indices of vertices on the path. return empty list if failed to find one.
out.resize(n + 1), in.resize(n + 1);
LCT::init(n);
for (int i = 0; i <= n; i++) in[i] = out[i] = 0;
if (mx_ch == -1) mx_ch = 1ll * (n + 100) * (n + 50); //default
vector<vi> from(n + 1), to(n + 1);
for (auto v : eg)
from[v.fi].pb(v.se),
to[v.se].pb(v.fi);
unordered_set<int> canin, canout;
for (int i = 1; i <= n; i++)
canin.insert(i),
canout.insert(i);
mt19937 x(chrono::steady_clock::now().time_since_epoch().count());
int tot = 0;
while (mx_ch >= 0) {
// cout << tot << ' ' << mx_ch << endl;
vector<pi> eg;
for (auto v : canout)
for (auto s : from[v])
if (in[s] == 0) {
assert(canin.count(s));
continue;
}
else eg.pb(mp(v, s));
for (auto v : canin)
for (auto s : to[v])
eg.pb(mp(s, v));
shuffle(eg.begin(), eg.end(), x);
if (eg.size() == 0) break;
for (auto v : eg) {
mx_ch--;
if (in[v.se] && out[v.fi]) continue;
if (LCT::fdr(v.fi) == LCT::fdr(v.se)) continue;
if (in[v.se] || out[v.fi])
if (x() & 1) continue;
if (!in[v.se] && !out[v.fi])
tot++;
if (in[v.se]) {
LCT::cut(in[v.se], v.se);
canin.insert(v.se);
canout.insert(in[v.se]);
out[in[v.se]] = 0;
in[v.se] = 0;
}
if (out[v.fi]) {
LCT::cut(v.fi, out[v.fi]);
canin.insert(out[v.fi]);
canout.insert(v.fi);
in[out[v.fi]] = 0;
out[v.fi] = 0;
}
LCT::link(v.fi, v.se);
canin.erase(v.se);
canout.erase(v.fi);
in[v.se] = v.fi;
out[v.fi] = v.se;
}
if (tot == n - 1) {
vi cur;
for (int i = 1; i <= n; i++)
if (!in[i]) {
int pl = i;
while (pl) {
cur.pb(pl),
pl = out[pl];
}
break;
}
return cur;
}
}
//failed to find a path
return vi();
}
}
Note that if the start vertex $$$S$$$ and/or end vertex $$$T$$$ of the path is given, we simply create two new vertices $$$A, B$$$ and add edges from $$$A$$$ to $$$S$$$, from $$$T$$$ to $$$B$$$. Any Hamiltonian path in the new graph automatically has $$$A\to S$$$ as the first edge and $$$T\to B$$$ as the last edge. If we want to find a Hamiltonian cycle, we simply enumerate an edge, then convert the problem to finding a path given start/end vertices. For bidirectional case, we can simply add each edge twice.
Auto comment: topic has been updated by Rewinding (previous revision, new revision, compare).
I assume you meant $$$|V|=10^4$$$? With $$$|V|=10^5$$$ and $$$|E|=5\cdot 10^5$$$ directed edges the expected number of vertices with out-degree 0 is quite large ... ($$$\approx |V|\cdot e^{-5}\approx 674$$$)
He did say "by first generating a hamiltonian path, adding random edges, then randomly permuting indices" which I guess means he ensures there are no such vertices (and I guess there are ~674 vertices with degree 1 then?)
aha, can't read
Well assuming $$$4\cdot 10^5$$$ edges in addition to the hamiltonian path then that's $$$|V|\cdot e^{-4}\approx 1832$$$ with out-degree 1.
Yeah, that's true. I also tried many other random graphs, and it turned out that the performance is approximate $$$O(|E|\log|V|)$$$, not depends on $$$|E|/|V|$$$ very much (but it sometimes stuck in cases $$$|E|~3|V|$$$). I guess there're many paths in random graphs, so random guesses work most of the time?
Very cool approach!
Instead of
time(0)
it's better to usechrono::steady_clock::now().time_since_epoch().count()
insidemt19937
, because we can call functionwork()
several times for better probabilities. If we call this function with the same arguments several times per second,time(0)
returns the same value, which is bad.Or we can move
mt19937(time(0))
out ofwork()
's body.Yeah I see. Updated! >w<
You can test any input with the code above! Just generate graphs (store edges as vector<pair<int, int>>, where (u, v) denotes an edge from $$$u$$$ to $$$v$$$), and put them into hamil::work. Initialization is done automatically, so you can test graphs in series.
This idea is also useful in some problems which are related to Hamiltonian path! In JOISC 2021 Day1 T1 Aerobatics, applying this idea will get >80 points easily.
I tried to apply this idea here and looks like it doesn't perform great for the bidirectional case, where not so many random edges added, so it won't work for very simple undirected graphs. For example, it fails to find the path for 5000 vertices and undirected edges $$$1-2-3-\ldots-N$$$ in 2 seconds.
I see. Seems like the algorithm works bad when there are lots of directional paths but few Hamiltonian paths(which is often true when bidirectional).
(It also depends on seed a lot. For 1-2-3-...-N with N = 5000, it runs rather fast with seed = 377921677545000 (the 1 out of 100 seeds that success at mx_ch = 1e6 lol))
But bidirectional case should be easier, as we can only restrict vertices to have $$$deg \le 2$$$ rather than $$$out \le 1$$$ and $$$in \le 1$$$. I'll write another code for bidirectional later. :)