Hello,
I was implementing a simple Trie when I found a behavior that I didn't understand. I followed the program using a debugger, the issue seems to be in the insert
function. I explained the issue in code comments.
struct Trie{
struct Node{
int id, nxt[26];
Node(int id) : id(id)
{
memset(nxt, -1, sizeof(nxt));
}
};
vector<Node> v;
vector<bool> leaf;
void init()
{
v.clear();
leaf.clear();
addNode();
}
int addNode()
{
v.push_back(v.size());
leaf.push_back(false);
return (int)(v.size()) - 1;
}
int getSizeMinusOne()
{
return (int)(v.size()) - 1;
}
void insert(const string &s)
{
int cur = 0;
for (char c : s)
{
c -= 'a';
if (v[cur].nxt[c] == -1)
{
// Note I only use one of the following NOT both at the same time
// Approach (1)
// v[cur].nxt[c] doesn't get updated. Stays -1 and RTEs later on due to out of boundary access
v[cur].nxt[c] = addNode();
// Approach (2)
// works fine
addNode();
v[cur].nxt[c] = getSizeMinusOne();
}
cur = v[cur].nxt[c];
}
leaf[cur] = true;
}
};
I also found out that both approaches seem to work find when changing the array nxt
to a vector instead of an array. Could someone explain why this happens when nxt
is an array? And why approach (2) works fine even if nxt
is an array?
Auto comment: topic has been updated by Bekh (previous revision, new revision, compare).
Bad stuff happens when a vector resizes. Saving
addNode()
to an intermediate variable and then settingv[cur].nxt[c]
equal to that variable suffices (or what you did in approach 2).Could you provide more insight on what kind of bad stuff happen? When do the bad stuff effects fade and I can safely use the vector again? Why is the program not affected by the bad stuff if
nxt
is a vector instead of an array (Or is it just a coincidence?)?I asked ecnerwala about something similar a while back, I think the same reasoning applies (although I can't say that I understand it that well).
Aha, got you. And yes, I'm using C++14. Calling
v.reserve()
with a big amount before running the code works fine so this is indeed the issue.And it seems to evaluate the rhs first when
nxt
is a vector for some reason lol (Tried generating many random strings and inserting them to make sure pointers are indeed getting invalidated).Thank you for the explanation.