Jomax100's blog

By Jomax100, history, 6 months ago, In English

According to https://cplusplus.com/reference/string/string/substr/ Complexity = Unspecified, but generally linear in the length of the returned object.

However, I believe in practice it's much faster, specially for repeated calls with same start_pos.

Example problem: https://leetcode.com/contest/weekly-contest-377/problems/minimum-cost-to-convert-string-ii/

Solution from contest winner below

Solution

My analysis of the time complexity for the code above: I think substr() call should result in timeout. STL says complexity of substr(x, len) = len. Therefore, the dp loop is n * lens.size() * max_len where, n = source.size(), and max_len = max(lens[i]) for all i.

Eg. in the case where n = 1000, and we have lens = [900, 901, ..., 999]. Therefore,

  1. Outer loop > for (int i = 0; i < n; ++i) n = 1000,

  2. Inner loop > for (int l : lens), lens = [900, 901, ..., 999]

  3. Inside inner loop. we call substr(st, l), in O(l). But max(l) = n

Thus, since max(l) = max_len = 999,

  • Time Complexity = n * lens.size() * max_len

  • Time Complexity = n * lens.size() * n

  • Time Complexity = 1000*100*1000, which should TLE

There must be something going on making substr() more efficient. My guess is caching susbtr() calls so substr(i, x+d), uses previously queried substr(i, x),

Would love to understand more about the optimization going on in substr(). Or would this solution always give TLE for this test case, indicating that it could be hacked (even if not supported in Leetcode)?

Only thing I found is from https://stackoverflow.com/questions/4679746/time-complexity-of-javas-substring

stackoverflow
  • Vote: I like it
  • +4
  • Vote: I do not like it

»
6 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by Jomax100 (previous revision, new revision, compare).

»
6 months ago, # |
  Vote: I like it 0 Vote: I do not like it
  1. $$$1000\cdot100\cdot1000=10^8$$$, which doesn't necessarily TLE.
  2. substr(i, len) creates a new copy with length len — that's impossible to do in $$$o(\mathrm{len})$$$.
  • »
    »
    6 months ago, # ^ |
    Rev. 2   Vote: I like it 0 Vote: I do not like it

    What about replacing

           for(int len: lens){
                    if(i+len > n) break;
                    string cur = source.substr(i, len);
                    string need = target.substr(i, len);
    

    with the following

           string s, t;
           for(int len = 1; len <= lens.back(); len++){
                    if(i+len > n) break;
                    s += source[i+len-1];
                    t += target[i+len-1];
    

    Previously inner loop = lens * n = 100*1000

    Now inner loop might be n since max(len) <= n

    However, inside the inner loop, we compute substr() in O(1), and the dominant term is the O(log(200)) call to label.find(s) in the map<string,int>.

    So inner loop = n * log(200)

    Final ans = n*n*log(200) = 1000*1000*8

    But that does TLE

    Maybe label.find(s) has it's time complexity log(label.size()) + some cost related to s.size()?

»
6 months ago, # |
Rev. 2   Vote: I like it +3 Vote: I do not like it

Actually tried replacing substr() with manual implementation and, while slower, still got Accepted:

Before

string cur = source.substr(i, cur_size);
string need = target.substr(i, cur_size);

After:

forn(j,cur_size) cur += source[i+j];
forn(j,cur_size) need += target[i+j];

So I guess it must be weak test cases.