Блог пользователя ed1d1a8d

Автор ed1d1a8d, 11 лет назад, По-английски

Here's the problem: SUB_PROB

I'm having trouble getting my solution to run in time, even though my algorithm should pass. If you didn't already read the problem statement, it asks you to check whether or not words exist in a piece of text. I'm currently using Aho-Corasick to solve the problem. Basically, I build up the search trie first in O(N*S), and then search through the text once in O(M). This should add up to a O(M + N*S) algorithm, which by my calculations — should AC.

Here is my code: OLD http://ideone.com/ndQcN4 OLD

EDIT — HERE IS UPDATED CODE: http://ideone.com/3P7XIj

Note: I had to split up the query strings into 3 batches since my trie would exceed the memory limit otherwise.

Is my algorithm flawed, or is my implementation not optimal. And if it is the latter case, what can I do to speed up the algorithm. I have already removed all recursive methods in exchange for iterative ones, so I don't know how I can speed it up further. Thanks.

  • Проголосовать: нравится
  • +1
  • Проголосовать: не нравится

»
11 лет назад, # |
Rev. 4   Проголосовать: нравится +3 Проголосовать: не нравится
  1. what is vector repeats[MAX_N]; ? use bitmasks to remember set of words for terminating vertices. it takes only 12.5 MB memory

struct node { int val; int terminal; int parent; map<char, int> childrens; int fail; }; or struct node { int val; int terminal; int parent; int fail; }; map<pair<int, char>, int> childrens; ... for (map<pair<int, char>, int>::iterator ch = childrens.lower_bound(pair<int, char>(x, 0)); ch != childrens.end() && ch->first.first == x; ++ch) // for_each_chilren_of(x)
  • »
    »
    11 лет назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    I went about and implemented your suggestions, but it still gets TLE. Although now, it does take up less memory (only 79MB total). As for your first suggestion, I don't really need to save that space so I stuck with using vector<int> repeats[MAX_N]. In addition, I also reverted scanf and printf to cin and cout since it made no difference when testing it with large cases on my own computer. Is there a trick I'm missing in Aho-Corasick, or should I approach the problem differently, maybe using Suffix Tree?

    Here is my updated code: http://ideone.com/3P7XIj

    Thanks for your help.

    • »
      »
      »
      11 лет назад, # ^ |
      Rev. 5   Проголосовать: нравится 0 Проголосовать: не нравится
      1. I don't undesrtand what is array "repeats". I think it can be very large and slow. use bitmasks.

      2. if(aho[idx].children.count(val) > 0) return aho[idx].children[val]; the is two searches, but you need only one.

      map<int, int>::iterator it = aho[idx].chilrdern.count(val);
      if (it != aho[idx].chilrdern.end()) return it->second;
      
      

      3 scanf & printf is faster than cin & cout

    • »
      »
      »
      11 лет назад, # ^ |
      Rev. 3   Проголосовать: нравится 0 Проголосовать: не нравится

      also you can store trie using hash-table, it is fast enough. code
      UPD: thre is a bug at line 81. has to be int hp = hash(parent, val);

»
4 года назад, # |
  Проголосовать: нравится 0 Проголосовать: не нравится

can someone tell why i'm getting wa, used string hashing link to code: https://ideone.com/oRX1SS

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    Firstly the strings will also contain numeric characters and secondly you have to consider 'A' and 'a' as different character.

  • »
    »
    4 года назад, # ^ |
      Проголосовать: нравится 0 Проголосовать: не нравится

    I have tried to solve the problem using string hashing. But I got TLE. I have solved the problem with O(n*m) complexity. How I can optimize it? Here is my code -- https://ideone.com/zrInz2