Rating changes for last rounds are temporarily rolled back. They will be returned soon. ×

Little_Sheep_Yawn's blog

By Little_Sheep_Yawn, history, 8 months ago, In English

I was practicing when I came across this problem. And I submitted a few codes which are almost the same, but their running time is quite different. Here's the thing:

The fastest one is here, which only runs 1590ms:

n, mod = map(int, input().split())

dp = [0] * (n * (n - 1) + 1)
diff = [0] * (n * (n - 1) + 1)
msk = n * (n - 1) // 2
v = 1

wing = 0
for i in range(n):
    
    for j in range(msk - wing, msk + wing + 1):
        if dp[j]:
            diff[j] += dp[j]
            diff[j + (n - i - 1) + 1] -= dp[j]
    
    cur = 0
    for j in range(msk - wing, msk + wing + n - i):
        cur += diff[j]
        cur %= mod
        dp[j] = cur
        diff[j] = 0
    
    for j in range(msk - wing, msk + wing + n - i):
        if dp[j]:
            diff[j - (n - i - 1)] += dp[j]
            diff[j + 1] -= dp[j]
    
    cur = 0
    for j in range(msk - wing - (n - i - 1), msk + wing + n - i):
        cur += diff[j]
        cur %= mod
        dp[j] = cur
        diff[j] = 0
    
    for j in range(-(n - i - 1), 0):
        dp[j + msk] += v * (n - i + j) % mod
        dp[j + msk] %= mod
    
    v = v * (n - i) % mod
    wing += n - i - 1

print(sum(dp[msk+1:]) % mod)

Yet, another code which is basically the same as the former one needs 3743ms:

def main():
    n, mod = map(int, input().split())

    dp = [0] * (n * (n - 1) + 1)
    diff = [0] * (n * (n - 1) + 1)
    msk = n * (n - 1) // 2
    v = 1
    
    wing = 0
    for i in range(n):
        
        for j in range(msk - wing, msk + wing + 1):
            if dp[j]:
                diff[j] += dp[j]
                diff[j + (n - i - 1) + 1] -= dp[j]
        
        cur = 0
        for j in range(msk - wing, msk + wing + n - i):
            cur += diff[j]
            cur %= mod
            dp[j] = cur
            diff[j] = 0
        
        for j in range(msk - wing, msk + wing + n - i):
            if dp[j]:
                diff[j - (n - i - 1)] += dp[j]
                diff[j + 1] -= dp[j]
        
        cur = 0
        for j in range(msk - wing - (n - i - 1), msk + wing + n - i):
            cur += diff[j]
            cur %= mod
            dp[j] = cur
            diff[j] = 0
        
        for j in range(-(n - i - 1), 0):
            dp[j + msk] += v * (n - i + j) % mod
            dp[j + msk] %= mod
        
        v = v * (n - i) % mod
        wing += n - i - 1
    
    print(sum(dp[msk+1:]) % mod)
    return

t = 1
for _ in range(t):
    main()

I'm quite puzzled as these two codes seems nothing too different. I really wonder how this could happen as it could cause some unexpected TLEs in other questions. It would really help a lot if you could offer some insights on it.

Thanks in advance!

  • Vote: I like it
  • +11
  • Vote: I do not like it

»
8 months ago, # |
  Vote: I like it 0 Vote: I do not like it

When I try Python3 the speed seems reversed. There are articles explaining why for CPython codes within a function are faster than global, but pypy might be doing something different to produce the opposite results...