lvisbl_'s blog

By lvisbl_, history, 7 months ago, In English

Hello mates, I'm trying to solve this DP question Coin Combinations II from CSES: https://cses.fi/problemset/task/1636/

Initially, I define an array dp[i][j]: meaning number of ways to form sum i, starting choosing coins from j-th index. And this solution using this DP array get me TLE.

But if I switch the two dimension, array dp[i][j]: number of ways to form sum j, starting choosing coins from i-th index. This solution give me accepted. But why?

Note:

  • Two solution is 99% the same the only different is that I switch two dimensions.

  • Sum i at most 10^6 and there are at most 10^2 coins.

Thank in advance.

Accepted code
TLE code
  • Vote: I like it
  • +3
  • Vote: I do not like it

»
7 months ago, # |
  Vote: I like it 0 Vote: I do not like it

Auto comment: topic has been updated by lvisbl_ (previous revision, new revision, compare).

»
7 months ago, # |
Rev. 3   Vote: I like it +1 Vote: I do not like it

In C/C++ (see here), a 2D array (or vector) will be stored in a row major format, i.e., an array arr[N][M] will be stored as N collections of M sized contiguous memory blocks. This might not be the case for other languages like Java, see here.

Now, when your program is run by the processor, and you access arr[i][j], the cache (which holds copies of the main memory for fast access by the processor) loads some contiguous blocks of memory due to their spatial locality, and thus, arr[i][j+1], arr[i][j+2], ... will be accessed much faster than say arr[i+1][j] or arr[i+2][j]. If the array size is large enough it can cause a significant difference in the efficiency of your code.

Hope this answers your query!!

  • »
    »
    7 months ago, # ^ |
      Vote: I like it +3 Vote: I do not like it

    Wow! Thats cool.

  • »
    »
    7 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Would using a 2D vector help?

    • »
      »
      »
      7 months ago, # ^ |
      Rev. 2   Vote: I like it 0 Vote: I do not like it

      No, even for a dynamically allocated vector, each row will be stored in a contiguous fashion, see here, so accessing arr[i][j+1] is still faster that arr[i+1][j], if arr[i][j] has been accessed.

  • »
    »
    7 months ago, # ^ |
      Vote: I like it 0 Vote: I do not like it

    Thank you so much for clear explanation, this is what I'm looking for.