1B - Spreadsheet was extremely challenging for me; it took me over 2 hours to get AC. I didn't find the editorial helpful, and then when I went to see practice solutions from red/orange people, it made even less sense (they were basically copy/paste with minor variation):
I'll talk about each part of the problem itself and approaches/alternatives, and then explain how the above solution (henceforth "the solution") handles it. I'll go into unnecessary detail since I'm very new to CP/CF and would've wanted this level of detail myself.
Problem statement
Input: First line is an int representing the number of test cases. Each following line is a string of characters specifying a spreadsheet cell, in one of two formats:
- "Excel":
^[A-Z]+[0-9]+\$
(e.g.BB352
). - "RC":
^R[0-9]+C[0-9]+\$
(e.g.R253C55
).
Output: For test case, we must output the same cell specified in the opposite format. Example:
2
R23C55
BC23
should output
BC23
R23C55
Constraints:
- Between 1 and 10000 test cases (all valid in either format)
- In each case, the row and column numbers are no larger than 1000000 (so each line is 16 chars at longest since we never exceed
R1000000C1000000
/BDWGN1000000
).
Approach
Our implementation needs to do 3 things:
- Convert from Excel to RC
- Convert from RC to Excel
- Determine which format the input is in, and do the right conversion on it
The heart of this problem is a change of base; RC-format uses two base-10 numbers, whereas Excel uses one for rows and a quasi-base-26 character string for columns.
Determine format
Note that for RC format, we will always find a C
after some numbers in the string, whereas with Excel we will never find any letters after numbers. We could use this as a test of format with something like:
The solution does this in a better way:
The string %*c%d%*c%d
will match on the RC format, but not Excel; %*c
matches on and throws away any non-numeric characters, whereas %d
matches on numeric characters and stores them in the respective variable. For an RC string, R
and C
are thrown away but the numbers following them are stored in x
and y
(so (sscanf()
will return 2
). On an Excel string, this won't match so nothing is stored in x
or y
(and sscanf()
will return 0
).
Excel to RC conversion
This is the easier conversion, conceptually. We can copy the row portion and print it as-is. For the column portion, we have to convert from base-26 to base-10. A base-10 number represents addition of successive powers of 10, each times a constant: 12345
in base-10 means 5*(10^0) + 4*(10^1) + 3*(10^2) + 2*(10^3) + 1*(10^4)
. So for the column string, we can map each letter to a number (i.e. A=1
, B=2
, ... , Z=26
) and then associate each with a power of 26
. Consider RZA
: A=1
, Z=26
, and R=18
, so we can write it as 1*(26^0) + 26*(26^1) + 18*(26^2) = 12845
.
We have a problem: C++ has no built-in for integer exponentiation. There are a couple workarounds. We could try writing one ourselves:
```