Recently I gave a second thought on the solution of 98E and got confused about some technical detail. It occurs to me that there may be something wrong with the problem.
In the editorial, it claims that the expected utility for "bluffing and move on" should be 1-P(n,m-1) and the reason was "In the same manner we fill other cells of the matrix.". Obviously not a compelling reason.
I guess the writer's proof was something like this: Since the opponent chooses to move on. It must lose when I'm not bluffing. So it chooses to behave optimally in the case that I am bluffing which is equivalent to the case that I showed and discarded this card.
The problem is that they are actually not equivalent because I did not discard that card. Since f[n][m] may not coincide with 1 — f[m][n], I can bluff about this card even though the opponent knows I'm bluffing. I waste a step and change the state to 1 — f[m][n] instead, and I may actually benefit from it. The cards known by the opponent but not discarded matters!
So now the claim made in the editorial seems quite suspicious. I'm looking forward to a clear and intuitive explanation. Any idea ?