DP with probabilities and cycles: theory behind solution and similar problems?

#	User	Rating
1	tourist	3985
2	jiangly	3814
3	jqdai0815	3682
4	Benq	3529
5	orzdevinwang	3526
6	ksun48	3517
7	Radewoosh	3410
8	hos.lyric	3399
9	ecnerwala	3392
9	Um_nik	3392

#	User	Contrib.
1	cry	169
2	maomao90	162
2	Um_nik	162
4	atcoder_official	161
5	djm03178	158
6	-is-this-fft-	157
7	adamant	155
8	awoo	154
8	Dominater069	154
10	luogu_official	150

I was trying to solve this problem, which requires computing probabilities of winning/losing in a turn-based two-player game. It's not hard to come up with a recurrent solution that pretty much resembles a DP, except for the fact that the states can repeat down the recurrence, producing cycles. Not knowing how to solve it, I resorted to this comment which describes the expected solution. In short, the idea is to have two arrays lo[x][y][z] and hi[x][y][z] that will maintain the lower and upper bounds respectively of the actual probability dp[x][y][z], and basically by initializing lo[x][y][z] to 0, hi[x][y][z] to 1.0 and then updating the arrays in function of each other in a smart way upon multiple iterations, the values "magically" converge to the actual probabilities with enough accuracy and fast enough given the time constraint. I just confirmed this approach works by getting accepted in the problem. However, I still don't understand why it works. So these are my questions:

Why does this algorithm work? Is there a broader theory that explains it or is this just an adhoc solution?
Why does this converge fast enough?
Are there similar problems out there to practice this approach?

Thank you very much.