PyPy2 vs PyPy3 - Codeforces

→ Обратите внимание

До соревнования
CodeTON Round 9 (Div. 1 + Div. 2, Rated, Prizes!)
19:29:46
Зарегистрироваться »

*есть доп. регистрация

→ Лидеры (рейтинг)

№	Пользователь	Рейтинг
1	tourist	4009
2	jiangly	3823
3	Benq	3738
4	Radewoosh	3633
5	jqdai0815	3620
6	orzdevinwang	3529
7	ecnerwala	3446
8	Um_nik	3396
9	ksun48	3390
10	gamegame	3386

Страны | Города | Организации

Всё →

→ Лидеры (вклад)

№	Пользователь	Вклад
1	cry	167
2	Um_nik	163
3	maomao90	162
3	atcoder_official	162
5	adamant	159
6	-is-this-fft-	158
7	awoo	157
8	TheScrasse	154
9	Dominater069	153
9	nor	153

Всё →

→ Найти пользователя

→ Прямой эфир

Детальнее →

Блог пользователя notTehlka

PyPy2 vs PyPy3

Автор notTehlka, 3 года назад, По-английски

In yesterday's contest's problem D2, my PyPy3 code runs in 4150ms and the same PyPy2 code runs in 2700ms. Can someone please explain why? And in what cases, PyPy2 is faster than PyPy3?

pajenegod c1729 Kiri8128 conqueror_of_tourist can help maybe.

#python, #doubt

notTehlka
3 года назад
15

Комментарии (14)

Показать архивные | Написать комментарий?

bjy

3 года назад, # |

-18

I mean, 'same' only applies if you treat the io templates as opaque black boxes (like I do by still not using them...?). Most py 3<2 performance issues can be attributed to 3 being unicode aware, whether it's the additional decode/encode calls or default int/str conversions.

A more direct work-on-the-bytes-like-c approach could maybe eek out some extra mileage, but my level of interest/desperation fell short of that (also during the contest I had a kid gnawing on my head telling me about her friend 'Buggy' who was an actual insect she had brought into the house -- so I commented out the input replacement as if it were a sync issue (ie cargo cult prayer) and it stupidly passed pretests, womp womp)

→ Ответить

pajenegod

3 года назад, # ^ |

I do not think the reason between the performance this time around is because of unicode. Here is a submission that almost entirely skips unicode in PyPy3 121736481, working only with bytestrings, and it runs slow.

→ Ответить

pajenegod

3 года назад, # |

← Rev. 2 →

+11

I've tested around a bit, and I found that only using os.write and os.read makes PyPy2 and PyPy3 run equally fast (121737546 and 121737672). So I think the culprit might be that PyPy3 has a slower implementation of BytesIO than PyPy2. But at the same time, it doesn't really make sense that the speed of BytesIO matters here (since the total amount of input is kind of tiny). My best guess is that something is badly optimized in PyPy3's implementation of BytesIO.

→ Ответить

at_f

3 года назад, # ^ |

I ran into this problem before for 1535E - Gold Transfer (which is also interactive).

Using the pyrival template TLE'd on pypy3 but passes on pypy2. Switching to os.write made it barely pass but there was still a large gap between pypy2 and pypy3 (pypy2 was 3s, pypy3 was 4.3s, and time limit was 4.5s). Using your suggestion of replacing os.read too finally made the gap disappear: 118470981 121755890 (note: using a implementation of input that I stole from you from another problem)

I wonder what the actual cause of the bug is and whether the pypy devs can fix it. It kind of sucks that pypy3 needs extra work to achieve the same performance as pypy2 for interactives (but at least there's a known workaround for it now, thanks!).

→ Ответить

pajenegod

3 года назад, # ^ |

In general PyPy should just have an overhaul of everything IO related. One of the big drawbacks of PyPy compared to CPython is that the built in IO is really not good. It is stupid that my fastIO code even speeds things up in the first place.

Here is an example of how PyPy really has unoptimized IO. Haven't you wondered why using input is much slower than readline() in PyPy (even though input should just be a wrapper to readline)? Well this kind of explains it:

Code example

import sys,os
class dummy:
    def flush(self):
        os.write(1, b'Flushing\n')
    def write(self, S):
        os.write(1, b'Writing: ' + str(S.encode('ascii')).encode('ascii') + b'\n')

sys.stdout = dummy()
os.write(1, b'Calling input\n')
input()
os.write(1, b'Done calling input\n')

Output CPython3

Calling input
Flushing
Done calling input
Flushing
Flushing

Output PyPy3

Calling input
Writing: b''
Writing: b''
Flushing
Done calling input
Flushing

→ Ответить

at_f

3 года назад, # ^ |

In your example is the issue the extra empty writes? I wonder if they are actually expensive. I remember looking at the system calls generated by pypy vs cpython and didn't see a significant difference. Command I used:

strace -Tfe trace=open,close,read,write,fsync,file python io-test.py < io-test.in

→ Ответить

notTehlka

3 года назад, # ^ |

To sum it all up, we should use PyRival template for normal problems and this IO for interactive. Am I right?

→ Ответить

at_f

3 года назад, # ^ |

Yea and it seems like you now have the fastest time on 1543D2 with it. That IO should probably be cleaned up and wrapped a little nicer before people copy and paste it all over the place though. (like rename input() to something like getint() since it's semantically very different from input)

Personally I don't like long IO templates so I will stick to using sys.stdin.buffer.readline and os.write unless it seems like TL is going to be a problem.

→ Ответить

FrozenKandy

3 года назад, # |

Hi, this may be out of topic but I am looking for a little help in python and this seems just the right place.

This problem Parsa's humongous tree, is throwing tle, and there is no accepted solution in python but in pypy3. Moreover, all accepted solutions in pypy uses stacks and not recursion (Is recursion slower?). My submission: Using recursion, Using stack.

Please correct me if I am doing something wrong, or is this because of the tight time limit? I would be really grateful for the help.

→ Ответить

Chef_17

3 года назад, # ^ |

I had the same issue during the contest- pypy3 was giving WA (stack limit) , Python 3 gave me TLE/WA- Testcase 4 (tried threading too link) and then finally I implemented the same logic in c++ and got AC (link 10 mins after the contest lol :) ).

Can someone help in writing a bootstrap version of DFS ( bypass stack limit ref) — cuz I didn't had any return value in dfs function call , so what was supposed to be yield ?.

→ Ответить

pajenegod

3 года назад, # ^ |

I've explained about no return value before here.

I tried adding bootstrap to your code 121758780. Unfortunately it gets TLE because the implementation is pretty slow, and also because Cf runs 32 bit PyPy (causing your implementation to use big integers everywhere). Hopefully Cf will be switching to 64 bit in the near future (link).

→ Ответить