I've listed my findings on experimenting with Python below, I hope they are useful to someone else who uses Python too.
TL;DR: Links for the master template and algorithms.
Python Version
Based on a prior blog post on Codeforces, it is quite conclusive that PyPy 2 is the fastest version of Python available on Codeforces.
I have not had any incident where my code could not pass in PyPy 2 while passing in a different version.
Input & Output
Though there are faster ways to read input in general, the fastest way to safely redefine input and print involves buffering the entire input and output. Please be careful of memory constraints in problems with small memory limits while using the following code.
from __future__ import division, print_function
import sys
from atexit import register
if sys.version_info[0] < 3:
from io import BytesIO as stream
else:
from io import StringIO as stream
sys.stdin = stream(sys.stdin.read())
input = lambda: sys.stdin.readline().rstrip('\r\n')
sys.stdout = stream()
register(lambda: sys.__stdout__.write(sys.stdout.getvalue()))
Fast Imports
Not all imports take an equal amount of time and memory at startup. The following list is a partially exhaustive list of what I have found to not take time and memory at startup in PyPy 2 on Codeforces. The commented code represents costly imports.
from __future__ import division, print_function
import cmath
import itertools
import math
import operator as op
# import random
import sys
from atexit import register
from bisect import bisect_left, bisect_right
# from collections import Counter, MutableSequence, defaultdict, deque
# from copy import deepcopy
# from decimal import Decimal
# from difflib import SequenceMatcher
# from fractions import Fraction
# from heapq import heappop, heappush
if sys.version_info[0] < 3:
# from cPickle import dumps
from io import BytesIO as stream
# from Queue import PriorityQueue, Queue
else:
# from functools import reduce
from io import StringIO as stream
from math import gcd
# from pickle import dumps
# from queue import PriorityQueue, Queue
Python 3 Compatability
In Python 3 dictionaries, range, filter, map, and zip all return iterators as opposed to lists in Python 2 and thus use less memory.
This can be easily fixed with the code below.
if sys.version_info[0] < 3:
class dict(dict):
"""dict() -> new empty dictionary"""
def items(self):
"""D.items() -> a set-like object providing a view on D's items"""
return dict.iteritems(self)
def keys(self):
"""D.keys() -> a set-like object providing a view on D's keys"""
return dict.iterkeys(self)
def values(self):
"""D.values() -> an object providing a view on D's values"""
return dict.itervalues(self)
def gcd(x, y):
"""gcd(x, y) -> int
greatest common divisor of x and y
"""
while y:
x, y = y, x % y
return x
input = raw_input
range = xrange
filter = itertools.ifilter
map = itertools.imap
zip = itertools.izip
Recursion Limit
In Python, we can use threading to increase the stack size to allow for large recursion limits, in PyPy this is not possible.
However, PyPy's docs give detail on how to resolve this issue. I've added some code below that should work for both.
def main():
pass
if __name__ == '__main__':
if 'PyPy' in sys.version:
def bootstrap(c):
callable, arg = c.switch()
while True:
to = continulet(lambda _, f, x: f(x), callable, arg)
callable, arg = c.switch(to=to)
c = continulet(bootstrap)
c.switch()
main()
else:
sys.setrecursionlimit(2097152)
threading.stack_size(134217728)
main_thread = threading.Thread(target=main)
main_thread.start()
main_thread.join()
Credits
Special Thanks to pajenegod and Mukundan314 for testing, debugging and techniques.
Please feel free to message/mail me with your findings too!