For almost a decade now, I've been told to avoid using +=
on strings in Java and Python. Because they are immutable, the +=
creates an entirely new string and has to copy over all the characters from the left-hand side. Thus, putting a +=
in a loop could have your program's complexity degrade into quadratic in some cases if you're not careful.
But then just last week, I learned that CPython actually hands string concatenation smartly! On the Custom Test tab here on Codeforces, the following code runs in just 280ms when submitted to Python 3.8.10.
s = ''
for _ in range(10**6):
s += 'a'
print(len(s))
(In contrast, it badly TLEs when submitting to PyPy 3.7, so I suppose that's one thing Python has over PyPy, but it's not like join
was too hard to do, though, anyway).
Was this a recent change? How long have I been living a lie?
https://docs.python.org/2.4/lib/typesseq.html
i get that this is irrelevant to the blog, but did you know that you can cin character arrays. i have also been living under a lie my whole life.
you can also read from the k-th element of the char array
for example:
will read from s[1] (which kinda makes sense since ig operator<< and operator>> are overloaded for char*)
also, you can do the same with cout as well (printing from the k-th character of a char array)
how so orz sir
You still need to assume it's quadratic, otherwise you're just gonna be badly surprised one day. Python and assessing critical performance is a match made in hell.
PyPy talks about it in its documentation.
On the subject of how brittle this feature can be in CPython. The following code will do a slow string concatination
The reason why CPython can't optimize it is because
+=
forA[0]
is equvivalent toNote that both x and A[0] contains references to the string here, so the concatination will be slow. It is fast iff there is only a single reference to the string. So in order to make it into a fast string concatination, you need to do something like
Since Aug 7, 2004 .