Haskell solution to 702E

Правка en1, от codeonwort, 2016-09-21 23:40:10

Educational Round 15 — Problem 702E (Analysis of Pathes on Functional Graph): http://codeforces.net/problemset/problem/702/E
The official editorial: http://codeforces.net/blog/entry/46324?locale=en

This article is not about another solution, but how to implement the solution that editorial provides with Haskell, efficiently. After reading the editorial I tried to solve it with Haskell, but I got memory limit exceed on test 28: http://codeforces.net/contest/702/submission/19550505

First I wondered if my code is wrong, so I ported my Haskell code to C++ and it was accepted (0.7ms, 138MB): http://codeforces.net/contest/702/submission/20682955

So problem is the nature of Haskell — strong type system and rich expressivity at the cost of excessive memory usage and low performance. Clearly heavy memory usage came from all that state nodes. each state is an array of nodes. There can be at most 10^5 nodes and 34 states (k <= 10^10 and 10^10 is 34 characters in binary representation).

Each node consists of (next node, total weight, minimum weight) — whose type is Int, Int64, and Int64 respectively. In C++, this is (4+8+8 bytes) * 10^5 * 34 = 64.85MB. How many memory is needed for a node in Haskell?

type Node = (Int, Int64, Int64)

Node has a bunch of pointers and wrappers so one Node consumes 10 words. (You can see Haskell types' memory footprints here: https://www.youtube.com/watch?v=_pDUq0nNjhI) A word is 8 bytes on a 64-bit machine, so one Node takes 80 bytes. To hold all states, we need 259MB. This is only for holding states. Other operations need their memory therefore memory limit (512MB) is exceeded.

To save memory I unpacked all fields of Node:

{-# LANGUAGE BangPatterns #-}
data Node = Node {-# UNPACK #-} !Int {-# UNPACK #-} !Int64 {-# UNPACK #-} !Int64

It removes all pointers and wrappers inside a Node. Regarding constructor overhead, now a Node takes 4 words = 32 bytes. Now total states consume 103MB and total memory usage of the program is about 210MB. But it's not over. TLE on test 35 (n = 10^5, k = 10^10) :(

Again what matters is Haskell's powerful feature; lazy evaluation. Our states are memory-heavy and they need a lot of numeric calculations. There is no unnecessary work and we have to utilize all of the evaluated results. This is not a place for lazy evaluation, but for strict evaluation.

To force evaluation of states, I tried evaluate in the Control.Exception module: http://codeforces.net/contest/702/submission/20793048

evaluate :: a -> IO a

But there was no improvement. Stucked here, I was going mad but realized that evaluate only evaluates to weak head normal form. To fully evaluate something we need force in Control.DeepSeq.

To force a Node, Node should be an instance of NFData.

import Control.DeepSeq
instance NFData Node where
    rnf (Node a b c) = rnf a `seq` rnf b `seq` rnf c

Finally it was accepted (1.84s, 242MB): http://codeforces.net/contest/702/submission/20805671

With all my efforts, My Haskell solution is 2.6 times slower than C++ one which I wrote quite easy. The problem's time constraint is 2 seconds and Haskell version took 1.84s. So tough!

This problem made me give up to use Haskell on contests. Maybe I have to use Haskell only for practice :(

История

 
 
 
 
Правки
 
 
  Rev. Язык Кто Когда Δ Комментарий
en1 Английский codeonwort 2016-09-21 23:40:10 3395 Initial revision (published)