Blog entries - Codeforces

#	User	Rating
1	tourist	3856
2	jiangly	3747
3	orzdevinwang	3706
4	jqdai0815	3682
5	ksun48	3591
6	gamegame	3477
7	Benq	3468
8	Radewoosh	3462
9	ecnerwala	3451
10	heuristica	3431

#	User	Contrib.
1	cry	167
2	-is-this-fft-	162
3	Dominater069	160
4	Um_nik	158
5	atcoder_official	157
6	Qingyu	156
7	adamant	151
7	djm03178	151
7	luogu_official	151
10	awoo	146

Writing an Efficient Code in Codeforces is important for a problem to be Accepted. We will compare your different code runtimes by implementing them in simple standard libraries and then implementing the same code using a buffer reader, creating the necessary String Token, and changing the int variables to Long. Then, we will understand why the Buffered one is faster than the other.

Why does Scanner take longer compared to BufferedReader and DataInputStream? The primary reason is that the Scanner reads smaller bits of information from the input stream one at a time without buffering. This means that the process of reading from the input stream, which is inherently slow, must be repeated multiple times. Each read operation involves significant overhead, making the process inefficient. In contrast, BufferedReader and DataInputStream use buffering to enhance performance. Buffered input works by reading large chunks of data from the file into a buffer in one go. The program then reads from this Buffer, significantly reducing the time it needs to access the slow input stream. Also, while directly managing buffers, I can optimize memory access patterns to suit specific needs, which reduces cache misses and memory fragmentation. Standard libraries might not be as finely tuned for the particular use case.

This method minimizes the bottleneck caused by the input stream read operations, making the overall reading process much faster. I have made two sample codes, one without implementing Buffer and token and the other being implemented using Buffer and token. Check the equivalent code, and then I will attach a screenshot of their runtime.

The below Java code iterates an input array through its length and then counts the frequency of each element in the array. For this, it uses Hashmap with keys as the distinct elements of the input array and corresponding values as the frequency of that element in the array. The spoiler below is a test of the code without any use of buffer, token, or long variable.

testWithoutBuffer

import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws IOException {
        BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(System.in));
        String inputLine = bufferedReader.readLine();
        StringTokenizer tokenizer = new StringTokenizer(inputLine);
        int n = tokenizer.countTokens();
        long[] input = new long[n];
        for (int i = 0; i < n; i++) {
            input[i] = Long.parseLong(tokenizer.nextToken());
        }
        printFrequency(input);
    }

    static void printFrequency(long[] arr) {
        HashMap<Long, Integer> freq = new HashMap<>();
        for (long num : arr) {
            freq.put(num, freq.getOrDefault(num, 0) + 1);
        }
        for (Map.Entry<Long, Integer> entry : freq.entrySet()) {
            System.out.println(entry.getKey() + " " + entry.getValue());
        }
    }
}

testWithBuffer

import java.io.*;
import java.util.*;

public class Main {
    public static void main(String[] args) throws IOException {
        BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(System.in));
        String inputLine = bufferedReader.readLine();
        StringTokenizer tokenizer = new StringTokenizer(inputLine);
        int n = tokenizer.countTokens();
        long[] input = new long[n];
        for (int i = 0; i < n; i++) {
            input[i] = Long.parseLong(tokenizer.nextToken());
        }
        printFrequency(input);
    }

    static void printFrequency(long[] arr) {
        HashMap<Long, Integer> freq = new HashMap<>();
        for (long num : arr) {
            freq.put(num, freq.getOrDefault(num, 0) + 1);
        }
        for (Map.Entry<Long, Integer> entry : freq.entrySet()) {
            System.out.println(entry.getKey() + " " + entry.getValue());
        }
    }
}

The time elapsed by both codes in there run time is shown below.

The result for the above case is that buffer was not used.

The result for the above case is that buffer was used.

For I/O, Buffer management can reduce the overhead associated with them. One can batch I/O operations using a buffer, reducing the time spent on costly I/O system calls. Higher-level abstractions in standard libraries introduce penalties in terms of performance. Direct manipulation using tokens and specific data types like long avoids these penalties by working closer to the hardware level.

For Big Code, Buffer significantly reduces the time. The reduction can go up to 3x and 4x. The image below represents the time that these two cases elapsed when handling a significantly competitive problem.

Does Using Buffer always Optimal and more comprehensive?

Buffers consume memory to store the data read from the input stream. The additional memory usage might be a concern for applications with stringent memory constraints. The buffering overhead might introduce latency for applications that frequently read small amounts of data or require immediate input processing (like real-time systems). In such cases, unbuffered reads might be more responsive.

While buffers like BufferedReader in Java are often more optimal and comprehensive, it is not a one-size-fits-all solution. The decision to use buffered input should be based on the application's specific requirements, considering factors like data size, memory constraints, latency requirements, and complexity.

References

To understand the input cases between these.

Implementing a More efficient code in Java using Buffer, Token, and Long

Full text and comments »

Flick__'s blog

Does Using Buffer always Optimal and more comprehensive?