Fastest Scanner in Java

Revision en2, by wrick, 2015-10-25 10:08:36

In my previous post, I discussed a fast Scanner for Scala.

To benchmark it, I wrote the fastest possible Scanner I could in Java:

package better.files;

import java.io.BufferedReader;
import java.io.IOException;
import java.io.UncheckedIOException;
import java.util.Arrays;

/**
 * Hand built using a char buffer
 */
public class ArrayBufferScanner extends AbstractScanner {
  private char[] buffer = new char[1 << 4];
  private int pos = 0;

  private BufferedReader reader;

  public ArrayBufferScanner(BufferedReader reader) {
    super(reader);
    this.reader = reader;
  }

  @Override
  public boolean hasNext() {
    return pos != -1;
  }

  private void loadBuffer() {
    pos = 0;
    while (true) {
      int i;
      try {
        i = reader.read();
      } catch (IOException e) {
        throw new UncheckedIOException(e);
      }
      if (i == -1) {
        pos = -1;
        break;
      }
      char c = (char) i;
      if (c != ' ' && c != '\n' && c != '\t' && c != '\r' && c != '\f') {
        if (pos == buffer.length) {
          buffer = Arrays.copyOf(buffer, 2 * pos);
        }
        buffer[pos++] = c;
      } else if (pos != 0) {
        break;
      }
    }
  }

  @Override
  public String next() {
    loadBuffer();
    return String.copyValueOf(buffer, 0, pos);
  }

  @Override
  public String nextLine() {
    try {
      return reader.readLine();
    } catch (IOException e) {
      throw new UncheckedIOException(e);
    }
  }

  @Override
  public int nextInt() {
    loadBuffer();
    final int radix = 10;
    int result = 0;
    for (int i = buffer[0] == '-' || buffer[0] == '+' ? 1 : 0; i < pos; i++) {
      int digit = buffer[i] - '0';
      assert (0 <= digit && digit <= 9);
      result = result * radix + digit;
    }
    return buffer[0] == '-' ? -result : result;
  }
}

Is this the best I can do?

It is barely faster than Java's StreamTokenizer (NOT StringTokenizer) inspite of being much simpler than it: http://docs.oracle.com/javase/8/docs/api/java/io/StreamTokenizer.html

Java source: https://github.com/pathikrit/better-files/blob/master/benchmarks/src/main/java/better/files/ArrayBufferScanner.java

Other Scanners: https://github.com/pathikrit/better-files/blob/master/benchmarks/src/main/scala/better/files/Scanners.scala

Benchmark results: https://github.com/pathikrit/better-files/tree/master/benchmarks

Tags java, scala, scanner, input reading

History

 
 
 
 
Revisions
 
 
  Rev. Lang. By When Δ Comment
en3 English wrick 2015-10-25 10:15:43 245
en2 English wrick 2015-10-25 10:08:36 625
en1 English wrick 2015-10-22 08:32:45 2270 Initial revision (published)