In my previous post, I discussed a fast Scanner for Scala.
To benchmark it, I wrote the fastest possible Scanner I could in Java:
package better.files;
import java.io.BufferedReader;
import java.io.IOException;
import java.io.UncheckedIOException;
import java.util.Arrays;
/**
* Hand built using a char buffer
*/
public class ArrayBufferScanner extends AbstractScanner {
private char[] buffer = new char[1 << 4];
private int pos = 0;
private BufferedReader reader;
public ArrayBufferScanner(BufferedReader reader) {
super(reader);
this.reader = reader;
}
@Override
public boolean hasNext() {
return pos != -1;
}
private void loadBuffer() {
pos = 0;
while (true) {
int i;
try {
i = reader.read();
} catch (IOException e) {
throw new UncheckedIOException(e);
}
if (i == -1) {
pos = -1;
break;
}
char c = (char) i;
if (c != ' ' && c != '\n' && c != '\t' && c != '\r' && c != '\f') {
if (pos == buffer.length) {
buffer = Arrays.copyOf(buffer, 2 * pos);
}
buffer[pos++] = c;
} else if (pos != 0) {
break;
}
}
}
@Override
public String next() {
loadBuffer();
return String.copyValueOf(buffer, 0, pos);
}
@Override
public String nextLine() {
try {
return reader.readLine();
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}
@Override
public int nextInt() {
loadBuffer();
final int radix = 10;
int result = 0;
for (int i = buffer[0] == '-' || buffer[0] == '+' ? 1 : 0; i < pos; i++) {
int digit = buffer[i] - '0';
assert (0 <= digit && digit <= 9);
result = result * radix + digit;
}
return buffer[0] == '-' ? -result : result;
}
}
Is this the best I can do?
It is barely faster than Java's StreamTokenizer (NOT StringTokenizer) inspite of being much simpler than it: http://docs.oracle.com/javase/8/docs/api/java/io/StreamTokenizer.html
Java source: https://github.com/pathikrit/better-files/blob/master/benchmarks/src/main/java/better/files/ArrayBufferScanner.java
Other Scanners: https://github.com/pathikrit/better-files/blob/master/benchmarks/src/main/scala/better/files/Scanners.scala
Benchmark results: https://github.com/pathikrit/better-files/tree/master/benchmarks