Here are some implementation to output numbers I found and the time to write first 10.000.000 numbers (base on Codeforces Custom Test)
Putchar Recursive Implementation: 3525ms
Putchar Non-recursive Implementation: 3525ms
Putchar Reverse Implementation: 3493ms
Printf Implementation: 1762ms
synchronized(off) Implementation: 1356ms
synchronized(true) Implementation: 1060ms
Normal Implementation: 1045ms
fwrite buffer Implementation: 545ms