Monday 6 February 2017

Do not use Scanner

The Scanner class is often used in programming tutorials and beginner code. It's often used to read user input from System.in or from a text file. But it is a bad class and should be avoided. Here I try to explain why.

The main problem is that it violates the single responsibility principle. The scanner class is doing the following things:
  • reading data from a file or an InputStream
  • interpreting this data using a charset
  • detecting line endings
  • using complex regular expressions to (optionally) parse the data into Integers, Doubles, BigIntegers and so on.
Ideally one class should do one thing and do it well. Splitting a problem into easy manageable subproblems is a key aspect of programming. The Scanner class fails this by trying to do all these things at once.

Because of that, beginners (and experts) are often confused by the Scanner class and rightfully so. For example nextInt() only consumes line endings if it needs to, but nextLine() does always, so if you try to read mixed data, for example an Integer and then a String from the console, you will easily run into problems. For example reading a line after reading an Integer from the console will not read the next line but just the not consumed line ending of the previous line. This is explained here in detail.

Additionally, the Scanner class has really weird exception handling in that it silently eats IOExceptions when reading values and only optionally returns the exception with the ioException() method, which makes it easy to miss exceptions and read incomplete input accidentally.

What should you do instead? As usual, divide a problem into easily manageable subproblems.

Use a BufferedReader to read lines from an input source. The input source could be for example a InputStreamReader reading data from an InputStream in a specific encoding. If reading from the console, remember that System.in is also an InputStream. Or simply use Files.readAllLines(...) when reading from a file.

Then if you have the line, split it into multiple elements if you need. You could use String.split(), but the Splitter class from Google Guava works better for many reasons. If you only need a single value per line, you need neither.

And then use for example Integer.parseInt() to convert the String into whatever data type you need. Or if you need a String, simply use it.

In the end, this sounds more complicated than using a Scanner and you may need a little bit more code, but as a programmer you need to know anyway how to read a file or how to split data or how to convert a String into numbers, so that should be easy. You do not need to know how to use a Scanner (except to understand stupid programming tutorials). And by doing the splitting and parsing yourself, you have much more control about the process and the handling of possible errors.

No comments:

Post a Comment