1) The StringTokenizer is legacy, Prefer split() as more chances of its performance getting improved as happens in Java 7.
2) The StringTokenizer doesn't support regular expression, while spilt() does. However, you need to be careful, because every time you call split, it creates a new Pattern object and compiles expression into a pattern. This means if you are using the same pattern with different input, then consider using Pattern.split() method, because compiling a pattern takes more time later to check whether a given string matches a pattern or not.
3) The String.split() method returns an array (String[]) and Tokenizer returns one token at a time. which makes it easy to use a foreach loop:
for (String token : input.split("\\s+") { ... }
4) The StringTokenizer doesn't handle empty strings well. But split() does. like if you need to parse empty tokens, like a comma-separated line like
one,,three,,,six
Where the field values are "one", "", "three", "", "" and "six" where the three empty strings are indicated by the commas with nothing between them - that's a lot more work with a StringTokenizer.
By default, it gives you just "one", "three", "six" and skips the empties. You can use a special constructor that takes a boolean to tell the StringTokenizer to return delimiters, but that gets complicated too. I'll skip the details. It's much easier to use split(","), which immediately returns {"one", "", "three", "", "", "six"}, exactly right?
6) Actually, String.split() doesn't always compile the pattern. Look at the source if 1.7 java, you will see that there is a check if the pattern is a single character and not an escaped one, it will split the string without regexp, so it should be quite fast.
7) Since the String split builds a new Pattern every time, it is bound to be slower than StringTokenizer. If you have a lot of Strings to operate on, creating the Pattern once and using the Pattern split() method would be the way to go for maximum speed.
8) For StringTokenizer, there is a constructor with a parameter that allows you to specify possible delimiter characters.
9) Here is the code which uses StringTokenizer to split a String into multiple small strings:
StringTokenizer st = new StringTokenizer("this is a test");
while(st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
String[] result = "this is a test".split("\\s");
for (int x=0; x < result.length(); x++){
System.out.println(result[x]);
}
Output
this
is
a
test
That's all about the difference between StringTokenizer and the Split method in Java. You can see that both provide elegant ways to split a big string into multiple String based upon specific delimiter, but StringTokenizer is legacy and you should avoid that. Prefer split() method of String class whenever possible as it also supports regular expression.
0 comments:
Post a Comment