Navigation

Java strings 6 min read

StringTokenizer

StringTokenizer is a classic Java utility that breaks a string into smaller pieces — called tokens — based on one or more delimiter characters. It lives in java.util and has been part of Java since version 1.0, making it one of the oldest string-processing tools in the standard library.

Why StringTokenizer Exists

Before String.split() arrived in Java 1.4, StringTokenizer was the go-to way to parse delimited text like CSV lines, command strings, or configuration values. Today it is considered a legacy class, but it still appears in older codebases and is occasionally useful when you need a lightweight, allocation-friendly tokenizer without regex overhead.

Note: The official Java documentation itself recommends using String.split() or java.util.Scanner for new code. But understanding StringTokenizer is valuable both for reading older code and for the rare cases where its simpler model fits perfectly.

Creating a StringTokenizer

StringTokenizer has three constructors:

import java.util.StringTokenizer;

// 1. Default delimiter: whitespace (\t, \n, \r, \f, and space)
StringTokenizer st1 = new StringTokenizer("hello world java");

// 2. Custom delimiter string
StringTokenizer st2 = new StringTokenizer("red,green,blue", ",");

// 3. Custom delimiter + return delimiters as tokens (true = include them)
StringTokenizer st3 = new StringTokenizer("a:b:c", ":", true);

The second argument is a delimiter string, not a regex. Every character in that string is treated as an independent delimiter. So "," means “comma is a delimiter”, not “the string , is a delimiter pattern”.

Iterating Over Tokens

The primary API is a small set of methods:

Method	Returns	Description
`hasMoreTokens()`	`boolean`	`true` if more tokens remain
`nextToken()`	`String`	Returns the next token
`nextToken(String delim)`	`String`	Changes delimiter mid-stream, then returns next token
`countTokens()`	`int`	Estimates remaining tokens (without consuming them)
`hasMoreElements()`	`boolean`	Same as `hasMoreTokens()` (implements `Enumeration`)
`nextElement()`	`Object`	Same as `nextToken()` (implements `Enumeration`)

The classic usage pattern:

import java.util.StringTokenizer;

public class TokenExample {
    public static void main(String[] args) {
        String sentence = "Java is fun to learn";
        StringTokenizer st = new StringTokenizer(sentence);

        System.out.println("Token count: " + st.countTokens());

        while (st.hasMoreTokens()) {
            System.out.println(st.nextToken());
        }
    }
}

Output:

Token count: 5
Java
is
fun
to
learn

Using a Custom Delimiter

import java.util.StringTokenizer;

public class CsvTokenizer {
    public static void main(String[] args) {
        String csv = "Alice,30,Engineer";
        StringTokenizer st = new StringTokenizer(csv, ",");

        String name = st.nextToken();
        int age      = Integer.parseInt(st.nextToken());
        String role  = st.nextToken();

        System.out.println(name + " is " + age + " and works as " + role);
    }
}

Output:

Alice is 30 and works as Engineer

Multiple Delimiter Characters

Every character you include in the delimiter string acts as its own delimiter. You can split on both commas and semicolons at once:

import java.util.StringTokenizer;

public class MultiDelim {
    public static void main(String[] args) {
        String data = "apple,banana;cherry,date";
        StringTokenizer st = new StringTokenizer(data, ",;");

        while (st.hasMoreTokens()) {
            System.out.println(st.nextToken());
        }
    }
}

Output:

apple
banana
cherry
date

Returning Delimiters as Tokens

Pass true as the third constructor argument to include delimiter characters in the token stream. This is handy when you need to know which delimiter separated two values:

import java.util.StringTokenizer;

public class DelimAsToken {
    public static void main(String[] args) {
        StringTokenizer st = new StringTokenizer("a+b-c", "+-", true);

        while (st.hasMoreTokens()) {
            System.out.println("[" + st.nextToken() + "]");
        }
    }
}

Output:

[a]
[+]
[b]
[-]
[c]

Changing the Delimiter Mid-Stream

You can call nextToken(String newDelim) to switch to a different delimiter for just that one call — and all subsequent calls use the new delimiter until you change it again:

import java.util.StringTokenizer;

public class ChangeDelim {
    public static void main(String[] args) {
        // First token split by space, rest by comma
        StringTokenizer st = new StringTokenizer("section1 a,b,c");
        String section = st.nextToken();           // uses space
        String rest    = st.nextToken(",");        // switches to comma, gets "a"
        String b       = st.nextToken();           // still comma, gets "b"
        String c       = st.nextToken();           // still comma, gets "c"

        System.out.println(section + " | " + rest + " | " + b + " | " + c);
    }
}

Output:

section1 | a | b | c

StringTokenizer vs String.split() vs Scanner

Choosing the right tool matters. Here is a quick comparison:

Feature	`StringTokenizer`	`String.split()`	`Scanner`
Regex support	No	Yes	Yes
Returns array	No (iterator style)	Yes	No (stream style)
Empty tokens	Skipped silently	Included	Skipped
Performance	Fastest (no regex)	Moderate	Flexible
Recommended for new code	No (legacy)	Yes (simple cases)	Yes (flexible parsing)
Java version	1.0+	1.4+	5+

Warning: StringTokenizer silently skips consecutive delimiters — it never gives you an empty token. If you have "a,,b" and split on ",", you get "a" and "b" with no indication of the missing middle field. String.split(",") returns ["a", "", "b"], preserving the empty slot, which is usually what you want for structured data.

For most new code, prefer String.split() for simple splitting or Scanner for interactive / stream-based parsing.

Under the Hood

StringTokenizer is intentionally simple. Internally it keeps three pieces of state:

currentPosition — index into the original string where scanning should resume.
maxPosition — the length of the string (end boundary).
delimiters — the delimiter string you provided (or the default whitespace set).

When you call nextToken(), it:

Skips forward past any delimiter characters starting at currentPosition.
Scans forward until it hits the next delimiter or the end of the string.
Returns the substring between those two positions and advances currentPosition.

Because it works directly on the original String and uses String.substring() internally, there is no regex compilation, no array allocation, and no Pattern/Matcher overhead. For tight loops that parse millions of simple delimited lines, this can be measurably faster than split().

However, modern JVMs have closed most of that gap, and String.split() with a single-character non-regex delimiter is heavily optimized since Java 8 — it takes a fast path that avoids regex entirely when the delimiter is a single character with no special regex meaning.

StringTokenizer also implements the Enumeration<Object> interface (a legacy precursor to Iterator), which is why it has hasMoreElements() and nextElement() alongside the more readable hasMoreTokens() / nextToken() pair.

Common Pitfalls

Missing empty tokens. As noted above, consecutive delimiters produce no empty token. This will silently corrupt structured data with optional fields.
Not thread-safe. Each StringTokenizer instance is stateful; never share one across threads without external synchronization.
Delimiter characters, not strings. new StringTokenizer(s, "->") treats - and > as two separate one-character delimiters, not the literal two-character sequence ->. Use String.split("->") if you need a multi-character delimiter.
countTokens() is an estimate. It counts from currentPosition to the end, so its value decreases as you consume tokens. Calling it before iterating is fine; calling it mid-loop gives you remaining tokens, not total tokens.

import java.util.StringTokenizer;

public class PitfallDemo {
    public static void main(String[] args) {
        // Consecutive delimiters: empty field is lost!
        StringTokenizer st = new StringTokenizer("Alice,,Engineer", ",");
        System.out.println(st.countTokens()); // 2, not 3!
        while (st.hasMoreTokens()) {
            System.out.println(st.nextToken());
        }
    }
}

Output:

2
Alice
Engineer

The age field disappears entirely. With String.split(",") you would get ["Alice", "", "Engineer"] and could detect the missing value.

String Methods — the full reference for String instance methods including split(), indexOf(), and substring()
Scanner — a flexible, regex-powered alternative for parsing strings, files, and streams token by token
Strings — the foundation page covering String creation, immutability, and the string pool
Regular Expressions — Pattern and Matcher for powerful pattern-based splitting and matching
StringBuilder — for building strings efficiently when constructing output from parsed tokens

StringTokenizer

Why StringTokenizer Exists

Creating a StringTokenizer

Iterating Over Tokens

Using a Custom Delimiter

Multiple Delimiter Characters

Returning Delimiters as Tokens

Changing the Delimiter Mid-Stream

StringTokenizer vs String.split() vs Scanner

Under the Hood

Common Pitfalls

Related Topics