The java.util.StringTokenizer class enables the division of a String into tokens, providing a straightforward method for this purpose. This class is considered a legacy component within the Java programming language.
The StreamTokenizer class does not offer the capability to distinguish between numbers, quoted strings, identifiers, etc. as opposed to the facility provided by it. The upcoming I/O chapter will delve into a detailed discussion on the functionality of the StreamTokenizer class.
Delimiters in the StringTokenizer class can be specified either during instantiation or individually for each token.
Constructors of the StringTokenizer Class
The StringTokenizer class includes three constructors that have been specified.
| Constructor | Description |
|---|---|
| StringTokenizer(String str) | It creates StringTokenizer with specified string. |
| StringTokenizer(String str, String delim) | It creates StringTokenizer with specified string and delimiter. |
| StringTokenizer(String str, String delim, boolean returnValue) | It creates StringTokenizer with specified string, delimiter and returnValue. If return value is true, delimiter characters are considered to be tokens. If it is false, delimiter characters serve to separate tokens. |
StringTokenizer in Java is a class that is used to break a string into tokens. It is part of the java.util package and provides a simple way to tokenize a string based on a specified delimiter. The class provides several constructors that allow us to specify different delimiters or whether to include the delimiters as tokens. Once a StringTokenizer object is created, we can use its methods to iterate over the tokens or retrieve them individually.
When creating a StringTokenizer, it is common to provide the string that needs to be tokenized and the specific delimiter(s) to the constructor. By default, a space character acts as the delimiter, yet you have the flexibility to define a custom set of characters to serve as delimiters. For instance, if you intend to tokenize a string using commas as separators, you can instantiate a StringTokenizer with this configuration.
The StringTokenizer class also offers a method called countTokens which allows you to determine the total count of remaining tokens. This feature can be valuable when you require information about the token count prior to iterating through them.
It is important to mention that the use of the StringTokenizer class is not recommended as it is considered a legacy class. Instead, it is advised to utilize the split method from the String class or the Scanner class for more intricate tokenization requirements. The split method offers greater flexibility and simplicity for basic tokenization tasks, whereas the Scanner class presents advanced tokenization capabilities, including the ability to parse various data types.
Advantages of StringTokenizer Class
- It provides a straightforward way to tokenize a string based on a specified delimiter without requiring complex logic. This makes it easy to use for simple tokenization tasks where you just need to split a string into parts.
- Another advantage is its ability to include or exclude delimiters as tokens. It can be useful in certain cases where you need to process both the tokens and the delimiters separately. For example, if we are parsing a mathematical expression, we might want to tokenize the expression while also keeping track of the operators.
- StringTokenizer is also efficient in terms of memory usage. It does not create additional copies of the input string or tokens that can be beneficial when working with large strings or when memory is a concern.
- Additionally, StringTokenizer is part of the standard Java API, so it is widely available and supported. It means we can use it in your Java applications without needing to add any external libraries or dependencies.
- One disadvantage of using StringTokenizer in Java is its limited functionality compared to other tokenization methods.
- StringTokenizer can only tokenize based on a single delimiter at a time that can be restrictive for more complex tokenization needs. For example, if we need to tokenize a string based on multiple delimiters or more complex patterns, we would need to use a different approach.
- Another disadvantage is that StringTokenizer is a legacy class, means it is not recommended for use in new code.
- The StringTokenizer class has been largely replaced by the split method of the String class and the Scanner class, which provide more flexibility and functionality for tokenization tasks. As a result, using StringTokenizer may lead to code that is harder to maintain and less efficient.
- Additionally, StringTokenizer does not provide a way to tokenize a string in a way that allows you to easily reconstruct the original string. It can be a limitation if we need to tokenize a string for processing but also need to reconstruct the original string later.
Disadvantages of StringTokenizer Class
In general, although StringTokenizer can be beneficial for basic tokenization requirements, its constraints and outdated nature may not be ideal for advanced tokenization tasks in contemporary Java development.
Applications of StringTokenizer Class
StringTokenizer in Java finds several important applications due to its ability to break down strings into tokens using specified delimiters.
- One common use is in text parsing and analysis, where it is crucial to extract individual words or phrases from a larger text body. The functionality is particularly useful in natural language processing tasks like sentiment analysis, where the sentiment of a sentence can be determined by analyzing the individual words.
- Another significant application is in data processing, especially while dealing with data stored in a delimited format such as CSV (Comma-Separated Values) files.
- StringTokenizer can be used to parse these files and extract data fields that is essential in data mining, data validation, and data cleaning operations.
- In addition,
StringTokenizeris often used in networking applications to parse incoming messages or commands. For example, in a client-server architecture, the server might receive commands from clients as strings with specific delimiters.StringTokenizercan be used to parse these commands and extract the necessary information to execute the requested action. - Furthermore,
StringTokenizercan be used in web applications for processing URL query strings. The query string in a URL contains parameters and values separated by delimiters like "&" and "=", andStringTokenizercan be used to extract these parameters and values for further processing.
In Java, StringTokenizer serves as a versatile utility with numerous applications in various domains such as text manipulation, data analysis, network communication, and web application creation. This makes it a valuable asset for diverse programming assignments.
Methods of the StringTokenizer Class
StringTokenizer in Java provides several methods to manipulate and access tokens in a string. One of the key methods is hasMoreTokens, which returns true if there are more tokens in the string and false otherwise. This method is commonly used in a loop to iterate over all tokens in a string.
The nextToken function is employed to fetch the subsequent token from the string. It provides the subsequent token as a string and progresses the tokenizer to the next token. This function is commonly utilized alongside hasMoreTokens to loop through all tokens within a string.
Another beneficial function is countTokens, which provides the count of tokens that are still present in the string. This function is handy for establishing the overall token count in a string or for monitoring the tokenization progress.
StringTokenizer also provides a constructor that allows you to specify custom delimiters. By default, StringTokenizer uses whitespace as the delimiter, but you can specify a string containing the custom delimiters when creating the StringTokenizer object.
Here are six valuable techniques provided by the StringTokenizer class:
| Methods | Description |
|---|---|
| boolean hasMoreTokens() | It checks if there is more tokens available. |
| String nextToken() | It returns the next token from the StringTokenizer object. |
| String nextToken(String delim) | It returns the next token based on the delimiter. |
| boolean hasMoreElements() | It is the same as hasMoreTokens() method. |
| Object nextElement() | It is the same as nextToken() but its return type is Object. |
| int countTokens() | It returns the total number of tokens. |
Example of StringTokenizer Class
Consider the following demonstration using the StringTokenizer class to tokenize the string "my name is khan" based on whitespace.
Simple.java
import java.util.StringTokenizer;
public class Simple{
public static void main(String args[]){
StringTokenizer st = new StringTokenizer("my name is khan"," ");
while (st.hasMoreTokens()) {
System.out.println(st.nextToken());
}
}
}
Output:
my
name
is
khan
The Java code shown above illustrates the utilization of the StringTokenizer class along with its functionalities like hasMoreTokens and nextToken.
Example of StringTokenizer.nextToken(String delim) Method
Test.java
import java.util.*;
public class Test {
public static void main(String[] args) {
StringTokenizer st = new StringTokenizer("my,name,is,khan");
// printing next token
System.out.println("Next token is : " + st.nextToken(","));
}
}
Output:
Next token is : my
Note: The StringTokenizer class is deprecated now. It is recommended to use the split method of the String class or the Pattern class that belongs to the java.util.regex package.
Example of StringTokenizer.hasMoreTokens Method
The function returns a boolean value of true if there are additional tokens present in the tokenizer String; otherwise, it returns false.
StringTokenizer1.java
import java.util.StringTokenizer;
public class StringTokenizer1
{
/* Driver Code */
public static void main(String args[])
{
/* StringTokenizer object */
StringTokenizer st = new StringTokenizer("Demonstrating methods from StringTokenizer class"," ");
/* Checks if the String has any more tokens */
while (st.hasMoreTokens())
{
System.out.println(st.nextToken());
}
}
}
Output:
Demonstrating
methods
from
StringTokenizer
class
The Java code above demonstrates the utilization of the StringTokenizer class's methods hasMoreTokens and nextToken.
Example of StringTokenizer.hasMoreElements Method
The function provided here yields an identical outcome as the hasMoreTokens method in the StringTokenizer class, with the unique ability to incorporate the Enumeration interface.
StringTokenizer2.java
import java.util.StringTokenizer;
public class StringTokenizer2
{
public static void main(String args[])
{
StringTokenizer st = new StringTokenizer("Hello everyone I am a Java developer"," ");
while (st.hasMoreElements())
{
System.out.println(st.nextToken());
}
}
}
Output:
Hello
everyone
I
am
a
Java
developer
The code snippet above showcases the utilization of the hasMoreElements function.
Example of StringTokenizer.nextElement Method
The method nextElement retrieves the subsequent token object within the specified tokenizer String. This method is capable of conforming to the Enumeration interface.
StringTokenizer3.java
import java.util.StringTokenizer;
public class StringTokenizer3
{
/* Driver Code */
public static void main(String args[])
{
/* StringTokenizer object */
StringTokenizer st = new StringTokenizer("Hello Everyone Have a nice day"," ");
/* Checks if the String has any more tokens */
while (st.hasMoreTokens())
{
/* Prints the elements from the String */
System.out.println(st.nextElement());
}
}
}
Output:
Hello
Everyone
Have
a
nice
day
The code snippet above showcases the implementation of the nextElement function.
Example of StringTokenizer.countTokens Method
This function determines the quantity of tokens within the tokenizer String.
StringTokenizer4.java
import java.util.StringTokenizer;
public class StringTokenizer3
{
/* Driver Code */
public static void main(String args[])
{
/* StringTokenizer object */
StringTokenizer st = new StringTokenizer("Hello Everyone Have a nice day"," ");
/* Prints the number of tokens present in the String */
System.out.println("Total number of Tokens: "+st.countTokens());
}
}
Output:
Total number of Tokens: 6
The Java code provided above showcases the utilization of the countTokens function from the StringTokenizer class.