Return to Snippet

Revision: 64631
at August 30, 2013 18:09 by jarlah


Initial Code
package com.java.test;

import java.io.File;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import java.util.Scanner;
import java.util.StringTokenizer;
 
/***
 * input file format:
 * 
 *  foo|bar
 *  foo2|bar2
 *  .
 *  .
 */
public class Indexer {
	private Map<String, String> indexerMap = new HashMap<String, String>();
	
	public Indexer(String filename) throws IOException{
		indexerMap = parseFileIntoMap(filename);
	}
	
	private Map<String, String> parseFileIntoMap(String filename) throws IOException {
		Map<String, String> map = new HashMap<String, String>();
		
		Scanner scanner = null;
		try {
			scanner = new Scanner(new File(filename), "UTF-8");
			scanner.useDelimiter("[
]+");
			while (scanner.hasNext())
			{
				StringTokenizer tokenizer = new StringTokenizer(scanner.next(), "\\|");
				if (tokenizer.countTokens() == 2) {
					String key = tokenizer.nextElement().toString();
					String value = tokenizer.nextElement().toString();
					map.put(key, value);
				}
			}
		} catch (FileNotFoundException e) {
			throw new IllegalArgumentException("Could not load file indexing! The file "+filename+" does not exist.", e);
		}finally {
			if(scanner != null)
				scanner.close();
		}
		
		return map;
	}
	
	/***
	 * Returns the value from the map if key is not null
	 */
	public String getValue(String key) {
		if (key == null) {
			return null;
		} else {
			return indexerMap.get(key);
		}
	}	
}

Initial URL


Initial Description
Use Scanner instead of BufferedReader and StringTokenizer for parsing the line. I see a potential for using Scanner for both use cases. But it was a major improvement to get rid of the split arrays. In addition the BufferedReader was not closed. The scanner is in the final loop. Also, use of static is highly discouraged. Object oriented approach is better and no reason to expose map or list, or the parse method. Both is now private. String expectations about file encoding. Lets assume we only handle UTF-8 files. No fishy norwegian characters.

Initial Title
Scanner and Tokenizer for file indexing

Initial Tags


Initial Language
Java