com.Ostermiller.util Java Utilities


com.Ostermiller.util
Class LabeledCSVParser

java.lang.Object
  extended by com.Ostermiller.util.LabeledCSVParser
All Implemented Interfaces:
CSVParse

public class LabeledCSVParser
extends Object
implements CSVParse

Decorate a CSVParse object to provide an index of field names. Many (most?) CSV files have a list of field names (labels) as the first line. A LabeledCSVParser will consume this line automatically. The methods getLabels(), getLabelIndex(String) and getValueByLabel(String) allow these labels to be discovered and used while parsing CSV data. This class can also be used to conveniently ignore field labels if they happen to be present in a CSV file and are not desired.

Since:
ostermillerutils 1.03.00
Author:
Campbell, Allen T. , Stephen Ostermiller http://ostermiller.org/contact.pl?regarding=Java+Utilities

Constructor Summary
LabeledCSVParser(CSVParse parse)
          Construct a LabeledCSVParser on a CSVParse implementation.
 
Method Summary
 void changeDelimiter(char newDelim)
          Change this parser so that it uses a new delimiter.
 void changeQuote(char newQuote)
          Change this parser so that it uses a new character for quoting.
 void close()
          Close any stream upon which this parser is based.
 String[][] getAllValues()
          Get all the values from the file.
 int getLabelIdx(String label)
          Get the index of the column having the given label.
 int getLabelIndex(String label)
          Deprecated. may swallow an IOException while reading the labels - please use getLabelIdx()
 String[] getLabels()
          Return an array of all field names from the top of the CSV file.
 int getLastLineNumber()
          Get the line number that the last token came from.
 String[] getLine()
          Get all the values from a line.
 String getValueByLabel(String label)
          Given the label for the column, get the column from the last line that was read.
 int lastLineNumber()
          Get the line number that the last token came from.
 String nextValue()
          Read the next value from the file.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

LabeledCSVParser

public LabeledCSVParser(CSVParse parse)
                 throws IOException
Construct a LabeledCSVParser on a CSVParse implementation.

Parameters:
parse - CSVParse implementation
Throws:
IOException - if an error occurs while reading.
Since:
ostermillerutils 1.03.00
Method Detail

changeDelimiter

public void changeDelimiter(char newDelim)
                     throws BadDelimiterException
Change this parser so that it uses a new delimiter.

The initial character is a comma, the delimiter cannot be changed to a quote or other character that has special meaning in CSV.

Specified by:
changeDelimiter in interface CSVParse
Parameters:
newDelim - delimiter to which to switch.
Throws:
BadDelimiterException - if the character cannot be used as a delimiter.
Since:
ostermillerutils 1.03.00

changeQuote

public void changeQuote(char newQuote)
                 throws BadQuoteException
Change this parser so that it uses a new character for quoting.

The initial character is a double quote ("), the delimiter cannot be changed to a comma or other character that has special meaning in CSV.

Specified by:
changeQuote in interface CSVParse
Parameters:
newQuote - character to use for quoting.
Throws:
BadQuoteException - if the character cannot be used as a quote.
Since:
ostermillerutils 1.03.00

getAllValues

public String[][] getAllValues()
                        throws IOException
Get all the values from the file.

If the file has already been partially read, only the values that have not already been read will be included.

Each line of the file that has at least one value will be represented. Comments and empty lines are ignored.

The resulting double array may be jagged.

The last line of the values is saved and may be accessed by getValueByLabel().

Specified by:
getAllValues in interface CSVParse
Returns:
all the values from the file or null if there are no more values.
Throws:
IOException - if an error occurs while reading.
Since:
ostermillerutils 1.03.00

getLastLineNumber

public int getLastLineNumber()
Get the line number that the last token came from.

New line breaks that occur in the middle of a token are not counted in the line number count.

The first line of labels does not count towards the line number.

Specified by:
getLastLineNumber in interface CSVParse
Returns:
line number or -1 if no tokens have been returned yet.
Since:
ostermillerutils 1.03.00

lastLineNumber

public int lastLineNumber()
Get the line number that the last token came from.

New line breaks that occur in the middle of a token are not counted in the line number count.

The first line of labels does not count towards the line number.

Specified by:
lastLineNumber in interface CSVParse
Returns:
line number or -1 if no tokens have been returned yet.
Since:
ostermillerutils 1.03.00

getLine

public String[] getLine()
                 throws IOException
Get all the values from a line.

If the line has already been partially read, only the values that have not already been read will be included.

In addition to returning all the values from a line, LabeledCSVParser maintains a buffer of the values. This feature allows getValueByLabel(String) to function. In this case getLine() is used simply to iterate CSV data. The iteration ends when null is returned.

Note: The methods nextValue() and getAllValues() are incompatible with getValueByLabel(String) because the former methods cause the offset of field values to shift and corrupt the internal buffer maintained by getLine().

Specified by:
getLine in interface CSVParse
Returns:
all the values from the line or null if there are no more values.
Throws:
IOException - if an error occurs while reading.
Since:
ostermillerutils 1.03.00

nextValue

public String nextValue()
                 throws IOException
Read the next value from the file. The line number from which this value was taken can be obtained from getLastLineNumber().

This method is not compatible with getValueByLabel(). Using this method will make getValueByLabel() throw an IllegalStateException for the rest of the line.

Specified by:
nextValue in interface CSVParse
Returns:
the next value or null if there are no more values.
Throws:
IOException - if an error occurs while reading.
Since:
ostermillerutils 1.03.00

getLabels

public String[] getLabels()
                   throws IOException
Return an array of all field names from the top of the CSV file.

Returns:
Field names.
Throws:
IOException - if an IO error occurs
Since:
ostermillerutils 1.03.00

getLabelIndex

@Deprecated
public int getLabelIndex(String label)
Deprecated. may swallow an IOException while reading the labels - please use getLabelIdx()

Get the index of the column having the given label. The getLine() method returns an array of field values for a single record of data. This method returns the index of a member of that array based on the specified field name. The first field has the index 0.

Parameters:
label - The field name.
Returns:
The index of the field name, or -1 if the label does not exist.
Since:
ostermillerutils 1.03.00

getLabelIdx

public int getLabelIdx(String label)
                throws IOException
Get the index of the column having the given label. The getLine() method returns an array of field values for a single record of data. This method returns the index of a member of that array based on the specified field name. The first field has the index 0.

Parameters:
label - The field name.
Returns:
The index of the field name, or -1 if the label does not exist.
Throws:
IOException - if an IO error occurs
Since:
ostermillerutils 1.04.02

getValueByLabel

public String getValueByLabel(String label)
                       throws IllegalStateException
Given the label for the column, get the column from the last line that was read. If the column cannot be found in the line, null is returned.

Parameters:
label - The field name.
Returns:
the value from the last line read or null if there is no such value
Throws:
IllegalStateException - if nextValue has been called as part of getting the last line. nextValue is not compatible with this method.
Since:
ostermillerutils 1.03.00

close

public void close()
           throws IOException
Close any stream upon which this parser is based.

Specified by:
close in interface CSVParse
Throws:
IOException - if an error occurs while closing the stream.
Since:
ostermillerutils 1.03.00

com.Ostermiller.util Java Utilities


Copyright © 2001-2012 by Stephen Ostermiller