public class CSVParser extends java.lang.Object implements CSVParse
If field includes a comma or a new line, the whole field must be surrounded with double quotes. When the field is in quotes, any quote literals must be escaped by \" Backslash literals must be escaped by \\. Otherwise a backslash and the character following will be treated as the following character, IE. "\n" is equivalent to "n". Other escape sequences may be set using the setEscapes() method. Text that comes after quotes that have been closed but come before the next comma will be ignored.
Empty fields are returned as as String of length zero: "". The following line has three empty
fields and three non-empty fields in it. There is an empty field on each end, and one in the
middle. One token is returned as a space.
,second,," ",fifth,
Blank lines are always ignored. Other lines will be ignored if they start with a comment character as set by the setCommentStart() method.
An example of how CVSLexer might be used:
CSVParser shredder = new CSVParser(System.in); shredder.setCommentStart("#;!"); shredder.setEscapes("nrtf", "\n\r\t\f"); String t; while ((t = shredder.nextValue()) != null){ System.out.println("" + shredder.lastLineNumber() + " " + t); }
Some applications do not output CSV according to the generally accepted standards and this parse may not be able to handle it. One such application is the Microsoft Excel spreadsheet. A separate class must be use to read Excel CSV.
ExcelCSVParser
Constructor and Description |
---|
CSVParser(java.io.InputStream in)
Create a parser to parse comma separated values from
an InputStream.
|
CSVParser(java.io.InputStream in,
char delimiter)
Create a parser to parse delimited values from
an InputStream.
|
CSVParser(java.io.InputStream in,
char delimiter,
java.lang.String escapes,
java.lang.String replacements,
java.lang.String commentDelims)
Create a parser to parse delimited values from
an InputStream.
|
CSVParser(java.io.InputStream in,
java.lang.String escapes,
java.lang.String replacements,
java.lang.String commentDelims)
Create a parser to parse comma separated values from
an InputStream.
|
CSVParser(java.io.Reader in)
Create a parser to parse comma separated values from
a Reader.
|
CSVParser(java.io.Reader in,
char delimiter)
Create a parser to parse delimited values from
a Reader.
|
CSVParser(java.io.Reader in,
char delimiter,
java.lang.String escapes,
java.lang.String replacements,
java.lang.String commentDelims)
Create a parser to parse delimited values from
a Reader.
|
CSVParser(java.io.Reader in,
java.lang.String escapes,
java.lang.String replacements,
java.lang.String commentDelims)
Create a parser to parse comma separated values from
a Reader.
|
Modifier and Type | Method and Description |
---|---|
void |
changeDelimiter(char newDelim)
Change this parser so that it uses a new delimiter.
|
void |
changeQuote(char newQuote)
Change this parser so that it uses a new character for quoting.
|
void |
close()
Close any stream upon which this parser is based.
|
java.lang.String[][] |
getAllValues()
Get all the values from the file.
|
int |
getLastLineNumber()
Get the number of the line from which the last value was retrieved.
|
java.lang.String[] |
getLine()
Get all the values from a line.
|
int |
lastLineNumber()
Get the line number that the last token came from.
|
java.lang.String |
nextValue()
get the next value.
|
static java.lang.String[][] |
parse(java.io.Reader in)
Parse the delimited data from a stream.
|
static java.lang.String[][] |
parse(java.io.Reader in,
char delimiter)
Parse the comma delimited data from a stream.
|
static java.lang.String[][] |
parse(java.io.Reader in,
char delimiter,
java.lang.String escapes,
java.lang.String replacements,
java.lang.String commentDelims)
Parse the delimited data from a stream.
|
static java.lang.String[][] |
parse(java.io.Reader in,
java.lang.String escapes,
java.lang.String replacements,
java.lang.String commentDelims)
Parse the comma delimited data from a stream.
|
static java.lang.String[][] |
parse(java.lang.String s)
Parse the comma delimited data from a string.
|
static java.lang.String[][] |
parse(java.lang.String s,
char delimiter)
Parse the delimited data from a string.
|
static java.lang.String[][] |
parse(java.lang.String s,
char delimiter,
java.lang.String escapes,
java.lang.String replacements,
java.lang.String commentDelims)
Parse the delimited data from a string.
|
static java.lang.String[][] |
parse(java.lang.String s,
java.lang.String escapes,
java.lang.String replacements,
java.lang.String commentDelims)
Parse the comma delimited data from a string.
|
void |
setCommentStart(java.lang.String commentDelims)
Set the characters that indicate a comment at the beginning of the line.
|
void |
setEscapes(java.lang.String escapes,
java.lang.String replacements)
Specify escape sequences and their replacements.
|
public CSVParser(java.io.InputStream in)
Byte to character conversion is done using the platform default locale.
in
- stream that contains comma separated values.public CSVParser(java.io.InputStream in, char delimiter) throws BadDelimiterException
Byte to character conversion is done using the platform default locale.
in
- stream that contains comma separated values.delimiter
- record separatorBadDelimiterException
- if the specified delimiter cannot be usedpublic CSVParser(java.io.Reader in)
in
- reader that contains comma separated values.public CSVParser(java.io.Reader in, char delimiter) throws BadDelimiterException
in
- reader that contains comma separated values.delimiter
- record separatorBadDelimiterException
- if the specified delimiter cannot be usedpublic CSVParser(java.io.InputStream in, char delimiter, java.lang.String escapes, java.lang.String replacements, java.lang.String commentDelims) throws BadDelimiterException
Byte to character conversion is done using the platform default locale.
in
- stream that contains comma separated values.escapes
- a list of characters that will represent escape sequences.replacements
- the list of replacement characters for those escape sequences.commentDelims
- list of characters a comment line may start with.delimiter
- record separatorBadDelimiterException
- if the specified delimiter cannot be usedpublic CSVParser(java.io.InputStream in, java.lang.String escapes, java.lang.String replacements, java.lang.String commentDelims)
Byte to character conversion is done using the platform default locale.
in
- stream that contains comma separated values.escapes
- a list of characters that will represent escape sequences.replacements
- the list of replacement characters for those escape sequences.commentDelims
- list of characters a comment line may start with.public CSVParser(java.io.Reader in, char delimiter, java.lang.String escapes, java.lang.String replacements, java.lang.String commentDelims) throws BadDelimiterException
in
- reader that contains comma separated values.escapes
- a list of characters that will represent escape sequences.replacements
- the list of replacement characters for those escape sequences.commentDelims
- list of characters a comment line may start with.delimiter
- record separatorBadDelimiterException
- if the specified delimiter cannot be usedpublic CSVParser(java.io.Reader in, java.lang.String escapes, java.lang.String replacements, java.lang.String commentDelims)
in
- reader that contains comma separated values.escapes
- a list of characters that will represent escape sequences.replacements
- the list of replacement characters for those escape sequences.commentDelims
- list of characters a comment line may start with.public void close() throws java.io.IOException
public java.lang.String nextValue() throws java.io.IOException
public int lastLineNumber()
New line breaks that occur in the middle of a token are no counted in the line number count.
lastLineNumber
in interface CSVParse
public java.lang.String[] getLine() throws java.io.IOException
If the line has already been partially read, only the values that have not already been read will be included.
public java.lang.String[][] getAllValues() throws java.io.IOException
If the file has already been partially read, only the values that have not already been read will be included.
Each line of the file that has at least one value will be represented. Comments and empty lines are ignored.
The resulting double array may be jagged.
getAllValues
in interface CSVParse
java.io.IOException
- if an error occurs while reading.public void setEscapes(java.lang.String escapes, java.lang.String replacements)
setEscapes("nrtf", "\n\r\t\f");
escapes
- a list of characters that will represent escape sequences.replacements
- the list of replacement characters for those escape sequences.public void changeDelimiter(char newDelim) throws BadDelimiterException
The initial character is a comma, the delimiter cannot be changed to a quote or other character that has special meaning in CSV.
changeDelimiter
in interface CSVParse
newDelim
- delimiter to which to switch.BadDelimiterException
- if the character cannot be used as a delimiter.public void changeQuote(char newQuote) throws BadQuoteException
The initial character is a double quote ("), the delimiter cannot be changed to a comma or other character that has special meaning in CSV.
changeQuote
in interface CSVParse
newQuote
- character to use for quoting.BadQuoteException
- if the character cannot be used as a quote.public void setCommentStart(java.lang.String commentDelims)
# Comment ; Another Comment ! Yet another commentBy default there are no comments in CVS files. Commas and quotes may not be used to indicate comment lines.
commentDelims
- list of characters a comment line may start with.public int getLastLineNumber()
getLastLineNumber
in interface CSVParse
public static java.lang.String[][] parse(java.lang.String s)
Only escaped backslashes and quotes will be recognized as escape sequences. The data will be treated as having no comments.
s
- string with comma delimited data to parse.public static java.lang.String[][] parse(java.lang.String s, char delimiter) throws BadDelimiterException
Only escaped backslashes and quotes will be recognized as escape sequences. The data will be treated as having no comments.
s
- string with delimited data to parse.delimiter
- record separatorBadDelimiterException
- if the character cannot be used as a delimiter.public static java.lang.String[][] parse(java.lang.String s, java.lang.String escapes, java.lang.String replacements, java.lang.String commentDelims)
s
- string with comma delimited data to parse.escapes
- a list of additional characters that will represent escape sequences.replacements
- the list of replacement characters for those escape sequences.commentDelims
- list of characters a comment line may start with.public static java.lang.String[][] parse(java.lang.String s, char delimiter, java.lang.String escapes, java.lang.String replacements, java.lang.String commentDelims) throws BadDelimiterException
s
- string with delimited data to parse.escapes
- a list of additional characters that will represent escape sequences.replacements
- the list of replacement characters for those escape sequences.commentDelims
- list of characters a comment line may start with.delimiter
- record separatorBadDelimiterException
- if the character cannot be used as a delimiter.public static java.lang.String[][] parse(java.io.Reader in, char delimiter) throws java.io.IOException, BadDelimiterException
Only escaped backslashes and quotes will be recognized as escape sequences. The data will be treated as having no comments.
in
- Reader with comma delimited data to parse.delimiter
- record separatorBadDelimiterException
- if the character cannot be used as a delimiter.java.io.IOException
- if an error occurs while reading.public static java.lang.String[][] parse(java.io.Reader in) throws java.io.IOException
Only escaped backslashes and quotes will be recognized as escape sequences. The data will be treated as having no comments.
in
- Reader with comma delimited data to parse.java.io.IOException
- if an error occurs while reading.public static java.lang.String[][] parse(java.io.Reader in, char delimiter, java.lang.String escapes, java.lang.String replacements, java.lang.String commentDelims) throws java.io.IOException, BadDelimiterException
in
- Reader with delimited data to parse.delimiter
- record separatorescapes
- a list of additional characters that will represent escape sequences.replacements
- the list of replacement characters for those escape sequences.commentDelims
- list of characters a comment line may start with.BadDelimiterException
- if the character cannot be used as a delimiter.java.io.IOException
- if an error occurs while reading.public static java.lang.String[][] parse(java.io.Reader in, java.lang.String escapes, java.lang.String replacements, java.lang.String commentDelims) throws java.io.IOException
in
- Reader with comma delimited data to parse.escapes
- a list of additional characters that will represent escape sequences.replacements
- the list of replacement characters for those escape sequences.commentDelims
- list of characters a comment line may start with.java.io.IOException
- if an error occurs while reading.Copyright (c) 2001-2020 by Stephen Ostermiller