Package edu.mit.eecs.parserlib
Interface Parser<NT extends Enum<NT>>
- Type Parameters:
NT
- is an Enum type with one value for every nonterminal in the grammar
public interface Parser<NT extends Enum<NT>>
A Parser is an immutable object that is able to take a sequence of characters and return a parse tree according
to some grammar.
Parsers are constructed by calling compile() with a grammar, which might be stored in a string, in a file, or read from a stream.
Once constructed, a Parser object is used by calling parse() on a sequence of characters (represented as a string or file or stream). Its result is a ParseTree showing how that string matches the grammar.
The type parameter NT
should be an Enum type with the same (case-insensitive) names as
the nonterminals in the grammar. This allows nonterminals to be referred to by your Java code with static checking
and type safety. For example, if your grammar is:
String sumGrammar = "expression ::= number '+' number ; number ::= [0-9]+;"then you should create a nonterminal enum like this:
enum SumGrammar { EXPRESSION, NUMBER };and then use:
Parser<SumGrammar>.compile(sumGrammar, SumGrammar.EXPRESSION)to compile it into a parser.
The grammar of a grammar is as follows.
@skip whitespaceAndComments { grammar ::= ( production | skipBlock )+ production ::= nonterminal '::=' union ';' skipBlock ::= '@skip' nonterminal '{' production* '}' union :: = concatenation ('|' concatenation)* concatenation ::= repetition* repetition ::= unit repeatOperator? unit ::= nonterminal | terminal | '(' union ')' } nonterminal ::= [a-zA-Z_][a-zA-Z_0-9]* terminal ::= quotedString | characterSet | anyChar | characterClass quotedString ::= "'" ([^'\r\n\\] | '\\' . )* "'" // e.g. 'hello', '\'', '\r\n\t', '' | '"' ([^"\r\n\\] | '\\' . )* '"' // e.g. "world", "\"", "\r\n\t", "" characterSet ::= '[' ([^\]\r\n\\] | '\\' . )+ ']' // e.g. [abc], [a-z], [^a-z], [\]], [\r\n\t] anyChar ::= '.' repeatOperator ::= [*+?] | '{' ( number | range | upperBound | lowerBound ) '}' number ::= [0-9]+ range ::= number ',' number upperBound ::= ',' number lowerBound ::= number ',' characterClass ::= '\\' [dsw] // e.g. \d, \s, \w whitespaceAndComments ::= (whitespace | oneLineComment | blockComment)* whitespace ::= [ \t\r\n] oneLineComment ::= '//' [^\r\n]* [\r\n]+ blockComment ::= '/*' [^*]* '*' ([^/]* '*')* '/'
- Author:
- 6.005/6.031 course staff
-
Field Summary
-
Method Summary
Modifier and Type Method Description static <NT extends Enum<NT>>
Parser<NT>compile(File f, NT rootNonterminal)
Compile a Parser from a grammar stored in a file.static <NT extends Enum<NT>>
Parser<NT>compile(InputStream in, NT rootNonterminal)
Compile a Parser from a grammar represented as an InputStream.static <NT extends Enum<NT>>
Parser<NT>compile(Reader in, NT rootNonterminal)
Compile a Parser from a grammar represented as a Reader stream.static <NT extends Enum<NT>>
Parser<NT>compile(String grammar, NT rootNonterminal)
Compile a Parser from a grammar represented as a string.default ParseTree<NT>
parse(File f)
Parses a file based on the grammar internally represented by the parser.default ParseTree<NT>
parse(InputStream stream)
Parses a stream based on the grammar internally represented by the parser.ParseTree<NT>
parse(Reader in)
Parses a stream based on the grammar internally represented by the parser.ParseTree<NT>
parse(String string)
Parses a string based on the grammar internally represented by the parser.
-
Field Details
-
VERSION
- See Also:
- Constant Field Values
-
-
Method Details
-
compile
static <NT extends Enum<NT>> Parser<NT> compile(String grammar, NT rootNonterminal) throws UnableToParseExceptionCompile a Parser from a grammar represented as a string.- Type Parameters:
NT
- an Enum type that contains one value for every nonterminal in the grammar.- Parameters:
grammar
- the grammar to userootNonterminal
- the desired root nonterminal in the grammar- Returns:
- a parser for the given grammar that will start parsing at rootNonterminal.
- Throws:
UnableToParseException
- if the grammar has a syntax error
-
compile
static <NT extends Enum<NT>> Parser<NT> compile(Reader in, NT rootNonterminal) throws UnableToParseException, IOExceptionCompile a Parser from a grammar represented as a Reader stream.- Type Parameters:
NT
- an Enum type that contains one value for every nonterminal in the grammar.- Parameters:
in
- contains the grammarrootNonterminal
- the desired root nonterminal in the grammar- Returns:
- a parser for the given grammar that will start parsing at rootNonterminal.
- Throws:
UnableToParseException
- if the grammar has a syntax errorIOException
- if the stream has an I/O error
-
compile
static <NT extends Enum<NT>> Parser<NT> compile(File f, NT rootNonterminal) throws UnableToParseException, IOExceptionCompile a Parser from a grammar stored in a file.- Type Parameters:
NT
- an Enum type that contains one value for every nonterminal in the grammar.- Parameters:
f
- file containing the grammar. Required to have UTF-8 encoding; if you need a different encoding, use compile(new FileReader(...),...) to choose the encoding yourself instead.rootNonterminal
- the desired root nonterminal in the grammar- Returns:
- a parser for the given grammar that will start parsing at rootNonterminal.
- Throws:
UnableToParseException
- if the grammar has a syntax errorIOException
- if the file is missing or has an I/O error
-
compile
static <NT extends Enum<NT>> Parser<NT> compile(InputStream in, NT rootNonterminal) throws UnableToParseException, IOExceptionCompile a Parser from a grammar represented as an InputStream.- Type Parameters:
NT
- an Enum type that contains one value for every nonterminal in the grammar.- Parameters:
in
- stream containing the grammar. Required to have UTF-8 encoding; if you need a different encoding, if you need a different encoding, use compile(new InputStreamReader(...),...) to choose the encoding yourself instead.rootNonterminal
- the desired root nonterminal in the grammar- Returns:
- a parser for the given grammar that will start parsing at rootNonterminal.
- Throws:
UnableToParseException
- if the grammar has a syntax errorIOException
- if the stream has an I/O error
-
parse
Parses a string based on the grammar internally represented by the parser.- Parameters:
string
- string to parse- Returns:
ParseTree
representing a successful parse of the string- Throws:
UnableToParseException
- if string cannot be parsed, describing approximately where the parsing error occurred
-
parse
Parses a stream based on the grammar internally represented by the parser.- Parameters:
in
- stream from which to read the text to be parsed.- Returns:
ParseTree
representing a successful parse of the content of the stream- Throws:
UnableToParseException
- if the stream cannot be parsed, describing approximately where the parsing error occurredIOException
- if the stream has an I/O error.
-
parse
Parses a file based on the grammar internally represented by the parser.- Parameters:
f
- File containing the text to be parsed. Required to have UTF-8 encoding; if you need a different encoding, use parse(new FileReader(...),...) to choose the encoding yourself instead.- Returns:
ParseTree
representing a successful parse of the content of the file- Throws:
UnableToParseException
- if the file cannot be parsed, describing approximately where the parsing error occurredIOException
- if the file has an I/O error.
-
parse
Parses a stream based on the grammar internally represented by the parser.- Parameters:
stream
- stream from which to read the text to be parsed. Required to have UTF-8 encoding; if you need a different encoding, use compile(new InputStreamReader(...),...) to choose the encoding yourself instead.- Returns:
ParseTree
representing a successful parse of the content of the stream- Throws:
UnableToParseException
- if the stream cannot be parsed, describing approximately where the parsing error occurredIOException
- if the stream has an I/O error.
-