Class StreamTokenizer
A StreamTokenizer similar to Java's. This breaks an input stream (coming from a TextReader) into Tokens based on various settings. The settings are stored in the TokenizerSettings property, which is a StreamTokenizerSettings instance.
Inherited Members
Namespace: RTools_NTS.Util
Assembly: NetTopologySuite.dll
Syntax
public class StreamTokenizer : IEnumerable<Token>, IEnumerable
Remarks
This is configurable in that you can modify TokenizerSettings.CharTypes[] array to specify which characters are which type, along with other settings such as whether to look for comments or not.
WARNING: This is not internationalized. This treats all characters beyond the 7-bit ASCII range (decimal 127) as Word characters.
There are two main ways to use this: 1) Parse the entire stream at once and get an ArrayList of Tokens (see the Tokenize* methods), and 2) call NextToken() successively. This reads from a TextReader, which you can set directly, and this also provides some convenient methods to parse files and strings. This returns an Eof token if the end of the input is reached.
Here's an example of the NextToken() endCapStyle of use:
StreamTokenizer tokenizer = new StreamTokenizer();
tokenizer.GrabWhitespace = true;
tokenizer.Verbosity = VerbosityLevel.Debug; // just for debugging
tokenizer.TextReader = File.OpenText(fileName);
Token token;
while (tokenizer.NextToken(out token)) log.Info("Token = '{0}'", token);
Here's an example of the Tokenize... endCapStyle of use:
StreamTokenizer tokenizer = new StreamTokenizer("some string");
ArrayList tokens = new ArrayList();
if (!tokenizer.Tokenize(tokens))
{
// error handling
}
foreach (Token t in tokens) Console.WriteLine("t = {0}", t);
Comment delimiters are hardcoded (// and /*), not affected by char type table.
This sets line numbers in the tokens it produces. These numbers are normally the line on which the token starts. There is one known caveat, and that is that when GrabWhitespace setting is true, and a whitespace token contains a newline, that token's line number will be set to the following line rather than the line on which the token started.
Constructors
| Edit this page View SourceStreamTokenizer()
Default constructor.
Declaration
public StreamTokenizer()
StreamTokenizer(TextReader)
Construct and set this object's TextReader to the one specified.
Declaration
public StreamTokenizer(TextReader sr)
Parameters
Type | Name | Description |
---|---|---|
TextReader | sr | The TextReader to read from. |
StreamTokenizer(TextReader, StreamTokenizerSettings)
Construct and set this object's TextReader to the one specified.
Declaration
public StreamTokenizer(TextReader sr, StreamTokenizerSettings tokenizerSettings)
Parameters
Type | Name | Description |
---|---|---|
TextReader | sr | The TextReader to read from. |
StreamTokenizerSettings | tokenizerSettings | Tokenizer settings. |
StreamTokenizer(string)
Construct and set a string to tokenize.
Declaration
public StreamTokenizer(string str)
Parameters
Type | Name | Description |
---|---|---|
string | str | The string to tokenize. |
Fields
| Edit this page View SourceNChars
This is the number of characters in the character table.
Declaration
public static readonly int NChars
Field Value
Type | Description |
---|---|
int |
Properties
| Edit this page View SourceSettings
The settings which govern the behavior of the tokenization.
Declaration
public StreamTokenizerSettings Settings { get; }
Property Value
Type | Description |
---|---|
StreamTokenizerSettings |
TextReader
This is the TextReader that this object will read from. Set this to set the input reader for the parse.
Declaration
public TextReader TextReader { get; set; }
Property Value
Type | Description |
---|---|
TextReader |
Methods
| Edit this page View SourceDisplay()
Display the state of this object.
Declaration
public void Display()
Display(string)
Display the state of this object, with a per-line prefix.
Declaration
public void Display(string prefix)
Parameters
Type | Name | Description |
---|---|---|
string | prefix | The pre-line prefix. |
GetEnumerator()
Gibt einen Enumerator zurück, der die Auflistung durchläuft.
Declaration
public IEnumerator<Token> GetEnumerator()
Returns
Type | Description |
---|---|
IEnumerator<Token> | Ein IEnumerator<T>, der zum Durchlaufen der Auflistung verwendet werden kann. |
NextToken(out Token)
Get the next token. The last token will be an EofToken unless there's an unterminated quote or unterminated block comment and Settings.DoUntermCheck is true, in which case this throws an exception of type StreamTokenizerUntermException or sub-class.
Declaration
public bool NextToken(out Token token)
Parameters
Type | Name | Description |
---|---|---|
Token | token | The output token. |
Returns
Type | Description |
---|---|
bool | bool - true for success, false for failure. |
Tokenize(IList<Token>)
Parse the rest of the stream and put all the tokens in the input ArrayList. This resets the line number to 1.
Declaration
public bool Tokenize(IList<Token> tokens)
Parameters
Type | Name | Description |
---|---|---|
IList<Token> | tokens | The ArrayList to append to. |
Returns
Type | Description |
---|---|
bool | bool - true for success |
TokenizeFile(string)
Tokenize a file completely and return the tokens in a Token[].
Declaration
public Token[] TokenizeFile(string fileName)
Parameters
Type | Name | Description |
---|---|---|
string | fileName | The file to tokenize. |
Returns
Type | Description |
---|---|
Token[] | A Token[] with all tokens. |
TokenizeFile(string, IList<Token>)
Parse all tokens from the specified file, put them into the input ArrayList.
Declaration
public bool TokenizeFile(string fileName, IList<Token> tokens)
Parameters
Type | Name | Description |
---|---|---|
string | fileName | The file to read. |
IList<Token> | tokens | The ArrayList to put tokens in. |
Returns
Type | Description |
---|---|
bool | bool - true for success, false for failure. |
TokenizeReader(TextReader, IList<Token>)
Parse all tokens from the specified TextReader, put them into the input ArrayList.
Declaration
public bool TokenizeReader(TextReader tr, IList<Token> tokens)
Parameters
Type | Name | Description |
---|---|---|
TextReader | tr | The TextReader to read from. |
IList<Token> | tokens | The ArrayList to append to. |
Returns
Type | Description |
---|---|
bool | bool - true for success, false for failure. |
TokenizeStream(Stream, IList<Token>)
Parse all tokens from the specified Stream, put them into the input ArrayList.
Declaration
public bool TokenizeStream(Stream s, IList<Token> tokens)
Parameters
Type | Name | Description |
---|---|---|
Stream | s | |
IList<Token> | tokens | The ArrayList to put tokens in. |
Returns
Type | Description |
---|---|
bool | bool - true for success, false for failure. |
TokenizeString(string, IList<Token>)
Parse all tokens from the specified string, put them into the input ArrayList.
Declaration
public bool TokenizeString(string str, IList<Token> tokens)
Parameters
Type | Name | Description |
---|---|---|
string | str | |
IList<Token> | tokens | The ArrayList to put tokens in. |
Returns
Type | Description |
---|---|
bool | bool - true for success, false for failure. |