com.holub.text
Class Scanner

java.lang.Object
  extended by com.holub.text.Scanner

public class Scanner
extends Object

A Scanner lets you read a file as a set of input tokens.

See the source code for Database (in the distribution jar) for an example of how a token set is used in conjunction with a Scanner. Here's a stripped-down version:

First create a token set:

        private static final TokenSet tokens = new TokenSet();

        private static final Token
                COMMA           = tokens.create( "',"           ),
                EQUAL           = tokens.create( "'="           ),
                LP                      = tokens.create( "'("           ),
                RP                      = tokens.create( "')"           ),
                DOT                     = tokens.create( "'."           ),
                STAR            = tokens.create( "'*"           ),
                SLASH           = tokens.create( "'/"           ),
                AND                     = tokens.create( "'AND"         ),
                BEGIN           = tokens.create( "'BEGIN"       ),
                CREATE          = tokens.create( "'CREATE"      ),
                //...
                INTEGER         = tokens.create( "(small|tiny|big)?int(eger)?"),
                IDENTIFIER      = tokens.create( "[a-zA-Z_0-9/\\\\:~]+"         );
        
Then create and initialize the scanner. The following method scans input from a string (as compared to a file):
        private Scanner in;

        public Table execute( String expression ) throws IOException, ParseFailure
        {       try
                {       this.expression   = expression;
                        in                             = new Scanner(tokens, expression);
                        in.advance();        // advance to the first token.
                        return statement();
                }
                catch( ParseFailure e )
                {       if( transactionLevel > 0 )
                                rollback();
                }
                //...
        }
        
The Scanner uses a "match/advance" strategy. Rather than read tokens that you might have to push back, you first check if the next token is the one you want, and then advance past it if so.
        // statement
        //      ::= CREATE  DATABASE IDENTIFIER
        //      |   CREATE  TABLE    IDENTIFIER LP idList RP
        //
        void statement()
        {
                // The matchAdvance(CREATE) call skips past (and returns)
                // the CREATE token if it's the next input token, otherwise
                // it returns null.

                if( in.matchAdvance(CREATE) != null )
                {
                        // Here, I'm doing match and advance as two separate
                        // operations.

                        if( in.match( DATABASE ) )
                        {       in.advance();
                                createDatabase( in.required( IDENTIFIER ) );
                        }
                        else // must be CREATE TABLE
                        {       
                                // This required() call throws an exception
                                // if the next input token isn't a TABLE. If
                                // a TABLE token is found, then we'll advance past
                                // it automatically.

                                in.required( TABLE );
                                String tableName = in.required( IDENTIFIER );
                                in.required( LP );
                                createTable( tableName, declarations() );
                                in.required( RP );
                        }
                }
                //...
        }
        

©2004 Allen I. Holub. All rights reserved.

This code may be used freely by yourself with the following restrictions:

  1. Your splash screen, about box, or equivalent, must include Allen Holub's name, copyright, and URL. For example:

    This program contains Allen Holub's SQL package.
    (c) 2005 Allen I. Holub. All Rights Reserved.
    http://www.holub.com


    If your program does not run interactively, then the foregoing notice must appear in your documentation.
  2. You may not redistribute (or mirror) the source code.
  3. You must report any bugs that you find to me. Use the form at http://www.holub.com/company/contact.html or send email.
  4. The software is supplied as is. Neither Allen Holub nor Holub Associates are responsible for any bugs (or any problems caused by bugs, including lost productivity or data) in any of this code.

Nested Class Summary
static class Scanner.Test
           
 
Constructor Summary
Scanner(TokenSet tokens, Reader inputReader)
          Create a Scanner for the indicated token set, which will get input from the indicated Reader.
Scanner(TokenSet tokens, String input)
          Create a Scanner for the indicated token set, which will get input from the indicated string.
 
Method Summary
 Token advance()
          Advance the input to the next token and return the current token (the one in the input before the advance).
 ParseFailure failure(String message)
          Throw a ParseFailure object initialized for the current input position.
 boolean match(Token candidate)
          Return true if the current token matches the candidate token.
 String matchAdvance(Token candidate)
          Combines the match and advance operations.
 String required(Token candidate)
          If the specified candidate is the current token, advance past it and return the lexeme; otherwise, throw an exception with the rror message "XXX Expected".
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Scanner

public Scanner(TokenSet tokens,
               String input)
Create a Scanner for the indicated token set, which will get input from the indicated string.


Scanner

public Scanner(TokenSet tokens,
               Reader inputReader)
Create a Scanner for the indicated token set, which will get input from the indicated Reader.

Method Detail

match

public boolean match(Token candidate)
Return true if the current token matches the candidate token.


advance

public Token advance()
              throws ParseFailure
Advance the input to the next token and return the current token (the one in the input before the advance). This returned token is valid only until the next advance() call (at which time the lexeme may change, for example).

Throws:
ParseFailure

failure

public ParseFailure failure(String message)
Throw a ParseFailure object initialized for the current input position. This method lets a parser that's using the current scanner report an error in a way that identifies where in the input the error occurred.

Parameters:
message - the "message" (as returned by Throwable.getMessage) to attach to the thrown RuntimeException object.
Throws:
ParseFailure - always.

matchAdvance

public String matchAdvance(Token candidate)
                    throws ParseFailure
Combines the match and advance operations. Advance automatically if the match occurs.

Returns:
the lexeme if there was a match and the input was advanced, null if there was no match (the input is not advanced).
Throws:
ParseFailure

required

public final String required(Token candidate)
                      throws ParseFailure
If the specified candidate is the current token, advance past it and return the lexeme; otherwise, throw an exception with the rror message "XXX Expected".

Throws:
ParseFailure - if the required token isn't the current token.