Charater Classification and Conversion

Technical Notes

Overview
Classifying Characters
Converting Characters
Sample Program

Overview

This unit consists of functions which perform character classification and character conversion.

When strings are involved in character conversions, it is assumed that they are ASCIIZ strings (i.e., NULL terminated).

Classifying Characters

_is_instr
Determines if a character matches any specified character from a given string.
Spontaneous Assembly provides a rich collection of functions that determine if a character is a member of a family of characters (uppercase characters, digit characters, etc.). However, because most C compilers provide macros which perform character classification using a table look-up, C interfaces are not provided for many of the Spontaneous Assembly functions in this unit. Although the table look-up technique is very fast and tight, it requires a 257 byte table to be linked into the program. If this is unacceptable, inline assembly can be used to call Spontaneous Assembly routines which do not have the overhead of the array. A C interface is provided for _is_instr because it is generally unavailable in C libraries.

Examples of how to use the remaining Spontaneous Assembly character classification functions are given below. For each of these functions the JE condition is returned if the character matches the classification and the JNE condition is returned if it does not match the classification.

The following code assumes INLINE.H has been included:

Example 1 (is_alnum):                        Example 2 (is_alpha):

    get_chr ();                                 get_chr():
    is_alnum ();                                is_alpha();
    jne    alnum_10                             jne  alpha_10 
/* Alphanumeric keystroke */                 /* Handle a "non-alpha" keystroke */
    jmp     short alnum_20                      jmp   short alpha_20 
alnum_10:                                    alpha_10:                        
/* Non alphanumeric keystroke */             /* Handle an "alpha" keystroke */
alnum_20:                                    alpha_20:  
Example 3 (is_cntrl):                        Example 9 (is_punct):

    get_chr ();                                 get_chr():
    is_cntrl ();                                is_punct();
    jne    cntrl_20                             jne   punct_10       
/* Handle a control keypress */              /* Punctuation character */
    jmp     short cntrl_20                      jmp   short punct_20     
cntrl_10:                                    punct_10:                             
/* Handle a non-control keypress */          /* Non-punctuation character */       
cntrl_20:                                    punct_20:                                          
                                                put_newline ();
                                     
Example 4 (is_digit):                        Example 10 (is_space):

        get_chr ();                             get_chr(): 
        is_digit ();                            is_space():           
         jne    digit_10                         jne  space_10                 
/* Numeric keypress */                       /* Space character */                 
        jmp     short digit_20                   jmp short space_20
digit_10:                                    space_10:                                                    
/* Non-numeric keypress */                   /* Non-space character */             
digit_20:                                    space_20:                                         
                                                put_newline ();
                                             
Example 5 (is_graph):                Example 11 (is_wspace):

        get_chr ();                             get_chr();         
        is_graph ();                            is_wspace();                     
         jne    graph_10                         jne wspace_10          
/* Graphics character */                    /* A whitespace character */                       
        jmp     short graph_20                   jmp short wspace_20                       
graph_10:                                  wspace_10:                                            
/* Non-graphics character */               /* Non-whitespace character */     
graph_20:                                  wspace_20:                                                  
                                                 put_newline ();            
                                                 
Example 6 (is_lower):                      Example 12 (is_xdigit):    

        get_chr ();                              get_chr();
        is_lower ();                             is_xdigit();
         jne    lower_10                          jne  xdigit_10                
/* Lowercase character */                   /* Hexadecimal character */                           
        jmp     short lower_20                    jmp short xdigit_20                          
lower_10:                                   xdigit_10:        
/* Non-lowercase character */               /* Non-hexadecimal character */                         
lower_20:                                   xdigit_20:                             
                                                 put_newline ();                   
Example 7 (is_upper):                       Example 13 (is_xspace):                             

        get_chr ();                              get_chr();                   
        is_upper ();                             is_xspace();                 
         jne    is_upper_10                       jne  xspace_10                                      
/* Uppercase character */                   /* Extended space character */     
        jmp     short                              jmp short xspace_20        
is_upper_20                                 xspace_10:                          
is_upper_10:                                /* Non-extended space character */
/* Non-uppercase character */               xspace_20:                            
is_upper_20:                                       _put_newline ();
                                                                            
Example 8 (is_print):                                

        get_chr ();                                                  
        is_print ();                                                     
         jne    print_10                                           
/* Printable character */                                                
        jmp     short print_20                                           
print_10:                                                          
/* Non-printable character */                 
print_20:                                                                
        put_newline ();                                      

Converting Characters

_to_lower
Converts an uppercase ASCII letter (A-Z) to lowercase (a-z).
_to_upper
Converts a lowercase ASCII letter (a-z) to uppercase (A-Z).
These functions convert ASCII alphabetic characters between uppercase and lowercase characters.

These case conversion functions only affect ASCII alphabetic characters which are not already the indicated case. Note that IBM PC international characters (characters codes above 127) are not affected by this function.

Sample Program

The sample program shown below uses character conversion functions to demonstrate character conversion to uppercase and to lowercase. The program is listed in its entirety.

#include 
#include 
#define ESC 27
void main (void)
{
   char chr;
   _put_str ("Press escape to exit...");
   _put_newline ();
   _put_str ("Press a key... ");
   chr = _get_chr ();
   while (chr != ESC)
   {
      if (chr == 0) /* ignore extended keypress */
         _get_chr ();
      else {
            char high,low;
         _put_str ("  Character: ");
         _put_chr (chr);
         _put_str ("  Uppercase: ");
         _put_chr (_to_upper (chr));
         _put_str ("  Lowercase: ");
         _put_chr (_to_lower (chr));
         _put_newline ();
         _put_str ("Press a key... ");
      }
      chr = _get_chr ();
   }
}
The sample program shown above (CHAR.C) is provided on the distribution diskettes and may be compiled and linked using the following Microsoft C and Borland C command lines:

Microsoft C:


cl /c /I\msc\include /I\sa\include char.c
link char,char,,\sa\lib\_sas \msc\lib\slibce
Borland C:

bcc /c /ms /I\bc\include /I\sa\include char.c
tlink \bc\lib\c0s char,char,,\sa\lib\_sas \bc\lib\cs