Saturday, November 5, 2011

Masking and Matching

When you have to find things in a list that match things in another list, that's fairly easy.  You simply spin through one list (the shorter one) and for each element there, you see if "WordPos" delivers a match in the other list.  WORDPOS' processing is probably faster by orders of magnitude than anything you can code in REXX using the Bigger Hammer method.

What do you do when you have to find elements in one list that are merely like elements in another list?  Say you need to find all the membernames in a PDS that look like 'BRT**4'.  That's a different problem.  Yes, you could just go get a bigger hammer, or you can reach for the jeweler's screwdrivers.

REXX has two built-in functions, BITOR and BITAND, that can double-team that problem.  BITAND operates on two strings by ANDing them at the bit-level.  BITOR operates on two strings by ORing them at the bit-level.

BITAND returns a 1-bit in every bit position for which both strings have a 1-bit:  BITAND( '73'x , '27'x ) yields '23'x.  '0111 0011' & '0010 0111' = '0010 0011'.

BITOR returns a 1-bit in every bit position for which either string has a 1-bit:  BITOR( '15'x , '24'x ) yields '35'x.  '0001 0101' | '0010 0100' = '0011 0101'.

If there are characters for which it doesn't matter whether they match, those characters can be BITANDed with 'FF'x and BITORed with '00'x.  The result in each case is that the "I don't care" characters are returned intact by the BITAND/BITOR.  For the other characters, the ones for which it does matter whether they match, only an exact match will deliver the proper bit-pattern when they are both BITANDed and BITORed.  Like this:

      memmask = Strip(memmask,"T","*")
      maskl   = Length(memmask)
      lomask  = Translate(memmask, '00'x , "*")
      himask  = Translate(memmask, 'FF'x , "*")
      do Words(mbrlist)                /* each membername            */
         parse var mbrlist mbr mbrlist
         if BitAnd(himask,Left(mbr,maskl)) = ,
             BitOr(lomask,Left(mbr,maskl)) then,
            mbrlist = mbrlist mbr
      end                              /* words                      */

In this case, we have been given 'memmask' which may look like 'BRT**4'.  MASKL is '6'.  LOMASK gets '00'x characters in place of the asterisks.  HIMASK gets 'FF'x characters in place of the asterisks.  We spin through the list of membernames stored in MBRLIST.  For each word ("MBR") in that list, we BITAND it against the HIMASK and BITOR it against the LOMASK.  When the comparison is exactly equal, we have a match.  In this case, we merely attach the matched membername at the back of MBRLIST.  When we have iterated this loop "Words(mbrlist)" times (once for each original word), MBRLIST will contain only those membernames matching "BTR**4".

As with all 'tips and tricks', this is merely one way to 'skin the cat'.

No comments:

Post a Comment