Tuesday, August 21, 2012

What's Wrong With This picture?

How many times have you seen this and thought nothing of it?

do nn = 1 to next.0
   --parse token from next.nn --
   --35 lines of binary search thru first. --
   if notfound then do
      --diagnostic message--
   else do
      --process the match--

Not only have I seen this kind of code, I have written this kind of code.  A startling revelation, an epiphany, has rocked my world.

The 'revelation' is this:  in MVS, the first time you say 'READ', you don't just read one record; you read five buffers.  If you're talking about modern DASD and 80-byte records, you've just 'read' something like 1400 or 1500 records. They're all in main storage.  Whatever operation you do to them, you do at main storage speeds.  And "EXECIO * DISKR" doesn't stop at five buffers;  it reads all of it.  All the heavy lifting, the SIOs, has already been done.  You've just spent $11 to read from DASD, and now you propose to save four cents by doing a binary search.  Are we all nuts?

In a situation like this, we should leverage the power of REXX's data-associated arrays by spinning through one of those stems and parsing everything we can find, then use the data-association to the second file to know (intuitively) whether there is or is not a match.  It's all in main storage, right?  You would need highly-sophisticated and very expensive equipment to discern how much time you saved by doing a binary search over using a sequential process.  The cost of having a programmer write those thirty-five lines of binary search will never be paid back by the time saved.

Edsger Dijkstra once proposed that a programmer worrying about how fast (or slow) hir code would execute was worrying about the wrong thing.  "Get a faster computer", he advised.  Easier said than done in many cases, but always the optimal solution.

That's not, however, what we're seeing here.  This is truly "penny-wise and pound-foolish" to do all that I/O and then waste the advantage of having it all immediately accessible (let's face it) for free.

I think I may have written my last binary search.  What do you think?

Friday, August 10, 2012

Embedding ISPF (and other) assets

When I write a tool for myself — for my own use — I will typically include ISPF assets at the bottom of the code and have software extract them as part of the initialization phase.  I rarely load panel text to ISPPLIB or skeletons to ISPSLIB.  There are several advantages to keeping your ISPF assets 'local':

  • I/O is reduced, sometimes very substantially reduced
  • changes made to these assets are reflected immediately without having to be in TEST-mode and certainly without having to leave ISPF and restart it
  • there is no doubt about the identity of subsidiary elements
  • there is no danger of duplicate member names
  • When distributing or installing, there is only one element to distribute or install: the enclosing REXX code

When ISPF is invoked by a non-developer (call it 'standard mode') its habit is to cache any panels, skeletons, or messages that it uses.  It keeps them in storage so that if the same element is re-used, ISPF can get it from the cache rather than doing I/O to get a fresh copy.  Obviously, if you're modifying that element, saving it to its library won't do a thing for your current session.  To get that new panel, you have to exit ISPF to READY-mode and restart ISPF.  That takes a lot of I/O because on start-up, ISPF opens and reads ISPPLIB, ISPSLIB, ISPTLIB, ISPMLIB, and ISPLLIB so that it knows all the available membernames and where they're located — in case you ask for one of them.

Developers who work with ISPF services generally invoke ISPF in TEST-mode when developing because in TEST-mode, ISPF caches nothing and always does I/O to handle service requests.  If you've just saved a change to a panel and you're in TEST-mode, your next DISPLAY request will retrieve the new version.  The penalty you pay for this is that every service request is handled via I/O.

Embedding your ISPF assets gives you the best of both worlds:  because ISPF caches elements based on DSN+membername, re-extracting ISPF assets at execution-time creates a new dataset (in VIO) and ISPF recognizes that this member XYZ is not the same XYZ as that in the cache, so it reloads a fresh copy.  All other service requests are handled via the cache — because you don't use TEST-mode.

It gets better:  When you invoke the enclosing REXX, it all gets read into storage immediately, and that includes your panels and skeletons.  Extracting them thus happens at 'core speed' and if they're written to VIO datasets, that happens at 'core speed' as well.  No need to read the panel(s) from ISPPLIB or the skeleton(s) from ISPSLIB separately.  It's already here.

You no longer have to search through the libraries to ensure you're not using a duplicate membername — because your data is not going to live in any of those libraries.  The elements embedded in your application are extracted and loaded to a library which is then LIBDEF'd into a preferential position to all others.  If you use a name replicated elsewhere, you will only use your element;  the other will be 'masked' by virtue of being too far down the concatenation.  Of course, when the application ends, LIBDEF tears down those purpose-built libraries and the environment is restored to its pre-invocation state.  Neat.

When you install the application, you install one element.  All the other panels and skeletons used by the main REXX routine do not get separately installed — there's no need to formally install them because they will be regenerated dynamically as and when needed.

If anyone out there can find a 'down-side' to any of this, I'd be very interested to hear it.  To me, it all looks like 'up-side'.

Code for extracting ISPF assets can be found on my REXX Tools page and a short example of how it's implemented at ALIST.