Sunday, May 22, 2016

Quick! Get me member (Q47MMT1)!

Imagine a partitioned dataset with 8,000 members (or more).  This is getting into the range where finding the directory entry for a specific member is becoming a real chore and is chewing up cycles.  I heard of an imaginative way to speed up the process.

Define the partitioned dataset as a Generation Data Group and make the group large enough that, when the dataset is split, searching the directory of each is less of a chore (it will be even if only because each fragment of the whole is smaller).  Let's say, for the sake of argument, that we break it into 27 generations, one for each letter of the alphabet plus a catch-all.  Now copy all the members beginning with non-alphabetics into generation #1, all the "A"s into #2, all the "B"s into #3, etc.  When you access the group via its base-name (without specifying a generation) you get them all concatenated in 27-26-25...3-2-1 order.

When you look for member 'Q47MMT1', the directory of generation #27 is scanned, but member names are always in alphabetical order and this directory starts with 'Z...'.  That's not it; skip to G0026V00.  Its first entry starts with 'Y...'.  Nope.  G0025V00 starts with 'X...', G0024V00 starts with 'W...', G0023V00 starts with 'V...', G0022V00 starts with 'U...', G0021V00 starts with 'T...', G0020V00 starts with 'S...', G0019V00 starts with 'R...', G0018V00 starts with 'Q...'.  Got it!  You quickly find the required member and processing continues.  What's happening here is that instead of searching through 8,000+ directory entries and finding what you seek in (on average) 4000-or-so lookups, you looked at (on average) 13 + ~150 (8000 / 27 / 2).  As the original partitioned dataset gathers more members, this comparison gets more stark.  At some point, the comparison is so stark that someone will wonder if the quicker method failed because it just couldn't complete that fast.

Monday, May 9, 2016

ALIAS is not a four-letter word

Are you one of those who thinks "Alias?  Why bother?"?  They do have their uses, and with a little imagination they can be leveraged to deliver surprising productivity gains.

Aliases come in two flavors:  member aliases and dataset aliases.

Member aliases are nothing more than entries in a partitioned dataset's directory.  Each such entry holds the TTR (track and record) of an existing member — called the "base member".  If you edit an alias and save it, BPAM writes the saved text at the back of the dataset and records the new TTR in the directory entry for the alias, making it a base member in its own right (no longer an alias of some other base member).  But as long as it is an alias, any reference to the base name or any of its aliases points to the same code.  Most languages have the facility of knowing by which name the routine was called, and the logic may branch differently for each (or not — it depends).

Dataset aliases provide the same sort of facility but at the dataset level.  These aliases must be kept in the same catalog as holds the dataset name for which the alias is created.  The kicker here is that the alias and the dataset it aliases must have the same high-level qualifier.  If they didn't, they'd be in different master catalogs and couldn't exist in the same sub-catalog.

So, what can you do with a dataset alias?  Why would you bother?  Well, here's a practical application that can save hours of updating and weeks of grief:  You have a dataset (or a series of datasets) that IBM or some other maintainer periodically updates.  Maybe it's the PL/I compiler or something similar.  You have a cataloged procedure, a PROC, or possibly several of them that programmers use to compile programs.  If your PROC(s) all reference SYS1.COMPLIB.R012V04.LOADLIB, then when IBM sends down the next update, somebody is going to have to change all those PROCs to reference ...R013V01... and slip them into the PROCLIB at exactly the right moment.  Usually that means Sunday afternoon when the system is quiesced for maintenance and only the sysprogs are doing any work.  Or...

You could alias whichever is the currently supported version as SYS1.COMPLIB.CURRENT.LOADLIB.  When the new version has been adequately tested and is ready to be installed for everyone's use, you use IDCAMS to DELETE ALIAS the old one and DEFINE ALIAS the new one.  These two operations will happen so fast it will be like flipping a switch:  one instant everyone is using R012V04, and the next they're using R013V01.  The system doesn't even have to be down.  You can do it Tuesday during lunch.  Nobody's JCL has to change, but (more importantly) none of the PROCs have to change, either.  Your favorite beta-testers can access the next level just by overriding the STEPLIB.  Everybody else just uses the PROC as-is.  If somebody reallyreallyreally needs to get to the prior version, that, too, is just a STEPLIB override.

I think (but I don't know for certain) that you can write an ACF2 rule that allows certain privileges to an ALIAS that are prohibited to the BASE (and vice versa), but the most amazing ALIAS-trick (as far as I'm concerned) is the ability to swap one dataset for another with none of the users being any the wiser.