Thursday, December 29, 2016

Saving Time; Saving Compiles

Let's face it:  programs fail, and if they're like my programs, they'll fail at the most inopportune time, generally around 2 in the morning.

A thought just occurred to me:  if you live or work in an environment where you are dependent on a compiler — a PL/I, COBOL, or ForTran environment, for instance — you need to be saving your compiler listings.  "OMG!  You can't be serious!  That would take up mountains of disk space!"  Yes, I'm serious.  The mountains of disk space are peanuts compared to the time your programmers will spend at 2am recreating — if it's even possible — a compiler listing from before the last change to the compiler.

When some piece of compiled code FOOPs in the wee hours, the last thing you need is to have to locate the code and compile it.  The compile listing you get from doing that may not, in fact, be an accurate representation of the load module that failed.  Comforting thought, no? 

The only plausible solution is to save the compiler listing for the module you will eventually roll into production.  Whenever you last produce a load module that is destined for production, that compiler listing must be saved to a place where, when you need it at 2am, you can find it quickly and easily.  How you pull this off is likely dependent on the system programmers who are in charge of your infrastructure, the processes you use to get software into the production libraries.  The greatest challenge here is to get those process-oriented folk to understand that the process isn't just one step, that actions taken in the past have serious consequences for actions that will take place in the future.  Apologies in advance to sysprogs this doesn't apply to, but my experience is that most of them are fundamentally incapable of thinking three moves ahead.  The compiler listing must be saved now in case it is needed in the future.  So...

If your protocol is that you always compile at 'acceptance test' time and move the resultant load module to production when it's accepted, you must capture the a/t compiler listing.  If you re-compile before the move to production, that compiler listing is the one you need to preserve.  You need a large compiler-listing dataset whose members will match one-to-one with the production load library.  You will also want to capture the linkeditor's output since that forms a critical mass of information about the load module, and therein lies a problem:  the DCB of the LKED listing is potentially different than that of the compiler output.  I got yer solution right here:

Go to the REXX Language Assn's free code repository and get COMBINE.  In the JCL for doing the compile/link, insert an IKJEFT01 step after the LKED has run and execute COMBINE there specifying the DDs for the compiler output and the LKED output.  COMBINE will combine (naturally) those two files and produce a single file that can be saved to your listing dataset.  See the HELP text for COMBINE for the calling sequence, but it will generally be something like:

with //COMPILE DD referencing the compiler output and //LKED DD referencing the LKED output.  Your combined output will be on //$PRINT DD.

Start today.  Nothing moved to production before you start doing this will be included, so the sooner you start, the better.

Wednesday, June 8, 2016

Profiles Inefficiency

I got into this racket in 1971 when IBM decided I should be a programmer rather than an accountant, and who was I to argue?  They introduced me to the IBM 029 keypunch machine and the PL/I(F) language along with JCL and utilities.  I began to program, such as it was.  Write a description of the program, convert it to a flowchart, code the program by hand onto keypunch sheets, give the sheets to the keypunch operator and wait for her (always 'her') to produce a deck of cards with neat rectangular holes, proof-read the cards, deliver them to the RJE (Remote Job Entry) station downstairs (we had a Mod 25, I think, that was used as a glorified card reader), pick up the output when ready along with the cards, check the SYSOUT, correct the program, repeat until success.

A few years later, the department got a shipment of 2741 terminals, Selectric-y things with the ability to communicate with computers far away.  There was a sign-up sheet where one would bid for time on the few terminals, and we would wait in our offices (yes, real offices) for the phone to ring with the news that it was 'time' to go work on the terminal.  Log on to TSO, edit the datasets, compile the program, build the JCL, run it, check the output, fix the program, fix the JCL, repeat until success.  Later, the 2741s were replaced by 3274s, 'green screen' paperless terminals, but the process, while faster, remained essentially the same.

Then, along came SPF, the Structured Programming Facility, with a suite of utility functions and an editor...  a WYSIWYG editor!  What You See Is What You Get.  Each improvement made the process easier and faster and less error-prone.  It was a long time before anyone realized those three things were operationally connected.

SPF mutated into ISPF, the Interactive System Productivity Facility, and with it came a whole raft of new features...  and new problems.  One of the new features was something called "edit profiles".  ISPF would handle different types of data differently, and the user could control this by strategically naming datasets.  The profiles were named based on the low-level qualifier (LLQ) of the particular dataset plus its record format (F or V) plus its record length, so you wouldn't get a PLI-F-80 profile mixed up with a CNTL-F-80 profile, even though the data had the same general 'shape'.  Alas, ISPF only made provision for 25 profiles, and when a user created a 26th profile, the least-recently-used profile would be summarily discarded to make room for the new one.  Because users were never given much education in 'profiles and their care and feeding', and since the system administrators were often system programmers whose use of ISPF was rarely more than rudimentary, the stage was set for all kinds of mischief:

The Jones Company uses COBOL and PLI, along with CLIST and REXX, in a zOS/TSO setting.  Everyone uses ISPF.  There are no standards regarding dataset names that deal with other than the first two nodes;  the LLQ is never even considered as something that ought to be standardized.

Arthur is a programmer.  He has datasets named (we ignore here the high-level qualifiers and concentrate on the LLQs) PAYROLL.PLI, TOOLS.PLI, REPORTS.PLI, and MISC.PLI.  They all contain PL/I source and all use the same profile, PLI-F-80.  He also has PAYROLL.CNTL, TOOLS.CNTL, REPORTS.CNTL, and MISC.CNTL.  These all contain JCL and similar data and all use the same profile, CNTL-F-80.  Betty has datasets named PLICODE.STUFF, COBOL.STUFF, and JCL.STUFF.  They all use the same profile, STUFF-F-80, even though their contents are radically different.  Betty is constantly changing profiles in ISPF Edit to get the behavior she wants from the editor.  Cheryl has datasets named ACCTG.SOURCE (a mixture of PL/I and COBOL), and PROTOTYP.SRC (a similar mixture of languages) along with ACCTG.JCL and PROTOTYP.JCLLIB.  Cheryl has four profiles: SOURCE-F-80, SRC-F-80, JCL-F-80, and JCLLIB-F-80.  Betty asks Cheryl to look at a module in COBOL.STUFF that won't compile in hopes that Cheryl might see something wrong.  Cheryl views Betty's COBOL.STUFF and suddenly gets a new profile, STUFF-F-80, that she never had before, and it's different than Betty's STUFF-F-80.  Cheryl is stumped by the problem and asks for Arthur's help.  Presently, Arthur has a STUFF-F-80 profile and his, too, is different than either Betty's or Cheryl's.  Neither Arthur nor Cheryl edited Betty's data;  they both used VIEW which, unfortunately, is affected by the problem since it uses edit profiles.

We're dealing here with just three people and already the problem is showing its potential.  Imagine a setting with 200 programmers, technicians, testers, and so on, all operating guidance-free.  One day you go into edit on one of your datasets and the highlighting is wrong, the data is shifted into all-CAPS whenever you type something new, and your tab settings seem to have gone away.  "What the hell happened to my profile?" you ask.  The answer is that it got purged when Larry had you look at a stack of ISPF skeletons he found in an archive.  You created profile #26 and lost one that you had counted on keeping.  P.s.:  when you went back into edit on that all-wrong data you re-created a profile for that dataset and purged another, different profile.  Which one?  I have no idea and neither does anyone else.

Is there an answer to the problem of "where did my profile go?"?  There is, and the answer is 'standards'.  Someone in authority — and the higher, the better — must say:  "All xxx data must reside in a dataset whose LLQ is aaa" and repeat as necessary for all known common types of corporate data.  Exceptions, where a case can be made for an exception, are granted by managers.  There also needs to be a catch-all category for material that doesn't fit neatly anywhere else.

In recent years, ISPF changed the protocol a little.  Profiles can be locked or unlocked.  When EDIT goes looking for a profile to purge, it selects the least-recently-used unlocked profile.  If there aren't any of those, the least-recently-used locked profile gets tossed.  Also, there's something the sysprogs can do post-installation that allows each user to have 37 profiles, somewhat easing the problem if not eliminating it.  Regardless, be careful about how you name your datasets and urge everyone else to be just as careful.  If corporate doesn't set standards, try to get your department or division to do it. 

Sunday, May 22, 2016

Quick! Get me member (Q47MMT1)!

Imagine a partitioned dataset with 8,000 members (or more).  This is getting into the range where finding the directory entry for a specific member is becoming a real chore and is chewing up cycles.  I heard of an imaginative way to speed up the process.

Define the partitioned dataset as a Generation Data Group and make the group large enough that, when the dataset is split, searching the directory of each is less of a chore (it will be even if only because each fragment of the whole is smaller).  Let's say, for the sake of argument, that we break it into 27 generations, one for each letter of the alphabet plus a catch-all.  Now copy all the members beginning with non-alphabetics into generation #1, all the "A"s into #2, all the "B"s into #3, etc.  When you access the group via its base-name (without specifying a generation) you get them all concatenated in 27-26-25...3-2-1 order.

When you look for member 'Q47MMT1', the directory of generation #27 is scanned, but member names are always in alphabetical order and this directory starts with 'Z...'.  That's not it; skip to G0026V00.  Its first entry starts with 'Y...'.  Nope.  G0025V00 starts with 'X...', G0024V00 starts with 'W...', G0023V00 starts with 'V...', G0022V00 starts with 'U...', G0021V00 starts with 'T...', G0020V00 starts with 'S...', G0019V00 starts with 'R...', G0018V00 starts with 'Q...'.  Got it!  You quickly find the required member and processing continues.  What's happening here is that instead of searching through 8,000+ directory entries and finding what you seek in (on average) 4000-or-so lookups, you looked at (on average) 13 + ~150 (8000 / 27 / 2).  As the original partitioned dataset gathers more members, this comparison gets more stark.  At some point, the comparison is so stark that someone will wonder if the quicker method failed because it just couldn't complete that fast.

Monday, May 9, 2016

ALIAS is not a four-letter word

Are you one of those who thinks "Alias?  Why bother?"?  They do have their uses, and with a little imagination they can be leveraged to deliver surprising productivity gains.

Aliases come in two flavors:  member aliases and dataset aliases.

Member aliases are nothing more than entries in a partitioned dataset's directory.  Each such entry holds the TTR (track and record) of an existing member — called the "base member".  If you edit an alias and save it, BPAM writes the saved text at the back of the dataset and records the new TTR in the directory entry for the alias, making it a base member in its own right (no longer an alias of some other base member).  But as long as it is an alias, any reference to the base name or any of its aliases points to the same code.  Most languages have the facility of knowing by which name the routine was called, and the logic may branch differently for each (or not — it depends).

Dataset aliases provide the same sort of facility but at the dataset level.  These aliases must be kept in the same catalog as holds the dataset name for which the alias is created.  The kicker here is that the alias and the dataset it aliases must have the same high-level qualifier.  If they didn't, they'd be in different master catalogs and couldn't exist in the same sub-catalog.

So, what can you do with a dataset alias?  Why would you bother?  Well, here's a practical application that can save hours of updating and weeks of grief:  You have a dataset (or a series of datasets) that IBM or some other maintainer periodically updates.  Maybe it's the PL/I compiler or something similar.  You have a cataloged procedure, a PROC, or possibly several of them that programmers use to compile programs.  If your PROC(s) all reference SYS1.COMPLIB.V04R012.LOADLIB, then when IBM sends down the next update, somebody is going to have to change all those PROCs to reference ...V04R013... and slip them into the PROCLIB at exactly the right moment.  Usually that means Sunday afternoon when the system is quiesced for maintenance and only the sysprogs are doing any work.  Or...

You could alias whichever is the currently supported version as SYS1.COMPLIB.CURRENT.LOADLIB.  When the new version has been adequately tested and is ready to be installed for everyone's use, you use IDCAMS to DELETE ALIAS the old one and DEFINE ALIAS the new one.  These two operations will happen so fast it will be like flipping a switch:  one instant everyone is using V04R012, and the next they're using V04R013.  The system doesn't even have to be down.  You can do it Tuesday during lunch.  Nobody's JCL has to change, but (more importantly) none of the PROCs have to change, either.  Your favorite beta-testers can access the next level just by overriding the STEPLIB.  Everybody else just uses the PROC as-is.  If somebody reallyreallyreally needs to get to the prior version, that, too, is just a STEPLIB override.

I think (but I don't know for certain) that you can write an ACF2 rule that allows certain privileges to an ALIAS that are prohibited to the BASE (and vice versa), but the most amazing ALIAS-trick (as far as I'm concerned) is the ability to swap one dataset for another with none of the users being any the wiser.