BookmarkSubscribeRSS Feed
Calcite | Level 5 MXG
Calcite | Level 5

Moderator's note: The author originally shared this on the SAS-L mailing list, and we have republished here -- under his name -- with his permission.


Fifty years ago today, October 9, 1972, I ran my first SAS Program.


I left the Navy in June, 1972, and in August, my Psychologist friend, Dr. L. Rogers Taylor, now working at State Farm Automobile HQ in Bloomington, IL, suggested I might find a home there and arranged for an interview. At Purdue in 1966, I had written FORTRAN programs for his dissertation, using pattern recognition techniques, cluster analysis, and vector distance tools from my Master's Research in EE at LARS, the Laboratory for Agricultural Remote Sensing. These tools had not been previously used in his then-new field of Industrial Psychology. His actual application analyzed questionnaires completed by Humble Oil Petroleum Engineers, which were then correlated with a separate data file that identified those Engineers who HAD found oil from those that hadn't, to construct a predictive questionnaire (very successfully, he received accolades from his peers for introducing pattern recognition to them). He arranged for an interview with the Vice President for Data Processing, Dr. Norman Vincent.


After completing the required HR forms, my escort very nervously drove me to the Corporate HQ Building; he had never even MET a State Farm Corporate VP, let alone be in a VP's office! I immediately clicked with Norm and met the manager of the brand new "Measurement Unit", Dave Vitek, and then spent the day interviewing members of that group (and being interviewed/evaluated by them). I started Sept 18, 1972 at $13800.


In 1972, the state of the art for IBM mainframe computer capacity planning was simple: your company's IBM salesman would visit with your company's vice president for data processing, hand him the contract for a newer and faster and larger computer for only a few million dollars. Dave Vitek had attended (the first?) Boole and Babbage User Group (BBUG) annual meeting, where the idea of actually measuring the computer system utilization was THE topic. Dave decided that rather than just trusting the IBM salesman as your capacity planner, State Farm should be able to figure out how measure its own computers, and Dave got Norm to fund a ten-person Measurement Unit for three years for a feasibility study.


Steve Cullen had drafted an excellent attack plan to investigate the four possible tools, SMF Accounting, Software Monitors, Hardware Monitors, and Simulation, and in short order, we had Kommand/PACES for accounting, Software Monitors (SYSTEM LEAP and PROGRAM LEAP), Hardware Monitors (TESDATA XRAY), and Simulation (SAM). But, Kommand was only for billing, with only a few canned reports, and with no tool for data extraction, Denny Maguire had started to write PL/1 programs to extract fields directly from the raw SMF records. When he mentioned he wanted to plot his data. I called Purdue's LARS and they sent me the ORTRAN "PLOT" subroutine that I had written there that did simple plots on line printers, but could also print detailed graphics on alComp paper plotters. Denny was still having problems reading the omplex data in SMF records, so my PLOT program was still untested, hen, in the September, 1972, Datamation, I found this announcement:


"The Institute of Statistics at North Carolina State University announces the availability of the Statistical Analysis System, a package of 100,000 lines, one third each in Fortran, PL/1 and Assembler, that does printing, analysis and plotting of data. The package is available, including source code, for $100.00."

I wrote for information, and got typical university documentation, with some pages dittoed, some pages typed, some printed, each on paper of a different color, but I immediately realized the power and simplicity and the beauty of the SAS language and especially of power of its INPUT statement which could clearly handle the complexity of SMF data. However, in their list of supported data field formats, there was no reference to support for Packed Decimal fields. You only need to get seven bytes into an SMF record to encounter a Packed Decimal field, so I called the Institute of Statistics at North Carolina State University, and was connected with Tony Barr, the designer of the SAS language and the author of the SAS compiler about support for that data type. In his North Carolina accent, he replied, "Wheall, we haven't got around to documenting it yet, but if you type in P D 4 Point, it'll work jest fine", so I convinced State Farm to risk the 1972 purchase price of $100 for the SAS package.


Starting in 1964, Tony Barr and Dr. Jim Goodnight had collaborated to develop an ANOVA routine for the Department of Agriculture. Tony had been an IBM developer of the data base for the cold war's Distant Early Warning (DEW line) radar system, and Jim was a well-known statistician. Both recognized the weakness of the existing stat packages: they were only subroutines that had to be invoked by other programs that had to prepare and manage the data to be analyzed. By creating a language, a database, and the statistics, the Statistical Analysis System expanded well beyond the original ANOVA routine and had been tested at several Agricultural Experimental Stations and other universities, but the 1972 announcement was the first public release of the Statistical Analysis System, and in October, 1972, State Farm was the FIRST real customer to install the SAS package from NCSU's Statistics Department.


Within days of receipt of SAS, I was extracting CPU time and PROGRAM name and Core-Hours to produce reports on resource consumption direct from SMF records. When the CPU time recorded in the Kommand billing records was found to be many hours less than the CPU time that my SAS program found reading SMF directly, we discovered that Kommand times were truncated (because COBOL fixed length fields were used), but because SAS stores all numerics as floating point numbers, SAS effectively had eliminated the exposure to truncation and to un-initialization, the two most common causes of numerical errors in computer programs!


Over the next months, I made presentations on the use of SAS software and began to discuss the design of the "PDB", the "Performance Data Base", a daily repository of performance and capacity related datasets created from SMF data.


Presentations were given to the Bloomington and Chicago chapters of the ACM and DPMA; the SAS data base was mentioned in my paper (on the use of the SAS data base to create simulation input for the System Analysis Machine directly from actual SMF data) presented at the 1973 SSCS (Symposium on the Simulation of Computer Systems) at the National Bureau of Standards, and at a BOF (Birds of a Feather) informal session at the Seventh Annual Interface Symposium at Iowa State. Many XRAY hardware monitor users became aware of State Farm's PDB through the Midwest TESDATA Users Group, which held its inaugural meeting in 1973 at State Farm. These presentations were only half technical; I also had to convince attendees that staffing of this new measurement concept was cost justified by the real dollar savings. John Chapman had used an XRAY at Standard Oil and invited me to join SHARE's Computer Measurement and Evaluation (CME) project, and I described SAS and the PDB in a closed session of the CME project at my first SHARE meeting, SHARE 42 in Houston in March of 1974. The first open session presentation on the use of the SAS System to process SMF data was at the next SHARE 43 that August in Chicago before to an audience of over 750 (half of the attendees!)


That session was split with an IBM presentation on their new SGP, Statistics Gathering Package, an FDP that selected a few fields from a few SMF records. IBM spoke first, then I showed what we had done with SAS at State Farm. One attendee stood and asked the IBM author of SGP, Bill Tetzlaff, "Now that you have seen SAS, is there any reason why you would still recommend your SGP product?" Several hundred SHARE sites acquired SAS that fall as a result of this SHARE session!


I developed my Doctoral Thesis while working at State Farm Insurance, 1972-1976, proved it while at Sun Oil Company, 1976-1984, and in 1984, at the urging of my wife, Judith, Vice President, left Sun Oil to create Merrill Consultants (I write software and support it, she runs the business). We commercialized my dissertation into our MXG Software Product, which has been licensed by over 7000 corporations worldwide, where it is used by senior technicians for the Measurement of the Performance of the Large Scale commercial (IBM) mainframes, providing response time, utilization, and bottleneck detection, for Capacity Planning, for cost accounting of departmental resource usage, and for security auditing of who's using what program, what files, etc. among its many facilities, and is delivered in 100% Source Code. At its peak approximately 10,000 technicians used MXG and SAS daily.

SAS Employee rlw
SAS Employee

Talk about a Deja Vu moment!  I read this article and my own career flashed before my eyes.  I spent 8 years at Piedmont Airlines/US Airways as a Capacity Planner for their Mainframe MVS and TPF systems.  We lived on SMF data and used it to reduce hardware expenditures by locating the programs that were heavy on the resource usage, modifying them to use less.


I started using SAS in 1977 and have continuously use it at every job I have been in since.  I was always intrigued by how I could do things so simply compared to Cobol's verbose nature and Fortran's finicky nature.  Having my Master's in Statistics, I understood the value of having a reliable tool like SAS in my professional life.  If a company did not have SAS, I made them get me a copy so I could do the job they hired me to do.


Thanks for the trip down memory lane!


Randy Wagner 


Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.


Register now!

Discussion stats
  • 1 reply
  • 2 in conversation