09-24-2014 12:04 PM
We are migrating from a MVS SAS environment to a Linux SAS 9.3 EBI platform.
We have thousands of applications to migrate.
After a technical validation (are all applications still working?) we also need to do a content validation: are the results the same on both platforms?
This validation should be without too many manual interventions.
I created already dynamic jobs to compare structure info like number of observations, number of variables, type of variables, formats, length etc. for each table.
To compare content I built the following:
- calculate for every numeric variable in a table the sum throughout the table and compare these values for both platforms
- calculate the number of unique values for each character variable.
- create frequency tables for variables with less than 50 unique values.
- calculate the number of missing values per variable per table.
The problem is that these jobs require (too) many resources on MVS.
Does anybody have other ideas for this content validation.
09-24-2014 01:05 PM
Sounds like a very ambitious testing strategy.
If you wish to do it like you describe, it's hard to suggest a different approach, the data resides where it resides.
In my mind, SAS is the same disregarding platform. So I would focus testing on charset conversion, numerical precision and physical search paths.
A part from that, can't you run both environments in parallel, and compare existing data marts/report on the top?
Things that potentially doesn't work would most likely trigger errors/warnings which should be alarmed in the batch job scheduler.
09-24-2014 05:56 PM
Henri, I am little bit astonished by your approach. In the old age as of the eniac and acoustic signals could be unreliable. These days that part is rather very well.
I would advice doing a step back for a survey of your migration.
You are stating to migrate to SAS EBI 9.3 on Linux. Why not go for a 9.4 version as 9.3 is becoming old?
One thing to be sure on this new environment you should have done and validated on DR (Disaste Recovery) and BR (Backup-Restore).
When those two are evaluated, part of business continuity Planning and management (BCP BCM) proceed with getting the data to it.
You are saying to migrate thousands of applications. the word applications is as meaningless as the word thing.
The MVS OS name is the old name (more as 8 years old) I hope it is a recent z/OS version. Do you have HFS and/or SAS/Connect (Bpxas address space) as options?
Segregate in datasets preferable SAS-datasets and SAS-code. I assume you have thought on release management (DTAP) and version management (developers)
Than there are two separate questions
1- migration of SAS datasets
2- migration of SAS code
Migration of SAS datasets
You have several options that are:
- Using proc cport/cimport. This will work one way with the requirement the destination is not of an older SAS version
- Using HFS SAS libraries with CEDA and a binary transfer
- SAS/Connect (Works in both directions)
- (other look alikes and combinations)
There is no need to check every byte you should check and design the conversion process (All bytes types options). The differences:
+ z/OS is still limited to 32-bit addressing and Ebcdic as charset (single byte). As being Belgium you have a Dutch and French language. There is an encoding problem with that, it could be an non standard one. In any case the dollarcent en not-sign (uppercase 6 3270 terminal) do not exist in Ascii.
+ The precision of SAS numeric is different on z/OS to Windows/Unix see: https://support.sas.com/techsup/technote/ts654.pdf
+ With Unix you are running a 64 bit version in an Ascii encoding most likely a Latin1 one. You could use utf8 (a multibyte sas system needed) as the common standard with most known systems today. The bad thing is that interoperability for utf8 with SAS is not as easy as should be. It is one of the reasons to prefer 9.4 (improvements). The check for any impact on encoding see SAS(R) 9.4 National Language Support (NLS): Reference Guide, Third Edition
Unix is behaving differently as it case-sensitive. All dataset-names should be preferable lower-case. Upper case will not work with sas-macros in an autocall library. They are required to be lower-case as namings. A bypass could be storing code in SAS-catalogs. With SAS-catalogs you will need a SAS-base version.
The best strategy (SAS datasets) is:
- checking al transfer with all possible encoding issue and validate that also with other used tools (eg Excel).
- For numeric use a new length of 4 or 8 and for the shorter lengths add 1 byte to the length.
- optimization on the Unix environment think of aligniofiles bufsize and to compress it as binary (long records/many variables).
- Test with some bigger datasets whether the performance of transfer is acceptable.
Transfering the data will give you all counters you are possible needing for some validation (nobs).
There is no need to check the content anymore that should have been solved. For some first user acceptation let some of those guys accept something of data before doing all.
The best strategy (SAS code) is:
- SAS code should be echangeable unless:
+ there is some physical naming hard coded in the SAS code. With z/OS JCL and Cobol history is was often a mandatory coding standard to have that not done that way.
You will never know it has not been done until verified. If this is not done you must implement an similar alternative at Unix.
+There a specifics in the SAS coding at the mainframe on top of namings. VTOC analyses and tape-headers are dedicated tool just for mainframes.
You will have to find those and mitigate them. SMF-analyses of SAS generated records could help with that. PDSE contents is an other special.
The not-sing is a code-char that will be recognized by a SAS/Connect download.
When the SAS-code has been downloaded it should be verified for correct functionality in a user acceptance approach (DTAP release management).
Thousands of applications you said is this indicated as thousands of sas-source-code files/members?
09-25-2014 02:05 AM
Thanks for your extended reply....... BUT.
Our project is a lot more complex than the situation that you describe above. We migrate 'a history of more than 30 years'.
1) You suggest to migrate to 9.4. We are the Belgian office of a multinational big company. We have a local mainframe, and will work on the international Linux environment. So we have no decission power about SAS versions. It is an existing environment.
2) It is not just a matter of transferring /modifying SAS data and SAS programs. A lot of our input data is coming from DB2 extracts, GDG files, IMS/DLI, etc. We do NOT have SAS/Connect available, and we are NOT allowed to use for instance FTP access to read from LINUX data on our mainframe. So, our batch jobs on Linux start with a file transfer (binary) of sequential files from mainframe to Linux. These mainframe files often have PD and ZD data. We have modules to 'update' our SAS programs automatically to be Linux compatible (lengths for numeric fields, JCL translated into LIBNAMEs and FILENAMES, specific mainframe options (like DYNALLOC for instance) removed etc.
3) we have several applications that need re-design: for instance SAS/AF is not working anymore in EBI, SAS/FSP windows are not working anymore in EBI, IMS/DLI data access is not available on Linux, calling COBOL routines is not available etc. So there is a need for real in depth testing.
09-25-2014 03:21 AM
Henri, nothing to be a BUT... as what you are describing is quite common for that big companies with a long history.
The issues you are describing including all politics is business as usual and they are not being correct as of the bokito-culture http://en.wikipedia.org/wiki/Bokito_(gorilla). When you are keeping 30 years history this is for Life insurance in other cases there should data being wiped as of retention policies.
ad 1) Nice that it is an big international company and they are supporting it wit an international Linux environment. But there are some issue-s to mention on that.
There are guidelines on IT governance (ISO27k series, SOX, Basel III, COBIT etc) the international differences are rather small but the control checks on that can be different. A prudent IT governance is also stating you should not have outdated software in use, keeping it up to date (Life Cycle Management).
There is something to be worked as concern not knowing your influence.
Well they are saying there is an existing EBI environment, than there should be a SAS platform admin (all hats) supporting you. When not than there is another political issue to solve.
ad 2) All those other Mainframe types being used. Nothing special for me, although you did not mention that. I did mention SMF (special types VBS) You did not mention VSAM. SAS is using some QSAM type with excp access methods. The numeric conversion was one of my advices, you have it already. I am not seeing response to that 4-8 byte standard but that is just some more detailed.
JCL being translated into Libname/filenames is not very sensible to put that into SAS sources. The best place to do that in an EBI (BI/DI) environment is to define those associated wiht an appserver-context. The appserver-context is a logical application environment. You did not mention how many of those you have.
With filenames macros formats involved you get into updating the usermod-files at the OS-level at the .../Lev-/<SASApp>/... You need a platform admin.
For those PD and ZD fields you are most likely run into conceptually issues as of the precision if the floatings. A banking account number is not suitable for being processed as numeric(floating) in SAS.
That not being allowed of using FTP (not encrypted enve no password encryption) not giving SAS/connect (has password encryption, has data encryption with 9.4 sas/secure AES included) is an example of the typical bokito of the old classic IT department.
The are forcing you into their sFTP tooling no matter whether it is a good fit or not. sFTP is logical the same as a FTP the only difference is that encryption. That encryption is done as part of the mentioned regualtions. Some often practized joke only doing those things as of regulations you like without really wanting to do them.
ad 3) With EBI you cannot run SAS/AF as EBI is C/S server based. You need to go for SAS-VA (9.4 strongly recommended) or SAS portal (web based reporting) Both are heavily based on the SAS metadataserver. That is where you are needing SAS platform admin support from architectural vision to operational actions.
I expect this change is followed up by the statement SAS should be replaced by ... An ... is another more preferred tool by your multinational.
You could run however SAS/AF on Linux (using X11) or Windows-base. That brings you into a possible position that users are not allowed on Unix.
Nothing new here.
My preference for SAS/connect is that is making most of that all that transparant. I have run that and supporting Windows Unix and Mainframe processing for some analytic users (marketing/business analysts). Running that all at the same time and the data seen as being federated.
Yep, I have seen that all before.
My history is about 30 years on this (SAS and Systems programmer IBM context), personal being part of some retention policies.