I have a requirement to compare two mainframe file NAMES. (not the records) . If match write to output#1 if no match write to output#2
The file names are inside two different PDS
PDS#1 looks like
FILE1.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!
FILE2.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!
FILE3.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!
! is a delimiter in above file
PDS#2 looks like
FILE1.DSRA0201.$EFG00.G4461V00
FILE2.DSRA0201.$EFG00.G4461V00
FILE3.DSRA0500.$EFG00.G4461V00
The challenge is file#1 contains names with G*
File#2 has full name of the file with gdg name
So I need to take file name up to 'G; from from #2 and compare with file name up to 'G' in #1
File length (number of nodes) may vary. If it was fixed, I can do it easily in JCL sort.
Is there a way read record up to ".G" and compare both in SAS?
Hi @Phil20
A small change is needed. Pds1 has 4 significant name levels, and Pds2 has 5 name levels. So you need to
The SAS part of the code should be changed to the this - the 3 changed lines are marked:
data pds1 (drop=e1-e4); 
	infile IFILE1 dsd dlm='.';
	input e1$ e2$ e3$ e4$;
	fn1 = catx('.',e1,e2,e3,e4);
run;
data pds2 (drop=e1-e4); 
	infile IFILE2 dsd dlm='.';
	input e1$ e2$ e3$ e4$ e5$;       /* changed */
	fn2 = catx('.',e1,e2,e3,e4,e5);  /* changed */
run;
proc sql;
	create table tmp as
	select distinct pds1.fn1, pds2.fn2
	from pds1 full outer join pds2
	on pds1.fn1 EQT pds2.fn2;        /* changed */
quit;
data _null_; set tmp (where=(fn1 ne '' and fn2 ne ''));
	file ofile1;
	put fn2;
run;
data _null_; set tmp (where=(fn1 = '' and fn2 ne ''));
	file ofile2;
	put fn2;
run;
I hope this works. Given the same test data, the output files contains:
OFILE1
FDI18F.DSSR8801.V2524.$ICBD47.G4395V00
FDI18F.DSSR8801.V2524.$ICBD48.G4395V00
OFILE2
FDI18F.DSSR8801.V2524.$ICBD49.G4395V00
FDI18F.DSSR8801.V2524.$ICBE00.G4201V00
I'm sure there is some way with regular expressions.
You need to show what the desired output looks like as I am not sure what you expect. You mention comparing but did not show how to express the result of the comparison.
If your character G is the actual "delimiter" for where things change then you can use the SCAN function to extract the bits before for comparison:
data pds1; str1= "FILE1.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!"; str2 = "FILE1.DSRA0201.$EFG00.G4461V00"; same = (scan(str1,1,'G') = scan(str2,1,'G')); run;
Same will have the value 1 (true) if the strings are identical up to the G, or 0 otherwise.
input filefile1
file2
desired output if match
Hi Ballard,
Can I use two input files in SAS and scan each record from file 2 against file 1 .
Here is my inputs and desired outputs
input file1
FDI18F.DSSR8801.V2524.$ICBD47.G*!
FDI18F.DSSR8801.V2524.$ICBD48.G*!
 
input file2 (to compare against file1)
FDI18F.DSSR8801.V2524.$ICBD47.G4395V00
FDI18F.DSSR8801.V2524.$ICBD48.G4395V00
FDI18F.DSSR8801.V2524.$ICBD49.G4395V00
FDI18F.DSSR8801.V2524.$ICBE00.G4201V00
Process:
1. Read first record from file 2 up to .G and scan in file1 . If matched write record to output file
2. repeat step1
-----------------------------------------------------------------------------
desired output (matched file) is
   FDI18F.DSSR8801.V2524.$ICBD47.G4395V00
   FDI18F.DSSR8801.V2524.$ICBD48.G4395V00
-----------------------------------------------------------------------------
3. repeat step 1. If not matched write output to unmatched file
-----------------------------------------------------------------------------
desired output (unmatched file) is
FDI18F.DSSR8801.V2524.$ICBD49.G4395V00
FDI18F.DSSR8801.V2524.$ICBE00.G4201V00
-----------------------------------------------------------------------------
Hi @Phil20
From yout reply to @ballardw it seems that you want a program that can be written in JCL, reads the files to compare and writes the results to other files. You can use SAS as the programming language as shown below.
It would be rather difficult to code without using SAS data sets for the comparison, so the following code reads the two input files with pds member names into two SAS data sets. The input files are treated as delimited files with point as separator, because it is an easy and straightforward way to get the first 4 member name elements, when the length of each element is umknown. The elements are concatenated with a point to give a member name prefix used in the comparison.
Then the two data sets are joined in a full join, and the results are written to two files, one with match and another with unmatched names from pds2. The input is copied from your answer to Ballard, and the code is tested in windows, because I dont have access to mainframe SAS, Please complete the JCL to run in your environment and using your files.
//<your job card info>
//SASSTEP   EXEC SAS
//IFILE1 DD DSN=<your infile1 - long names>
//IFILE2 DD DSN=<your infile2 - short names>
//OFILE1 DD DSN=<your new outfile>
//OFILE2 DD DSN=<your new outfile>
//SYSIN  DD *
data pds1 (drop=e1-e4); 
	infile IFILE1 dsd dlm='.';
	input e1$ e2$ e3$ e4$;
	fn1 = catx('.',e1,e2,e3,e4);
run;
data pds2 (drop=e1-e4); 
	infile IFILE2 dsd dlm='.';
	input e1$ e2$ e3$ e4$;
	fn2 = catx('.',e1,e2,e3,e4);
run;
proc sql;
	create table tmp as
	select distinct pds1.fn1, pds2.fn2
	from pds1 full outer join pds2
	on pds1.fn1 = pds2.fn2;
quit;
data _null_; set tmp (where=(fn1 ne '' and fn2 ne ''));
	file ofile1;
	put fn2;
run;
data _null_; set tmp (where=(fn1 = '' and fn2 ne ''));
	file ofile2;
	put fn2;
run;
the input is
OFILE1
FDI18F.DSSR8801.V2524.$ICBD47.G4395V00
FDI18F.DSSR8801.V2524.$ICBD48.G4395V00
OFILE2
FDI18F.DSSR8801.V2524.$ICBD49.G4395V00
FDI18F.DSSR8801.V2524.$ICBE00.G4201V00
Thank you so much Eric. you solution almost working.
output file OFILE1 is cutting at position 28
=COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
****** ***************************** Top of Data ******************************
000001 FDI02F.DSRR8801.V2547.$ARS00
 
The desired OFILE1 is
----+----1----+----2----+----3----+----4----+
FDI02F.DSRR8801.V2547.$ARS00.G8205V00
what you suggest to change?
Hi @Phil20
A small change is needed. Pds1 has 4 significant name levels, and Pds2 has 5 name levels. So you need to
The SAS part of the code should be changed to the this - the 3 changed lines are marked:
data pds1 (drop=e1-e4); 
	infile IFILE1 dsd dlm='.';
	input e1$ e2$ e3$ e4$;
	fn1 = catx('.',e1,e2,e3,e4);
run;
data pds2 (drop=e1-e4); 
	infile IFILE2 dsd dlm='.';
	input e1$ e2$ e3$ e4$ e5$;       /* changed */
	fn2 = catx('.',e1,e2,e3,e4,e5);  /* changed */
run;
proc sql;
	create table tmp as
	select distinct pds1.fn1, pds2.fn2
	from pds1 full outer join pds2
	on pds1.fn1 EQT pds2.fn2;        /* changed */
quit;
data _null_; set tmp (where=(fn1 ne '' and fn2 ne ''));
	file ofile1;
	put fn2;
run;
data _null_; set tmp (where=(fn1 = '' and fn2 ne ''));
	file ofile2;
	put fn2;
run;
I hope this works. Given the same test data, the output files contains:
OFILE1
FDI18F.DSSR8801.V2524.$ICBD47.G4395V00
FDI18F.DSSR8801.V2524.$ICBD48.G4395V00
OFILE2
FDI18F.DSSR8801.V2524.$ICBD49.G4395V00
FDI18F.DSSR8801.V2524.$ICBE00.G4201V00
Erick, you are the best
 Solved my problem 
Hi @Phil20
It seems from your test data that you are interested in the difference between the two lists. One way is to join the two tables with a join condition that takes the file name from Pds1 up to .G* and match it with file names in Pds2 beginning with the same string, i.e comparing the first 3 elements of the file names.
* test data;
data pds1; 
	fn = 'FILE1.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!'; output;
	fn = 'FILE2.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!'; output;
	fn = 'FILE3.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!'; output;
run;
data pds2;
	fn = 'FILE1.DSRA0201.$EFG00.G4461V00'; output;
	fn = 'FILE2.DSRA0201.$EFG00.G4461V00'; output;
	fn = 'FILE3.DSRA0500.$EFG00.G4461V00'; output;
run;
* compare;
proc sql;
	create table want as
		select 
			pds1.fn as pds1 label='',
			pds2.fn as pds2 label='',
			case
				when pds1.fn ne '' and pds2.fn ne '' then 'Both'
				when pds1.fn ne '' and pds2.fn =  '' then 'Pds1'
				when pds1.fn =  '' and pds2.fn ne '' then 'Pds2'
				else ''
			end as FoundIn
		from pds1 full outer join pds2
		on substr(pds1.fn,1,index(pds1.fn,'.G*')-1) eqt  pds2.fn;
quit;
Result:
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.
