BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Phil20
Obsidian | Level 7

I have a requirement to compare two mainframe file NAMES. (not the records) . If match write to output#1 if no match write to output#2

The file names are inside two different PDS 

 

PDS#1 looks like

FILE1.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!
FILE2.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!

FILE3.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!

 

! is a delimiter in above file

 

PDS#2 looks like

FILE1.DSRA0201.$EFG00.G4461V00
FILE2.DSRA0201.$EFG00.G4461V00

FILE3.DSRA0500.$EFG00.G4461V00

 

The challenge is file#1 contains names with G*

File#2 has full name of the file with gdg name

 

So I need to take file name up to 'G; from from #2 and compare with file name up to 'G' in #1

File length (number of nodes) may vary. If it was fixed, I can do it easily in JCL sort.

 

Is there a way read record up to ".G" and compare both in SAS?

 

1 ACCEPTED SOLUTION

Accepted Solutions
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Phil20 

 

A small change is needed. Pds1 has 4 significant name levels, and Pds2 has 5 name levels. So you need to 

  1. Change the step reading Pds2, so it reads 5 levels instead of 4, and
  2. Change the comparison in the SQL Where-condition, so it allows a truncated match on 4 levels from Pds1 against a longer string.

The SAS part of the code should be changed to the this - the 3 changed lines are marked:

 

data pds1 (drop=e1-e4); 
	infile IFILE1 dsd dlm='.';
	input e1$ e2$ e3$ e4$;
	fn1 = catx('.',e1,e2,e3,e4);
run;
data pds2 (drop=e1-e4); 
	infile IFILE2 dsd dlm='.';
	input e1$ e2$ e3$ e4$ e5$;       /* changed */
	fn2 = catx('.',e1,e2,e3,e4,e5);  /* changed */
run;
proc sql;
	create table tmp as
	select distinct pds1.fn1, pds2.fn2
	from pds1 full outer join pds2
	on pds1.fn1 EQT pds2.fn2;        /* changed */
quit;
data _null_; set tmp (where=(fn1 ne '' and fn2 ne ''));
	file ofile1;
	put fn2;
run;
data _null_; set tmp (where=(fn1 = '' and fn2 ne ''));
	file ofile2;
	put fn2;
run;

 

I hope this works. Given the same test data, the output files contains:

 

OFILE1

FDI18F.DSSR8801.V2524.$ICBD47.G4395V00
FDI18F.DSSR8801.V2524.$ICBD48.G4395V00

 

OFILE2

FDI18F.DSSR8801.V2524.$ICBD49.G4395V00
FDI18F.DSSR8801.V2524.$ICBE00.G4201V00

 

 

 

View solution in original post

10 REPLIES 10
Phil20
Obsidian | Level 7
Hello expertise,
Let me make a clarification. you can consider both datasets as files.
Grab the first colume from each file where it ends with a .G and compare
ballardw
Super User

I'm sure there is some way with regular expressions.

 

You need to show what the desired output looks like as I am not sure what you expect. You mention comparing but did not show how to express the result of the comparison.

 

 

If your character G is the actual "delimiter" for where things change then you can use the SCAN function to extract the bits before for comparison:

 

data pds1;
   str1=  "FILE1.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!";
   str2 = "FILE1.DSRA0201.$EFG00.G4461V00";
   same = (scan(str1,1,'G') = scan(str2,1,'G'));
run;

Same will have the value 1 (true) if the strings are identical up to the G, or 0 otherwise.

Phil20
Obsidian | Level 7

input filefile1file1file2file2desired output if matchdesired output if match

Phil20
Obsidian | Level 7

Hi Ballard,

Can I use two input files in SAS and scan each record from file 2 against file 1 .

Here is my inputs and desired outputs

 

input file1

FDI18F.DSSR8801.V2524.$ICBD47.G*!
FDI18F.DSSR8801.V2524.$ICBD48.G*!
 

 

input file2 (to compare against file1)

 

FDI18F.DSSR8801.V2524.$ICBD47.G4395V00
FDI18F.DSSR8801.V2524.$ICBD48.G4395V00
FDI18F.DSSR8801.V2524.$ICBD49.G4395V00
FDI18F.DSSR8801.V2524.$ICBE00.G4201V00

 

Process:

1. Read first record from file 2 up to .G and scan in file1 . If matched write record to output file

2. repeat step1

 

-----------------------------------------------------------------------------

    desired output (matched file) is 

   FDI18F.DSSR8801.V2524.$ICBD47.G4395V00
   FDI18F.DSSR8801.V2524.$ICBD48.G4395V00

-----------------------------------------------------------------------------

3. repeat step 1. If not matched write output to unmatched file

 

-----------------------------------------------------------------------------

    desired output (unmatched file) is 

FDI18F.DSSR8801.V2524.$ICBD49.G4395V00
FDI18F.DSSR8801.V2524.$ICBE00.G4201V00

-----------------------------------------------------------------------------

 

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Phil20 

 

From yout reply to @ballardw it seems that you want a program that can be written in JCL, reads the files to compare and writes the results to other files. You can use SAS as the programming language as shown below.

 

It would be rather difficult to code without using SAS data sets for the comparison, so the following code reads the two input files with pds member names into two SAS data sets. The input files are treated as delimited files with point as separator, because it is an easy and straightforward way to get the first 4 member name elements, when the length of each element is umknown. The elements are concatenated with a point to give a member name prefix used in the comparison.

 

Then the two data sets are joined in a full join, and the results are written to two files, one with match and another with unmatched names from pds2. The input is copied from your answer to Ballard, and the code is tested in windows, because I dont have access to mainframe SAS, Please complete the JCL to run in your environment and using your files.

 

//<your job card info>
//SASSTEP   EXEC SAS
//IFILE1 DD DSN=<your infile1 - long names>
//IFILE2 DD DSN=<your infile2 - short names>
//OFILE1 DD DSN=<your new outfile>
//OFILE2 DD DSN=<your new outfile>
//SYSIN  DD *
data pds1 (drop=e1-e4); 
	infile IFILE1 dsd dlm='.';
	input e1$ e2$ e3$ e4$;
	fn1 = catx('.',e1,e2,e3,e4);
run;
data pds2 (drop=e1-e4); 
	infile IFILE2 dsd dlm='.';
	input e1$ e2$ e3$ e4$;
	fn2 = catx('.',e1,e2,e3,e4);
run;
proc sql;
	create table tmp as
	select distinct pds1.fn1, pds2.fn2
	from pds1 full outer join pds2
	on pds1.fn1 = pds2.fn2;
quit;
data _null_; set tmp (where=(fn1 ne '' and fn2 ne ''));
	file ofile1;
	put fn2;
run;
data _null_; set tmp (where=(fn1 = '' and fn2 ne ''));
	file ofile2;
	put fn2;
run;

 

the input is 

 

OFILE1
FDI18F.DSSR8801.V2524.$ICBD47.G4395V00
FDI18F.DSSR8801.V2524.$ICBD48.G4395V00

 

OFILE2
FDI18F.DSSR8801.V2524.$ICBD49.G4395V00
FDI18F.DSSR8801.V2524.$ICBE00.G4201V00

 

 

 

 

Phil20
Obsidian | Level 7

Thank you so much Eric. you solution almost working. 

 

output file OFILE1 is cutting at position 28 

=COLS> ----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
****** ***************************** Top of Data ******************************
000001 FDI02F.DSRR8801.V2547.$ARS00
 

The desired OFILE1 is 

 

----+----1----+----2----+----3----+----4----+
FDI02F.DSRR8801.V2547.$ARS00.G8205V00

 

what you suggest to change?

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Phil20 

 

A small change is needed. Pds1 has 4 significant name levels, and Pds2 has 5 name levels. So you need to 

  1. Change the step reading Pds2, so it reads 5 levels instead of 4, and
  2. Change the comparison in the SQL Where-condition, so it allows a truncated match on 4 levels from Pds1 against a longer string.

The SAS part of the code should be changed to the this - the 3 changed lines are marked:

 

data pds1 (drop=e1-e4); 
	infile IFILE1 dsd dlm='.';
	input e1$ e2$ e3$ e4$;
	fn1 = catx('.',e1,e2,e3,e4);
run;
data pds2 (drop=e1-e4); 
	infile IFILE2 dsd dlm='.';
	input e1$ e2$ e3$ e4$ e5$;       /* changed */
	fn2 = catx('.',e1,e2,e3,e4,e5);  /* changed */
run;
proc sql;
	create table tmp as
	select distinct pds1.fn1, pds2.fn2
	from pds1 full outer join pds2
	on pds1.fn1 EQT pds2.fn2;        /* changed */
quit;
data _null_; set tmp (where=(fn1 ne '' and fn2 ne ''));
	file ofile1;
	put fn2;
run;
data _null_; set tmp (where=(fn1 = '' and fn2 ne ''));
	file ofile2;
	put fn2;
run;

 

I hope this works. Given the same test data, the output files contains:

 

OFILE1

FDI18F.DSSR8801.V2524.$ICBD47.G4395V00
FDI18F.DSSR8801.V2524.$ICBD48.G4395V00

 

OFILE2

FDI18F.DSSR8801.V2524.$ICBD49.G4395V00
FDI18F.DSSR8801.V2524.$ICBE00.G4201V00

 

 

 

Phil20
Obsidian | Level 7

Erick, you are the best

 Solved my problem Smiley LOL

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Phil20 

 

It seems from your test data that you are interested in the difference between the two lists. One way is to join the two tables with a join condition that takes the file name from Pds1 up to .G* and match it with file names in Pds2 beginning with the same string, i.e comparing the first 3 elements of the file names.

 

* test data;
data pds1; 
	fn = 'FILE1.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!'; output;
	fn = 'FILE2.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!'; output;
	fn = 'FILE3.DSRA0201.$EFG00.G*!COLUME2!COLUME3!COLUME4!'; output;
run;

data pds2;
	fn = 'FILE1.DSRA0201.$EFG00.G4461V00'; output;
	fn = 'FILE2.DSRA0201.$EFG00.G4461V00'; output;
	fn = 'FILE3.DSRA0500.$EFG00.G4461V00'; output;
run;

* compare;
proc sql;
	create table want as
		select 
			pds1.fn as pds1 label='',
			pds2.fn as pds2 label='',
			case
				when pds1.fn ne '' and pds2.fn ne '' then 'Both'
				when pds1.fn ne '' and pds2.fn =  '' then 'Pds1'
				when pds1.fn =  '' and pds2.fn ne '' then 'Pds2'
				else ''
			end as FoundIn
		from pds1 full outer join pds2
		on substr(pds1.fn,1,index(pds1.fn,'.G*')-1) eqt  pds2.fn;
quit;

Result:

 

 

cmpfn.gif

ErikLund_Jensen
Rhodochrosite | Level 12
You need to have the files read into SAS to do it that way.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1402 views
  • 2 likes
  • 3 in conversation